前言
SQL (Structured Query Language) 是具有数据操纵和数据定义等多种功能的数据库语言,这种语言具有交互性特点,能为用户提供极大的便利,数据库管理系统应充分利用SQL语言提高计算机应用系统的工作质量与效率。
身边很多人工作中经常和SQL打交道, 可是每人的编写SQL风格都独树一帜。
刚好在githup上看到一个不错的编码风格, 在这里给大家推荐一下。
介绍
使用小写的SQL
小写SQL比大写SQL易读,而且不必一直按住shift键。
-- Good
select * from users
-- Bad
SELECT * FROM users
-- Bad
Select * From users
单行查询vs多行查询
以下情况最好将SQL写在同一行中:
- 查询所有列(*)或者只查询1列或者两列
- 查询语句没有额外复杂性
-- Good
select * from users
-- Good
select id from users
-- Good
select id, email from users
-- Good
select count(*) from users
这样做的原因很简单,当所有内容都在一行时,仍然很容易阅读。但一旦你开始添加更多的列或更复杂的代码,如果是多行代码就更容易阅读:
-- Good
select
id,
email,
created_at
from users
-- Good
select *
from users
where email = 'example@domain.com
对于具有1或2列的查询,可以将这些列放在同一行上。对于3+列,将每个列名放在它自己的行上,包括第一项:
-- Good
select id, email
from users
where email like '%@gmail.com'
-- Good
select user_id, count(*) as total_charges
from charges
group by user_id
-- Good
select
id,
email,
created_at
from users
-- Bad
select id, email, created_at
from users
-- Bad
select id,
email
from users
向左对齐
-- Good
select id, email
from users
where email like '%@gmail.com'
-- Bad
select id, email
from users
where email like '%@gmail.com'
使用单引号
一些SQL方言,如BigQuery支持使用双引号,但是对于大多数方言,双引号最终将引用列名。因此,单引号更可取:
-- Good
select *
from users
where email = 'example@domain.com'
-- Bad
select *
from users
where email = "example@domain.com"
使用 !=
而不用 <>
因为!=读起来像“不等于”,更接近我们大声说出来的方式。
-- Good
select count(*) as paying_users_count
from users
where plan_name != 'free'
-- Bad
select count(*) as paying_users_count
from users
where plan_name <> 'free'
逗号应该放在行尾
-- Good
select
id,
email
from users
-- Bad
select
id
, email
from users
缩进条件
当只有一个where条件时,将它保留在与where
相同的行上:
select email
from users
where id = 1234
当有多个缩进时,将每个缩进比where更深一层。将逻辑运算符放在前一个条件的末尾:
select id, email
from users
where
created_at >= '2019-03-01' and
vertical = 'work'
避免括号内的空格
-- Good
select *
from users
where id in (1, 2)
-- Bad
select *
from users
where id in ( 1, 2 )
将in
值的长列表分成多个缩进行
-- Good
select *
from users
where email in (
'user-1@example.com',
'user-2@example.com',
'user-3@example.com',
'user-4@example.com'
)
表名应该是名词的复数形式
-- Good
select * from users
select * from visit_logs
-- Bad
select * from user
select * from visitLog
列名应该是xxx_xxx格式
-- Good select id, email, timestamp_trunc(created_at, month) as signup_month from users
-- Bad select id, email, timestamp_trunc(created_at, month) as SignupMonth from users
列名的约定
- 布尔字段的前缀应该是is_、has_或does_。例如,is_customer、has_unsubscribe等。
- 仅限日期的字段应该以_date作为后缀。例如,report_date。
- Date+time字段应该以_at作为后缀。例如,created_at、posted_at等。
列排序约定
首先放置主键,然后是外键,然后是所有其他列。如果表中有任何系统列(created_at、updated_at、is_deleted等),那么将它们放在最后。
-- Good
select
id,
name,
created_at
from users
-- Bad
select
created_at,
name,
id,
from users
inner join
处理
最好显式,以便连接类型非常清楚:
-- Good
select
email,
sum(amount) as total_revenue
from users
inner join charges on users.id = charges.user_id
-- Bad
select
email,
sum(amount) as total_revenue
from users
join charges on users.id = charges.user_id
对于join
条件,将首先引用的表放在on
之后
-- Good
select
...
from users
left join charges on users.id = charges.user_id
-- primary_key = foreign_key --> one-to-many --> fanout
select
...
from charges
left join users on charges.user_id = users.id
-- foreign_key = primary_key --> many-to-one --> no fanout
-- Bad
select
...
from users
left join charges on charges.user_id = users.id
单个连接条件应该与连接位于同一行
-- Good
select
email,
sum(amount) as total_revenue
from users
inner join charges on users.id = charges.user_id
group by email
-- Bad
select
email,
sum(amount) as total_revenue
from users
inner join charges
on users.id = charges.user_id
group by email
当你有多个连接条件时,把它们放在各自的缩进行:
-- Good
select
email,
sum(amount) as total_revenue
from users
inner join charges on
users.id = charges.user_id and
refunded = false
group by email
除非必要,否则不要包含表名
-- Good
select
id,
name
from companies
-- Bad
select
companies.id,
companies.name
from companies
始终重命名聚合和函数包装的参数
-- Good
select count(*) as total_users
from users
-- Bad
select count(*)
from users
-- Good
select timestamp_millis(property_beacon_interest) as expressed_interest_at
from hubspot.contact
where property_beacon_interest is not null
-- Bad
select timestamp_millis(property_beacon_interest)
from hubspot.contact
where property_beacon_interest is not null
在布尔条件需显示写出
-- Good
select * from customers where is_cancelled = true
select * from customers where is_cancelled = false
-- Bad
select * from customers where is_cancelled
select * from customers where not is_cancelled
用as
重命名列名
-- Good
select
id,
email,
timestamp_trunc(created_at, month) as signup_month
from users
-- Bad
select
id,
email,
timestamp_trunc(created_at, month) signup_month
from users
group by
时写出列名而不要用序号
-- Good
select user_id, count(*) as total_charges
from charges
group by user_id
-- Bad
select
user_id,
count(*) as total_charges
from charges
group by 1
分组的列首先显示出来
-- Good
select
timestamp_trunc(com_created_at, year) as signup_year,
count(*) as total_companies
from companies
group by signup_year
-- Bad
select
count(*) as total_companies,
timestamp_trunc(com_created_at, year) as signup_year
from companies
group by signup_year
调整对齐case-when语句
每个when
应该在它自己的行上(case
行上没有任何内容),并且应该比case
行缩进更深一层。then
可以在同一条线上,也可以在它下面的自己的线上。
-- Good
select
case
when event_name = 'viewed_homepage' then 'Homepage'
when event_name = 'viewed_editor' then 'Editor'
end as page_name
from events
-- Good too
select
case
when event_name = 'viewed_homepage'
then 'Homepage'
when event_name = 'viewed_editor'
then 'Editor'
end as page_name
from events
-- Bad
select
case when event_name = 'viewed_homepage' then 'Homepage'
when event_name = 'viewed_editor' then 'Editor'
end as page_name
from events
尽量使用CTEs, 避免使用子查询
-- Good
with ordered_details as (
select
user_id,
name,
row_number() over (partition by user_id order by date_updated desc) as details_rank
from billingdaddy.billing_stored_details
),
final as (
select user_id, name
from ordered_details
where details_rank = 1
)
select * from final
-- Bad
select user_id, name
from (
select
user_id,
name,
row_number() over (partition by user_id order by date_updated desc) as details_rank
from billingdaddy.billing_stored_details
) ranked
where details_rank = 1
使用有意义的CTE名称
-- Good
with ordered_details as (
-- Bad
with d1 as (
使用over()函数
你可以把它放在同一行上,也可以根据它的长度把它分成多行:
-- Good
select
user_id,
name,
row_number() over (partition by user_id order by date_updated desc) as details_rank
from billingdaddy.billing_stored_details
-- Good
select
user_id,
name,
row_number() over (
partition by user_id
order by date_updated desc
) as details_rank
from billingdaddy.billing_stored_details
喜欢可以关注公众号: 终身幼稚园