PostgreSQL优化器浅析康贤funnyxj@qq.com阿里云PostgresConferenceChina2016中国用户大会目录优化器介绍逻辑优化物理优化PostgresConferenceChina2016中国用户大会优化器介绍•什么是优化器•数据库的大脑•SQL处理过程PostgresConferenceChina2016中国用户大会优化器介绍PostgresConferenceChina2016中国用户大会优化器介绍•查看优化器生成计划•explainquery,如下postgres=#explainselect*fromtestwhereid=10limit1;QUERYPLAN------------------------------------------------------------------------------Limit(cost=0.28..8.29rows=1width=4)-IndexOnlyScanusingtest11ontest(cost=0.28..8.29rows=1width=4)IndexCond:(id=10)(3rows)PostgresConferenceChina2016中国用户大会优化器介绍•Plan结构PostgresConferenceChina2016中国用户大会逻辑查询优化•什么是逻辑优化•主要对SQL进行等价或者推倒变换,让其达到更好的执行计划•主要优化方向•避免重复工作•减少子集数据量PostgresConferenceChina2016中国用户大会逻辑查询优化•SQL组成SELECTs.nameFROMscoresc,studentsWHEREs.noIN(SELECTst_noFROMclasscWHEREc.name='1101')ANDsc.st_no=s.noORDERBYsc.scoreDESCLIMIT1;targetlistrangetablequalifiterIN-clausesubqueryJoinpredicatesortorderlimitexpressionPostgresConferenceChina2016中国用户大会逻辑查询优化•查询重写•消除view、rule等createviewv_t_1_2asSELECTt1.a1,t1.b1,t2.a2,t2.b2FROMt1,t2;postgres=explainselect*fromv_t_1_2,t1wherev_t_1_2.a1=10andt1.b1=20;QUERYPLAN-------------------------------------------------------------------------------------NestedLoop(cost=0.55..41.59rows=1000width=24)-NestedLoop(cost=0.55..16.60rows=1width=16)-IndexScanusingt1_a1_keyont1t1_1(cost=0.28..8.29rows=1width=8)IndexCond:(a1=10)-IndexScanusingb1_1ont1(cost=0.28..8.29rows=1width=8)IndexCond:(b1=20)-SeqScanont2(cost=0.00..15.00rows=1000width=8)(7rows)view又被重写回原来的queryPostgresConferenceChina2016中国用户大会•提升子链•将IN和exists子句递归提升select*fromt1wheret1.a1in(selectt2.a2fromt2wheret2.b2=10);假设t2.a2为unique转化为:selectt1.a1,t1,a2fromt1joint2wheret1.a1=t2.a2andt2.b2=10;执行计划如下:postgres=explainselect*fromt1wheret1.a1in(selectt2.a2fromt2wheret2.b2=10);QUERYPLAN--------------------------------------------------------------------------NestedLoop(cost=0.28..25.80rows=1width=8)-SeqScanont2(cost=0.00..17.50rows=1width=4)Filter:(b2=10)-IndexScanusingt1_a1_keyont1(cost=0.28..8.29rows=1width=8)IndexCond:(a1=t2.a2)PostgresConferenceChina2016中国用户大会•提升子链explainselect*fromt1whereexists(selectt2.a2fromt2wheret2.a2=t1.a1);假设t2.a2为unique转化为:selectt1.a1,t1.b1fromt1,t2wheret1.a1=t2.a1;执行计划如下:postgres=explainselect*fromt1whereexists(selectt2.a2fromt2wheret2.a2=t1.a1);QUERYPLAN-----------------------------------------------------------------HashJoin(cost=26.42..54.69rows=952width=8)HashCond:(t2.a2=t1.a1)-SeqScanont2(cost=0.00..15.00rows=1000width=4)-Hash(cost=14.52..14.52rows=952width=8)-SeqScanont1(cost=0.00..14.52rows=952width=8)(5rows)PostgresConferenceChina2016中国用户大会•提升子链•部分IN和exists子句不能提升postgres=explainselect*fromt1wheret1.a1in(selectt2.a2fromt2wheret1.b1=10);QUERYPLAN--------------------------------------------------------------------SeqScanont1(cost=0.00..8349.28rows=476width=8)Filter:(SubPlan1)SubPlan1-Result(cost=0.00..15.00rows=1000width=4)One-TimeFilter:(t1.b1=10)-SeqScanont2(cost=0.00..15.00rows=1000width=4)(6rows)这里IN子句中使用了主查询的表字段,和主查询有相关性这里也是可以优化的,把t1.b1移到父查询,但是PG没有支持PostgresConferenceChina2016中国用户大会•提升子查询•子查询和子链接区别:子查询是不在表达式中的子句,子链接在表达式中的子句select*fromt1,(select*fromt2)ascwheret1.a1=c.a2;转化为:select*fromt1,t2wheret1.a1=t2.a2;postgres=explainselect*fromt1,(select*fromt2)ascwheret1.a1=c.a2;QUERYPLAN-----------------------------------------------------------------HashJoin(cost=26.42..54.69rows=952width=16)HashCond:(t2.a2=t1.a1)-SeqScanont2(cost=0.00..15.00rows=1000width=8)-Hash(cost=14.52..14.52rows=952width=8)-SeqScanont1(cost=0.00..14.52rows=952width=8)(5rows)PostgresConferenceChina2016中国用户大会•提升子查询•同样子查询也有不支持的情况postgres=explainselectt1.a1fromt1,(selecta2fromt2limit1)ascwherec.a2=10;QUERYPLAN------------------------------------------------------------------------NestedLoop(cost=0.00..24.07rows=952width=4)-SubqueryScanonc(cost=0.00..0.03rows=1width=0)Filter:(c.a2=10)-Limit(cost=0.00..0.01rows=1width=4)-SeqScanont2(cost=0.00..15.00rows=1000width=4)-SeqScanont1(cost=0.00..14.52rows=952width=4)(6rows)1.没有集合操作2.没有聚合操作3.不包含sort/limit/with/group4.没有易失函数5.from非空……PostgresConferenceChina2016中国用户大会•化简条件规则化简前化简后常量传递a1=a2anda2=100a1=100anda2=100表达式计算a1=1+2a1=3去除多余括号(aandb)and(candd)aandbandcandd简化orfalseora1a1PostgresConferenceChina2016中国用户大会•外连接消除(left/right/fulljoin)以leftjoin为例,leftjoin(左连接)返回包括左表中的所有记录和右表中连接字段相等的记录,如果右表没有匹配的记录,那么右表将会以NULL值代替,例如:A表B表ID1ID2112select*fromAleftjoinBonA.id1=B.id2;结果如下:ID1ID2112NULLPostgresConferenceChina2016中国用户大会•外连接消除(left/right/fulljoin)postgres=explainselect*fromt1leftjoint2ontrue;QUERYPLAN-------------------------------------------------------------------NestedLoopLeftJoin(cost=0.00..11932.02rows=952000width=16)-SeqScanont1(cost=0.00..14.52rows=952width=8)-Materialize(cost=0.00..20.00rows=1000width=8)-SeqScanont2(cost=0.00..15.00rows=1000width=8)(4rows)postgres=explainselect*fromt1leftjoint2ontruewheret1.a1=t2.a2;QUERYPLAN-----------------------------------------------------------------HashJoin(cost=26.42..54.69rows=952width=16)HashCond:(t2.a2=t1.a1)-SeqScanont2(cost=0.00..15.00rows=10