TPC-H Q07 Or执行计划优化
一、布尔运算推演
已知,
- snippet.txt
A∩(B∪C) = (A∩C)∪(A∪C) A∪(B∩C) = (A∪C)∩(A∪C) A∩A = A (A∩B)∩C = A∩B∩C = A∩C∩B
那么,
- snippet.txt
(A∩B)∪(C∩D) = ((A∩B)∪C) ∩ ((A∩B)∪D) = ((A∪C)∩(B∪C)) ∩ ((A∪D)∩(B∪D)) = (A∪C) ∩ (B∪C) ∩ (A∪D) ∩ (B∪D) => ((A∩B)∪(C∩D)) ∩ (A∪C) ∩ (B∪D) = (A∪C) ∩ (B∪C) ∩ (A∪D) ∩ (B∪D) ∩ (A∪C) ∩ (B∪D) => ((A∩B)∪(C∩D)) ∩ (A∪C) ∩ (B∪D) = (A∪C) ∩ (B∪C) ∩ (A∪D) ∩ (B∪D)
同理,
- snippet.txt
((A∩B)∪(C∩D)) ∩ (A∪C) = (A∪C) ∩ (B∪C) ∩ (A∪D) ∩ (B∪D) ∩ (A∪C) 或, ((A∩B)∪(C∩D)) ∩ (B∪D) = (A∪C) ∩ (B∪C) ∩ (A∪D) ∩ (B∪D) ∩ (B∪D) => ((A∩B)∪(C∩D)) ∩ (A∪C) = (A∪C) ∩ (B∪C) ∩ (A∪D) ∩ (B∪D) 或, ((A∩B)∪(C∩D)) ∩ (B∪D) = (A∪C) ∩ (B∪C) ∩ (A∪D) ∩ (B∪D)
所以,
(A∩B)∪(C∩D) = ((A∩B)∪(C∩D)) ∩ (A∪C) ∩ (B∪D) = ((A∩B)∪(C∩D)) ∩ (A∪C) = ((A∩B)∪(C∩D)) ∩ (B∪D)
二、Q07等价SQL
基于上述布尔等式,Q07,
- snippet.sql
…… SELECT n1.n_name AS supp_nation, n2.n_name AS cust_nation, EXTRACT(YEAR FROM l_shipdate) AS l_year, l_extendedprice * (1 - l_discount) AS volume FROM supplier, lineitem, orders, customer, nation n1, nation n2 WHERE s_suppkey = l_suppkey AND o_orderkey = l_orderkey AND c_custkey = o_custkey AND s_nationkey = n1.n_nationkey AND c_nationkey = n2.n_nationkey AND ( (n1.n_name = 'ROMANIA' AND n2.n_name = 'INDONESIA') OR (n1.n_name = 'INDONESIA' AND n2.n_name = 'ROMANIA') ) AND l_shipdate BETWEEN DATE '1995-01-01' AND DATE '1996-12-31'; ……
等价改写方法为:
- snippet.sql
…… SELECT n1.n_name AS supp_nation, n2.n_name AS cust_nation, EXTRACT(YEAR FROM l_shipdate) AS l_year, l_extendedprice * (1 - l_discount) AS volume FROM supplier, lineitem, orders, customer, nation n1, nation n2 WHERE s_suppkey = l_suppkey AND o_orderkey = l_orderkey AND c_custkey = o_custkey AND s_nationkey = n1.n_nationkey AND c_nationkey = n2.n_nationkey AND ( (n1.n_name = 'ROMANIA' AND n2.n_name = 'INDONESIA') OR (n1.n_name = 'INDONESIA' AND n2.n_name = 'ROMANIA') ) AND (n1.n_name = 'ROMANIA' OR n1.n_name = 'INDONESIA') AND (n2.n_name = 'INDONESIA' OR n2.n_name = 'ROMANIA') AND l_shipdate BETWEEN DATE '1995-01-01' AND DATE '1996-12-31'; ……
以上如果在执行计划层优化,则,
- 相当于增加了两组单表条件。
- (n1.n_name = 'ROMANIA' and n2.n_name = 'INDONESIA')or (n1.n_name = 'INDONESIA' and n2.n_name = 'ROMANIA')可以作为复杂Join条件一次执行,避免or算子拆分条件导致两次Join运算。
打赏作者以资鼓励: