Greenplum参数调优
本文适用于GPDB 6.X。
一、内存参数
GPDB推荐参数配置
statement_mem
每条查询单个segment可以使用的内存大小。
Allocates segment host memory per query. The amount of memory allocated with this parameter cannot exceed max_statement_mem or the memory limit on the resource queue or resource group through which the query was submitted. If additional memory is required for a query, temporary spill files on disk are used.
不同资源管理方式下,其不同作用。
估算方法: $$ \frac {gp\_vmem\_protect\_limit * 0.9} {期望的最大并发数} $$
gp_vmem_protect_limit
资源队列时有效。
设置单个活动的segment实例的所有进程可以使用的内存总量(单位:MB)。
官方文档对这个参数的估算比较复杂,一方面考虑的over commit,另一方面考虑了镜像实例数。不考虑镜像、SWAP和over commit,一个简单估算方法如下: $$ \frac {\frac {RAM – 7.5GB} {1.7}} {实例数} $$
二、算子参数
enable_hashagg
诸如enable_hashagg、enable_hashjoin等参数,属于原生PostgreSQL的查询算子参数,通常,当optimizer=on时,不起作用。
例如:
- snippet.sql
tpch10s=# SET enable_hashagg=off; tpch10s=# SET optimizer=ON; tpch10s=# EXPLAIN SELECT SUM(l_quantity), l_partkey, l_suppkey FROM lineitem GROUP BY l_partkey, l_suppkey; QUERY PLAN -------------------------------------------------------------------------------------------------------- Gather Motion 8:1 (slice2; segments: 8) (cost=0.00..6012.40 ROWS=59986052 width=16) -> HashAggregate (cost=0.00..3351.42 ROWS=7498257 width=16) GROUP KEY: l_partkey, l_suppkey -> Redistribute Motion 8:8 (slice1; segments: 8) (cost=0.00..1457.66 ROWS=7498257 width=13) Hash KEY: l_partkey, l_suppkey -> Seq Scan ON lineitem (cost=0.00..971.25 ROWS=7498257 width=13) Optimizer: Pivotal Optimizer (GPORCA)
上述计划,ORCA优化器坚持使用HashAggregate,而不理会enable_hashagg参数配置。
- snippet.sql
tpch10s=# SET enable_hashagg=off; tpch10s=# SET optimizer=off; tpch10s=# EXPLAIN SELECT SUM(l_quantity), l_partkey, l_suppkey FROM lineitem GROUP BY l_partkey, l_suppkey; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Gather Motion 8:1 (slice2; segments: 8) (cost=14249316.35..14444271.04 ROWS=5998606 width=40) -> GroupAggregate (cost=14249316.35..14444271.04 ROWS=749826 width=40) GROUP KEY: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=14249316.35..14264312.86 ROWS=749826 width=40) Sort KEY: lineitem.l_partkey, lineitem.l_suppkey -> Redistribute Motion 8:8 (slice1; segments: 8) (cost=12175746.69..12970561.91 ROWS=749826 width=40) Hash KEY: lineitem.l_partkey, lineitem.l_suppkey -> GroupAggregate (cost=12175746.69..12850589.79 ROWS=749826 width=40) GROUP KEY: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=12175746.69..12325711.82 ROWS=7498257 width=13) Sort KEY: lineitem.l_partkey, lineitem.l_suppkey -> Seq Scan ON lineitem (cost=0.00..654972.52 ROWS=7498257 width=13) Optimizer: Postgres query optimizer
打赏作者以资鼓励: