一、数据分布

查看表的分布键方式

snippet.sql
tpch1000s=# SELECT c.relname, policytype, numsegments, distkey 
FROM pg_class c, pg_catalog.gp_distribution_policy gpp 
WHERE c.oid = gpp.localoid;
 relname  | policytype | numsegments | distkey 
----------+------------+-------------+---------
 nation   | p          |         192 | 1
 region   | p          |         192 | 1
 part     | p          |         192 | 1

注:如果为 RANDOMLY分布,则 distkey为空。

查看数据分布

snippet.sql
tpch1000s=# SELECT gp_segment_id, COUNT(*) FROM supplier GROUP BY gp_segment_id ORDER BY 1;
 gp_segment_id | COUNT 
---------------+-------
             0 | 51716
             1 | 52062
             2 | 51880

数据分布倾斜程度分析

snippet.sql
tpch1s=# SELECT * FROM gp_toolkit.gp_skew_coefficients;
 skcoid | skcnamespace | skcrelname |         skccoeff         
--------+--------------+------------+--------------------------
  16386 | public       | nation     | 28.284271247461900800000
  16393 | public       | region     | 28.284271247461900976000
  16400 | public       | part       |  0.118793939239339984000
  16407 | public       | supplier   |  0.367695526217004712000

skccoeff 表示变异系数(coefficient of variation, CV)值,是标准差除以均值。值越小越好(分布越均匀),值越大表示分布越倾斜。 $$ {标准差\ } \sigma = \sum_{i=1}^N\sqrt{\frac{(x_i-\overline{x})^2}{n}} $$

$$ {变异系数\ } C_V = \frac{\sigma}{\overline{x}} $$

二、集群状态

segment状态

snippet.sql
tpch1000s=# SELECT * FROM gp_segment_configuration ORDER BY dbid;
 dbid | content | ROLE | preferred_role | mode | STATUS | port | hostname | address |         datadir          
------+---------+------+----------------+------+--------+------+----------+---------+--------------------------
    1 |      -1 | p    | p              | n    | u      | 7456 | test-4   | gp10g_7 | /opt/gp/gpmaster/gpseg-1
    2 |       0 | p    | p              | n    | u      | 6345 | test-1   | gp10g_4 | /opt/gp/gpdata/gpseg0
    3 |       1 | p    | p              | n    | u      | 6346 | test-1   | gp10g_4 | /opt/gp/gpdata/gpseg1
    4 |       2 | p    | p              | n    | u      | 6347 | test-1   | gp10g_4 | /opt/gp/gpdata/gpseg2

参考:PostgreSQL、Greenplum 日常监控 和 维护任务

查询当前session的backend进程id

snippet.sql
tpch1s=> SELECT pg_backend_pid();
 pg_backend_pid 
----------------
           2543

可以用ps命令验证。

三、配置参数

reload config

有些配置参数修改后,需要刷新参数值,即重新加载postgres.conf配置文件,但无需重启集群。则使用“gpstop -u”命令,例如:

snippet.bash
[gpadmin@bogon ~]$ gpconfig -s gp_resqueue_memory_policy
Values on all segments are consistent
GUC          : gp_resqueue_memory_policy
Master  value: none
Segment value: none
[gpadmin@bogon ~]$ gpconfig -c gp_resqueue_memory_policy -v eager_free
……
[gpadmin@bogon ~]$ gpconfig -s gp_resqueue_memory_policy
Values on all segments are consistent
GUC          : gp_resqueue_memory_policy
Master  value: none
Segment value: none
[gpadmin@bogon ~]$ gpstop -u
……
[gpadmin@bogon ~]$ gpconfig -s gp_resqueue_memory_policy
Values on all segments are consistent
GUC          : gp_resqueue_memory_policy
Master  value: eager_free
Segment value: eager_free