云环境准备过程见:使用Kaggle云GPU及SSH登录
注:云主机的环境是Ubuntu。
安装依赖:
apt install libreadline-dev apt install flex apt install bison
下载REL_16_STABLE
分支源码:
mkdir /opt/postgres mkdir /opt/postgres/pg_bin mkdir /opt/postgres/pg_db cd /opt/postgres git clone https://git.postgresql.org/git/postgresql.git cd postgresql git checkout -b REL_16_STABLE origin/REL_16_STABLE
编译Debug版本的PG:
./configure --enable-cassert --enable-debug --without-icu --prefix=/opt/postgres/pg_bin/ CFLAGS="-ggdb -O0" make -j4 make install
安装PG:
# 准备一个普通用户,因为PG不能使用root用户启动服务。 adduser yz chown yz:yz -R /opt/postgres/pg_bin/ chown yz:yz -R /opt/postgres/pg_db/ # 切换到普通用户下操作 su - yz vim ~/pg_env.sh # pg_env.sh脚本文件内容如下: export PGHOME=/opt/postgres/pg_bin export PGDATA=/opt/postgres/pg_db PATH=$PGHOME/bin:$PATH:$HOME/bin PG_CONFIG=$PGHOME/bin/pg_config source ~/pg_env.sh initdb -D /opt/postgres/pg_db # 可以这样启动PG: pg_ctl -D /opt/postgres/pg_db -l logfile start
下载pg-strom
源码:
su - yz mkdir pg-strom cd pg-strom git clone https://github.com/heterodb/pg-strom.git
编译:
source ~/pg_env.sh cd pg-strom/src make make install
配置pg-strom:
经测试pg-strom在多款GPU环境下宕机,PG-Strom GPU Service
后端进程无法启动。
问题出在nvcc选项不合适,需要去掉--source-in-ptx
选项。
需要修改一下代码:
diff --git a/src/Makefile.cuda b/src/Makefile.cuda index ab92a7a6..39056325 100644 --- a/src/Makefile.cuda +++ b/src/Makefile.cuda @@ -38,7 +38,7 @@ endif # flags to build GPU libraries __NVCC_CFLAGS += -I $(shell $(PG_CONFIG) --includedir-server) \ diff --git a/src/Makefile.cuda b/src/Makefile.cuda index ab92a7a6..39056325 100644 --- a/src/Makefile.cuda +++ b/src/Makefile.cuda @@ -38,7 +38,7 @@ endif # flags to build GPU libraries __NVCC_CFLAGS += -I $(shell $(PG_CONFIG) --includedir-server) \ --maxrregcount=$(MAXREGCOUNT) \ - --source-in-ptx -lineinfo \ + -lineinfo \ -DHAVE_FLOAT2 \ -DCUDA_MAXTHREADS_PER_BLOCK=$(MAXTHREADS_PER_BLOCK) \ $(__NVCC_TARGET) diff --git a/src/gpu_service.c b/src/gpu_service.c index 3d714568..f6a1a895 100644 --- a/src/gpu_service.c +++ b/src/gpu_service.c @@ -677,7 +677,7 @@ __rebuild_gpu_fatbin_file(const char *fatbin_dir, appendStringInfo(&cmd, " /bin/sh -x -c '%s/bin/nvcc" " --maxrregcount=%d" - " --source-in-ptx -lineinfo" + " -lineinfo" " -I. -I%s " " -DHAVE_FLOAT2 " " -DCUDA_MAXTHREADS_PER_BLOCK=%u "
问题修复后,PG进程如下:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND yz 10972 0.0 0.0 4248 3624 pts/1 S 11:43 0:00 -bash yz 13610 0.1 0.4 5958516 150996 ? Ss 12:04 0:00 postgres -D pg_db yz 13613 0.0 0.0 5958516 3396 ? Ss 12:04 0:00 postgres: checkpointer yz 13614 0.0 0.1 5958676 43340 ? Ss 12:04 0:00 postgres: background writer yz 13616 0.1 0.5 16557636 168532 ? Ssl 12:04 0:00 postgres: PG-Strom GPU Service yz 13629 0.0 0.0 5958516 20828 ? Ss 12:04 0:00 postgres: walwriter yz 13631 0.0 0.0 5960140 5760 ? Ss 12:04 0:00 postgres: autovacuum launcher yz 13633 0.0 0.0 5960144 6156 ? Ss 12:04 0:00 postgres: logical replication launcher
SQL示例:
postgres=# CREATE EXTENSION pg_strom ; CREATE EXTENSION postgres=# EXPLAIN SELECT t1.a FROM t1 JOIN t2 ON t1.a=t2.a; QUERY PLAN ----------------------------------------------------------------------- Custom Scan (GpuJoin) ON t1 (cost=167.38..503.29 ROWS=32512 width=4) GPU Projection: t1.a GPU JOIN Quals [1]: (t1.a = t2.a) ... [nrows: 2550 -> 32512] GPU OUTER Hash [1]: t1.a GPU INNER Hash [1]: t2.a -> Seq Scan ON t2 (cost=0.00..35.50 ROWS=2550 width=4) (6 ROWS)