云环境准备过程见:使用Kaggle云GPU及SSH登录

注:云主机的环境是Ubuntu。

编译安装PostgreSQL

安装依赖:

snippet.bash
apt install libreadline-dev
apt install flex
apt install bison

下载REL_16_STABLE分支源码:

snippet.bash
mkdir /opt/postgres
mkdir /opt/postgres/pg_bin
mkdir /opt/postgres/pg_db
cd /opt/postgres
 
git clone https://git.postgresql.org/git/postgresql.git
cd postgresql
git checkout -b REL_16_STABLE origin/REL_16_STABLE

编译Debug版本的PG:

snippet.bash
./configure --enable-cassert --enable-debug --without-icu --prefix=/opt/postgres/pg_bin/ CFLAGS="-ggdb -O0"
make -j4
make install

安装PG:

snippet.bash
# 准备一个普通用户,因为PG不能使用root用户启动服务。
adduser yz
chown yz:yz -R /opt/postgres/pg_bin/
chown yz:yz -R /opt/postgres/pg_db/
 
# 切换到普通用户下操作
su - yz
 
vim ~/pg_env.sh
# pg_env.sh脚本文件内容如下:
export PGHOME=/opt/postgres/pg_bin
export PGDATA=/opt/postgres/pg_db
PATH=$PGHOME/bin:$PATH:$HOME/bin
PG_CONFIG=$PGHOME/bin/pg_config
 
source ~/pg_env.sh
initdb -D /opt/postgres/pg_db
 
# 可以这样启动PG:
pg_ctl -D /opt/postgres/pg_db -l logfile start

编译安装pg-strom

下载pg-strom源码:

snippet.bash
su - yz
mkdir pg-strom
cd pg-strom
git clone https://github.com/heterodb/pg-strom.git

编译:

snippet.bash
source ~/pg_env.sh
cd pg-strom/src
make
make install

配置pg-strom:

https://heterodb.github.io/pg-strom/install/

宕机问题修复

经测试pg-strom在多款GPU环境下宕机,PG-Strom GPU Service后端进程无法启动。

问题出在nvcc选项不合适,需要去掉--source-in-ptx选项。

需要修改一下代码:

snippet.diff
diff --git a/src/Makefile.cuda b/src/Makefile.cuda
index ab92a7a6..39056325 100644
--- a/src/Makefile.cuda
+++ b/src/Makefile.cuda
@@ -38,7 +38,7 @@ endif
 # flags to build GPU libraries
 __NVCC_CFLAGS += -I $(shell $(PG_CONFIG) --includedir-server) \
diff --git a/src/Makefile.cuda b/src/Makefile.cuda
index ab92a7a6..39056325 100644
--- a/src/Makefile.cuda
+++ b/src/Makefile.cuda
@@ -38,7 +38,7 @@ endif
 # flags to build GPU libraries
 __NVCC_CFLAGS += -I $(shell $(PG_CONFIG) --includedir-server) \
                  --maxrregcount=$(MAXREGCOUNT)           \
-                 --source-in-ptx -lineinfo               \
+                 -lineinfo               \
                  -DHAVE_FLOAT2                           \
                  -DCUDA_MAXTHREADS_PER_BLOCK=$(MAXTHREADS_PER_BLOCK) \
                  $(__NVCC_TARGET)
diff --git a/src/gpu_service.c b/src/gpu_service.c
index 3d714568..f6a1a895 100644
--- a/src/gpu_service.c
+++ b/src/gpu_service.c
@@ -677,7 +677,7 @@ __rebuild_gpu_fatbin_file(const char *fatbin_dir,
                appendStringInfo(&cmd,
                                                 " /bin/sh -x -c '%s/bin/nvcc"
                                                 " --maxrregcount=%d"
-                                                " --source-in-ptx -lineinfo"
+                                                " -lineinfo"
                                                 " -I. -I%s "
                                                 " -DHAVE_FLOAT2 "
                                                 " -DCUDA_MAXTHREADS_PER_BLOCK=%u "

问题修复后,PG进程如下:

snippet.bash
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
yz         10972  0.0  0.0   4248  3624 pts/1    S    11:43   0:00 -bash
yz         13610  0.1  0.4 5958516 150996 ?      Ss   12:04   0:00 postgres -D pg_db
yz         13613  0.0  0.0 5958516 3396 ?        Ss   12:04   0:00 postgres: checkpointer
yz         13614  0.0  0.1 5958676 43340 ?       Ss   12:04   0:00 postgres: background writer
yz         13616  0.1  0.5 16557636 168532 ?     Ssl  12:04   0:00 postgres: PG-Strom GPU Service
yz         13629  0.0  0.0 5958516 20828 ?       Ss   12:04   0:00 postgres: walwriter
yz         13631  0.0  0.0 5960140 5760 ?        Ss   12:04   0:00 postgres: autovacuum launcher
yz         13633  0.0  0.0 5960144 6156 ?        Ss   12:04   0:00 postgres: logical replication launcher

SQL示例:

snippet.sql
postgres=# CREATE EXTENSION pg_strom ;
CREATE EXTENSION
postgres=# EXPLAIN SELECT t1.a FROM t1 JOIN t2 ON t1.a=t2.a;
                              QUERY PLAN
-----------------------------------------------------------------------
 Custom Scan (GpuJoin) ON t1  (cost=167.38..503.29 ROWS=32512 width=4)
   GPU Projection: t1.a
   GPU JOIN Quals [1]: (t1.a = t2.a) ... [nrows: 2550 -> 32512]
   GPU OUTER Hash [1]: t1.a
   GPU INNER Hash [1]: t2.a
   ->  Seq Scan ON t2  (cost=0.00..35.50 ROWS=2550 width=4)
(6 ROWS)