下载安装包并解压,得到一个bin文件,从bin文件提取rpm文件
1
2
|
# 前960行为安装脚本
[root@hadoop2 ~]# tail -n +961 greenplum-db-5.16.0-rhel7-x86_64.bin > gpdb.tar.gz
|
如果集群没有gpadmin用户,可以利用gpssh
批量创建gpadmin用户。首先以root用户登陆master节点,解压gpdb.tar.gz:
1
2
|
[root@hadoop2 ~]# mkdir gpdb-5.16.0
[root@hadoop2 ~]# tar zxf gpdb.tar.gz -C gpdb-5.16.0
|
修改gpdb-5.16.0/greenplum_path.sh
文件:
使gpdb环境变量生效:
1
2
3
|
[root@hadoop2 ~]# source gpdb-5.16.0/greenplum_path.sh
[root@hadoop2 ~]# which gpssh
/root/gpdb-5.16.0/bin/gpssh
|
创建hosts文件,包含集群所有节点的主机名,主机名需要添加到/etc/hosts
文件中:
1
2
3
4
5
6
7
8
9
10
11
|
[root@hadoop2 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.203.12 hadoop2.lw hadoop2
192.168.203.13 hadoop3.lw hadoop3
192.168.203.14 hadoop4.lw hadoop4
[root@hadoop2 ~]# cat hosts
hadoop2
hadoop3
hadoop4
|
交换各节点之间的ssh key:
1
2
3
4
5
6
7
8
9
10
11
12
|
[root@hadoop2 ~]# gpssh-exkeys -f hosts
[STEP 1 of 5] create local ID and authorize on local host
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] authorize current user on remote hosts
... send to hadoop3
*** Enter password for hadoop3:
... send to hadoop4
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with hadoop3
... finished key exchange with hadoop4
[INFO] completed successfully
|
现在可以开始批量创建gpadmin用户了:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
[root@hadoop2 ~]# gpssh -f hosts
=> groupadd -g 530 gpadmin
[hadoop2]
[hadoop4]
[hadoop3]
=> useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin
[hadoop2]
[hadoop4]
[hadoop3]
=> echo gpadmin:gpadmin | chpasswd
[hadoop2]
[hadoop4]
[hadoop3]
|
修改每个节点上的文件打开数量限制,这里只展示一个节点,其他节点类似操作。因为之前有交换过key,所以可以无需密码直接通过ssh hostname
连接到其他节点修改配置,这个步骤也可以用gpssh
处理,但是稳妥起见,还是手动修改每个节点吧。
1
2
3
4
5
6
|
[root@hadoop2 ~]# vi /etc/security/limits.conf
# End of file
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
|
将数据库安装目录移动到/home/gpadmin
:
1
2
3
4
|
[root@hadoop2 ~]# mv gpdb-5.16.0 /home/gpadmin
[root@hadoop2 ~]# chown -R gpadmin:gpadmin /home/gpadmin/gpdb-5.16.0
[root@hadoop2 ~]# mv hosts /home/gpadmin
[root@hadoop2 ~]# chown gpadmin:gpadmin /home/gpadmin/hosts
|
接下来切换到gpadmin用户下安装gpdb:
1
|
[root@hadoop2 ~]# su - gpadmin
|
修改.bash_profile
,添加如下内容:
1
2
3
4
5
6
7
|
GREENPLUM_PATH=~/gpdb-5.16.0/greenplum_path.sh
source $GREENPLUM_PATH
#DATA_DIR目录用于存放master数据,如果你想把数据放到别的地方,不要忘了修改这个路径
DATA_DIR=~/data/master/gpseg-1/
export MASTER_DATA_DIRECTORY=$DATA_DIR
export PGPORT=5432
export PGDATABASE=postgres
|
使.bash_profile
生效:
1
2
3
|
[gpadmin@hadoop2 ~]$ source .bash_profile
[gpadmin@hadoop2 ~]$ which gpssh
~/gpdb-5.16.0/bin/gpssh
|
因为现在处于gpadmin
用户下,需要再次交换一下各节点之间的ssh key:
1
2
3
4
5
6
7
8
9
10
11
12
|
[gpadmin@hadoop2 ~]$ gpssh-exkeys -f hosts
[STEP 1 of 5] create local ID and authorize on local host
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] authorize current user on remote hosts
... send to hadoop3
*** Enter password for hadoop3:
... send to hadoop4
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with hadoop3
... finished key exchange with hadoop4
[INFO] completed successfully
|
创建segs文件,该文件包含所有子节点的主机名,不包括主节点的主机名:
1
2
3
|
[gpadmin@hadoop2 ~]$ cat segs
hadoop3
hadoop4
|
将数据库安装目录分发到各个子节点:
1
2
3
4
5
6
7
8
9
10
11
12
|
[gpadmin@hadoop2 ~]$ gpseginstall -f segs
20190411:13:52:19:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-Installation Info:
link_name None
binary_path /home/gpadmin/gpdb-5.16.0
binary_dir_location /home/gpadmin
binary_dir_name gpdb-5.16.0
20190411:13:52:19:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-check cluster password access
20190411:13:52:19:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-de-duplicate hostnames
20190411:13:52:19:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-master hostname: hadoop2
20190411:13:52:20:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-rm -f /home/gpadmin/gpdb-5.16.0.tar; rm -f /home/gpadmin/gpdb-5.16.0.tar.gz
...
20190411:13:54:42:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-SUCCESS -- Requested commands completed
|
批量创建存放数据的目录:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
[gpadmin@hadoop2 ~]$ gpssh -f hosts
=> mkdir data
[hadoop2]
[hadoop4]
[hadoop3]
=> cd data
[hadoop2]
[hadoop4]
[hadoop3]
=> mkdir p1 p2 m1 m2 master
[hadoop2]
[hadoop4]
[hadoop3]
|
从数据库安装目录中拷贝一个初始化数据库的配置文件模板:
1
|
cp gpdb-5.16.0/docs/cli_help/gpconfigs/gpinitsystem_config .
|
修改初始化数据库的配置文件,主要修改以下选项:
1
2
3
4
5
|
PORT_BASE=40000 #这个端口不要设置太小了,可能会端口冲突
declare -a DATA_DIRECTORY=(/home/gpadmin/data/p1 /home/gpadmin/data/p2)
MASTER_HOSTNAME=hadoop2
MASTER_DIRECTORY=/home/gpadmin/data/master
declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/data/m1 /home/gpadmin/data/m2)
|
最后初始化数据库:
1
|
gpinitsystem -c gpinitsystem_config -h hosts
|
修改数据库用户gpadmin密码:
1
|
postgres=# ALTER USER gpadmin PASSWORD 'gpadmin';
|
修改master数据目录中的配置文件pg_hba.conf
,在最后添加以下内容:
1
|
host all all 192.168.0.0/16 md5
|
重新加载配置文件:
至此,gpdb数据库可以接受内网的访问了。