本文参考这里安装greenplum成功,并记录一下过程
创建docker节点
拉取centos镜像
1
|
[xiaoyu@xiaoyu ~]$ docker pull centos
|
创建几个容器,作为greenplum的节点
1
2
3
4
|
[xiaoyu@xiaoyu ~]$ docker run -it --name gp-master centos /bin/bash
[xiaoyu@xiaoyu ~]$ docker run -it --name gp-segment1 centos /bin/bash
[xiaoyu@xiaoyu ~]$ docker run -it --name gp-segment2 centos /bin/bash
[xiaoyu@xiaoyu ~]$ docker run -it --name gp-segment3 centos /bin/bash
|
配置基础环境
进入每个greenplum节点,配置基础环境
由于docker的centos镜像是centos的简化版本,里面有很多包是没有安装的,会影响到后面部署greenplum,因此在docker的每个节点中安装相关的依赖包
1
|
[root@00bcc0ba2b3f ~]# yum install -y net-tools which openssh-clients openssh-server less zip unzip iproute
|
docker中默认没有启动ssh,为了方便各节点之间的互连,创建相关的认证key,并启动docker的每个节点里面的ssh
1
2
3
4
|
[root@00bcc0ba2b3f ~]# ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key
[root@00bcc0ba2b3f ~]# ssh-keygen -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key
[root@00bcc0ba2b3f ~]# ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key
[root@00bcc0ba2b3f ~]# /usr/sbin/sshd
|
在每个docker节点中添加如下配置,方便后续greenplum集群的配置文件中用到,ip为各个docker节点中的ip地址
1
2
3
4
5
|
[root@00bcc0ba2b3f ~]# vi /etc/hosts
172.17.0.2 dw-greenplum-1 mdw
172.17.0.3 dw-greenplum-2 sdw1
172.17.0.4 dw-greenplum-3 sdw2
172.17.0.5 dw-greenplum-4 sdw3
|
同时修改所有节点里面的/etc/sysconfig/network
文件,保持与主机名一致
1
2
3
|
[root@00bcc0ba2b3f ~]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=mdw
|
为了方便安装greenplum集群,且使greenplum自带的python不与系统的python版本相冲突,在每个节点中创建greenplum的用户和用户组
1
2
3
4
|
[root@00bcc0ba2b3f ~]# groupadd -g 530 gpadmin
[root@00bcc0ba2b3f ~]# useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin
[root@00bcc0ba2b3f ~]# chown -R gpadmin:gpadmin /home/gpadmin
[root@00bcc0ba2b3f ~]# passwd gpadmin
|
1
2
3
4
5
6
|
[root@00bcc0ba2b3f ~]# vi /etc/security/limits.conf
# End of file
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
|
1
2
3
4
5
6
7
8
9
10
11
12
13
|
[root@00bcc0ba2b3f ~]# service iptables stop
[root@00bcc0ba2b3f ~]# chkconfig iptables off
[root@00bcc0ba2b3f ~]# vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
|
下载greenplum安装包
到greenplum的官网上,下载greenplum安装包,点开Greenplum Database Server,根据自己的操作系统下载安装包,我下载当前最新的greenplum-db-5.10.2-rhel7-x86_64.zip,将其拷到master节点的/home/gpadmin
目录中
在master节点上安装greenplum
切换到gpadmin用户
1
|
[root@00bcc0ba2b3f ~]# su gpadmin
|
解压下载后的zip文件
1
|
[gpadmin@mdw ~]$ unzip greenplum-db-5.10.2-rhel7-x86_64.zip
|
执行安装文件
1
|
[gpadmin@mdw ~]$ ./greenplum-db-5.10.2-rhel7-x86_64.bin
|
安装期间需要配置安装目录,输入/home/gpadmin/greenplum-db-5.10.2
为了方便安装集群,greenplum提供了批量操作节点的命令,通过指定配置文件使用批处理命令
1
2
3
4
5
6
7
8
9
|
[gpadmin@mdw ~]$ vi ./conf/hostlist
mdw
sdw1
sdw2
sdw3
[gpadmin@mdw ~]$ vi ./conf/seg_hosts
sdw1
sdw2
sdw3
|
greenplum-db/greenplum_path.sh
中保存了运行greenplum的一些环境变量,包括GPHOME、PYTHONHOME等,在gpadmin账号下设置环境变量,并将master节点的key交换到各个segment节点
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
[gpadmin@mdw ~]$ source /home/gpadmin/greenplum-db/greenplum_path.sh
[gpadmin@mdw ~]$ gpssh-exkeys -f /home/gpadmin/conf/hostlist
[STEP 1 of 5] create local ID and authorize on local host
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] authorize current user on remote hosts
... send to mdw
... send to sdw1
***
*** Enter password for sdw1:
... send to sdw2
... send to sdw3
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with mdw
... finished key exchange with sdw1
... finished key exchange with sdw2
... finished key exchange with sdw3
[INFO** completed successfully
|
交换成功后,后续就可以使用一些命令执行批量操作
注意:使用gpssh-exkeys命令时一定要使用gpadmin用户,因为会在/home/gpadmin/.ssh
中生成ssh的免密码登录秘钥,如果使用其它账号登录,则会在其它账号下生成密钥,在gpadmin账号下就无法使用gpssh的批处理命令
1
2
3
4
5
6
7
8
9
10
11
12
|
[gpadmin@mdw ~]$ gpssh -f /home/gpadmin/conf/hostlist
=> pwd
[sdw1] /home/gpadmin
[sdw3] /home/gpadmin
[ mdw] /home/gpadmin
[sdw2] /home/gpadmin
=> ls
[sdw1]
[sdw3]
[ mdw] conf greenplum-db greenplum-db-5.10.2
[sdw2]
=> exit
|
pwd命令是linux中的查看路径命令,在这里也是查看批量操作时各个节点当前所在的路径,从中可以看到已经成功连通了4个节点
分发安装包到每个子节点
打包master节点上的安装包
1
|
[gpadmin@mdw ~]$ tar -czf gp.tar.gz greenplum-db-5.10.2
|
使用gpscp命令将这个文件复制到每个子节点
1
|
[gpadmin@mdw ~]$ gpscp -f /home/gpadmin/conf/seg_hosts gp.tar.gz =:/home/gpadmin
|
批量解压,并创建软链接
1
2
3
4
5
6
7
8
9
|
[gpadmin@mdw ~]$ gpssh -f /home/gpadmin/conf/seg_hosts
=> tar -zxf gp.tar.gz
[sdw3]
[sdw1]
[sdw2]
=> ln -s greenplum-db-5.10.2 greenplum-db
[sdw3]
[sdw2]
[sdw1]
|
这样就完成了所有子节点数据库的安装
初始化安装数据库
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
[gpadmin@mdw ~]$ gpssh -f /home/gpadmin/conf/hostlist
=> mkdir gpdata
[sdw1]
[sdw3]
[ mdw]
[sdw2]
=> cd gpdata
[sdw1]
[sdw3]
[ mdw]
[sdw2]
=> mkdir gpmaster gpdatap1 gpdatap2 gpdatam1 gpdatam2
[sdw1]
[sdw3]
[ mdw]
[sdw2]
=> exit
|
- 在master节点上修改
.bash_profile
配置环境变量,并发送给其他子节点,确保这些环境变量生效
1
2
3
4
5
6
7
8
9
10
|
[gpadmin@mdw ~]$ vi .bash_profile
source /opt/gpadmin/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/home/gpadmin/gpdata/gpmaster/gpseg-1
export PGPORT=2345
export PGDATABASE=testDB
[gpadmin@mdw ~]$ source .bash_profile
[gpadmin@mdw ~]$ gpscp -f /home/gpadmin/conf/seg_hosts /home/gpadmin/.bash_profile
[gpadmin@sdw1 ~]$ source .bash_profile
[gpadmin@sdw2 ~]$ source .bash_profile
[gpadmin@sdw3 ~]$ source .bash_profile
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
[gpadmin@mdw ~]$ vi /home/gpadmin/conf/gpinitsystem_config
ARRAY_NAME="Greenplum"
MACHINE_LIST_FILE=/home/gpadmin/conf/seg_hosts
# Segment 的名称前缀
SEG_PREFIX=gpseg
# Primary Segment 起始的端口号
PORT_BASE=33000
# 指定 Primary Segment 的数据目录
declare -a DATA_DIRECTORY=(/home/gpadmin/gpdata/gpdatap1 /home/gpadmin/gpdata/gpdatap2)
# Master 所在机器的 Hostname
MASTER_HOSTNAME=mdw
# 指定 Master 的数据目录
MASTER_DIRECTORY=/home/gpadmin/gpdata/gpmaster
# Master 的端口
MASTER_PORT=2345
# 指定Bash的版本
TRUSTED_SHELL=/usr/bin/ssh
# Mirror Segment起始的端口号
MIRROR_PORT_BASE=43000
# Primary Segment 主备同步的起始端口号
REPLICATION_PORT_BASE=34000
# Mirror Segment 主备同步的起始端口号
MIRROR_REPLICATION_PORT_BASE=44000
# Mirror Segment 的数据目录
declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/gpdata/gpdatam1 /home/gpadmin/gpdata/gpdatam2)
|
1
|
[gpadmin@mdw ~]$ gpinitsystem -c /home/gpadmin/conf/gpinitsystem_config -s sdw3
|
其中,-s sdw3
是指配置master的standby节点,然后按照提示步骤就能完成安装了
如果gpinitsystem不成功,在master节点的/home/gpadmin/gpAdminLogs
目录下gpinitsystem_*.log文件中查看日志信息,找出原因进行修改,然后再重新执行gpinitsystem进行初始化安装