大数据平台CDH搭建

3,014 阅读4分钟

1.  CDH简介

来自官网简介:   

     CDH is Cloudera’s 100% open source platform distribution, including Apache Hadoop and built specifically to meet enterprise demands. CDH delivers everything you need for enterprise use right out of the box. By integrating Hadoop with more than a dozen other critical open source projects, Cloudera has created a functionally advanced system that helps you perform end-to-end Big Data workflows.

www.cloudera.com/

2.  环境准备

2.1   机器准备

 三台虚拟机即可,博主这里使用的一台xenserver虚拟服务器分出来的三台虚拟机。

| IP地址       |   主机名           | 系统版本   |  软件                |
| 192.168.1.80 | devcdh1.juejin.org| centos7.6 |  ClouderaManager Mysql  
| 192.168.1.81 | devcdh1.juejin.org| centos7.6 |  hadoopNameNode  HBaseMaster NodeManagre
| 192.168.1.82 | devcdh1.juejin.org| centos7.6 |  hadoopNode HRegionServer sparkSlave

2.2 软件准备(离线安装)

因公网安装速度很慢,而经常会下载断线,最好的方式是离线安装,事前将rpm包下载到服务器上.

2.2.1 下图所有rpm软件下载到本地

archive.cloudera.com/cm6/6.1.1/r…

2.2.2 下图中红框内rpm软件下载到本地

archive.cloudera.com/cdh6/6.1.1/…

2.3 系统初始化(三台机器都执行)

1. 安装常用软件
yum install wget iptables-services telnet net-tools git curl unzip sysstat lsof ntpdate lrzsz vim  -y
2. 安装ntp时间服务器并启动
yum install ntp -y
systemctl start ntpd
systemctl enable ntpd
3. 为了后续方便操作,先关闭防火墙并禁止开机启动(iptables为上图安装,firewalld为centos7默认防火墙)
systemctl stop iptables
systemctl disable iptables
systemctl stop firewalld
systemctl disable firewalld

2.4 安装JDK和mysql-connect(三台机器都执行)

将下载好的oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm 上传到服务器

执行命令安装jdk: 

rpm -ivh oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm

写入系统全局环境文件,并声明:

echo "export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera"  >> /etc/profile
echo "export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib"  >> /etc/profileecho "export PATH=$PATH:$JAVA_HOME/bin"  >> /etc/profile

source /etc/profile   #声明
java -version         #查看版本

安装mysql-connect

mkdir /usr/share/java/  && cd /usr/share/java/
wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz
scp ./mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar

2.5 设置hosts(三台机器都执行)

服务器主机名看上去和二级域名一样。只是没有解析,这是CDH官方文档要求:

docs.cloudera.com/documentati…

注*  主机名称尽量使用自已公司域名

echo "192.168.1.80 devcdh1.juejin.org" >> /etc/hosts
echo "192.168.1.81 devcdh2.juejin.org" >> /etc/hostsecho "192.168.1.82 devcdh3.juejin.org" >> /etc/hosts

2.6 安装mysql(官方支持多种数据库,博主mysql用得比较多,这里选择mysql)

支持的数据库:

docs.cloudera.com/documentati…

执行下面命令安装mysql:

wget http://dev.mysql.com/get/mysql57-community-release-el7-11.noarch.rpm  #下载官方rpm包
yum localinstall mysql57-community-release-el7-11.noarch.rpm -y            #安装rpm包
yum repolist enabled | grep "mysql.*-community.*"  -y                      
yum install mysql-community-server -y                                      #安装mysql-server
systemctl start mysqld                                                     #启动mysql
grep 'temporary password' /var/log/mysqld.log                              #查看mysql 初始密码仓库

创建cdh集群所需要的库和账密(账密请自行设定)

CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'XXXXXXX#####';

CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'XXXXXXX#####';
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'XXXXXXX#####';
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'XXXXXXX#####';
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'XXXXXXX#####';
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'XXXXXXX#####';
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'XXXXXXX#####';
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'XXXXXXX#####';
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'XXXXXXX#####';

2.7 安装cloudera-manager

将下载好的 cloudera-manager 软件上传到服务器上

dev-cdh1.juejin.org  安装  ->   cloudera-manager-daemons cloudera-manager-agent cloudera-manager-serverdev-cdh2.juejin.org  安装  ->   cloudera-manager-daemons cloudera-manager-agent dev-cdh3.juejin.org  安装  ->    cloudera-manager-daemons cloudera-manager-agent 

登陆到dev-cdh1.juejin.org 执行命令:

yum localinstall cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server -y

登陆到dev-cdh2.juejin.org 执行命令:

yum localinstall cloudera-manager-daemons cloudera-manager-agent -y

登陆到dev-cdh3.juejin.org 执行命令:

yum localinstall cloudera-manager-daemons cloudera-manager-agent -y

注*   yum localinstall 为安装本地rpm包文件  yum install 是下载yum仓库软件再安装

2.8 启动manager

初始化数据库 : 

/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm 

在dev-cdh1.juejin.org 上执行启动命令: 

service cloudera-scm-server restart

本地绑一下hosts, 博主这里使用的SwitchHosts:

192.168.1.80 devcdh1.juejin.org
192.168.1.81 devcdh2.juejin.org
192.168.1.82 devcdh3.juejin.org

访问: 

devchd1.juejin.org:7180 打开管理页面,CDH默认账密为:  admin admin

3. 集群安装

3.1 将主机组成集群,安装cdh agent

3.1.1  打开管理后台,登陆后并接受协议                                                    点击继续

3.1.2 选择Cloudera Express版本                                                              点击继续

3.1.3  Specify Hosts   将三台主机名填入,搜索后选择全部                        点击继续

3.1.4  选择存储库  默认既可                                                                      点击继续

3.1.5   JDK 安装选项  因我们已经安装jdk  这里不用勾选                            点击继续    

3.1.6   提供 SSH 登录凭据。 将服务器密码填入既可                                   点击继续

3.1.7  Install Agents                                                                                 点击继续

3.1.8  Install Parcels                                                                                 点击继续

3.1.8  检查主机正确性 如有报错根据提示修正后再重新检查既可                  点击继续

3.2 配置CDH服务部署

3.2.1  选择安装的服务   自定义安装, 选择hadoop hbase spark hive  hue  zookeepr 既可,其他或后续添加

3.2.2  自定义角色分配,选择安装在那个节点上  需要根据服务器配置来选择,这里因为cdh安装在devcdh1上。所以这里将hadoopNamenode hbaseMaster sparkMaster等全部安装在devcdh2上

3.2.3  数据设置。将前端设置好的账密和主机地址填入,点击测试即可

3.2.4  后续一路默认,直到安装完成

4. 离线搭建私有yum仓库

cloudera-repo.repo

[cloudera-repo]
name=cloudera-repo
baseurl=http://ip:port/cloudera-repos/cm6/6.1.0/
enabled=1
gpgcheck=0

cloudera-repo-cdh.repo

[cloudera-repo-cdh]
name=cloudera-repo-cdh
baseurl=http://ip:port/cloudera-repos/cdh6/6.1.1/
enabled=1
gpgcheck=0

mkdir -pv /var/www/html/cloudera-repos
#启动一个web服务
python -m SimpleHTTPServer 8900
然后将这些文件下载到 该目录下
下载CM6.1的安装包cloudera-manager-agent-6.1.0-769885.el7.x86_64.rpm
cloudera-manager-daemons-6.1.0-769885.el7.x86_64.rpm
cloudera-manager-server-6.1.0-769885.el7.x86_64.rpm
cloudera-manager-server-db-2-6.1.0-769885.el7.x86_64.rpmallkeys.asc

repomd.xml下载CDH6.1的安装包repomd.xml
manifest.json

参考:

CDH6部分可用源:

ro-bucharest-repo.bigstepcloud.com/cloudera-re…

官方安装文档:

docs.cloudera.com/documentati…

官方对服务器要求文档:

docs.cloudera.com/documentati…