Install Ambari and Deploy HDP in CentOS

소개

스터디를 위해서 Ambari를 이용하여 하둡 환경을 구성하면서 기록을 남긴다.

Host	IP	CPU/RAM/HDD	Service
hadoop-00	192.168.20.20	4vCPU / 4GB / 16GB	Ambari
hadoop-01	192.168.20.21	4vCPU / 4GB / 120GB	NameNode
hadoop-02	192.168.20.22	4vCPU / 4GB / 120GB	Hive / DataNode
hadoop-03	192.168.20.23	4vCPU / 4GB / 120GB	DataNode
hadoop-04	192.168.20.24	4vCPU / 4GB / 120GB	DataNode

환경 준비

아래 항목들은 모든 서버에 공통적으로 적용해야 할 것들이다.

FQDN^[각주:1] 및 호스트네임 설정

호스트들 등록 x N

# vi /etc/hosts

# 추가한다.

192.168.20.20 hadoop-00 hadoop-00.antop.org

192.168.20.21 hadoop-01 hadoop-01.antop.org

192.168.20.22 hadoop-02 hadoop-02.antop.org

192.168.20.23 hadoop-03 hadoop-03.antop.org

192.168.20.24 hadoop-04 hadoop-04.antop.org

호스트네임 설정값 변경

각각 서버에 맞는 번호로 이름 변경 x N

# hostname hadoop-00.antop.org

# hostname -f

hadoop-00

# hostname

hadoop-00.antop.org

네트워크 설정 쪽에도 호스트네임이 들어가 있다. 이것도 변경 x N

# vi /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=hadoop-00.antop.org

GATEWAY=192.168.20.1

Password-less SSH 설정

비밀번호를 물어보지 않는 SSH를 사용할 수 있도록 설정

Ambari 쪽에서 다른 각각의 서버로만 뚫으면 된다.

공개키/개인키 생성

[root@hadoop-00 ~]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

Generating public/private dsa key pair.

Created directory '/root/.ssh'.

Your identification has been saved in /root/.ssh/id_dsa.

Your public key has been saved in /root/.ssh/id_dsa.pub.

The key fingerprint is:

de:3a:1a:9b:cb:5d:ee:b4:a1:b6:e3:8a:64:a9:d2:54 root@hadoop-00.antop.org

The key's randomart image is:

.. 중략 ..

인증 심어놓기(?)

# 나 자신 서버에도 한다.

[root@hadoop-00 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@hadoop-00.antop.org

The authenticity of host 'hadoop-00.antop.org (192.168.20.20)' can't be established.

RSA key fingerprint is 93:02:94:19:21:aa:99:a8:94:87:75:36:06:08:3f:12.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'hadoop-00.antop.org,192.168.20.20' (RSA) to the list of known hosts.

root@hadoop-00.antop.org's password:

Now try logging into the machine, with "ssh 'root@hadoop-00.antop.org'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[root@hadoop-00 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@hadoop-01.antop.org

[root@hadoop-00 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@hadoop-02.antop.org

[root@hadoop-00 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@hadoop-03.antop.org

[root@hadoop-00 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@hadoop-04.antop.org

Ambari 쪽에서 다른 서버들로 SSH 접속시 비밀번호를 물어보지 않고 접속이 잘 되어야 한다.

[root@hadoop-00 ~]# ssh root@hadoop-01.antop.org 'ls /'

# 비밀번호를 물어보지 않고 정상적으로 디렉터리 목록이 나와야함...

※ hadoop-00(Ambari)에 생성된 /root/.ssh/id_dsa 파일을 다운로드 받아놓자. Ambari Web UI에서 클러스터 생성시에 필요하다.

NTP^[각주:2] 서비스 설치 및 실행

NTP 가 안깔려 있으면 설치하자.

# yum -y install ntp

# service ntpd start

# chkconfig ntpd on

방화벽 해제

하둡은 상당히 많은 포트를 사용한다. 철저하게 관리하면 좋겠지만..... 전부 해제!

# chkconfig iptables off

# /etc/init.d/iptables stop

SELinux 비활성화

# setenforce 0

# vi /etc/selinux/config

# This file controls the state of SELinux on the system.

# SELINUX= can take one of these three values:

# enforcing - SELinux security policy is enforced.

# permissive - SELinux prints warnings instead of enforcing.

# disabled - No SELinux policy is loaded.

SELINUX=disabled

# SELINUXTYPE= can take one of these two values:

# targeted - Targeted processes are protected,

# mls - Multi Level Security protection.

SELINUXTYPE=targeted

THP^[각주:3] 비활성화

CentOS에서 THP가 활성화 되어있으면 성능 이슈가 발생할 수 있다고 경고가 나오게 된다.

HDP 가 설치되는 호스트들에 적용한다. (hadoop-01 ~ hadoop-04)

[root@hadoop-01 ~]# vi /etc/rc.local

#!/bin/sh

# This script will be executed *after* all the other init scripts.

# You can put your own initialization stuff in here if you don't

# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

# 추가된 부분

if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then

echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled; fi

if test -f /sys/kernel/mm/redhat_transparent_hugepage/defrag; then

echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag; fi

재부팅 후 값 확인

[root@hadoop-01 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled

※ 이렇게 하는게 맞는지는 확실히 모르겠슴.. 어찌어찌 하다가 경고 없어짐... 여기 참조

unmask 값 변경

/etc/profile 파일을 열어서 umask 설정 로직 하단에 umask 값을 강제로 변경하는 부분을 추가해 주도록 하자.

# vi /etc/profile

# 중략

# By default, we want umask to get set. This sets it for login shell

# Current threshold for system reserved uid/gids is 200

# You could check uidgid reservation validity in

# /usr/share/doc/setup-*/uidgid file

if [ $UID -gt 199 ] && [ "`id -gn`" = "`id -un`" ]; then

umask 002

else

umask 022

# 추가

umask 022

JDK 설치

여기선 1.7을 설치 하자.

# wget 없다면 설치

# yum install -y wget

# 다른 종류의 JDK 있으면 삭제

# yum remove -y java-1.6.0-openjdk

# yum remove -y java-1.7.0-openjdk

# RPM 파일 다운로드

# cd ~

# wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u80-b15/jdk-7u80-linux-x64.rpm"

# 설치

# yum localinstall -y jdk-7u80-linux-x64.rpm

# rm -f ~/jdk-7u80-linux-x64.rpm

# 환경 변수 설정

# echo 'export JAVA_HOME=/usr/java/jdk1.7.0_80' >> /etc/profile

# echo 'export PATH=$PATH:$JAVA_HOME/bin' >> /etc/profile

# 설치 확인

# java -version

Ambari Server 설치

Public Repositories^[각주:4] 등록

yum 으로 설치될 수 있도록 저장소를 등록한다.

[root@hadoop-00 ~]# cd /etc/yum.repos.d/

[root@hadoop-00 yum.repos.d]# wget http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.2.2.0/ambari.repo

설치

[root@hadoop-00 ~]# yum install ambari-server

설정

[root@hadoop-00 ~]# ambari-server setup

Using python /usr/bin/python

Setup ambari-server

Checking SELinux...

SELinux status is 'disabled'

Customize user account for ambari-server daemon [y/n] (n)? n

Adjusting ambari-server permissions and ownership...

Checking firewall status...

Checking JDK...

[1] Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8

[2] Oracle JDK 1.7 + Java Cryptography Extension (JCE) Policy Files 7

[3] Custom JDK

==============================================================================

Enter choice (1): 3

WARNING: JDK must be installed on all hosts and JAVA_HOME must be valid on all hosts.

WARNING: JCE Policy files are required for configuring Kerberos security. If you plan to use Kerberos,please make sure JCE Unlimited Strength Jurisdiction Policy Files are valid on all hosts.

Path to JAVA_HOME: /usr/java/jdk1.7.0_80

Validating JDK on Ambari Server...done.

Completing setup...

Configuring database...

Enter advanced database configuration [y/n] (n)? n

Configuring database...

Default properties detected. Using built-in database.

Configuring ambari database...

Checking PostgreSQL...

Running initdb: This may take upto a minute.

Initializing database: [ OK ]

About to start PostgreSQL

Configuring local database...

Connecting to local database...done.

Configuring PostgreSQL...

Restarting PostgreSQL

Extracting system views...

.....ambari-admin-2.2.2.0.460.jar

Adjusting ambari-server permissions and ownership...

Ambari Server 'setup' completed successfully.

시작

[root@hadoop-00 ~]# ambari-server start

Using python /usr/bin/python

Starting ambari-server

Ambari Server running with administrator privileges.

Organizing resource files at /var/lib/ambari-server/resources...

Server PID at: /var/run/ambari-server/ambari-server.pid

Server out at: /var/log/ambari-server/ambari-server.out

Server log at: /var/log/ambari-server/ambari-server.log

Waiting for server start....................

Ambari Server 'start' completed successfully.

Deploy CLuster using Ambari Web UI

브라우저로 http://<hostname>:8080 에 접속하자.

최초 로그인 아이디는 admin/admin

중앙 Create a Cluster 항목의 Launch Install Wizard 버튼 클릭!

클러스터 이름을 설정한다. 나는 MicroscopeHadoop 으로 하였다.

서비스 스택 버전을 선택한다.

최신 버전인 2.4 해도 되고, 나의 경우 Toad for Hadoop 1.5.0 버전에서 HDP 2.3 을 지원 한다고 하여 2.3 으로 설치한다.

설치할 호스트를 지정한다. FQDN을 줄줄이 써주면 된다.

위에서 다운로드 받아놓은 id_dsa 파일을 여기서 첨부 파일로 넣어준다.

각 호스트에 Ambari Agent 설치 및 이것저것 검사를 한다.

※ 위에 THP 처리를 안하면 아래와 같이 경고가 나오게 된다.

서비스를 선택한다. (항목이 너무 많아서 스크린샷을 편집함...)

나는 Hive 만을 사용하기 위한 최소 항목만 선택 하였다. 나중에 추가 설치 가능하다.

어느 호스트에 무엇을 설치할 것인지 설정한다.

이 부분은 경험과 노하우가 필요하겠다...

나의 경우 서버가 4대라서... 1번에 다 몰고... 2번에 하이브.. 2,3,4 를 데이터 노드로 사용하려고 하였다.

DataNode 로 쓸 호스트를 선택하자.

설정값을 조정하는 부분이다.

특별히 변경 할 건 없는거 같고... -_-/... 아래 처럼 탭 부분에 빨간색 숫자 뜨는 부분 찾아가서 채워 주자.

설치(Deploy) 전 마지막 확인!

설치 시작 그리고 테스트

설치 완료!

설치 완료 요약

완료... 된 것인가 ㅠ_ㅠ

추가로 HTTP 포트 변경하기

https://ambari.apache.org/1.2.3/installing-hadoop-using-ambari/content/ambari-chap2-2a.html

참조

https://ambari.apache.org/1.2.2/installing-hadoop-using-ambari

http://guruble.com/?p=147

https://wiki.kldp.org/KoreanDoc/html/PoweredByDNS-KLDP/fqdn.html [본문으로]
http://www.terms.co.kr/NTP.htm [본문으로]
Transparent Huge Pages [본문으로]
https://cwiki.apache.org/confluence/display/AMBARI/Install+Ambari+2.2.2+from+Public+Repositories [본문으로]

저작자표시 (새창열림)

'Study > Hadoop' 카테고리의 다른 글

Installing Hive(Hadoop 1) + MySQL on CentOS 6.7 (0)	2016.01.05
Installing Hadoop 1 on Ubuntu 14.04 (0)	2015.11.12

Brain to Blog

뇌에서 블로그로... antop@naver.com

Install Ambari and Deploy HDP in CentOS

'Study > Hadoop' 카테고리의 다른 글

Category

Recent Post

Recent Comment

Recent Trackback

Tag Cloud

Archive

Link

티스토리툴바