Sabtu Agu 27, 2011

Tutorial Installasi Hadoop On Ubuntu

Installasi Hadoop On Ubuntu

Pengertian :

Hadoop adalah framework software berbasis java yang berfungsi mengolah data yang besar secara terdistribusi mencapai petabyte yang berjalan diatas cluster (beroperasi diatara ribuan komputer) untuk membantu mempermudah transaksi data. Hadoop dirancang bekerja secara handal, efisien, scrable dalam menangani banyak mesin pada cluster. Andal mampu mengatasi kegagalan elemen komputer dan media penyimpanan, efisien melakukan pengelolaan secara paralel. Scrable, dapat mencapai ribuan mesin dan petabyte data. Perangkat yang dibutuhkan hadoop bersandar pada server komoditas, membuat terjangkau bagi banyak orang. Hadoop terdiri dari 2 inti bagian, yaitu :

  1. HDFS : filesystem untuk meyimpan data pada hadoop

  2. Mapreduce : sebagai pendukung jalanya data yang berada di HDFS hadoop

Pada intinya, fungsi hadoop adalah distributed filesystem dan mapreduce engine.


Software Yang Dibutuhkan :

# hadoop-0.20.2.tar.gz --> extrak software

# java versi 5 ke atas

Konfigurasi Awal :

$ nano /etc/profile

export JAVA_HOME="opt/jdk1.6.0_23"

export PATH=${JAVA_HOME}/bin:$PATH

Perintah Installasi Hadoop :

$ sudo add-apt-repository "deb http://archive.canonical.com/ lucid partner"

$ sudo apt-get update
$ sudo apt-get install sun-java6-jdk

$ sudo update-java-alternatives -s java-6-sun

update-alternatives: error: no alternatives for mozilla-javaplugin.so.

update-alternatives: error: no alternatives for xulrunner-1.9-javaplugin.so.
update-alternatives: error: no alternatives for mozilla-javaplugin.so.

update-alternatives: error: no alternatives for xulrunner-1.9-javaplugin.so.

$ dewi@dewi-laptop:~$ java -version

  java version "1.6.0_23"

  Java(TM) SE Runtime Environment (build 1.6.0_23-b05)   

  Java HotSpot(TM) Client VM (build 19.0-b09, mixed mode, sharing)       

 Buat User baru

$ sudo addgroup hadoop

$ sudo adduser --ingroup hadoop hadoop

Konfigurasi ssh

dewi@dewi-laptop:~$ su hadoop

hadoop@dewi-laptop:~$ ssh-keygen -t rsa -P ""

Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
/home/hadoop/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
a6:3a:fe:0e:f8:41:6f:bd:c5:14:2a:5b:c7:4f:84:7d hadoop@dewi-laptop
The key's randomart image is:
      +--[ RSA 2048]----+
      |                 |
      |           o     |
      |          o o E  |
      |         o o .   |
      |    . . S + .    |
      |   o . B + o     |
      |  . o = . o .    |
      |   ..=   o       |
      |   .++o .        |
      +-----------------+

hadoop@dewi-laptop:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys   

hadoop@dewi-laptop:~$ ssh localhost

Linux dewi-laptop
2.6.32-33-generic #70-Ubuntu SMP Thu Jul 7 21:09:46 UTC 2011 i686
GNU/Linux     Ubuntu10.04.3 LTS     Welcometo Ubuntu!     *Documentation:  https://help.ubuntu.com/ Last login: Tue Aug  9
15:44:52 2011 from localhost

DisableIPv6

hadoop@dewi-laptop:~$nano/etc/sysctl.conf

#disable ipv6

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

$cat /proc/sys/net/ipv6/conf/all/disable_ipv6

nilai 0 IPv6 diaktifkan, nilai 1 IPv6 dinonaktifkan

hadoop@dewi-laptop:/$cd /opt/hadoop/conf/

hadoop@dewi-laptop:/opt/hadoop/conf$nano hadoop-env.sh

export JAVA_HOME=/opt/jdk1.6.0_23

Installasi:

hadoop@dewi-laptop:/opt/hadoop$sudo chown -R hadoop:hadoop hadoop

hadoop@dewi-laptop:/opt/hadoop$nano /home/dewi/.bashrc

export JAVA_HOME="/opt/jdk1.6.0_23"

exportPATH=${JAVA_HOME}/bin:$PATH

#Add Hadoop bin/ directory to PATH

export PATH=$PATH:$HADOOP_HOME/bin

hadoop@dewi-laptop:/opt/hadoop/conf$nano core-site.xml

      <property>
        <name>hadoop.tmp.dir</name>
        <value>/app/hadoop/tmp</value>
        <description>A base for other temporary directories.</description>
      </property>

      <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:54310</value>
        <description>The name of the default file system.  A URI whose
        scheme and authority determine the FileSystem implementation.  The
        uri's scheme determines the config property (fs.SCHEME.impl) naming
        the FileSystem implementation class.  The uri's authority is used to
        determine the host, port, etc. for a filesystem.</description>
      </property>

hadoop@dewi-laptop:/opt/hadoop/conf$ nano hdfs-site.xml

<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is creat$
  The default is used if replication is not specified in create time.
  </description>
</property>

hadoop@dewi-laptop:/opt/hadoop/conf$ nano mapred-site.xml

<property>

     <name>mapred.job.tracker</name>
     <value>localhost:54311</value>
     <description>The host and port that the MapReduce job tracker runs
     at.  If "local", then jobs are run in-process as a single map
     and reduce task.
     </description>
   </property>

Format Namenode

hadoop@dewi-laptop:/opt/hadoop$ bin/hadoop namenode -format

10/05/08 16:59:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ubuntu/127.0.1.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
10/05/08 16:59:56 INFO namenode.FSNamesystem: fsOwner=hduser,hadoop
10/05/08 16:59:56 INFO namenode.FSNamesystem: supergroup=supergroup
10/05/08 16:59:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/05/08 16:59:56 INFO common.Storage: Image file of size 96 saved in 0 seconds.
10/05/08 16:59:57 INFO common.Storage: Storage directory .../hadoop-hduser/dfs/name has been successfully formatted.
10/05/08 16:59:57 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/

hadoop@dewi-laptop:/opt/hadoop$ bin/start-all.sh

starting namenode, logging to /opt/hadoop2/logs/hadoop-coba-namenode-dewi-laptop.out
localhost: starting datanode, logging to /home/dewi/Work2/hadoop-cdh3uo/bin/../logs/hadoop-coba-datanode-dewi-laptop.out
localhost: starting secondarynamenode, logging to /home/dewi/Work2/hadoop-cdh3uo/bin/../logs/hadoop-coba-secondarynamenode-dewi-laptop.out
starting jobtracker, logging to /opt/hadoop2/logs/hadoop-coba-jobtracker-dewi-laptop.out
localhost: starting tasktracker, logging to /home/dewi/Work2/hadoop-cdh3uo/bin/../logs/hadoop-coba-tasktracker-dewi-laptop.out

hadoop@dewi-laptop:/opt/hadoop/bin$ jps
5128 SecondaryNameNode
5438 Jps
4731 NameNode
4927 DataNode
5396 TaskTracker
5203 JobTracker

hadoop@dewi-laptop:/$ netstat -plten | grep java

tcp 0 0 127.0.0.1:56770 0.0.0.0:* LISTEN 1002 429466 13364/java

    tcp        0      0 0.0.0.0:50020           0.0.0.0:*               LISTEN      1002       429488      12898/java     
    tcp        0      0 127.0.0.1:54310        0.0.0.0:*               LISTEN      1002       427282     12710/java     
    tcp        0      0 127.0.0.1:54311        0.0.0.0:*               LISTEN      1002       428510     13187/java     
    tcp        0      0 0.0.0.0:50090           0.0.0.0:*               LISTEN      1002       428939      13113/java     
    tcp        0      0 0.0.0.0:50060           0.0.0.0:*               LISTEN      1002       429459      13364/java     
    tcp        0      0 0.0.0.0:50030           0.0.0.0:*               LISTEN      1002       429060      13187/java     
    tcp        0      0 0.0.0.0:57841           0.0.0.0:*               LISTEN      1002       427679      12898/java     
    tcp        0      0 0.0.0.0:39766           0.0.0.0:*               LISTEN      1002       428508      13187/java     
    tcp        0      0 0.0.0.0:50070           0.0.0.0:*               LISTEN      1002       427874      12710/java     
    tcp        0      0 0.0.0.0:53816           0.0.0.0:*               LISTEN      1002       427280      12710/java     
    tcp        0      0 0.0.0.0:50010           0.0.0.0:*               LISTEN      1002       428062      12898/java     
    tcp        0      0 0.0.0.0:50075           0.0.0.0:*               LISTEN      1002       428819      12898/java     
    tcp        0      0 0.0.0.0:41020           0.0.0.0:*               LISTEN      1002       428283      13113/java

hadoop@dewi-laptop:/opt/hadoop/bin$ ./stop-all.sh

stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: stopping datanode
localhost: stopping secondarynamenode
hadoop@dewi-laptop:/opt/hadoop/bin$

Test di Browse :

    * http://localhost:50030/ – web UI for MapReduce job tracker(s)
    * http://localhost:50060/ – web UI for task tracker(s)
    * http://localhost:50070/ – web UI for HDFS name node(s)



 

 

 

 

 

 

 

 


Comments:

Post a Comment:
  • HTML Syntax: Allowed