How to Install Hadoop 2.7.1 on CentOs, LinuxMint and Ubuntu

Now Apache Hadoop 2.7.1 is minor release in the 2.x.y release line, it is developed version of 2.7.0. It is a stable release after the Apache Hadoop 2.6.0.This release drops support for the support of JDK6 runtime and works with JDK7+ only. It’s HDFS feature support for files with variable length block, file truncation and  quotas per storage type. It’s MAPREDUCE feature has the ability to speed up FileOutputCommiter for very large jobs with many output files.

Now in this post i have mentioned the step by step process to install the Apache Hadoop 2.7.1 in CentOs, LinuxMint, and Ubuntu.


Step 1:             First Install Java

If you wish to install Hadoop 2.7.1 then you should have Java in your system. Check the availability of the Java in your system using the following command.

# java -version 

java version "1.8.0_66"

Java(TM) SE Runtime Environment (build 1.8.0_66-b17)

Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)



Step 2:             Create Account for Hadoop

It is recommended to create a normal account for Hadoop working, So first create the account using following command.


# adduser hadoop

# passwd hadoop



After creating the account then type the following command to setup the key based ssh to own the account.


# su - hadoop

$ ssh-keygen -t rsa

$ cat ~/.ssh/ >> ~/.ssh/authorized_keys

$ chmod 0600 ~/.ssh/authorized_keys


Now we have to verify key based login. Below command should not ask for password but first time it will prompt for adding RSA to the list of known hosts.


$ ssh localhost

$ exit


Step 3:             Download Hadoop 2.7.1 Source File

Download hadoop 2.7.1 source archive file using below command. You can also select alternate download mirror for increasing download speed.

$ cd ~

$ wget

$ tar xzf hadoop-2.7.1.tar.gz

$ mv hadoop-2.7.1 hadoop


Step 4:             Configuring Hadoop Pseudo-Distributed Mode


Now we have to setup environment variable, Edit ~/.bashrc file and append following values at end of file.

export HADOOP_HOME=/home/hadoop/hadoop








Apply the above changes in the current environment.

$ source ~/.bashrc

And now edit $HADOOP_HOME/etc/hadoop/ file. And set JAVA_HOME environment variable. Change the JAVA path as per install on your system. This path may vary as per your operating system version and installation source. So make sure you are using correct path.

Type the following command to edit the path

export JAVA_HOME=/usr/lib/jvm/java-8-oracle

Step 5:                         Setup Hadoop Configuration Files

There are many configuration file in Hadoop, which need to configure as per requirements of our Hadoop infrastructure. Now let us start with the configuration with basic Hadoop single node cluster setup.

Navigate to the below given location

$ cd $HADOOP_HOME/etc/hadoop

And type the following configuration code.

Edit core-site.xml

Type the following code







Edit hdfs-site.xml
Type the following code



















Edit mapred-site.xml

Type the following code







Edit yarn-site.xml








Step 6:                         Format Namenode

Now format the namenode using following command.

$ hdfs namenode -format

Step 7:                         Starting the Hadoop Cluster


After formatting the namenode then start the Hadoop Cluster

Move to your Hadoop sbin directory and execute scripts one by one.

$ cd $HADOOP_HOME/sbin/

And now run the script.

$ ./

after running the dfs script run the yarn by using the following code.

$ ./

Step 8:                        Accessing the Hadoop Services

Access the Hadoop services by the port number 50070 default on the web browser.

For example: http:/

and after knowing the details about port 50070, Now access port 8088 for getting the information about cluster and all applications by using the following command.

For example:

Access port  50090 for getting details about secondary namenode.

For example: 50090/

And finally access the port 50075 to get details about the data node.

Step 9:                        Testing The Hadoop Single Node Setup

Make the HDFS directories required using following commands.

$ bin/hdfs dfs -mkdir /user

$ bin/hdfs dfs -mkdir /user/hadoop

Now copy all files from local file system /var/log/httpd to hadoop distributed file system using below command.

$ bin/hdfs dfs -put /var/log/apache2 logs

After finding the log files, now copy all files from local file system /var/log/httpd to hadoop distributed file system using below command.

$ bin/hdfs dfs -put /var/log/apache2 logs

Now browse hadoop distributed file system by opening below url in browser. You will see apache2 folder in list. Click on folder name to open and you will find all log files there.

For Example:

Now copy logs directory for hadoop distributed file system to local file system.

$ bin/hdfs dfs -get logs /tmp/logs

$ ls -l /tmp/logs/

And finally Hadoop 2.7.1 is installed in your system.






6:52 am