How to Set Up Hadoop Cluster with HDFS High Availability

SacTiw says:
Dec 12, 2017 at 1:05 pm GMT
Normally a client would send a get/put file request to a particular “namenode” right? So once a failover has happened how would client get to know about it?
Assuming it is client responsibility to perform the retry on failure in that case is there a way client can first query for currently active namenode and then send a request to that one?
Reply
Barış says:
Nov 29, 2017 at 7:59 am GMT
It would be really good to show how to restart this system.
Thank you for sharing this valuable information.
Reply
- EdurekaSupport says:
  Jan 5, 2018 at 11:38 am GMT
  Thank you @Baris for appreciating our work. We will look into your suggestions as well. Cheers :)
  Reply
Hassan Asghar says:
Nov 4, 2017 at 1:41 pm GMT
my hadoop cluster is setup, and working fine:
i ran word count example:
can anybody provide me the following formulas to calculate some parameters:
Response Time:
Throughput:
Average I/o Rate:
Execution Time:
Thanks in advance
Reply
Den Kushnerik says:
Jan 9, 2017 at 9:42 am GMT
Hello. Its a very helpful instruction for me!
Do we need to format the ZKFC on Standby NameNode too?
According to this page: http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Initializing_HA_state_in_ZooKeeper we must do it one time: “…next step is to initialize required state in ZooKeeper. You can do so by running the following command from one of the NameNode hosts.”
Reply
aagnasoft says:
Dec 15, 2016 at 5:20 am GMT
Wow, It is a very helpful information. Thank you so much.
Reply
Sanjay says:
Nov 19, 2016 at 12:37 pm GMT
Normally when we setup a hadoop cluster (non HA), we need to configure yarn by modifying its yarn-site.xml . For HA, don’t we require any HA specific modification to yarn-site.xml ?
Reply
- Ashish Bakshi says:
  Nov 29, 2016 at 8:11 am GMT
  Thanks Sanjay for going through the blog.
  In this blog, we are modifying hdfs-site.xml because we are enabling HA feature only for NameNode. And yes you are absolutely correct, you can have HA for ResourceManager as well where you will have to modify the yarn-site.xml similarly. You can follow the Hadoop documentations to setup HA for ResouceManager which is given below:
  https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html
  Reply
Rakibul hassan Rakib says:
Sep 5, 2016 at 7:17 am GMT
I am just correcting your HA Architecture image
Reply
Rakibul hassan Rakib says:
Sep 3, 2016 at 5:10 am GMT
After killing active or standby namenode I am not getting web view of killing namenode. Is it possible to getting web view after killing namenode ?. But you have seen two namenode web view after killing one namenode. How it is possible? I am facing some problem in my namenode.
Thank you
Rakib
Reply
- Mani says:
  Sep 9, 2016 at 7:36 pm GMT
  Hey Rakib,
  If the namenode is manualy transitioned from active to standby you should be able to see the WEB UI of the namenode as it is still active. But if there is a failover in the active namenode and the it got a automatic transition to the standby namenode you can’t have the web ui because of the obvious reason that the namenode is down. Once you fix the dead namenode you can see the UI with STANDBY mentioned in the UI. Hope this helps
  Thanks,
  MK
  Reply
- EdurekaSupport says:
  Sep 15, 2016 at 6:55 am GMT
  Hey Rakibul, thanks for checking out the blog. Please follow the steps given below:
  -> Please Check your hdfs-site.xml configuration file and make sure that you have set up the automatic failover as per given in the blog.
  -> In case you are still facing the issue, change the directory for namenode, datanode, JN and zookeeper and give the permission 755 for these directories
  chmod 755 directory_path
  -> Format the Active Namenode and start the services as per given in the blog
  Hope this helps.
  Reply
anil kumar says:
Dec 10, 2015 at 5:50 am GMT
am inistaling high avalability like nn1 & nn2 and dn1 …. in that nn1 and nn2 both are standby mode only what i do now
Reply
- Mani says:
  Jun 9, 2016 at 10:39 am GMT
  Hope you got the solution by now anil. It might be the reason that you did not enable automatic failover property in hdfs-site.xml. According to what you are saying that your cluster is in manual failover mode. In this scenario you have to individually designate which name node should be active or standby.
  hdfs haadmin -transitionToActive nn1
  (nn1 – Active , nn2 – Standby)
  hdfs haadmin -transitionToStandby nn1
  (nn1 – Standby , nn2 – Standby)
  hdfs haadmin -transitionToActive nn2
  (nn1 – Standby , nn2 – Active)
  hdfs haadmin -transitionToStandby nn2
  (nn1 – Standby , nn2 – Standby)
  Check your name node service status using the command:
  hdfs haadmin -getServiceStatus
  If you by mistake make both of them active you might encounter scenario of split-brain where on both nodes edits will be in progress resulting in corrupted metadata.
  Hope this helps!
  Thanks,
  MK
  Reply
sureseh says:
Nov 8, 2015 at 9:00 am GMT
Getting below error when i follow the above configuration settings.
15/11/08 01:58:34 ERROR namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA is not enabled.
and i dont find solution for this from google.
Can someone help
regards
suresh bk
Reply
- EdurekaSupport says:
  Nov 19, 2015 at 11:05 am GMT
  Hi Suresh bk
  Thank you for reaching out to us.
  You can connect with our 24/7 support team with all your queries and doubts regarding Hadoop once you enroll for the course.
  You can also get in touch with us by contacting our sales team on +91-8880862004 (India) or 1800 275 9730 (US toll free). You can mail us on sales@edureka.co.
  Reply

1 2 Next »

Virtual machine	IP address	Host name
Active NameNode	192.168.1.81	nn1.cluster.com or nn1
Standby NameNode	192.168.1.58	nn2.cluster.com or nn2
DataNode	192.168.1.82	dn1.cluster.com or dn1

Introduction to Big Data

Introduction to Hadoop

Hadoop Distributed File System

Hadoop Installation

YARN & MapReduce

Data Loading Tools

Apache Pig

Apache Hive

DynamoDB vs MongoDB: Which One Meets Your Business Needs Better?

How To Install MongoDB On Windows Operating System?

How To Install MongoDB On Ubuntu Operating System?

How To Install MongoDB on Mac Operating System?

How To Create User In MongoDB?

Apache HBase

Apache Oozie

Hadoop Interview Questions

Career Guidance

Big Data

How to Set Up Hadoop Cluster with HDFS High Availability

HDFS 2.x High Availability Cluster Architecture

Introduction:

NameNode Availability:

HDFS HA Architecture:

Implementation of HA Architecture:

1. Using Quorum Journal Nodes:

Fencing of NameNode:

2. Using Shared Storage:

Automatic Failover:

Setting Up and Configuring High Availability Cluster in Hadoop:

Recommended videos for you

Python for Big Data Analytics

Webinar: Introduction to Big Data & Hadoop

MapReduce Tutorial – All You Need To Know About MapReduce

Hive Tutorial – Understanding Hive In Depth

Ways to Succeed with Hadoop in 2015

Hadoop for Java Professionals

5 Things One Must Know About Spark

Big Data Tutorial – Get Started With Big Data And Hadoop

Bulk Loading Into HBase With MapReduce

Apache Spark Will Replace Hadoop ! Know Why

Introduction to Hadoop Administration

Apache Kafka With Spark Streaming: Real-Time Analytics Redefined

Hadoop Cluster With High Availability

Apache Spark For Faster Batch Processing

Introduction to Apache Solr-1

Hadoop-A Highly Available And Secure Enterprise Data Warehousing Solution

Administer Hadoop Cluster

Power of Python With BigData

What Is Hadoop – All You Need To Know About Hadoop

Distributed Cache With MapReduce

Recommended blogs for you

Hadoop Cluster Configuration Files

Splunk Knowledge Objects: Splunk Timechart, Data Models And Alert

Introduction to Lambda Architecture

Demystifying Partitioning in Spark

30+ Azure Data Engineer Interview Questions

Machine Learning and Big Data: Is it the future?

DynamoDB vs MongoDB: Which One Meets Your Business Needs Better?

Explaining Kerberos

Transfer files from Windows to Cloudera Demo VM

How To Create User In MongoDB?

Splunk Careers – Your Pathway To Hot Big Data Jobs

Introduction of Hadoop Architecture

How to Run Hive Scripts?

Is Big Data the Right Move for You?

Hadoop MapReduce Interview Questions In 2025

PySpark Tutorial – Learn Apache Spark Using Python

Big Data Analytics: Turning Insights into Action

What is Big Data? – A Beginner’s Guide to the World of Big Data

Apache Falcon: New Data Management Platform For The Hadoop Ecosystem

How to become an Apache Spark Developer?

Join the discussionCancel reply

Browse Categories

Subscribe to our Newsletter, and get personalized recommendations.

How to Set Up Hadoop Cluster with HDFS High Availability