How to set properties for secondary namenode in Hadoop

0 votes

Hi Guys,

I am trying to set properties for secondary namenode, so that my cluster will always remain in running state even if my namenode goes down.

Thank You

Mar 31, 2020 in Big Data Hadoop by akhtar
• 38,260 points
4,278 views

2 answers to this question.

0 votes

Hi@akhtar,

Before setting properties make sure your cluster is not running. And set properties in hdfs-site.xml file.

<property>
    <name>dfs.secondary.http.address</name>
    <value>master:50090</value>
</property>

Now start your cluster. It will work.

answered Mar 31, 2020 by MD
• 95,460 points
0 votes

Secondary NameNode in HDFS

Secondary NameNode in Hadoop is more of a helper to NameNode, it is not a backup NameNode server which can quickly take over in case of NameNode failure.

Before going into details about Secondary NameNode in HDFS let’s go back to the two files which were mentioned while discussing NameNode in Hadoop– FsImage and EditLog.

  • EditLog– All the file write operations done by client applications are first recorded in the EditLog.
  • FsImage– This file has the complete information about the file system metadata when the NameNode starts. All the operations after that are recorded in EditLog.

When the NameNode is restarted it first takes metadata information from the FsImage and then apply all the transactions recorded in EditLog. NameNode restart doesn’t happen that frequently so EditLog grows quite large. That means merging of EditLog to FsImage at the time of startup takes a lot of time keeping the whole file system offline during that process.

Now you may be thinking only if there is some entity which could take over this job of merging FsImage and EditLog and keep the FsImage current that will save a lot of time. That’s exactly what Secondary NameNode does in Hadoop. Its main function is to check point the file system metadata stored on NameNode.

The process followed by Secondary NameNode to periodically merge the fsimage and the edits log files is as follows-

  1. Secondary NameNode gets the latest FsImage and EditLog files from the primary NameNode.
  2. Secondary NameNode applies each transaction from EditLog file to FsImage to create a new merged FsImage file.
  3. Merged FsImage file is transferred back to primary NameNode.

The start of the checkpoint process on the secondary NameNode is controlled by two configuration parameters which are to be configured in hdfs-site.xml.

  • dfs.namenode.checkpoint.period - This property specifies the maximum delay between two consecutive checkpoints. Set to 1 hour by default.
  • dfs.namenode.checkpoint.txns - This property defines the number of uncheckpointed transactions on the NameNode which will force an urgent checkpoint, even if the checkpoint period has not been reached. Set to 1 million by default.

Following image shows the HDFS architecture with communication among NameNode, Secondary NameNode, DataNode and client application.

image

answered Mar 31, 2020 by anonymous

Related Questions In Big Data Hadoop

0 votes
1 answer

How to configure secondary namenode in Hadoop 2.x ?

bin/hadoop-daemon.sh start [namenode | secondarynamenode | datanode ...READ MORE

answered Apr 6, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
1,814 views
+1 vote
2 answers

How to authenticate username & password while using Connector for Cloudera Hadoop in Tableau?

Hadoop server installed was kerberos enabled server. ...READ MORE

answered Aug 21, 2018 in Big Data Hadoop by Priyaj
• 58,020 points
1,680 views
+1 vote
0 answers

How to set up Hadoop cluster on Mac in intelliJ IDEA

I have Installed hadoop using brew and ...READ MORE

Jul 25, 2018 in Big Data Hadoop by Neha
• 6,300 points
1,085 views
0 votes
1 answer

How to create a project for the first time in Hadoop.?

If you want to learn Hadoop framework ...READ MORE

answered Jul 27, 2018 in Big Data Hadoop by Neha
• 6,300 points
1,161 views
0 votes
1 answer

What is the command to find the free space in HDFS?

You can use dfsadmin which runs a ...READ MORE

answered Apr 29, 2018 in Big Data Hadoop by Shubham
• 13,490 points
2,179 views
0 votes
1 answer

How to find the used cache in HDFS

hdfs dfsadmin -report This command tells fs ...READ MORE

answered May 4, 2018 in Big Data Hadoop by Shubham
• 13,490 points
2,466 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,028 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,536 views
0 votes
1 answer

How to set replication factor in Hadoop?

Hi@akhtar, You can find setrep command in the ...READ MORE

answered Oct 1, 2020 in Big Data Hadoop by MD
• 95,460 points
7,907 views
0 votes
1 answer

How to find the number of blocks for a file in Hadoop?

Hi@akhtar, You can use Hadoop file system command to ...READ MORE

answered Oct 13, 2020 in Big Data Hadoop by MD
• 95,460 points
2,271 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP