What are differences between NameNode and Secondary NameNode

Question

Is Secondary NameNode back up of NameNode. If NameNode goes down will secondary NameNode take over the responsibilities of NameNode? What are the differences between NameNode & secondary NameNode?

Shubham · Answer 1 · Mar 23, 2018

No, Secondary NameNode is not a backup of NameNode. You can call it a helper of NameNode.

NameNode is the master daemon which maintains and manages the DataNodes. It regularly receives a Heartbeat and a block report from all the DataNodes in the cluster to ensure that the DataNodes are live.

In case of the DataNode failure, the NameNode chooses new DataNodes for new replicas, balance disk usage and manages the communication traffic to the DataNodes.

It stores the metadata of all the files stored in HDFS, e.g. The location of blocks stored, the size of the files, permissions, hierarchy, etc.

It maintains 2 files:

FsImage: Contains the complete state of the file system namespace since the start of the NameNode.
EditLogs: Contains all the recent modifications made to the file system with respect to the most recent FsImage.

Whereas the Secondary NameNode is one which constantly reads all the file systems and metadata from the RAM of the NameNode and writes it into the hard disk or the file system.

It is responsible for combining the EditLogs with FsImage from the NameNode.

answered Mar 23, 2018 by Shubham
• 13,490 points

nitinrawat895 · Answer 2 · Mar 26, 2019

Name node is the one which stores the information of HDFS filesystem in a file called FSimage.

Any changes that you make in your HDFS are never logged directly into FSimage. instead, they are logged into a separate temporary file.

The name node reads the FSimage file and then reads the temporary file and updates the memory.

This temporary file which stores the intermediate data is called Secondary name node. This secodary name node is used just to speed up the memory accessing process of Name node. since the process of updating the minute data changes directly to the name node consumes a lot of time and is not efficient.

Image result for hdfs architecture

I hope my answer was informative, if not, please read this article which will elaborate more about HDFS and its architecture.

answered Mar 26, 2019 by nitinrawat895
• 11,380 points

score +1 · Answer 3 · Apr 8, 2019

File metadata information is stored by Namenode in form of two files-

fsimage – Contains the snapshot of the file system metadata and used by Namenode when it is started.
edit log – Any change made to the filesystem, after the Namenode is started, is recorded in edit logs.

When the Namenode is eventually restarted it has to first consult the fsimage and then apply all the changes recorded in edit logs which means taking more time for namenode to restart.

Secondary Namenode merges the fsimage and the edits log files periodically and Copies the newly created fsimage file back to Namenode.