No, Secondary NameNode is not a backup of NameNode. You can call it a helper of NameNode.
NameNode is the master daemon which maintains and manages the DataNodes. It regularly receives a Heartbeat and a block report from all the DataNodes in the cluster to ensure that the DataNodes are live.
In case of the DataNode failure, the NameNode chooses new DataNodes for new replicas, balance disk usage and manages the communication traffic to the DataNodes.
It stores the metadata of all the files stored in HDFS, e.g. The location of blocks stored, the size of the files, permissions, hierarchy, etc.
It maintains 2 files:
- FsImage: Contains the complete state of the file system namespace since the start of the NameNode.
- EditLogs: Contains all the recent modifications made to the file system with respect to the most recent FsImage.
Whereas the Secondary NameNode is one which constantly reads all the file systems and metadata from the RAM of the NameNode and writes it into the hard disk or the file system.
It is responsible for combining the EditLogs with FsImage from the NameNode.