This time, I will elaborate on the internal principles of NameNode and SecondaryNameNode.

HOW NameNode WORK?

First, we need to consider where the metadata is stored in the NameNode.

If the metadata is stored in the disk of the NameNode node, it must be too inefficient because random access is often required, and it is necessary to respond to customer requests. Therefore, metadata needs to be kept in memory. But if it only exists in memory, once the power is turned off, the metadata is lost, and the entire cluster cannot work. So there is a FsImage that backs up the metadata on disk.

But this will bring new problems. When the metadata in the memory is updated, if the FsImage is updated simultaneously,, the efficiency will be too low, but if it is not updated, a consistency problem will occur. Once the NameNode node is powered off, Data loss will occur. Therefore, the Edits file is introduced (append-only, which is very efficient). Whenever metadata is updated or metadata is added, the metadata in memory is modified and appended to Edits. In this way, once the NameNode node is powered off, the metadata can be synthesized by merging FsImage and Edits.

The definitions of the two are summarized as follows.

  • Fsimage: A permanent checkpoint of the HDFS file system metadata, containing serialization information for all directories and file inodes of the HDFS file system.
  • Edits: The path where all update operations of the HDFS file system are stored. All write operations performed by the file system client will first be recorded in the Edits file.

Every time the NameNode starts, the Fsimage file will be read into the memory.Then load the update operation in Edits to ensure the metadata information in the memory is up-to-date and synchronized. It can be simply seen that when the NameNode starts, the Fsimage and Edits files will be merged.

Now we can get a preliminary understanding of the data update process of NameNode through this pic.

NN

Step 1: After starting the NameNode format for the first time, create the Fsimage and Edits files. If it is not the first startup, directly load the edit log and image files into memory.

Step 2: A request from a client to add, delete, or modify metadata.

Step 3: Record the operation log and update the rolling log.

Step 4: Operates on metadata in memory.


HOW SecondaryNameNode WORK?

In my blog HDFS’s intro, it has been emphasized that 2NN is not a backup node of NN, but assists the work of NN. Here I will explain in detail how 2NN shares the work of NN.

In NameNode work, if data is added to Edits for a long time, the file data will be too large, the efficiency will be reduced, and once the power is cut off, it will take too long to restore the metadata. Therefore, it is necessary to periodically merge FsImage and Edits. If this operation is done by the NameNode node, it will be too inefficient. Therefore, a new node SecondaryNamenode is introduced, dedicated to the merging of FsImage and Edits.

2NN

Step 1: 2NN asks NN if CheckPoint is needed and brings back the result.

Step 2: 2NN requests the execution of CheckPoint.

Step 3: NN scrolls the Edits log that is being written.

Step 4: Copy the edit log and image files before rolling to 2NN.

Step 5: 2NN loads the edit log and image files into memory and merges them.

Step 6: Generate the new image file fsimage.chkpoint.

Step 7: Copy fsimage.chkpoint to NameNode.

Step 8: NN renames fsimage.chkpoint to fsimage.


Summary

The working principle of NameNode is very cleverly designed. It not only realizes the storage of metadata and logs but also combines the advantages of memory and hard disk storage, and improves the stability of HDFS.

Last modification:March 22, 2024
给阿姨倒一杯卡布奇诺~