Name node RAM metadata

0 votes
I have a small doubt, Consider we have 128 files with the size of each file is 1MB, Can you please suggest how to find the total size of meta data? Is there any formula or calculation about Namenode Metadata?
Jul 23, 2019 in Big Data Hadoop by Rajini
1,564 views

1 answer to this question.

0 votes
For the above requirement, the memory consumption would depend on the HDFS set up, so depending on the overall size of the HDFS and its relative block size. Please refer to the below explanation.

One namenode object uses about 150 bytes to store metadata information. Assume a 128 MB block size - you should increase the block size if you have a lot of data (PB scale or even 500+ TB in some cases).

Assume a file size 150 MB. The file will be split into two blocks. First block with 128 MB and second block with 22MB. For this file following information will be stored by Namenode.

1 file inode and 2 blocks.

That is 3 namenode objects. They will take about 450 bytes on namenode. For example, at 1MB block size, in this case, we will have 150 file blocks. We will have one inode and 150 blocks information in namenode. This means 151 namenode objects for the same data. 151 x 150 bytes = 22650 bytes. Even worse would be to have 150 files with 1MB each. That would require 150 inodes and 150 blocks = 300 x 150 bytes = 45000 bytes. See how this all changes. That's why we don't recommend small files for Hadoop.

Now assuming 128 MB file blocks, on average 1GB of memory is required for 1 million blocks.

Now let's do this calculation at PB scale.

Assume 6000 TB of data. That's a lot.

Imagine 30 TB capacity for each node. This will require 200 nodes.

At 128 MB block size, and a replication factor of 3.

Cluster capacity in MB = 30 x 1000 (convert to GB) x 1000 (convert to MB) x 200 nodes = 6 000000000 MB (6000 TB)

How many blocks can we store in this cluster?

6 000 000 000 MB/128 MB = 46875000 (that's 47 million blocks)

Assume 1 GB of memory required per million blocks, you need a mere 46875000 blocks / 1000000 blocks per GB = 46 GB of memory.

Namenodes with 64-128 GB memory are quite common. You can do a few things here.

1. Increase the block size to 256 MB and that will save you quite a bit of namenode space. At large scale, you should do that regardless.

2. Get more memory for name node. Probably 256 GB (Never had any customer go this far - may be someone else can chime in).
answered Jul 23, 2019 by Reshma

Related Questions In Big Data Hadoop

0 votes
1 answer

How to exit name node from safe mode?

In order to forcefully let the namenode ...READ MORE

answered Sep 26, 2018 in Big Data Hadoop by slayer
• 29,370 points
2,011 views
0 votes
0 answers

Difference between Name node and Secondary Name node

Explain to me the difference between the ...READ MORE

Mar 26, 2019 in Big Data Hadoop by nitinrawat895
• 11,380 points
528 views
0 votes
1 answer

How does a name-node understand by itself that it is facing a failure?

ZooKeeper-Failover-Controller or in short ZKFC is the ...READ MORE

answered Mar 27, 2019 in Big Data Hadoop by nitinrawat895
• 11,380 points
784 views
0 votes
1 answer

I have installed Hadoop on Ubuntu but name node is not running.

If you are using Hadoop version-2.7.7, then ...READ MORE

answered Apr 30, 2019 in Big Data Hadoop by Gitika
• 65,770 points
1,976 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,028 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,536 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
108,832 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
4,612 views
0 votes
1 answer

What is Shared Edit Logs in case of Stand By Name Node in Hadoop 2.x?

Yes, Shared Edit Logs exist in case ...READ MORE

answered Jun 19, 2019 in Big Data Hadoop by Hansini
2,891 views
0 votes
2 answers

Explain to me the difference between Name Node and Secondary Name Node

Secondary namenode is just a helper for ...READ MORE

answered Aug 6, 2019 in Big Data Hadoop by Dhiman
4,075 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP