To understand how or what are the process for the data distribution in Hadoop can be done , I will come up with the procedure how it works:
Hadoop is a distributed file system which follows Master Slave Architecture for Data distribution. In this architecture there is a cluster which consists of one single Name node(Master node) and Data nodes (slave nodes).
Here the Name node and Data node will be working to distribute the file in a structural way with limited memory byte in each nodes.
Name node has the function to manage and maintain the Data nodes. It records the Meta data of the actual data . Meta data is something which keep the location of the block ,the size of the block. Name node also responsible to records if any modification made to the files . For example: If any file is deleted , the Name node will immediately record it.
Data nodes are the slave nodes which is responsible to store the actual data ,where data will be stored in different Data nodes. It also sends the heartbeat a kind of report to Name node periodically to make sure that the Data nodes are working properly or not. In every 3 secs the process will repeat by default.
I hope it will be helpful .