Contributing limited/specific amount of storage as slave to the cluster

Srinivasreddy
4 min readNov 17, 2020

Hadoop:

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model

hadoop Namenode:

  • NameNode is the centerpiece of HDFS.
  • NameNode is also known as the Master
  • NameNode only stores the metadata of HDFS — the directory tree of all files in the file system, and tracks the files across the cluster.
  • NameNode does not store the actual data or the dataset. The data itself is actually stored in the DataNodes.
  • NameNode knows the list of the blocks and its location for any given file in HDFS. With this information NameNode knows how to construct the file from blocks.
  • NameNode is so critical to HDFS and when the NameNode is down, HDFS/Hadoop cluster is inaccessible and considered down.
  • NameNode is a single point of failure in Hadoop cluster.
  • NameNode is usually configured with a lot of memory (RAM). Because the block locations are help in main memory.

hadoop Datanode:

DataNodes store data in a Hadoop cluster and is the name of the daemon that manages the data. File data is replicated on multiple DataNodes for reliability and so that localized computation can be executed near the data. Within a cluster, DataNodes should be uniform.

Now, In this article i’m going to show how to provide limited/spacific amount of storage as slave.

For this initially we need to have hadoop and jdk files and need to be installed then after have to configure namenode as well as datanodes, for this you can see my past article through this link.

Once Namenode and Datanode configured succesfully , now we can focus on providing the storage

From the above image we can confirm that the slave node sharing the storage of 49G

Now,we can use the concept of partition so that we can provide specific amount of storage as a slave node.

Using fdisk -l we see all the disk storages available for us.

Using fdisk /dev/sdb we can enter into drive here i have drive of /dev/sdb with 8G of storage.

for creating new partition we need to type “n” and then for primary partition we need to type “p” and need to provide the information of size of storage we need to partition.

once,partition created then we need to format and mount so that we use such storage for our datanode.

To format we usemkfs.ext4 /dev/sdb1 here ext4 is our format type.

Once we completed doing format we can mount to the drive of our slave node.

using mount /dev/sdb1 /dn2 we can mount to /dn2 directory which is my datanode sharable drive.

After all we can check whether the specified size i.e 3G is alloted to slave node or not using command hadoop dfsadmin -report

Thus, we have succesfully shared the specific size of storage for masterode as a slave.

🌟 Thank you for reading…

--

--