Before stating with the tutorial on setting up a virtual environment in Hadoop, let’s have a brief introduction on the different modes of Hadoop clusters.
Introduction to Hadoop Cluster Modes:
Hadoop can run in any of the following 3 modes:
- Standalone or Local Mode – There are no daemons running and everything runs in a single JVM. This Mode is suitable for running MapReduce programs during development, since it’s easy to test and debug.
- Pseudo Distributed Mode – Hadoop daemon runs in a separate Java process on local machine, thus simulating a cluster on a small scale.
- Fully Distributed Mode – Hadoop runs on a cluster of machines. This is the mode in which Hadoop is being used by industries for doing real world data processing. Typically, one machine in the cluster is designated as the NameNode and another machine as the JobTracker, exclusively.
You can get a better understanding with the Azure Data Engineering certification.
Steps to Create a New Virtual Machine:
Let’s look at the steps for creating a virtual environment in Hadoop. A VMware for Windows or VirtualBox for Mac, can be used for this purpose. The following demonstration is done in VirtualBox :
Step 1: Provide the name ‘Lab1’ (can be anything of your choice). Select the type as ‘Linux’ and the versions as ‘Other Linux’.
Step 2: Select the amount of memory to be allocated to the virtual machine.
Step 3: Select the option ‘Create a virtual hard drive now’
Step 4: Select hard drive file types as ‘VMDK’ and click on ‘Continue’.
Step 5: For storage of physical hard drive, select ‘Dynamically allocated’ and click on ‘continue’.
Step 6: Specify file location and the size of the file and click on ‘Create’.
A new virtual machine has been created!!!
Step 7: Select ‘CentOS-6.5-x86_64’ in ‘CD/DVD Drive’.You are specifying the parameter needed for the virtual machine.
You can get a better understanding with the Azure Data Engineering Training in London.
Starting the Virtual Machine:
Let’s look at the steps involved in starting a virtual machine:
Step 1: Select ‘Enter’ in the confirmation box to initialize the start up.
Step 2: Select ‘Skip’ to start installation and continue clicking on ‘Ok’
Step 3: Select ‘Re-initialize all’ and continue.
From this Big Data training Course designed by a Big Data professional, you will get 100% real-time project experience in Hadoop tools, commands and concepts. Become a master of data architecture and shape the future with our comprehensive Data Architect Certification.
Got a question for us? Mention them in the comments section and we will get back to you.