32467/what-is-distributed-cache-in-mapreduce-framework
Distributed Cache is an important feature provided by map reduce framework. When you want to share some files across all nodes in Hadoop Cluster, DistributedCache is used. The files could be an executable jar files or simple properties file.
Basically distributed cache allows you to cache ...READ MORE
A Zero reducer as the name suggests ...READ MORE
MapReduce: MapReduce is an algorithm used to store ...READ MORE
Job job = new Job(conf,"job_name") is just used ...READ MORE
We use distributed cache to share those ...READ MORE
Firstly you need to understand the concept ...READ MORE
org.apache.hadoop.mapred is the Old API org.apache.hadoop.mapreduce is the ...READ MORE
Hi, You can create one directory in HDFS ...READ MORE
Differences are as follows: Hadoop's MR can be ...READ MORE
Doc on Hadoop Streaming : http://hadoop.apache.org/docs/r1.2.1/streaming.html Hadoop streaming is ...READ MORE
OR
At least 1 upper-case and 1 lower-case letter
Minimum 8 characters and Maximum 50 characters
Already have an account? Sign in.