Classes implementing InputFormat frequently

0 votes

Which is the base class for all file-based InputFormats? And which are the most frequent use Input Formats?

Jul 24, 2019 in Big Data Hadoop by Piyush
600 views

1 answer to this question.

0 votes

FileInputFormat : Base class for all file-based InputFormats

Other frequently used Input Formats are:

KeyValueTextInputFormat : An InputFormat for plain text files. Files are broken into lines. Either line feed or carriage-return are used to signal end of line. Each line is divided into key and value parts by a separator byte. If no such a byte exists, the key will be the entire line and value will be empty.

TextInputFormat : An InputFormat for plain text files. Files are broken into lines. Either linefeed or carriage-return are used to signal end of line. Keys are the position in the file, and values are the line of text..

NLineInputFormat : NLineInputFormat which splits N lines of input as one split. In many "pleasantly" parallel applications, each process/mapper processes the same input file (s), but with computations are controlled by different parameters.

SequenceFileInputFormat : An InputFormat for SequenceFiles.

Regarding second query, get the files from remote servers first and use appropriate InputFileFormat depending on contents in file. Hadoop works best for data locality.

answered Jul 24, 2019 by Reshma

Related Questions In Big Data Hadoop

0 votes
1 answer

Can we use different i/p and o/p format classes in mapreduce code?

Yes, InputFormatClass and OutputFormatClass are independent of ...READ MORE

answered Jul 10, 2019 in Big Data Hadoop by Jimmy
821 views
0 votes
1 answer

Can we use different input and output format classes?

Yes, InputFormatClass and OutputFormatClass are independent of ...READ MORE

answered Jul 22, 2019 in Big Data Hadoop by Jishan
643 views
0 votes
1 answer
0 votes
1 answer

Zookeeper server going down frequently.

Hi@B, There may be several reasons behind this. ...READ MORE

answered Nov 9, 2020 in Big Data Hadoop by MD
• 95,460 points
2,195 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,015 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,527 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
108,739 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
4,605 views
0 votes
1 answer

Why does one remove or add nodes in a Hadoop cluster frequently?

One of the most attractive features of ...READ MORE

answered Dec 14, 2018 in Big Data Hadoop by Frankie
• 9,830 points
2,304 views
0 votes
1 answer

Using Java Classes in Talend

While working with routines, the very 1st ...READ MORE

answered Apr 14, 2018 in Talend by geek.erkami
• 2,680 points
2,713 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP