I have a file in which I have ‘^’ character. I have written a custom input format where I have used ‘^’ character as delimiter. In my custom input format, I am extending FileInputFormat and I have also used custom record reader where I am extending RecordReader. I an getting an error in the nextKeyValue() method in the while loop. Can anyone help me in how to read data, split it and generate key value pair?
public class CustomRecordReader extends RecordReader<LongWritable, Text>
{
long start, current, end;
Text value;
LongWritable key;
LineReader reader;
FileSplit split;
Path path;
FileSystem fs;
FSDataInputStream in;
Configuration conf;
@Override
public void initialize(InputSplit inputSplit, TaskAttemptContext cont) throws IOException, InterruptedException
{
conf = cont.getConfiguration();
split = (FileSplit)inputSplit;
path = split.getPath();
fs = path.getFileSystem(conf);
in = fs.open(path);
reader = new LineReader(in, conf);
start = split.getStart();
current = start;
end = split.getLength() + start;
}
@Override
public boolean nextKeyValue() throws IOException
{
if(key==null)
key = new LongWritable();
key.set(current);
if(value==null)
value = new Text();
long readSize = 0;
while(current<end)
{
Text tmpText = new Text();
readSize = read //here how should i read data from the split, and generate key-value?
if(readSize==0)
break;
current+=readSize;
}
if(readSize==0)
{
key = null;
value = null;
return false;
}
return true;
}
@Override
public float getProgress() throws IOException
{
}
@Override
public LongWritable getCurrentKey() throws IOException
{
}
@Override
public Text getCurrentValue() throws IOException
{
}
@Override
public void close() throws IOException
{
}
}