top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

What is the purpose of RecordReader in Hadoop?

+1 vote
What is the purpose of RecordReader in Hadoop?
posted Dec 2, 2014 by Kali Mishra

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

0 votes

The InputSplit has defined a slice of work, but does not describe how to access it. The RecordReader class actually loads the data from its source and converts it into (key, value) pairs suitable for reading by the Mapper. The RecordReader instance is defined by the Input Format.

answer Dec 6, 2014 by Amit Kumar Pandey
Similar Questions
+2 votes

Currently I'm developing an application which would ingest logs of order of 70-80 GB of data/day and would then do Some analysis on them

Now the infrastructure that I have is a 4 node cluster( all nodes on Virtual Machines) , all nodes have 4GB ram.

But when I try to run the dataset (which is a sample dataset at this point ) of about 30 GB, it takes about 3 hrs to process all of it.

I would like to know is it normal for this kind of infrastructure to take this amount of time.
