What is InputSplit in hadoop?

InputSplit represents the data to be processed by an individual Mapper.  it presents a byte-oriented view on the input and is the responsibility of RecordReader of the job to process this and present a record-oriented view.

In simple way we can say when a Hadoop job is run, it splits input files into chunks and assign each split to a mapper to process. This is called InputSplit.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.