Hadoop is an open-source software framework. Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Cutting, who was working at Yahoo! at the time, named it after his son’s toy elephant. It was originally developed to support distribution for the Nutch search engine project. It is written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are commonplace and thus should be automatically handled in software by the framework. Prominent users of HADOOP includes Yahoo, Facebook and most of fortune 50 companies.