Before going into Hadoop , learn what is Apache .
Apache Hadoop is a free , open source software library written in Java to manage large applications
- Apache Hadoop™ was born out of a need to process an avalanche of big data.
- The web was generating more and more information on a daily basis, and it was becoming very difficult to index over one billion pages of content. In order to cope, Google invented a new style of data processing known as MapReduce.
- A year after Google published a white paper describing the MapReduce framework, Doug Cutting and Mike Cafarella, inspired by the white paper, created Hadoop to apply these concepts to an open-source software framework to support distribution for the Nutch search engine project.
- Given the original case, Hadoop was designed with a simple write-once storage infrastructure.
Read more :
You can find Hadoop installation files here :