Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware. The Apache Hadoop framework is composed of the following modules :
Hadoop Common - contains libraries and utilities needed by other Hadoop modules
Hadoop Distributed File System (HDFS) - a distributed file-system that stores data on the commodity machines, providing very high aggregate bandwidth across the cluster.
Hadoop YARN - a resource-management platform responsible for managing compute resources in clusters and using them for scheduling of users' applications.
Hadoop MapReduce - a programming model for large scale data processing.