MapReduce is a programming model and implementation for processing and producing large data sets on a cluster using a distributed, parallel method. A MapReduce programme is made up of two parts: a map procedure for filtering and sorting, and a reduction method for summarising the results.