Advantages of MapReduce

 

Some of the advantage of MapReduce is that it allows for distributed processing of the map and reduction operations. Providing each mapping operation is independent of the one and other, also all maps can be performed in parallel – it can be limited by the data source and the number of CPUs near that data. Similarly, a set of ‘reducers’ can perform the reduction phase - all that is required is that all outputs of the map operation which share the same key are presented to the same reducer, at the same time. While this process can seem inefficient compared to algorithms that are more sequential, MapReduce can be applied to significantly larger datasets than that which “commodity” servers can handle - a large server farm can use MapReduce to sort a petabyte of data in only a few hours.