|
Executive Summary
The emergence of cloud computing is rapidly transforming processing techniques to support distributed platforms with large cluster sizes.
With the announcement of Microsoft Azure, we are not only seeing a shift of internet scale services to cloud clusters, but also basic
functionality of Operating Systems slowly becoming available on such distributed platforms. This quantum shift in computation
requires a distributed framework which consumers and the industry alike could use for large amounts of computation.
Google’s seminal framework for computation, called MapReduce, provides an ideal base for such distributed processing.
The framework has been used for years at Google for large computations, and, now, its open-source implementation finds wide use
at many other companies like Facebook. Yet there are some hurdles in making it a feasible cloud computation framework.
This project plans to solve and implement such issues in MapReduce, as identified in this proposal. The solutions will be implemented
as part of an open-source plug-in for Hadoop (MapReduce’s Apache implementation) which could make it a viable choice as a cloud
computational framework.Since cloud platforms are the next wave of change, the project also aims to train the local software industry
for using such frameworks. Three workshops will be organized as part of the project to train IT professionals, educators and
researchers from other universities to use MapReduce.
|