VMware has announced a new open source project to enable Apache Hadoop to run on both private and public clouds.
The new project, named Serengeti, will enable enterprises to quickly deploy, manage and scale Apache Hadoop in virtual and cloud environments. In addition, VMware is working with the Apache Hadoop community to contribute extensions that will make key components ‘virtualisation-aware’ to support elastic scaling and further improve Hadoop performance in virtual environments.
Making Project Serengeti available free through Apache also continues a trend by VMware to embrace open standards. For example, its platform-as-as-service (PaaS) offering, Cloud Foundry is also open source.
Available for free download under the Apache 2.0 license, Serengeti is a ‘one-click’ deployment toolkit that allows enterprises to leverage the VMware Sphere platform in order to deploy a highly available Apache Hadoop cluster in minutes, including common Hadoop components like Apache Pig and Apache Hive
With names like these, Apache Hadoop needs further explanation ……. It is an open source software framework for managing massive amounts of unstructured data that is still in the early stages of adoption across most mid- to large-size enterprises. Jerry Chen, vice president, Cloud and Application Services, VMware, commented “Hadoop has the potential to transform business by allowing enterprises to harness very large amounts of data for competitive advantage. It represents one dimension of a sweeping change that is taking place in applications, and enterprises are looking for ways to incorporate these new technologies into their portfolios. VMware is working with the Apache Hadoop community to allow enterprise IT to deploy and manage Hadoop easily in their virtual and cloud environments.”
Planned updates to Spring are to make it easier for enterprise developers to build distributed processing solutions with Apache Hadoop. Then to further simplify and accelerate the enterprise use of Apache Hadoop, VMware is working with the Apache Hadoop community to contribute changes to the Hadoop Distributed File System (HDFS) and Hadoop MapReduce projects to make them ‘virtualisation-aware’, so that data and compute jobs can be optimally distributed across a virtual infrastructure.
Together, these projects and contributions are all intended to help speed up Hadoop adoption and enable enterprises to leverage big data analytics applications.
For more information on Project Serengeti go to: http://serengeti.cloudfoundry.com/
For more information on Apache Hadoop courses and certification, visit the Global Knowledge website at www.globalknowledge.co.uk/cloudera