By Rob Mitchum // August 20, 2014
Cloud computing has changed the way we work, the way we communicate online, even the way we relax at night with a movie. But even as “the cloud” starts to cross over into popular parlance, the full potential of the technology to directly impact science, medicine, transportation, and other industries has yet to be realized.
To help investigate and develop this promising cloud computing future, the National Science Foundation today awarded $10 million to a group of institutions led by the Computation Institute for the development of Chameleon, a new experimental testbed for cloud architecture and applications. The color-changing lizard is a fitting mascot for the project, which will create a customizable, large-scale cloud computing facility for cloud computing research.
“Like its namesake, the Chameleon testbed will be able to adapt itself to a wide range of experimental needs, from support for bare metal reconfiguration to popular cloud resources,” said Kate Keahey, a senior fellow at the Computation Institute and principal investigator for Chameleon.
Much of today’s cloud computing is hosted at massive, private data centers owned by vendors such as Amazon, Google, and Microsoft. For security and proprietary reasons, these services don’t allow users direct access to the computing infrastructure, instead providing a layer of “virtualization” where customers can assemble their own compute power. Cloud providers also rarely release metrics and other information that would be useful for researchers studying new architectures and applications, leaving scientists handicapped in the search for the next generation of cloud capabilities.
Chameleon will offer a public cloud computing resource for research use, with more customizability and transparency. It will be an experimental data center where researchers will have all-access permission to tinker with hardware and software for the cloud, see the results, and reveal an exciting future.
“Right now, cloud computing is roughly where the Internet was about twenty or so years ago,” Keahey said. “We had basic connectivity, email, web pages, et cetera, but we did not have social networks, data streaming, and e-commerce. This is where we are with cloud computing right now: we have a basic understanding and applications of this new technology but we do not yet fully understand how it will change our lives.”
“We want people to have the ability to discover, experiment with new ideas, and put together new innovations in different areas. There are a lot of open problems in cloud computing right now,” Keahey said.
Keahey describes the new testbed in terms of four adjectives: large-scale, reconfigurable, connected, and complementary. These design choices were informed by a series of interviews Keahey conducted with research groups about what was needed to seed tomorrow’s cloud computing innovations.
The large scale is embodied in Chameleon’s more than 650 multi-core cloud nodes, located at the University of Chicago and the Texas Advanced Computing Center, and connected by a 100G network between the sites. This system will allow researchers to develop and test at scale new high-performance and low-noise virtualization solutions that might make possible high-performance computing in the cloud — creating virtual supercomputers on demand for research.
Many researchers asked for the ability to work with large datasets in the cloud, which Chameleon will address with a total of 5 petabytes of disk space for data. Currently, the upload times alone for large datasets can use up hours or days of research time, so Chameleon will keep “big data” sets pre-loaded for researchers to use in their experiments on massive-scale cloud computing.
Reconfigurability is reflected in Chameleon’s programming environment, which will offer support for bare metal reconfiguration and flexible network topology configuration via OpenFlow switches. It will allow researchers to experiment with combining programmable resources and programmable networks — an important element for creating reliable, on-demand computing power for applications in medicine, science, and other fields.
The “connectedness” emphasizes partnerships with production clouds in both science and industry to understand and express relevant problems. The project will partner with commercial cloud providers such as Rackspace and existing research clouds operated by CERN and the Open Science Data Cloud, who can provide a level of openness that most private cloud centers cannot.
In addition to providing a focus on requirements, these partnerships will provide traces and workloads used in production settings to allow scientists to validate their solutions against real-life data. And finally, Chameleon will complement and partner with existing facilities such as the GENI project, a virtual laboratory for networking and distributed systems research and education.
Chameleon will originally rely on the resources of the FutureGrid project at the CI and TACC. In the fall of 2014, those resources will become available in the same way as they are currently available under FutureGrid to provide a seamless transition for FutureGrid users. As the Chameleon infrastructure is assembled, it will become available as highly reconfigurable resources and eventually will be supplanted by the first new hardware purchase in Summer/Fall of 2015.
“In a project like this, you would typically have to wait for roughly a year as the new hardware is built,” said Keahey, “Leveraging FutureGrid resources allows us to hit the ground running and start serving the community right away while also providing transition for FutureGrid users.”
Other academic partners on the project include the International Center for Advanced Internet Research at Northwestern University, the Ohio State University, and the University of Texas at San Antonio.
Chameleon also joined a second project, CloudLab, as the recipient of grants from the NSFCloud program, which hopes to propel cloud computing technology into its next era.
“Just as NSFNet laid some of the foundations for the current Internet, we expect that the NSFCloud program will revolutionize the science and engineering for cloud computing,” said Suzi Iacono, acting head of NSF’s Directorate for Computer and Information Science and Engineering. “We are proud to announce support for these two new projects, which build upon existing NSF investments in the Global Environment for Network Innovations (GENI) testbed and promise to provide unique and compelling research opportunities that would otherwise not be available to the academic community.”