By Rob Mitchum // April 16, 2014
The explosion of data across disciplines has opened up vast new possibilities for scientific discovery. But many researchers do not yet have access to the advanced infrastructure needed to work with Big Data and realize its full potential.
With new support from the Alfred P. Sloan Foundation, Globus can expand its mission to bring the advanced data management infrastructure used by massive science collaborations to small laboratories and individual researchers around the world. The foundation’s $500,000 grant will help Globus, part of the Computation Institute at the University of Chicago and Argonne National Laboratory, evolve from a free service to a sustainable non-profit model serving hundreds of thousands of resource providers, scientists, educators, and students.
“Our mission is to accelerate discovery across the sciences by ensuring that aspiring data scientists can enter the age of computation- and data-intensive science without having to acquire and operate local IT expertise and infrastructure,” said Ian Foster, director of the Computation Institute and Globus co-inventor. “The goal of Globus is to democratize science by making advanced data management capabilities broadly accessible to researchers at all levels.”
The Globus project was originally supported by grants from federal funding agencies. However, these grants are not intended to fund the ongoing operating costs of supporting users and expanding the service to make it self-sufficient. The grant from the Sloan Foundation boosts Globus to leap this critical gap and bring powerful computational abilities to researchers around the world.
“Most academic projects end after developing software, but that’s insufficient,” said Steve Tuecke, Globus co-inventor and Deputy Director of the Computation Institute. “In order to provide a production-grade service to our large and rapidly growing user base, it is vital to build out the ‘whole product,’ including user support and other operational functions. We are happy that Sloan recognizes this challenge and supports us in our mission.”
As scientific datasets grow larger and larger, management of that data becomes a challenging bottleneck. Researchers face complex IT challenges in moving data to high-performance computing resources for analysis, or from remote scientific instruments (such as next-gen genome sequencers) to their laboratory. Research collaborations spread out across multiple institutions and countries struggle to share and synchronize datasets, slowing progress towards discovery.
In November 2010, Globus Online launched as a cloud-based, software-as-a-service solution for the transfer of large scientific datasets. In its first three years, more than 15,000 registered users moved over 40 petabytes of data using the service — 20 times the amount of data stored in American academic libraries. Researchers at UCLA, NYU Langone Medical Center, Brookhaven National Laboratory, and the University of Colorado have used Globus for research on genetics, cancer, the origin of the universe, and oceanography.
In 2013, Globus began offering services to computing resource providers for universities and national laboratories, providing additional functions such as a management console, priority support, and a branded interface. The service enables campus research computing centers to offer big data transfer and sharing to their faculty and students, with dropbox-like simplicity, directly off their existing systems. Early subscribers include Indiana University, Michigan State University, the University of Exeter, the National Energy Research Scientific Computing Center (NERSC), and the DOE Systems Biology Knowledgebase (KBase).
“We recognized early on that we can use the economies of scale inherent in software-as-a-service to offer advanced capabilities to all researchers at a very modest cost,” said Vas Vasiliadis, director of products at the Computation Institute. “The Sloan grant gives us the means to attract more research institutions as paying customers, en route to making Globus self-sustaining.”
Last fall, the service also launched Globus Plus, a for-fee service that gives individual researchers expanded capabilities, such as securely sharing data between computers and receiving priority support. The basic Globus file transfer service remains free for non-profit research and educational use. The Sloan Foundation grant will help Globus expand these services to support thousands of new subscribers, building new payment infrastructure and user support services.
“Moving and managing large amounts of data is a key technical challenge facing twenty-first century science,” says Joshua M. Greenberg, Program Director of the Alfred P. Sloan Foundation’s Digital Information Technology program. “Globus is meeting that challenge head on. It’s an exciting platform, and a dynamic organization that the Foundation is proud to be able to help empower research in the digital age.”