Can We Use All the Bandwidth? IceCube Project Update
By John Hicks, Internet2 Research Network Engineer, and Matt Zekauskas, Internet2 Senior Engineer
Last year, it was “can we use all the GPUs?” This year it’s “can we use all the bandwidth?” Really, it’s about obtaining sufficient data movement to support a computational workflow – can one cost-effectively do large-scale data-intensive high-throughput compute in commercial cloud service providers with data streaming directly to and from on-premises storage? The IceCube project worked with Internet2’s Cloud Connect service to stream data to and from the University of Wisconsin-Madison and the San Diego Supercomputer Center (SDSC) to do a science run that used 100 PFLOPS of cloud compute, producing 130 TB of data streaming data collectively at close to 100 Gbps.
Internet2 has been working with Igor Sfiligoi (SDSC/OSG) in support of the IceCube project through the OSG (Open Science Grid) for a number of years. OSG is a framework for large distributed resource sharing including compute and storage. The IceCube project uses the IceCube neutrino observatory to investigate subatomic particles called neutrinos in a cubic kilometer of ice at the south pole. In November 2019 at the SuperComputing event in Denver, researchers at IceCube leveraged around 51,000 cloud-based GPUs to compute on the data collected by IceCube’s neutrino observatory. This was an unprecedented effort but focused primarily on computation using as many cloud-based GPUs available from public cloud offerings at the time of the experimental run.
This article, authored by Igor Sfiligoi, describes the latest computational run. The Internet2 Cloud Connect service was used to establish peerings with AWS, Azure, and Google to help facilitate data movement through the network. Internet2 engineers worked with engineers at UW- Madison and the University of California San Diego to configure the network infrastructure to facilitate this effort.