SahasraT, India’s first petascale computer housed at IISc, is powering diverse research pursuits across campus
When IISc brought home SahasraT in 2015, everyone on campus was excited, recalls Lakshmi J, Chief Research Scientist at the Supercomputer Education and Research Centre (SERC). It was India’s fastest supercomputer, a veritable beast capable of carrying out a quadrillion (1015) calculations per second. “Within three days of commissioning the machine, we were able to saturate 80% of the system resources in terms of the number of jobs executed,” she says.
Now there are two supercomputers in India that are faster than SahasraT (“Pratyush” and “Mihir”, specialised for climate and weather studies). IISc itself will add another supercomputer next year ‒ under the National Supercomputing Mission, of which it is a lead partner. Yet SahasraT will still retain a unique position in India’s academic landscape, says Sathish Vadhiyar, Chair of SERC. “The number of applications from different domains that SahasraT caters to is vast compared to these,” he says. “It is a real ‘general purpose’ machine.”
Supercomputers like SahasraT have thousands of processors ‒ the ‘brains’ of computers ‒ working on different parts of the same problem in parallel. This cuts down the time needed to sift through large amounts of data tremendously. SahasraT has 33,000 processor cores arranged in clusters called nodes, which are in turn stacked on blades arranged in racks. It also has custom-built hardware and software that allow the different units to “talk” to each other without time lag, explains Lakshmi.
Over the past five years, SahasraT has enabled researchers from various departments to study a host of topics, from monsoons and materials to black holes and biomolecules. Prabal Maiti’s lab in the Department of Physics, for example, is using SahasraT for at least four different projects, one of which is analysing DNA nanostructures for applications such as drug delivery. These structures are huge compared to carbon-based molecules, with over 300,000 atoms. Simulating how they behave or assemble can take years using conventional computers.
Another area of Maiti’s research is HIV. A protein called gp41 helps the HIV particle fuse with the immune cell membrane. To understand this process and design drugs that can block it, Maiti’s team is using a combination of atom-level and large-scale 3D simulations run on SahasraT. These involve solving math equations for millions of atoms to understand how exactly the atoms move and interact, and what forces act on them. “The force calculations are the heart of these simulations … they are [also] the most time-consuming part. Some of them run for months or years,” he says. His lab is also developing dendrimer polymers which can help inhibit HIV infection.
In recent months, SahasraT has also been helping in the fight against COVID-19. A multi-institutional team (IISc, ICTS, JNCASR and KTH, Sweden) led by Sourabh Diwan at the Department of Aerospace Engineering is using it to analyse the dispersal of droplets released by a person coughing or speaking, by adapting a numerical code originally developed to study how cumulus cloud flows evolve.
Both cloud and respiratory flows are chaotic (‘turbulent’), with droplets of different sizes behaving differently. The dynamics are governed by a set of partial differential equations which need to be solved computationally. To accurately replicate the turbulence in the flow, the researchers carry out a direct numerical simulation ‒ a heavy-duty process involving 50,000-400,000 core hours using 2048-16,660 cores for a single run. “The immediate objective is to see how far the moist air travels, what is the effect of evaporation, how long these droplets linger in the air, and so on,” says Diwan. “A long-term goal is to understand this flow more fundamentally.”
SahasraT is also helping researchers like Bratati Kahali at the Centre for Brain Research to analyse the genomic sequences of 10,000 individuals from across India as part of the GenomeIndia initiative. Identifying genetic variations unique to Indians will help understand the genetic basis of many diseases. Early results from studies involving about 100 individuals show that there are more than a million variations that are completely novel in the Indian population and currently not accounted for in global databases, says Kahali.
Such studies are not feasible in a reasonable time using normal computers. Each DNA sequence has more than 3 billion base pairs or letters. The raw data for each experimental run for 24 individuals can take up 1.5 terabytes (TB) of storage space ‒ 70 TB during analysis ‒ and about 20 hours to analyse. “Whereas in SahasraT, I can parallelise it in such a manner that I can analyse 24 individuals’ data in eight to ten hours,” explains Kahali.
There are scores of similar projects that are ongoing, including simulating extreme weather events, modelling materials using machine learning, studying properties of glasses, insulators and semiconductors, and designing drugs using crystallography. In 2018, SERC organised a Grand Challenge which allowed three teams to utilise SahasraT’s entire capacity for eight hours each. As part of this, an astrophysics team studied how particles in space aggregate to objects such as black holes.
These projects keep SahasraT running through the year; about 90% of the system’s resources are almost always in use, says Vadhiyar. “Even on a Saturday evening, you will see that the machine is full and many jobs are waiting in the queue.” To ensure that the machine keeps running without a pause, a dedicated team of engineers works continuously behind the scenes, helping IISc researchers execute their projects.
SERC also has more plans for this workhorse on the horizon. “We plan to set up teams that provide good visualisation services, high performance computing products, large scale scientific libraries and training courses,” says Vadhiyar. “Overall, we would like SERC to act as a centre of excellence for the country.”