College of Engineering launches “Arjuna,” a multi-departmental GPU cluster

Vidya Pelapu

May 31, 2017

GPU circuit board

Source: Department of Mechanical Engineering

Carnegie Mellon University’s College of Engineering has installed a new, multi-departmental GPU cluster at a Pittsburgh Supercomputing Center facility in Monroeville, Pennsylvania. The cluster, up and running since March, is in use by researchers in several engineering departments across campus.

Equipped with GPUs donated by Nvidia, the cluster has a total number of 1,792 CPU cores and 112 GPUs, giving it immense computing power. Following in the tradition of other Carnegie Mellon computer clusters named after mythological figures such as Hercules and Gilgamesh, the new cluster has been named Arjuna after a famous archer and charioteer in ancient Hindu mythology. Arjuna is one of the largest clusters with GPU capability of its kind hosted at a research institution.

The project was spearheaded by Venkat Viswanathan, assistant professor of mechanical engineering, who leveraged support from other faculty across the College of Engineering to contribute financial resources to the project. Vital to its completion was Zachary Ulissi, assistant professor of chemical engineering; Ian Lane, CMU Silicon Valley assistant research professor; John Kitchin, professor of chemical engineering; Franz Franchetti, associate professor of electrical and computer engineering; and Chad Dougherty, Parallel Data Lab principal research programmer.

We’re trying to describe the battery performance of one of the most popular anode materials, graphite.

Venkat Viswanathan, Assistant Professor, Mechanical Engineering, Carnegie Mellon University

Viswanathan’s interest in building the cluster stemmed from his research on next-generation batteries. Working with a wide array of inorganic and organic materials that could potentially be put to use in a battery, Viswanathan realized that progressing from the discovery of a new material to its deployment was taking far too long; essentially, the most time-consuming aspect of his research lay in the experimental testing of a new material.

“The key is to have extremely accurate predictions associated with those new material discoveries,” says Viswanathan. “We’re trying to describe the battery performance of one of the most popular anode materials, graphite, which is used in all lithium-ion batteries. But even for a system as important as that, we still don’t have accuracy levels to the point that we would like.”

Hoping to expedite this testing process, Viswanathan began researching GPU-accelerated computing, which utilizes both central processing units (CPUs) and graphical processing units (GPUs) to perform fast, accurate computations. Outside of a research context, GPUs are used in video games to provide high resolution graphical components such as pixel shaders, which offer texture and depth to two-dimensional images. Both the Play Station 4 and the Xbox One use GPUs as part of their hardware.

A problem that, before, would take a year, we can attempt to solve in a week.

Venkat Viswanathan, Assistant Professor, Mechanical Engineering, Carnegie Mellon University

Arjuna works like this: when an algorithm is sent to a GPU cluster, the CPU runs the high-level code while offloading the computationally intensive portions—the portions that contain several simple math operations—to the GPU. The GPU has a smaller amount of memory, as opposed to the CPU’s large pool of random access memory (RAM). Because of this, the GPU can access certain graphical functions much faster, allowing it to perform simple computations at a rapid rate, essentially relieving the CPU of its busywork. The GPU then sends its calculations back to the CPU, which interprets the data into a usable output, thus speeding up the overall computing process.

“We’re exploring a class of methods that is most amenable to GPU acceleration, so you can potentially get a runtime that’s about forty times faster,” says Viswanathan. “A problem that, before, would take a year, we can attempt to solve in a week.”

RPA is important because it will provide us with a more accurate description of the materials in batteries.

Greg Houchins, Ph.D. student, Department of Physics, Carnegie Mellon University

Greg Houchins, a Ph.D. student in physics who works in Viswanathan’s research team, played a vital role in Arjuna’s completion, working remotely from his office in Wean Hall to configure the cluster’s software over the course of two weeks. His area of research focuses on density-functional theory (DFT), which is a way to approximate the electronic structure of materials. Houchins also researches random phase approximation (RPA), which offers a more precise calculation of electron behavior following DFT.

“RPA is important because it will provide us with a more accurate description of the materials in batteries, and ultimately lead to the optimization of current materials, as well as the prediction of possible candidate materials,” says Houchins. “So one of the big things that we’re interested in for this cluster is to be able to finally have enough compute power to do RPA.”

The cluster will support research activities in energy storage, catalysis, machine learning, artificial intelligence, as well as the development and testing of advanced numerical techniques to make computations faster. The cluster will also help equip the next generation of graduate students with the skillset for GPU-accelerated computing, a hugely sought-after skill in STEM fields today.