NMSU research collaboration with Los Alamos National Lab wins team spot at international conference

For more than four years, a team from the Klipsch School of Electrical and Computer Engineering at NMSU’s College of Engineering has been collaborating in high-performance computing research with researchers at Los Alamos National Laboratory.
A product of this partnership includes a peer-reviewed technical paper that was accepted and presented at Supercomputing 2021, which took place in St. Louis, Missouri, in November 2021.
The paper, “Hybrid, scalable, trace-driven performance modeling of general-purpose computing on graphics processing units,” is the first from NMSU selected for Supercomputing, an international conference on high-performance computing, networking, storage and analysis, since 2012 The project is a joint effort between graduate students and faculty at NMSU and Los Alamos National Laboratory.
“We were very excited. This was the first submission for the paper,” said Abdel-Hameed Badawy, associate professor at the Klipsch School of Electrical and Computer Engineering and one of the paper’s authors. Getting into supercomputing is no easy feat. It needs a lot of hard work. It’s a feat to be admitted to the SC conference.
To present at the hybrid-format conference, Badawy traveled to St. Louis with Yehia Arafa, the lead author who graduated from NMSU in December 2021 with a Ph.D. in Computer Engineering and joined QUALCOMM Research as a Senior Engineer in January 2022.
“I am very excited to see my Ph.D. work presented at a high-impact conference like SC,” Arafa said. “The key idea was to have a fast, accurate, and scalable tool that the high computing community performance could use in his research on GPU modeling and simulation.”
The authors include NMSU graduate student Ammar ElWazir, who currently works with AMD; Atanu Barai, who currently works at Intel; Ali Eker, senior software engineer at AMD; Gopinath Chennupati, formerly at LANL and currently at Amazon; Nandakishore Santhi and Stephan Johannes Eidenbenz, both computer scientists in LANL’s Information Science Group.
The document describes the Performance Prediction Toolkit-Graphics Processing Unit (PPT-GPU) toolkit, which is a hardware-software co-design and performance prediction framework for HPC applications that run on GP-GPUs. It has 10 to 30 times the performance and scalability of competing tools.
“The core value proposition of the PPT-GPU toolkit is that it offers significant speedup – up to three orders of magnitude over alternative tools with almost no penalty in prediction accuracy,” Eidenbenz said. . “Taking advantage of GPU architectures effectively is a formidable challenge. It was a great experience for the lab to partner with NMSU and leverage New Mexico’s academic resources and capabilities.
“Predicting the performance of scientific codes on general-purpose GPUs is a generally error-prone and time-consuming process,” Santhi said. “Our approach with the PPT-GPU toolkit means that scientific applications that previously took weeks to simulate and analyze can now take just hours, dramatically accelerating the software-hardware co-design process. Improved GPU performance will benefit research fields including machine learning and artificial intelligence, drug discovery and medicine, and many other applications where computing is an integral part.
Although Badawy said he was extremely proud of this achievement, he hopes that HPC research at NMSU will become an important research focus. He believes the publication and presentation can elevate NMSU’s national status and serve as a springboard for further opportunities and collaborations for researchers, faculty, staff, and students on campus.
Author: Tiffany Acosta
For more information, Click here | For our full coverage, click here