Skip to content

Stijn Heldens defended his PhD

  • by
(Photo by Sanne Heldens)

On March 6, 2024, Stijn Heldens, one of the PhD students I had the pleasure to supervise, successfully defended his PhD thesis on Scalable Scientific Computing at the University of Amsterdam. This was made possible by a collaboration with the Netherlands eScience Center who funded the PhD position. Stijn worked on two projects: the EU-funded project “PROCESS: Providing computing solutions for exascale challenges“, and the Netherlands eScience Center funded project “A methodology and ecosystem for many-core programming“.

Many thanks also to the other members of the “Gang of Four”: Jason Maassen, Ben van Werkhoven, and Pieter Hijma who co-supervised Stijn with me. Always a pleasure, guys!

Many thanks also to the committee members who read and judged the thesis, and in some cases traveled far to attend the defense.

Stijn’s thesis is titled “Parallel programming systems for scalable scientific computing“. Here the abstract:

High-performance computing (HPC) systems are more powerful than ever before. However, this rise in performance brings with it greater complexity, presenting significant challenges for researchers who wish to use these systems for their scientific work. This dissertation explores the development of scalable programming solutions for scientific computing. These solutions aim to be effective across a diverse range of computing platforms, from personal desktops to advanced supercomputers.

To better understand HPC systems, this dissertation begins with a literature review on exascale supercomputers, massive systems capable of performing 10¹⁸ floating-point operations per second. This review combines both manual and data-driven analyses, revealing that while traditional challenges of exascale computing have largely been addressed, issues like software complexity and data volume remain. Additionally, the dissertation introduces the open-source software tool (called LitStudy) developed for this research.

Next, this dissertation introduces two novel programming systems. The first system (called Rocket) is designed to scale all-versus-all algorithms to massive datasets. It features a multi-level software-based cache, a divide-and-conquer approach, hierarchical work-stealing, and asynchronous processing to maximize data reuse, exploit data locality, dynamically balance workloads, and optimize resource utilization. The second system (called Lightning) aims to scale existing single-GPU kernel functions across multiple GPUs, even on different nodes, with minimal code adjustments. Results across eight benchmarks on up to 32 GPUs show excellent scalability.

The dissertation concludes by proposing a set of design principles for developing parallel programming systems for scalable scientific computing. These principles, based on lessons from this PhD research, represent significant steps forward in enabling researchers to efficiently utilize HPC systems.