High Performance Parallel Scientific Computing
September 4 - 5, 2008
Purdue University
Organizers: Alejandro Strachan, School of Materials Engineering and Faisal Saied, Rosen Center for Advanced
Computing & CRI
The first Purdue School on High Performance Parallel Scientific Computing was held at Purdue, September 4 - 5, 2008. This
event was jointly sponsored by
The goal of the workshop was to provide training in the area of high performance scientific computing for graduate
students and researchers interested in scientific computing. The School addressed current hardware and software
technologies and trends for parallel computing and their application to solve scientific problems. It also included
lab sessions where participants got hands-on experience with parallel computing including the use of performance
evaluation and debugging tools on state of the art simulation codes.
High performance computing resources on campus, and the use of them by Purdue researchers in a wide variety of
disciplines, have been growing. The strong interest in this event was evidenced by the fact that there were 52 people
who signed up, of which only 34 cpould be accomadated in the lab. The participants were drawn from from 10 departments
drawn from four colleges.
An opening, overview session at the Burton D. Morgan Center for Entrepreneurship in Purdue’s Discovery Park, which was
open to anyone, attracted about 40 faculty, staff and students.
Blaise Barney, Rob Cunningham and Barbara Jennings spoke about large scale computational resources at the tri-labs:
Livermore, Los Alamos and Sandia.
Professor Andrew Lumsdaine from Indiana University discussed MPI, for “message passing interface,” a programming system
for using many processors to solve parts of a problem simultaneously—the essence of parallel computing—to yield a solution
faster, take on bigger questions, or a combination of both. Purdue computer science Professor Ananth Grama, whose research
focuses on parallel and distributed computing, talked about SPIKE linear solver, a novel way of efficiently using hundreds,
even thousands, of processors developed by Purdue computer science Professors Ahmed Sameh and Ananth Grama.
Steve Plimpton of Sandia, known for his work with the widely used LAMMPS molecular dynamics simulation software, and Purdue
mechanical engineering Professor Steve Frankel, an expert in computational fluid dynamics, covered some cutting-edge
research applications of high performance computing, Plimpton from the perspective of atomistic modeling and Frankel from
a continuum modeling perspective. Plimpton talked about problems involving colloids and colloidal solutions, tiny particles
floating in a liquid, such as paints, pastes and gels, as well as being a feature in thin films and self-assembling
nanostructures. Likewise the flows at which Frankel looks, involving jet engine exhaust for one, using large eddy
simulation.
Students in the hands-on sessions used the Rosen Center’s Steele cluster as a platform, guided by center staff. They
received lessons in logging in, scheduling jobs using the PBS system commonly employed on a high performance computing
cluster and using modules, make files and compliers to configure a parallel program to run. They then set up and ran
small test programs. They also got an introduction to MPI programming the first afternoon. The next morning, the
participants worked with LAMMPS as an example of a real-world high performance, parallel computing program, including
downloading, installing and configuring it. They then ran simple scaling tests on an increasing number of processors to
get a feel for how a program scales up and needs to be adjusted to take maximum advantage of a larger computational
resource base.
Later, they worked with TotalView, software for quashing bugs in parallel programs, and went into more depth with PBS and
other setup tasks, in addition to touching on OpenMP, a programming model that takes advantage of high performance systems
with shared memory pools, which potentially reduces interprocessor communications and is significantly faster.
Many of the presentations were recorded and will be available through the nanoHUB maintained by the Network for
Computational Nanotechnology.
Agenda
Thursday, September 4, 2008 - Burton Morgan Center 121
(The Thursday morning session was open to the public)
Breakfast snacks
8:00 AM
Welcoming remarks
8:15 - 8:30 AM
Alejandro Strachan, Purdue University
Overview of High Performance Computing at the NNSA/ASC National Labs
8:30 - 9:10 AM
Blaise Barney, Lawrence Livermore National Labs,
Rob Cunningham, Los Alamos National Labs
Barbara Jennings, Sandia Labs
Speaker: Blaise Barney, Lawrence Livermore National Labs, Rob Cunningham, Los Alamos National Labs, and
Barbara Jennings, Sandia Labs Title: Overview of High Performance Computing at the NNSA/ASC National Labs Abstract: The Lawrence Livermore, Los Alamos, and Sandia National Laboratories have been at the leading edge
of High Performance Computing (HPC) since its inception, and have been instrumental in its ongoing evolution. This
presentation will provide a brief overview of HPC at the Labs, followed by discussions on the current architectures
sited at each Lab. We will provide a look into future petascale systems now being planned, and discuss some of the
challenges faced in deploying such systems. The presentation will conclude with a discussion on the resources that
are available to our University Partners through the academic Alliance Program hosted at the Labs.
Speaker: Andrew Lumsdaine, Indiana University Title: MPI for the Next Generation of Supercomputing Abstract: Despite premature rumours of its demise, MPI continues to be the de facto standard for
high-performance parallel computing. Nonetheless, supercomputing software and the high-end ecosystem continue to
advance, creating challenges to several aspects of MPI. In this talk we will review the design and functionality of
MPI and discuss the reasons for its historical success. In light of current trends in hardware and software, we will
discuss recent efforts that are intended to keep MPI relevant, productive, and efficient for the next generation of
supercomputing.
Speaker: Ananth Grama, Purdue University Title: Scalable Parallel Preconditioners for Linear System Solvers Abstract: The emergence of multicore architectures and highly scalable platforms motivates novel algorithms
and techniques that emphasize concurrency and are tolerant to deep memory hierarchies, as opposed to those that focus
primarily on minimizing raw FLOP counts. In this talk, we present a novel class of banded preconditioners and solvers
that have highly desirable concurrency characteristics, while delivering high aggregate FLOP counts. These methods are
shown to achieve excellent scalability on various architectures. In this talk, we present (i) reordering schemes that
allow extraction of a narrow central band that can be used as a banded preconditioner, (ii) a parallel solver, Spike
used as the inner banded solver, and (iii) a parallel iterative outer solver. Our results demonstrate that (i) our
banded preconditioners are more robust than the broad class of incomplete factorization based methods, (ii) they
deliver better convergence results (iteration counts) than incomplete factorization methods, (iii) they deliver
higher processor performance, resulting is faster time to solution, and (iv) they deliver excellent parallel
performance and scalability on diverse parallel platforms. We also show how we can derive models of performance that
characterize the performance of our solvers accurately. We demonstrate these results experimentally on a variety of
problems selected from diverse application domains.
Speaker: Steve Plimpton, Sandia National Labs Title: Nanoparticle and Colloidal Simulations with Molecular Dynamics Abstract: Modeling nanoparticle or colloidal systems in a molecular dynamics (MD) code requires coarse-graining
on several levels to achieve meaningful simulation times for study of rheological and other manufacturing properties.
These include treating colloids as single particles, moving from explicit to implicit solvent, and capturing hydrodynamic
effects. These changes impact parallel algorithms for tasks such as finding neighbor particles and interprocessor
communication. I'll describe enhancements we've made to our MD code LAMMPS to make nanoparticle simulations more
efficient and highlight some preliminary modeling results for nanoparticles in solution.
Speaker: Steve Frankel, Purdue University Title: Towards Petascale High-Fidelity Turbulent Combustion Simulations Abstract: Due to the wide range of length and time scales associated with turbulent reacting flows, as well
as the importance of large-scale flame dynamics, large eddy simulation (LES) has become a vital tool for both
fundamental and applied studies. Parallel computing is essential for performing such computations. Modeling and
numerical aspects of LES of turbulent flows will be presented with a focus on the use of various high-performance
computing platforms to study several canonical flows. Future directions and challenges to achieve petascale computing
levels will be discussed.