Exascale Computing is Here: El Capitan
Articles,  Blog

Exascale Computing is Here: El Capitan


*Music* Quinn: El Capitan is the result of a second
collaboration of the CORAL partners, which are Oak Ridge, Argonne, and LawrenceLivermore National Laboratories. The first collaboration resulted in Sierra, which
is sited at our Lab. The program, the NNSA program looked out beyond 2023 at our nuclear weapons needs and came up with a number of requirements that we wanted to achieve with the El Capitan system. And one of the key ones is that it
perform an order of magnitude better on some important calculations over Sierra
in that timeframe to meet those objectives. Clouse: I think exascale computing is kind of the next step in improving our
computational capabilities. Y’know, why do we need to improve our computational
capabilities? Well, the NNSA complex is facing several large modernization
programs right now and these programs will essentially introduce the most
significant changes to both the nuclear explosive package as well as the
delivery system since the end of nuclear testing. Pudliner: Ultimately, the world is in 3D. And there are serious questions in the stockpile that have to be addressed
by using 3D simulation. Another aspect is our physical models, some physics is
fundamentally 3D, we have to make compromises when we run them in lower
dimensions. We’d like to be in a position where we’re not making those compromises
and to model fundamentally 3D things with 3D simulation. A machine like El
Capitan will enable us to, in some cases, take those 3D simulations and turn them
around on the same kind of time scale that today’s 2D simulations are turned
around. That makes them a design iteration tool, and enables us to answer
the questions that we need to answer for the stockpile in a workday kind of turnaround. Rieben: Preparing for an exascale system like El Capitan is a major challenge.
Fortunately the Department of Energy ASC program anticipated this and got the
process started by investing in the necessary pieces. So this includes
getting our current production codes ready but also starting new codes, next
generation codes. Quinn: We have awarded two contracts with Cray. One is a
non-recurring engineering, or you might think of it as an R&D contract, and we’ve
awarded a contract that actually builds and delivers the El Capitan
system. The purpose of that R&D contract is to develop and pull forward some
innovations that that Cray and their subcontractors have in mind for that
system. So, in order to meet some of our objectives, we couldn’t just wait and buy
off-the-shelf, we really needed to push the technology in some key ways. So we
have both hardware and software development going on in that
NRA contract. Rieben: Similar to how we handled the Sierra platform, we will be engaging
with a center of excellence for the El Capitan system. This enables us to work
directly with hardware experts in getting our algorithms and our software
ready for this new platform. Clouse: If you look at the increase of computation
capability that’s occurred over the last couple decades: orders of magnitude
increase, but we can’t go up in orders of magnitude use of power and that’s really
what’s driving these new architectures is trying to deliver more computation
capability without driving up the power costs. I think El Capitan will be a
signature system into being able to deliver exaflop capabilities within a
power envelope that we can afford. Bailey: For material science in general, we do have
large collaborations across national labs, within an NNSA and also with Office
of Science, and those are really big efforts involving a lot of researchers. I
really do think that El Capitan will enable those collaborations to take next
steps into learning more about the physics and I think it’s gonna benefit
the entire DOE organization.

11 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *