> El Capitan, housed at Lawrence Livermore National Laboratory in Livermore, Calif., can perform over 2700 quadrillion operations per second at its peak. The previous record holder, Frontier, could do just over 2000 quadrillion peak operations per second.
> El Capitan uses AMD’s MI300a chip, dubbed an accelerated processing unit, which combines a CPU and GPU in one package. In total, the system boasts 44,544 MI300As, connected together by HPE’s Slingshot interconnects.
Yep! They've been part of the Exascale project for a long time, and it's good to see their commitment on HPC actually succeeded unlike Intel's during the same time period.
Fun facts, FFT was discovered back in 1965 based on the urgent necessity of discovering and detecting illegal nuke testing activities, just two years after the Partial Test Ban Treaty (PTBT) was signed in 1963 [1].
The first sentence statement in the article mentioning that United States and other nuclear powers committed to the Comprehensive Nuclear-Test-Ban Treaty in 1965 is wrong since the treaty was only signed in 1996 not in 1965 [2].
[1] The Algorithm That Almost Stopped The Development Of Nuclear Weapons:
Another fun fact, the priority for detecting nuclear testing led to seisometers all over the planet. So the detection the exact position and nature of any disturbance on the planet became radically better. This was quite the boon to anyone interested in earthquakes, not only can the earth quake be detected in 2D, but accurately in 3D. The number and accuracy is enough you can see where on each fault is, the thickness of the crust, and the outline of subduction zones in 3d. Pretty crazy to see enough detail to see where plates enter the mantle and melts.
Said sensors can also track sonic booms from secret supersonic planes, but governments don't like to talk about that.
I've always loved these charts. The Numerical Wind Tunnel, #1 in 1993, achieved 124.2 gigaflops on the Linpack benchmark.
In comparison, the iPhone 15 Pro Max cellphone, released in 2023, delivers approximately 2150 gigaflops.
I once drew the chart backwards. I think my PC in 2013 would have been the fastest on Earth in 1990. And faster than every computer combined in about 1982.[0]
That's a pretty standard Cray feature for systems larger than a few cabinets. El Capitan has the landscape, Hopper at NERSC had a photo of Grace Hopper, Aurora at ANL has a creamy gradient reminiscent of the Borealis, and on and on. Gives them a bit of character beyond the bad-ass Cray label on the doors.
Some may not want to hear this, but these “fastest supercomputer” list is now meaningless because all the Chinese labs have started obfuscating their progress.
A while ago there were a few labs in China in top 10 and they all attracted sanctions / bad attention. Now no Chinese lab report any data now
They are in good company, with X, Meta, Microsoft and others not reporting theirs either.
The basis for the ranking was a cumulative tracking of benchmark results that were required as part of commissioning bespoke computers. A contract would be written to buy a computer that could achieve a certain performance in operations per second, and in order to satisfy that the benchmarks were agreed to and codified in the contracts. Government contracts are to a certain extent public information so the goals and clout of successive performance were tracked in this way.
If you don’t need to satisfy a government contract, or don’t need the clout to attract engineers or funding, submitting results draws unwanted attention to what you’re cooking up.
Microsoft has the #4 cluster on the top 500 list. Sure not everyone reports, still seems like a useful list to watch the trends in computing and in particular HPC.
Keep in mind the average hyperscalers cloud is not a particularly good setup for the top500. HPC tends towards more bandwidth, lower latency, and no virtualization.
Unless it's like, air gapped powered by a naval nuclear reactor, I feel like someone would question why a random US gov building is drawing 20-30MW of power, and exhausting most of that as heat...
The NNSA—which oversees Lawrence Livermore as well as Los Alamos National Laboratory and Sandia National Laboratories—plans to use El Capitan to “model and predict nuclear weapon performance, aging effects, and safety,”
I fail to understand how these nuclear bomb simulations require so much compute power.
Are they trying to model every single atom?
Is this a case where the physicists in charge get away with programming the most inefficient models possible and then the administration simply replies "oh I guess we'll need a bigger supercomputer"
It literally requires simulating each subatomic particle, individually. The increases of compute power have been used for twin goals of reducing simulation time (letting you run more simulations) and to increase the size and resolution.
The alternative is to literally build and detonate a bomb to get empirical data on given design, which might have problems with replicability (important when applying the results to rest of the stockpile) or how exact the data is.
And remember that there is more than one user of every supercomputer deployed at such labs, whether it be multiple "paying" jobs like research simulations, smaller jobs run to educate, test, and optimize before running full scale work, etc.
AFAIK for considerable amount of time, supercomputers run more than one job at a time, too.
> It literally requires simulating each subatomic particle, individually.
Citation needed.
1 gram of Uranium 235 contains 2e21 atoms, which would take 15 minutes for this supercomputer to count.
"nuclear bomb simulations" do not need to simulate every atom.
I speculate that there will be some simulations at the subatomic scale, and they will be used to inform other simulations of larger quantities at lower resolutions.
Subatomic scale is the perfect option, but we tend to not have time for that, so we sample and average and do other things. At least that's the situation within aerospace's hunger for CFD, I figure nuclear has similar approaches.
Ok, that misreading is on me - in aerospace generally you care to level of molecules, and I've met many people who would love to be just able to brute force it this way. Hypersonics do however end up dealing with simulating subatomic particle behaviours (because of things like air turning into plasma)
> in aerospace generally you care to level of molecules
I would like a citation for this.
> Hypersonics do however end up dealing with simulating subatomic particle behaviours
And this.
---
For example, you could choose to cite "A Study on Plasma Formation on Hypersonic Vehicles using Computational Fluid Dynamics" DOI: 10.13009/EUCASS2023-492 Aerospace Europe Conference 2023 – 10ᵀᴴ EUCASS – 9ᵀᴴ CEAS
At sub-orbital altitudes, air can be modelled as a continuous flow governed by the Navier-Stokes equations for a multicomponent gas mixture. At hypersonic speeds, however, this physical model must account for various non-equilibrium
phenomena, including vibrational and electronic energy relaxation, dissociation and ionization.
"I wish I could give the finger to Navier-Stokes and brute force every molecules kinematics" does not make for a paper that will get to publication if not accompanied with actually doing that at speed and scale that makes it usable, no matter how many tenured professors dream of it. So instead they just ramp up resolution whenever you give them access to more compute
(younger generations are worse at it, because the problems that forced elder ones into more complex approaches can now be an overnight job on their laptop in ANSYS CFX)
So unfortunately my only source on that is bitching of post-docs and professors, with and without tenure (or rather its equivalent here), at premier such institutions in Poland.
* It's a jobs program to avoid the knowledge loss created by the end of the cold war. The US government poured a lot of money into recreating the institutional knowledge needed to build weapons (e.g. materials like FOGBANK) and it's preferred to maintain that knowledge by having people work on nuclear programs that aren't quite so objectionable as weapon design.
* It helps you better understand the existing weapons stockpiles and how they're aging.
* It's an obvious demonstration of your capabilities and funding for deterrence purposes.
* It's political posturing to have a big supercomputer and the DoE is one of the few agencies with both the means and the motivation to do so publicly. This has supposedly been a major motivator for the Chinese supercomputers.
There's all sorts of minor ancillary benefits that come out of these efforts too.
Because even normal explosives degenerate over time, and fissile material in nuclear devices is even worse about it - remember that unstable elements are ongoing constant fission events, critical mass is just one where they trigger each others' fission fast enough for runaway process.
So in order to verify that the weapons are still useful and won't fail in random ways, you have to test them.
Which either involves actually exploding them (banned by various treaties that have enough weight that even USA doesn't break them), or numerical simulations.
Basically yes, we are always designing new nuclear bombs. This isn't done to increase yield, we've actually been moving towards lower yield nuclear bombs ever since the mid Cold War. In the 60s the US deployed the B41 bomb with a maximum yield of 25 megatons, making it the most powerful bomb ever deployed by the US. When the B41 was retired in the late 70s, the most powerful bomb in the US arsenal was the B53 with a yield of 9 megatons. The B53 was retired in 2011, leaving the B83 as the most powerful bomb in the US arsenal with a yield of only 1.2 megatons.
There are two kinds of targeting that can be employed in a nuclear war: counterforce and countervalue. Counterforce is targeting enemy military installations, and especially enemy nuclear installations. Countervalue is targeting civilian targets like cities and infrastructure. In an all out nuclear war counterforce targets are saturated with nuclear weapons, with each target receiving multiple strikes to hedge against the risks of weapon failure, weapon interception, and general target survival due to being in a fortified underground positions. Any weapons that are not needed for counterforce saturation strike countervalue targets. It turns out that having a yield greater than a megaton is basically just overkill for both counterforce and countervalue. If you're striking an underground military target (like a missile silo) protected by air defenses, your odds of destroying that target are higher if you use three one megaton yield weapons than if you use a single 20 megaton yield weapon. If you're striking a countervalue target, the devastation caused by a single nuclear detonation will be catastrophic enough to make optimizing for maximum damage pointless.
Thus, weapons designers started to optimize for things other than yield. Safety is a big one, an American nuclear weapon going off on US soil would have far reaching political effects and would likely cause the president to resign. Weapons must fail safely when the bomber carrying them bursts into flames on the tarmac, or when the rail carrying the bomb breaks unexpectedly. They must be resilient against both operator error and malicious sabotage. Oh, and none of these safety considerations are allowed to get in the way of the weapon detonating when it is supposed to. This is really hard to get right!
Another consideration is cost. Nuclear weapons are expensive to make, so a design that can get a high yield out of a small amount of fissile material is preferred. Maintenance, and the cost of maintenance, is also relevant. Will the weapon still work in 30 years, and how much money is required to ensure that?
The final consideration is flexibility and effectiveness. Using a megaton yield weapon on the battlefield to destroy enemy troop concentrations is not a viable tactic because your own troops would likely get caught in the strike. But lower yield weapons suitable for battlefield use (often referred to as tactical nuclear weapons) aren't useful for striking counterforce targets like missile silos. Thus, modern weapon designs are variable yield. The B83 mentioned above can be configured to detonate with a yield in the low kilotons, or up to 1.2 megatons. Thus a single B83 weapon in the US arsenal can cover multiple continencies, making it cheaper and more effective than maintaining a larger arsenal of single yield weapons. This is in addition to special purpose weapons designed to penetrate underground bunkers or destroy satellites via EMP, which have their own design considerations.
Great comment- I have only one thing to add. Many people will enjoy reading "Command and Control" which covers the history of nuclear weapons accidents in the US and how they were managed/mitigated. It's always interesting to learn that a missile silo can explode, popping the warhead up and out (but without it exploding due to fission/fusion), that from the perspective of the nuclear warhead, the safety controls worked.
> Another consideration is cost. Nuclear weapons are expensive to make, so a design that can get a high yield out of a small amount of fissile material is preferred. Maintenance, and the cost of maintenance, is also relevant. Will the weapon still work in 30 years, and how much money is required to ensure that?
I've seen speculation that Russia's (former Soviet) nuclear weapons are so old and poorly maintained that they probably wouldn't work. Not that anyone wants to find out.
The euphemistic term used in the field is "stockpile stewardship", which is a catch-all term involving a wide range of activities, some of them forward-looking.
Well there's a fair bit of chemistry related to the explosions to bring the sub-critical bits together. Time scales are in the nanosecond range. Then as the subcritical bits get closer obviously the nuclear effects start to dominate. Things like berrylium are used to reflect and intensive the chain reaction. All of that is basically just a starter for the fusion reaction. That often involved uranium, lithium deturide, and more plutonium.
So it involves very small time scales, chemistry, fission, fusion, creating and channeling plasmas, high neutron fluxes, extremely high pressures, and of course the exponential release of amazing amounts of energy as matter is literally converted to energy and temperatures exceeding those in the sun.
Then add to all of that is the reality of aging. Explosives can degrade, the structure can weaken (age and radiation), radioactive materials have half lives, etc. What should the replacement rate be? What kind of maintenance would lengthen the useful lives of the weapons? What fraction of the arsenal should work at any given time? How will vibration during delivery impact the above?
> maybe it suffices to model the whole thing as a gas
What are you basing this on? Plasmas don't flow like gases even absent a magnetic field. They're self interacting, even in supersonic modes. This is like saying you can just model gases like liquids when trying to describe a plane--they're different states of matter.
Modern weapon codes couple computationally heavy physics like radiation & neutron transport, hydrodynamics, plasma, and chemical physics. While a 1-D or 2-D simulation might not be too heavy in compute often large ensembles of simulations are done for UQ or sensitivity analysis in design work.
Modelling a single nucleus, even one much lighter weight than uranium, is a captital-H Hard Problem involving many subject matter experts and a lot of optimisation work far beyond 'just throw it on a GPU'. Quantum systems get non-tractable without very clever approximations and a lot of compute very quickly, and quantum chromodynamics is by far the worst at this. Look up lattice QCD for a relevant keyword.
Pot, meet kettle? It's usually the industry that's leading with "write inefficient code, hardware is cheaper than dev time" approach. If anything, I'd expect a long-running physics research project to have well-optimized code. After all, that's where all the optimized math routines come from.
Do super computers need proximity to other compute nodes in order to perform this kind of computations?
I wonder what would happen if Apple offered people something like iCloud+ in exchange for using their idle M4 compute at night time for a distributed super computer.
The thing that sets these machines apart from something that you could set up in AWS (to some degree), or in a distributed sense like you're suggesting is the interconnect, how the compute nodes communicate. For a large system like El Capitan, you're paying a large chunk of the cost in connecting the nodes together, low latency, interesting topologies that ethernet, nor even Infiniband can get close to. Code that requires a lot of DMA or message passing really will take up all of the bandwidth that's available, that becomes the primary bottleneck in these systems.
The interconnect has been Cray's bread and butter for multiple decades: Slingshot, Dragonfly, Aries, Gemini, SeaStar, numalink via sgi, etc. and those for the less massively parallel systems before those.
I've seen nothing showing that slingshot has any particular advantage over IB for HPC. Sure HPE pushes slingshot (an HPE interconnect) over giving bags of money to Nvidia, but that's a business decisions. Eagle (the #4 cluster on the list) is Infiniband NDR.
I believe 306 of the top 500 clusters used Infiniband. Pretty sure the advance topologies like dragonfly are supported on IB as well as Slingshot. From what I can tell slingshot is much like ultra ethernet, trying to take the best of IB and ethernet and making a new standard. From what I can tell slingshot 11 latency is much like I got with omnipath/pathscale way back when dual core opterons were the cutting edge.
Yes, supercomputers need low-latency communication between nodes. If a problem is "embarrassingly parallel" (like folding@home, mentioned by sibling comment) then you can use loosely coordinated nodes. Those sorts of problems usually don't run on supercomputers in the first place, since there are cheaper ways to solve them.
This new upstart to the name may win in search results today, but in a few years the first and true El Cap will reclaim its place. It will outlast all of us.
So, they built this supercomputer to test new and more deadly nuclear weapons. That makes me so "happy". I am absolutely not worried about two nuclear powers being close to the brink of direct war, even as we speak; nor about the abandonment of the course of nuclear disarmament treaty; nor about the repeated talk of a coming war against certain Asian powers. Everything is great and I'll just fawn over the colorful livery and the petaflops figure.
> they built this supercomputer to test new and more deadly nuclear weapons
If you are afraid of nuclear war, the thing to fear is a nuclear state's capacity to retaliate being questioned. These supercomputers are the alternative to live tests. Taking them away doesn't poof nuclear weapons, it means you are left with a half-assed deterrent or must resume live tests.
> the abandonment of the course of nuclear disarmament treaty
North Korea, the American interventions in the Middle East and Ukraine set the precedent that nuclear sovereignty is in a separate category from the treaty-enforced kind. Non-proliferation won't be made or broken on the back of aging, degrading weapons.
> repeated talk of a coming war against certain Asian powers
The whole point of testing (and making) deadly nuclear weapons is to ensure they are never used again. The Mutually Assured Destruction doctrine has kept us alive through the darkest pf the Cold War (also keeping the Cold War cold). In order to credibly threaten anyone who tries to annihilate you with certain annihilation is with lots of such doomsday weapons. We have lived in this Mexican standoff for longer than we remember.
> "The Russians are fielding brand new nuclear weapons and bombs," said Lisa Gordon-Hagerty, undersecretary for nuclear security at the DOE. She said "a very large portion of their military is focused on their nuclear weapons complex."
> It's the same for China, which is building new nuclear weapons, Gordon-Hagerty said, "as opposed to the United States, where we are not fielding or designing new nuclear weapons. We are actually extending the life of our current nuclear weapons systems." She made the remarks yesterday in a webcast press conference.
> ...
> Businesses use 3D simulation to design and test new products in high performance computing. That is not a unique capability. But nuclear weapon development, particularly when it involves maintaining older weapons, is extraordinarily complex, Goldstein said.
> The DOE is redesigning both the warhead and nuclear delivery system, which requires researchers to simulate the interaction between the physics of the nuclear system and the engineering features of the delivery system, Goldstein said. He characterized the interaction as a new kind of problem for researchers and said 2D development doesn't go far enough. "We simply can't rely on two-dimensional simulations -- 3D is required," he said.
> Nuclear weapons require investigation of physics and chemistry problems in a multidimensional space, Goldstein said. The work is a very complex statistical problem, and Cray's El Capitan system, which can couple this computation with machine learning, is ideally suited for it, he said.
---
This isn't designing new ones. Or blowing things up ( https://www.reuters.com/article/us-usa-china-nuclear/china-m... ) to see if they still work. It is simulating them to have the confidence that they still work - and that the adversaries of the US know that the scientists are confident that they still work without having to blow things up.
> to see if they still work. It is simulating them to have the confidence that they still work
The Armageddon scenario is some nuclear states conduct stockpile stewardship, some don’t, and those who do discover that warheads come with a use-by date.
I'd guess it's unlikely to be the real use case. The real one is classified. Plus it's not like more deadly nuclear weapons would change anything, we can do bad enough with what we already have.
> it's unlikely to be the real use case. The real one is classified.
What are you basing this on?
> it's not like more deadly nuclear weapons would change anything
We haven't been chasing yield in nuclear weapons since the 60s.
Our oldest warheads date from the 60s [1]. For obvious reasons, the experimental track record on half-century old pits is scarce. We don't know if novel physics or chemistry is going on in there, and we don't want to be the second ones to find out.
I'd rather have a few supercomputers doing stockpile stewardship over being tested live. As much as I hate it personally, these weapons are a part of our society for better or for worse until we (as in the people) decide they won't be by electing those that will help dismantle the programs. They should be maintained and these tools help in that.
Noting here that 2700 quadrillion operations per second is less than the estimated sustained throughput of productive bfloat16 compute during the training of the large llama3 models, which IIRC was about 45% of 16,000 quadrillion operations per second, ie 16k H100 in parallel at about 0.45 MFU. The compute power of national labs has fallen far behind industry in recent years.
Agreed. However also note that if it was only matrix multiplies and no full transformer training, the performance of that Meta cluster would be closer to 16k PFlops/s, still much faster than the El Capitain performance measured on linpack and multiplied by 4. Other companies presumably cabled 100k H100s together, but they dont yet publish training data for their LLMs. It is good to have competition, I just didnt expect the tables to flip so dramatically over the last two decades from a time when governments still ruled the top spots in computer centers with ease to nowadays where the assumption is that there are at least ten companies with larger clusters than the most powerful governments.
I'd expect linpack to be much closer to a user research application than training LLMs. My understanding of LLMs is that it's more about throughput and has a very predictable communication patterns, not latency sensitive, and bandwidth intensive.
Most parallel research, especially at this scale is more about different balance of operations to memory bandwidth, and much more worried about interconnect latency.
I wouldn't assume that just because various corporations have large training clusters that they could dominate HPC if they wanted to. Hyperscalers have dominated throughput for many years now, but HPC is a different beast.
All HPC and LLMs tend to get fully optimized to their hardware specs. When you train models with over 405B parameters and process about 2 million tokens per second calculating derivatives on all these parameters every few seconds, you do end up at the boundary of latency and bandwidth at all scales (from host to host, host to device, and the multiple rates within each device). Typical LLM training at these scales multiplexes three or more different types of parallelism to avoid keeping the devices idle and of course they have to also deal with redundancy and frequent failures of these erratic hardwares (if a single H100 fails once every five years, 100K of them would have more than two failures per hour.)
When you're doing scientific simulations, you're generally a lot more sensitive to FP precision than ML training which is very, very tolerant of reduced precision. So while FP8 might be fine for transformer networks, it would likely be unacceptably inaccurate/unusable for simulations.
> El Capitan, housed at Lawrence Livermore National Laboratory in Livermore, Calif., can perform over 2700 quadrillion operations per second at its peak. The previous record holder, Frontier, could do just over 2000 quadrillion peak operations per second.
> El Capitan uses AMD’s MI300a chip, dubbed an accelerated processing unit, which combines a CPU and GPU in one package. In total, the system boasts 44,544 MI300As, connected together by HPE’s Slingshot interconnects.
Seems like a nice win for AMD.
> Seems like a nice win for AMD
Yep! They've been part of the Exascale project for a long time, and it's good to see their commitment on HPC actually succeeded unlike Intel's during the same time period.
Fun facts, FFT was discovered back in 1965 based on the urgent necessity of discovering and detecting illegal nuke testing activities, just two years after the Partial Test Ban Treaty (PTBT) was signed in 1963 [1].
The first sentence statement in the article mentioning that United States and other nuclear powers committed to the Comprehensive Nuclear-Test-Ban Treaty in 1965 is wrong since the treaty was only signed in 1996 not in 1965 [2].
[1] The Algorithm That Almost Stopped The Development Of Nuclear Weapons:
https://www.iflscience.com/the-algorithm-that-almost-stopped...
[2] The Comprehensive Nuclear-Test-Ban Treaty:
https://www.ctbto.org/our-mission/the-treaty
Another fun fact, the priority for detecting nuclear testing led to seisometers all over the planet. So the detection the exact position and nature of any disturbance on the planet became radically better. This was quite the boon to anyone interested in earthquakes, not only can the earth quake be detected in 2D, but accurately in 3D. The number and accuracy is enough you can see where on each fault is, the thickness of the crust, and the outline of subduction zones in 3d. Pretty crazy to see enough detail to see where plates enter the mantle and melts.
Said sensors can also track sonic booms from secret supersonic planes, but governments don't like to talk about that.
Cool factoids! If you have any more to share, please do so.
This is great but I absolutely love that poster of el capitan on the supercomputer racks ! Also TIL there is a list of top500 at https://www.top500.org/lists/top500/2024/11/
I've always loved these charts. The Numerical Wind Tunnel, #1 in 1993, achieved 124.2 gigaflops on the Linpack benchmark.
In comparison, the iPhone 15 Pro Max cellphone, released in 2023, delivers approximately 2150 gigaflops.
I once drew the chart backwards. I think my PC in 2013 would have been the fastest on Earth in 1990. And faster than every computer combined in about 1982.[0]
[0] might not be accurate
2Tflops was #500 in 2006. Still a supercomputer. Crazy.
https://www.top500.org/statistics/perfdevel/
That's a pretty standard Cray feature for systems larger than a few cabinets. El Capitan has the landscape, Hopper at NERSC had a photo of Grace Hopper, Aurora at ANL has a creamy gradient reminiscent of the Borealis, and on and on. Gives them a bit of character beyond the bad-ass Cray label on the doors.
Some may not want to hear this, but these “fastest supercomputer” list is now meaningless because all the Chinese labs have started obfuscating their progress.
A while ago there were a few labs in China in top 10 and they all attracted sanctions / bad attention. Now no Chinese lab report any data now
They are in good company, with X, Meta, Microsoft and others not reporting theirs either.
The basis for the ranking was a cumulative tracking of benchmark results that were required as part of commissioning bespoke computers. A contract would be written to buy a computer that could achieve a certain performance in operations per second, and in order to satisfy that the benchmarks were agreed to and codified in the contracts. Government contracts are to a certain extent public information so the goals and clout of successive performance were tracked in this way.
If you don’t need to satisfy a government contract, or don’t need the clout to attract engineers or funding, submitting results draws unwanted attention to what you’re cooking up.
Microsoft has the #4 cluster on the top 500 list. Sure not everyone reports, still seems like a useful list to watch the trends in computing and in particular HPC.
Keep in mind the average hyperscalers cloud is not a particularly good setup for the top500. HPC tends towards more bandwidth, lower latency, and no virtualization.
Folding-at-home reports theirs. 2020: "Folding@home project passes 2.4 ExaFLOPS, more than the top 500 supercomputers combined" https://www.techspot.com/news/84832-foldinghome-project-pass...
I wouldn't say meaningless... just incomplete.
I doubt the US Government is telling everyone about their fastest computer.
Unless it's like, air gapped powered by a naval nuclear reactor, I feel like someone would question why a random US gov building is drawing 20-30MW of power, and exhausting most of that as heat...
Aside from the utility, who would know? There's a lot of land out there, relative to the size of even a very large building.
Not disagreeing with you but how could any building draw power without radiating most of it as heat?
The DOE has entered the chat.
(after the nuclear test ban treaty, they run a LOT of simulations)
Isn't that the open secret for El Cap? "Classified workloads" aka weapons sims.
Not a secret, from IEEE:
The NNSA—which oversees Lawrence Livermore as well as Los Alamos National Laboratory and Sandia National Laboratories—plans to use El Capitan to “model and predict nuclear weapon performance, aging effects, and safety,”
I fail to understand how these nuclear bomb simulations require so much compute power.
Are they trying to model every single atom?
Is this a case where the physicists in charge get away with programming the most inefficient models possible and then the administration simply replies "oh I guess we'll need a bigger supercomputer"
It literally requires simulating each subatomic particle, individually. The increases of compute power have been used for twin goals of reducing simulation time (letting you run more simulations) and to increase the size and resolution.
The alternative is to literally build and detonate a bomb to get empirical data on given design, which might have problems with replicability (important when applying the results to rest of the stockpile) or how exact the data is.
And remember that there is more than one user of every supercomputer deployed at such labs, whether it be multiple "paying" jobs like research simulations, smaller jobs run to educate, test, and optimize before running full scale work, etc.
AFAIK for considerable amount of time, supercomputers run more than one job at a time, too.
> It literally requires simulating each subatomic particle, individually.
Citation needed.
1 gram of Uranium 235 contains 2e21 atoms, which would take 15 minutes for this supercomputer to count.
"nuclear bomb simulations" do not need to simulate every atom.
I speculate that there will be some simulations at the subatomic scale, and they will be used to inform other simulations of larger quantities at lower resolutions.
https://www.wolframalpha.com/input?i=atoms+in+1+gram+of+uran...
Subatomic scale is the perfect option, but we tend to not have time for that, so we sample and average and do other things. At least that's the situation within aerospace's hunger for CFD, I figure nuclear has similar approaches.
I would like a citation for anyone in aerospace using (or even realistically proposing) subatomic fluid dynamics.
Ok, that misreading is on me - in aerospace generally you care to level of molecules, and I've met many people who would love to be just able to brute force it this way. Hypersonics do however end up dealing with simulating subatomic particle behaviours (because of things like air turning into plasma)
> in aerospace generally you care to level of molecules
I would like a citation for this.
> Hypersonics do however end up dealing with simulating subatomic particle behaviours
And this.
---
For example, you could choose to cite "A Study on Plasma Formation on Hypersonic Vehicles using Computational Fluid Dynamics" DOI: 10.13009/EUCASS2023-492 Aerospace Europe Conference 2023 – 10ᵀᴴ EUCASS – 9ᵀᴴ CEAS
At sub-orbital altitudes, air can be modelled as a continuous flow governed by the Navier-Stokes equations for a multicomponent gas mixture. At hypersonic speeds, however, this physical model must account for various non-equilibrium phenomena, including vibrational and electronic energy relaxation, dissociation and ionization.
https://www.eucass.eu/doi/EUCASS2023-492.pdf
"I wish I could give the finger to Navier-Stokes and brute force every molecules kinematics" does not make for a paper that will get to publication if not accompanied with actually doing that at speed and scale that makes it usable, no matter how many tenured professors dream of it. So instead they just ramp up resolution whenever you give them access to more compute
(younger generations are worse at it, because the problems that forced elder ones into more complex approaches can now be an overnight job on their laptop in ANSYS CFX)
So unfortunately my only source on that is bitching of post-docs and professors, with and without tenure (or rather its equivalent here), at premier such institutions in Poland.
Are they always designing new nuclear bombs? Why the ongoing work to simulate?
Multiple birds with one stone.
* It's a jobs program to avoid the knowledge loss created by the end of the cold war. The US government poured a lot of money into recreating the institutional knowledge needed to build weapons (e.g. materials like FOGBANK) and it's preferred to maintain that knowledge by having people work on nuclear programs that aren't quite so objectionable as weapon design.
* It helps you better understand the existing weapons stockpiles and how they're aging.
* It's an obvious demonstration of your capabilities and funding for deterrence purposes.
* It's political posturing to have a big supercomputer and the DoE is one of the few agencies with both the means and the motivation to do so publicly. This has supposedly been a major motivator for the Chinese supercomputers.
There's all sorts of minor ancillary benefits that come out of these efforts too.
Because even normal explosives degenerate over time, and fissile material in nuclear devices is even worse about it - remember that unstable elements are ongoing constant fission events, critical mass is just one where they trigger each others' fission fast enough for runaway process.
So in order to verify that the weapons are still useful and won't fail in random ways, you have to test them.
Which either involves actually exploding them (banned by various treaties that have enough weight that even USA doesn't break them), or numerical simulations.
Basically yes, we are always designing new nuclear bombs. This isn't done to increase yield, we've actually been moving towards lower yield nuclear bombs ever since the mid Cold War. In the 60s the US deployed the B41 bomb with a maximum yield of 25 megatons, making it the most powerful bomb ever deployed by the US. When the B41 was retired in the late 70s, the most powerful bomb in the US arsenal was the B53 with a yield of 9 megatons. The B53 was retired in 2011, leaving the B83 as the most powerful bomb in the US arsenal with a yield of only 1.2 megatons.
There are two kinds of targeting that can be employed in a nuclear war: counterforce and countervalue. Counterforce is targeting enemy military installations, and especially enemy nuclear installations. Countervalue is targeting civilian targets like cities and infrastructure. In an all out nuclear war counterforce targets are saturated with nuclear weapons, with each target receiving multiple strikes to hedge against the risks of weapon failure, weapon interception, and general target survival due to being in a fortified underground positions. Any weapons that are not needed for counterforce saturation strike countervalue targets. It turns out that having a yield greater than a megaton is basically just overkill for both counterforce and countervalue. If you're striking an underground military target (like a missile silo) protected by air defenses, your odds of destroying that target are higher if you use three one megaton yield weapons than if you use a single 20 megaton yield weapon. If you're striking a countervalue target, the devastation caused by a single nuclear detonation will be catastrophic enough to make optimizing for maximum damage pointless.
Thus, weapons designers started to optimize for things other than yield. Safety is a big one, an American nuclear weapon going off on US soil would have far reaching political effects and would likely cause the president to resign. Weapons must fail safely when the bomber carrying them bursts into flames on the tarmac, or when the rail carrying the bomb breaks unexpectedly. They must be resilient against both operator error and malicious sabotage. Oh, and none of these safety considerations are allowed to get in the way of the weapon detonating when it is supposed to. This is really hard to get right!
Another consideration is cost. Nuclear weapons are expensive to make, so a design that can get a high yield out of a small amount of fissile material is preferred. Maintenance, and the cost of maintenance, is also relevant. Will the weapon still work in 30 years, and how much money is required to ensure that?
The final consideration is flexibility and effectiveness. Using a megaton yield weapon on the battlefield to destroy enemy troop concentrations is not a viable tactic because your own troops would likely get caught in the strike. But lower yield weapons suitable for battlefield use (often referred to as tactical nuclear weapons) aren't useful for striking counterforce targets like missile silos. Thus, modern weapon designs are variable yield. The B83 mentioned above can be configured to detonate with a yield in the low kilotons, or up to 1.2 megatons. Thus a single B83 weapon in the US arsenal can cover multiple continencies, making it cheaper and more effective than maintaining a larger arsenal of single yield weapons. This is in addition to special purpose weapons designed to penetrate underground bunkers or destroy satellites via EMP, which have their own design considerations.
Great comment- I have only one thing to add. Many people will enjoy reading "Command and Control" which covers the history of nuclear weapons accidents in the US and how they were managed/mitigated. It's always interesting to learn that a missile silo can explode, popping the warhead up and out (but without it exploding due to fission/fusion), that from the perspective of the nuclear warhead, the safety controls worked.
> Another consideration is cost. Nuclear weapons are expensive to make, so a design that can get a high yield out of a small amount of fissile material is preferred. Maintenance, and the cost of maintenance, is also relevant. Will the weapon still work in 30 years, and how much money is required to ensure that?
I've seen speculation that Russia's (former Soviet) nuclear weapons are so old and poorly maintained that they probably wouldn't work. Not that anyone wants to find out.
Small addition: weapon precision has drastically increased since the days of the monster bombs
Less need of 9 megatons against a hardened silo if you have a 1.2 megaton weapon with a 120m CEP.
How do you know all this?
The euphemistic term used in the field is "stockpile stewardship", which is a catch-all term involving a wide range of activities, some of them forward-looking.
It's also to check that the ones they have will still work, now that there are test bans.
Well there's a fair bit of chemistry related to the explosions to bring the sub-critical bits together. Time scales are in the nanosecond range. Then as the subcritical bits get closer obviously the nuclear effects start to dominate. Things like berrylium are used to reflect and intensive the chain reaction. All of that is basically just a starter for the fusion reaction. That often involved uranium, lithium deturide, and more plutonium.
So it involves very small time scales, chemistry, fission, fusion, creating and channeling plasmas, high neutron fluxes, extremely high pressures, and of course the exponential release of amazing amounts of energy as matter is literally converted to energy and temperatures exceeding those in the sun.
Then add to all of that is the reality of aging. Explosives can degrade, the structure can weaken (age and radiation), radioactive materials have half lives, etc. What should the replacement rate be? What kind of maintenance would lengthen the useful lives of the weapons? What fraction of the arsenal should work at any given time? How will vibration during delivery impact the above?
Seems like plenty to keep a supercomputer busy.
I'd never considered this, but do the high temperatures impose additional computational requirements on the chemical portions?
I'd assume computing atomic behavior at 0K is a lot simpler than at 800,000,000K, over the same time step. ;)
> Are they trying to model every single atom?
Given all nuclear physics happens inside atoms, I'd hope they're being more precise.
Note that a frontier of fusion physics is characterising plasma flows. So even at the atom-by-atom level, we're nowhere close to a solved problem.
Or maybe it suffices to model the whole thing as a gas. It all depends on what they're trying to compute.
> maybe it suffices to model the whole thing as a gas
What are you basing this on? Plasmas don't flow like gases even absent a magnetic field. They're self interacting, even in supersonic modes. This is like saying you can just model gases like liquids when trying to describe a plane--they're different states of matter.
Modern weapon codes couple computationally heavy physics like radiation & neutron transport, hydrodynamics, plasma, and chemical physics. While a 1-D or 2-D simulation might not be too heavy in compute often large ensembles of simulations are done for UQ or sensitivity analysis in design work.
>Are they trying to model every single atom?
Modelling a single nucleus, even one much lighter weight than uranium, is a captital-H Hard Problem involving many subject matter experts and a lot of optimisation work far beyond 'just throw it on a GPU'. Quantum systems get non-tractable without very clever approximations and a lot of compute very quickly, and quantum chromodynamics is by far the worst at this. Look up lattice QCD for a relevant keyword.
These usually get split into nodes and scientists can access some nodes at a time. The whole thing isn't working on a single problem.
It's because of the way the weapons are designed, which requires a CNWDI clearance to know, so your curiosity is not likely to be sated.
> It's because of the way the weapons are designed, which requires a CNWDI clearance to know, so your curiosity is not likely to be sated.
While that's true, the information that is online is surprisingly detailed.
For example, this series "Nuclear 101: How Nuclear Bombs Work"
https://www.youtube.com/watch?v=zVhQOhxb1Mc
https://www.youtube.com/watch?v=MnW7DxsJth0
Having once had said clearance limits my answers.
And if they really want to know: https://www.energy.gov/nnsa/working-nnsa
Pot, meet kettle? It's usually the industry that's leading with "write inefficient code, hardware is cheaper than dev time" approach. If anything, I'd expect a long-running physics research project to have well-optimized code. After all, that's where all the optimized math routines come from.
I bet the bulk of it is still super-fast Fortran code.
> I fail to understand how these nuclear bomb simulations require so much compute power
I wrote a previous HN comment explaining this:
Tl;dr - Monte Carlo Simulations are hard and the NPT prevents live testing similar to Bikini Atoll or Semipalatinsk-21
https://news.ycombinator.com/item?id=39515697
My brother in Christ, it's a supercomputer. What an odd question.
Anybody know what "Inertial Confinement Fusion" is in the referenced article?
> what "Inertial Confinement Fusion" is
The experimental fusion approach used by the NIF [1][2].
It's conveniently simultaneously an approach to fusion power, a way to study fusion plasmas and a tiny nuclear explosion.
[1] https://en.wikipedia.org/wiki/Inertial_confinement_fusion
[2] https://en.wikipedia.org/wiki/National_Ignition_Facility
This is a major milestone for Oxide Computer team. Congrats
Is El Capitan entirely made of Oxide components?
Do super computers need proximity to other compute nodes in order to perform this kind of computations?
I wonder what would happen if Apple offered people something like iCloud+ in exchange for using their idle M4 compute at night time for a distributed super computer.
The thing that sets these machines apart from something that you could set up in AWS (to some degree), or in a distributed sense like you're suggesting is the interconnect, how the compute nodes communicate. For a large system like El Capitan, you're paying a large chunk of the cost in connecting the nodes together, low latency, interesting topologies that ethernet, nor even Infiniband can get close to. Code that requires a lot of DMA or message passing really will take up all of the bandwidth that's available, that becomes the primary bottleneck in these systems.
The interconnect has been Cray's bread and butter for multiple decades: Slingshot, Dragonfly, Aries, Gemini, SeaStar, numalink via sgi, etc. and those for the less massively parallel systems before those.
I've seen nothing showing that slingshot has any particular advantage over IB for HPC. Sure HPE pushes slingshot (an HPE interconnect) over giving bags of money to Nvidia, but that's a business decisions. Eagle (the #4 cluster on the list) is Infiniband NDR.
I believe 306 of the top 500 clusters used Infiniband. Pretty sure the advance topologies like dragonfly are supported on IB as well as Slingshot. From what I can tell slingshot is much like ultra ethernet, trying to take the best of IB and ethernet and making a new standard. From what I can tell slingshot 11 latency is much like I got with omnipath/pathscale way back when dual core opterons were the cutting edge.
Yes, supercomputers need low-latency communication between nodes. If a problem is "embarrassingly parallel" (like folding@home, mentioned by sibling comment) then you can use loosely coordinated nodes. Those sorts of problems usually don't run on supercomputers in the first place, since there are cheaper ways to solve them.
If you weren’t aware - https://en.m.wikipedia.org/wiki/Folding@home
More of a SETI@home man myself.
This new upstart to the name may win in search results today, but in a few years the first and true El Cap will reclaim its place. It will outlast all of us.
https://en.wikipedia.org/wiki/El_Capitan
[dead]
So, they built this supercomputer to test new and more deadly nuclear weapons. That makes me so "happy". I am absolutely not worried about two nuclear powers being close to the brink of direct war, even as we speak; nor about the abandonment of the course of nuclear disarmament treaty; nor about the repeated talk of a coming war against certain Asian powers. Everything is great and I'll just fawn over the colorful livery and the petaflops figure.
> they built this supercomputer to test new and more deadly nuclear weapons
If you are afraid of nuclear war, the thing to fear is a nuclear state's capacity to retaliate being questioned. These supercomputers are the alternative to live tests. Taking them away doesn't poof nuclear weapons, it means you are left with a half-assed deterrent or must resume live tests.
> the abandonment of the course of nuclear disarmament treaty
North Korea, the American interventions in the Middle East and Ukraine set the precedent that nuclear sovereignty is in a separate category from the treaty-enforced kind. Non-proliferation won't be made or broken on the back of aging, degrading weapons.
> repeated talk of a coming war against certain Asian powers
One invites war by refusing to prepare for it.
The whole point of testing (and making) deadly nuclear weapons is to ensure they are never used again. The Mutually Assured Destruction doctrine has kept us alive through the darkest pf the Cold War (also keeping the Cold War cold). In order to credibly threaten anyone who tries to annihilate you with certain annihilation is with lots of such doomsday weapons. We have lived in this Mexican standoff for longer than we remember.
Are are living in the darkest days of the cold war right now.
I would reference an older article on super computers and the nuclear weapon arsenal.
https://www.techtarget.com/searchdatacenter/news/252468294/C...
> "The Russians are fielding brand new nuclear weapons and bombs," said Lisa Gordon-Hagerty, undersecretary for nuclear security at the DOE. She said "a very large portion of their military is focused on their nuclear weapons complex."
> It's the same for China, which is building new nuclear weapons, Gordon-Hagerty said, "as opposed to the United States, where we are not fielding or designing new nuclear weapons. We are actually extending the life of our current nuclear weapons systems." She made the remarks yesterday in a webcast press conference.
> ...
> Businesses use 3D simulation to design and test new products in high performance computing. That is not a unique capability. But nuclear weapon development, particularly when it involves maintaining older weapons, is extraordinarily complex, Goldstein said.
> The DOE is redesigning both the warhead and nuclear delivery system, which requires researchers to simulate the interaction between the physics of the nuclear system and the engineering features of the delivery system, Goldstein said. He characterized the interaction as a new kind of problem for researchers and said 2D development doesn't go far enough. "We simply can't rely on two-dimensional simulations -- 3D is required," he said.
> Nuclear weapons require investigation of physics and chemistry problems in a multidimensional space, Goldstein said. The work is a very complex statistical problem, and Cray's El Capitan system, which can couple this computation with machine learning, is ideally suited for it, he said.
---
This isn't designing new ones. Or blowing things up ( https://www.reuters.com/article/us-usa-china-nuclear/china-m... ) to see if they still work. It is simulating them to have the confidence that they still work - and that the adversaries of the US know that the scientists are confident that they still work without having to blow things up.
> to see if they still work. It is simulating them to have the confidence that they still work
The Armageddon scenario is some nuclear states conduct stockpile stewardship, some don’t, and those who do discover that warheads come with a use-by date.
I'd guess it's unlikely to be the real use case. The real one is classified. Plus it's not like more deadly nuclear weapons would change anything, we can do bad enough with what we already have.
> it's unlikely to be the real use case. The real one is classified.
What are you basing this on?
> it's not like more deadly nuclear weapons would change anything
We haven't been chasing yield in nuclear weapons since the 60s.
Our oldest warheads date from the 60s [1]. For obvious reasons, the experimental track record on half-century old pits is scarce. We don't know if novel physics or chemistry is going on in there, and we don't want to be the second ones to find out.
[1] https://en.wikipedia.org/wiki/B61_nuclear_bomb
Maybe there is research not on bigger bangs, but on smaller packages?
Think about a baseball-size device able to take out a city block.
Then think about an escadron of drones able to transport those baseballs to very precise city blocks...
> I'd guess it's unlikely to be the real use case
I can safely say that nuclear simulations are one of the major drivers for HPC research globally.
It is not the only one (genomics, simulations, fundamental research are also major drivers) but it is a fairly prominent one.
I'd rather have a few supercomputers doing stockpile stewardship over being tested live. As much as I hate it personally, these weapons are a part of our society for better or for worse until we (as in the people) decide they won't be by electing those that will help dismantle the programs. They should be maintained and these tools help in that.
Eh, we have all the nukes we need and we already know how to build them. This is going to help more with fusion power than fusion explosives.
Noting here that 2700 quadrillion operations per second is less than the estimated sustained throughput of productive bfloat16 compute during the training of the large llama3 models, which IIRC was about 45% of 16,000 quadrillion operations per second, ie 16k H100 in parallel at about 0.45 MFU. The compute power of national labs has fallen far behind industry in recent years.
A 64 bit float operation is >4X as expensive as a 16 bit float operation.
Agreed. However also note that if it was only matrix multiplies and no full transformer training, the performance of that Meta cluster would be closer to 16k PFlops/s, still much faster than the El Capitain performance measured on linpack and multiplied by 4. Other companies presumably cabled 100k H100s together, but they dont yet publish training data for their LLMs. It is good to have competition, I just didnt expect the tables to flip so dramatically over the last two decades from a time when governments still ruled the top spots in computer centers with ease to nowadays where the assumption is that there are at least ten companies with larger clusters than the most powerful governments.
I'd expect linpack to be much closer to a user research application than training LLMs. My understanding of LLMs is that it's more about throughput and has a very predictable communication patterns, not latency sensitive, and bandwidth intensive.
Most parallel research, especially at this scale is more about different balance of operations to memory bandwidth, and much more worried about interconnect latency.
I wouldn't assume that just because various corporations have large training clusters that they could dominate HPC if they wanted to. Hyperscalers have dominated throughput for many years now, but HPC is a different beast.
All HPC and LLMs tend to get fully optimized to their hardware specs. When you train models with over 405B parameters and process about 2 million tokens per second calculating derivatives on all these parameters every few seconds, you do end up at the boundary of latency and bandwidth at all scales (from host to host, host to device, and the multiple rates within each device). Typical LLM training at these scales multiplexes three or more different types of parallelism to avoid keeping the devices idle and of course they have to also deal with redundancy and frequent failures of these erratic hardwares (if a single H100 fails once every five years, 100K of them would have more than two failures per hour.)
In terms of heat dissipation, maybe, yes. But not necessarily in time.
Any idea how that stacks up with GPT-4?
If I knew, I wouldn’t be able to disclose it :-)
Training an LLM (basically Transformers) is different workflow from Nuclear Simulations (basically Monte Carlo simulations)
There are a lot of intricates, but at a high level they require different compute approaches.
Absolutely. Though the performance of El Capitain is only measured by a linpack benchmark not the actual application.
I thought modern supercomputers use benchmarks like HPCG instead of LINPACK?
The top 500 includes both. There is no HPCG result for El Capitan yet:
https://top500.org/lists/hpcg/2024/11/
This is about the raw compute, no matter the workflow.
Can you expand on why the operations per second is not an apt comparison?
When you're doing scientific simulations, you're generally a lot more sensitive to FP precision than ML training which is very, very tolerant of reduced precision. So while FP8 might be fine for transformer networks, it would likely be unacceptably inaccurate/unusable for simulations.