
What if we gave scientists machines that dwarf today’s most powerful supercomputers? What could they tell us about the nature of, say, a nuclear explosion? Indeed, what else could they discover about the world? This is the story of the quest for an exascale computer – and how it might change our lives.
One exaflop is 1,000 times faster than a petaflop. The fastest computer in the world is currently the IBM-based Roadrunner, which is located in Los Alamos, New Mexico. Roadrunner runs at an astounding one petaflop, which equates to more than 1,000 trillion operations per second. The supercomputer has 129,600 processing cores and takes up more room than a small house, yet it’s still not quite fast enough to run some of the most intense global weather simulations, nuclear tests and brain modelling tasks that modern science demands. For example, the lab currently uses the processing power of Roadrunner to run complex visual cortex and cellular modelling experiments in almost real- time. In the next six months, the computer will be used for nuclear simulation and stockpile tests to make sure that the US nuclear weapon reserves are safe. However, when exascale calculations become a reality in the future, the lab could step up to running tests on ocean and atmosphere interactions. These are not currently possible because the data streams involved are simply too large. The move to exascale is therefore critical, because researchers require increasingly fast results from their experiments.

“Current models represent a balance between the resolution of the model, which can be represented as the distance between the geographical data points used, and the time the model takes to run,” says Ed Turkel, a product manager for scalable computing and infrastructure at HP. “Increasing the resolution – moving the data points closer together – increases the accuracy of the models but dramatically increases the time to compute a solution with a system of a given size. So you need bigger and bigger systems to cope with the resolution and accuracy, while making sure you have the result in a reasonable amount of time.”
Being demanding sorts, scientists don’t just want the answers more quickly; they also want more accurate answers. Equations covering wind turbulence, material strength and how substances behave under stress are all begging for improvement, explains Mark Seager, the Assistant Department Head for Advanced Technology at Lawrence Livermore. Talking to PC Plus, he made his ambitions for the area clear: “We want to make predictive statements about the safety of the nuclear stockpile. As it ages, it behaves differently from the conditions under which it was originally tested. It’s like your car in the garage – if you leave it in the garage it starts to rust. We have to understand the changes [just like] you really need that car to start. Parts decay over time, and we have to predict the performance of the changes. We need high-resolution full-physics simulations in order to do that.” Of course, a car not starting and a bomb detonating due to rust and the ravages of time are very different things.

In terms of how exascale computing would help scientists to understand better how nuclear bombs age, Seager gave an example of how the nuclear data from before 1995 was presented. It was set out and interpreted on a flat map that you could lay on a table. Back then, scientists were only analysing the 2D data. After the weapons have aged for 30 years, however, they end up in different states of deterioration. If one faulty part of the stockpile is not detected and remedied, it could explode – and then set the other weapons off. Suddenly the problem of projecting behaviour becomes a whole lot more complex.
In order to understand how one part of the stockpile impacts on another, the data must be analysed in 3D. Working at petascale levels, researchers were finally able to simulate a nuclear stockpile and see the data as a whole. At exascale speeds, researchers would not only be able to examine all of the data, they would also be able to see the complex relationships between one part of the nuclear stockpile and the others. The current situation can be thought of as a 3D ‘tube’: today, petascale computers can see the entire tube in 3D, but they can’t compare one end of it with the other. Exascale processing would enable visualisation of the 3D data as a whole, as well as all of the different interrelationships between various components.

“We see a lot of opportunity to do multiscale simulations,” says Seager. “In biology, if you’re interested in cell division, the action takes place in tenths of seconds. Modelling the same thing at a molecular level takes femtoseconds (a tenth to a fifteenth of a second). It’s 16 to 17 orders of magnitude faster [in simulation]. If you want to do cell division, you have to model how the DNA splits up. It’s a complicated molecular reaction, and multiscale helps you resolve the biology as fast as possible.”
However, exascale computing is more than just a question of calculation and simulation speed. There are memory storage requirements too. You need to be able to store the results of calculations that have been fed into the supercomputer at amazing speeds. Supply the data too slowly, and the power of your compute engine will be untested. This throws up all manner of challenges. The need to pass data between components quickly enough makes designing and building the necessary interconnects very difficult.
In fact, suitably fast interconnects haven’t even been invented yet. At HP Labs, promising research into photonics – using fibre optics to move data at the speed of light between the components inside a computer and even inside the CPU itself – is in its early stages. Stan Williams, who invented memristor technology to improve memory speeds, is in charge of developing interconnects that will be used in future supercomputers.

Interconnects are important because of scale considerations. Scaling is a grand way of describing the process of bolting several Blue Gene/L supercomputers together. At the moment it’s not possible to do this properly because current interconnects aren’t up to the job – they’re just not fast enough. Finding a solution is essential due to the type of work these supercomputers are doing. Increasingly, the machines need to be capable of performing calculations that are not known in the industry as ‘embarrassingly parallel’.
An embarrassingly parallel workload is one where it’s very easy to break the central problem down into separate elements, all of which can then be worked on in parallel. This type of work is perfect fodder for cluster computing. In the SETI@home project, for example, each PC can beaver away on its slice of the supplied data in splendid isolation. When it comes to analysing the effects of aging on a bomb, the different nuances and variables of the whole problem must also, by their very nature, be worked on in parallel. But doing so is far from easy because there are lots of dependencies between processes. For example, scientists need to model the weather in order to explain the effects of wind and rain. This research can then generate variables essential to modelling the spread of rust. While all this is going on, they need to predict how the growing rust will affect the bomb mechanisms themselves. And all the while that the bomb is rusting, the weather will be changing too. The huge compute engines needed to handle all these factors also need communication channels that are incredibly fast.

Scaling is also a problem because each individual supercomputer is built to handle the data bandwidth for its own particular design. For example, the memory storage used for the Blue Gene/L architecture relies on standard magnetic disk drives – thousands of them – but an exascale supercomputer would require radically new forms of memory storage.
“You would need not just two or three but several thousand Blue Gene computers to achieve an exaflop,” says Reza Rooholamini, the Director of Engineering at Dell. “Even if it were practical to bolt these computers together, there would be many other factors that limit scalability, such as the network bandwidth between these computers and I/O bandwidth to storage. The problem mainly lies in access to shared resources, such as network, storage and memory.”
When it comes to memory and storage requirements for an exascale computer, the story doesn’t get any easier. Current technology is neither fast nor reliable enough to handle the trillions upon trillions of calculations required for such a machine. While memory technology advances at a similar rate to processor speeds (both rely on silicon advancements), mechanical components like today’s hard disks have not scaled nearly as quickly. Seager notes that, in current supercomputers, hard disk failure has been one of the most common and serious sources of problems. At exascale level, more components mean higher failure rates, because the faster data moves, the more error-prone it becomes. This would also necessitate a move to 128-bit computing to avoid latency issues.

“To put it in perspective, in the time it takes for the mechanical head of a disk drive to find a piece of data (about five thousandths of a second), an exascale system could have executed 200,000,000,000,000 instructions,” says David Flynn, the CTO of Fusion-io, a company that makes solid state and high I/O products. “What this means in practical terms is that the system would sit idle for most of the time, having nothing to do but wait for storage. Indeed, even today’s petaflop-scale systems sit idle about for about five to 30 per cent of the time waiting for access to storage. That means an utter waste of up to 30 per cent of the dollars spent on the system, because they aren’t getting work done.” “On the hardware side, most people think about power requirements,” adds Seager. “But there’s also a memory wall. The density of DRAM has increased, but the speed has not had a relative increase. Today, we hide that latency problem with caches, but those techniques are running out of gas. We are now looking at innovative parallelism strategies along the bus so that memory is not getting hung up somewhere.”
Systems with millions of hundred-core processors would be difficult to maintain, and reliability is a big worry. New fault-tolerance methods will be required to make sure that the processing cores can all operate in conjunction. Some experts have suggested that an exascale supercomputer would bring back the heady days of the earliest vacuum tube computers – when one tube died, operators had to shut everything down, replace it and start over. A future supercomputer processing many trillions of operations per second might only run for a few minutes before suffering an error and being forced to shut down. The solution to this problem may involve moving to more simplistic designs.
At Lawrence Livermore, we held one of the new computers for IBM’s forthcoming Sequoia project in one hand. It’s about half the size of a netbook PC, and there are no exposed wires. A demonstration of the IBM Dawn supercomputer – which will be installed this year as a precursor to Sequoia – showed that the design is ultra-simple: each computer is installed in a row, and the rows are installed in multiple racks. The picture only gets complicated with the interconnects, the storage and the simulation software programming.

“Simplicity in design is critical,” says Seager. “An exascale supercomputer would likely have fewer parts than current-generation supercomputers. Today, an entire computer within Blue Gene/L has embedded DRAM, voltage regulatory modules and Gigabit Ethernet, but it’s extremely simple.”
You probably know about Moore’s Law – the well-known theory that the number of transistors in a processor doubles every two years, and that performance doubles every 18 months. Amdahl’s Law is a less commonly known axiom that asserts a related idea – that there’s a point where adding more processors or processor cores will eventually cause a flat-line in performance due to the communication requirements between interconnects. Part of the issue is that each new processor introduces a computing overhead and creates new challenges for those who want to use the extended power.
Photonics would address this issue, but not negate it. Essentially, the law reveals a computing vagary that has been known about for some time: that doubling the number of processors in a cluster or in a supercomputer does not double computing power, and that each new addition just adds an incremental speed increase. This is also true with GPUs: in an SLI configuration, each new card does not double performance; in fact, a new unit only adds about a 10 to 20 per cent increase in graphics throughput. Amdahl’s Law applies to GPUs as well – at some point, you would reach a point after which adding another GPU to the system would not increase throughput at all.
One method of dealing with this is to increase the speed of each processor, which then allows you to add more processors before you reach the plateau again. HP’s Turkel says that the more common approach in supercomputer design is to use accelerators between processors, and to make sure that the communication over interconnects is as fast as possible. “You can use coprocessors, which are special-purpose processors that accelerate the compute-intensive part of an application,” he says. “These include general-purpose graphical processing units (GPGPUs), field- programmable gate arrays (FPGAs) and customer ASICs like Clearspeed. But not all applications lend themselves to acceleration.
“Some of the top systems use accelerators to achieve very high LINPACK numbers, but it’s not clear that they deliver substantial application speedup. Nevertheless, we’re seeing some significant interest in certain application segments for accelerators due to demonstration of real increases in speed.”
Many questions remain when it comes to exascale computing, not least how much such a machine will cost to build. Flynn from Fusion-io says that it would cost millions of dollars just to run the virtual experiments on an exascale supercomputer, let alone the cost of producing the equipment itself. Current supercomputers cost up to $100million, and given the manufacturing scale required, an exascale supercomputer could cost 10 times as much. That would be a heavy investment to make. Yet the march of science demands the progression of computing power to solve some of the most complex problems facing humankind – such as climate change and the cure for cancer. And what’s beyond exascale? Believe it or not, zettaflop computing is already on the very distant – but just visible – horizon.
If you’ve been bitten buy the super computing bug, check out our guide to building the ultimate cluster PC.
Copyright Future Publishing Limited (company registered number 2008885), a company registered in England and Wales whose registered office is at Beauford Court, 30 Monmouth Street, Bath, BA1 2BW, UK
1000 trillion flops is a petaflop, which is the performance of the biggest machines today.
1000 petaflops is an exaflop.
It won't save the world.
Submitted by Anonymous on 12 June 2009 - 7:21pm.
This bit:
"(a tenth to a fifteenth of a second)"
in "Modelling the same thing at a molecular level takes femtoseconds (a tenth to a fifteenth of a second)."
is wrong. Its one quadrillionth of a second.
Submitted by Anonymous on 29 June 2009 - 4:40am.
"customer ASICs like Clearspeed"
Maybe you meant "custom ASICs" ? But that would still be misleading. It's a processor. Highly parallel, but a processor. Programmed in C. And with that all important fault-tolerance stuff built in.
Submitted by Anonymous on 7 July 2009 - 10:07am.
Well, I know that I enjoyed this article in its entirety.
Submitted by StanislausBabalistic on 18 October 2009 - 12:09am.
Post new comment