![]() |
|
![]() |
Weekly Edition Daily updates Events Calendar Book reviews Penguin Gallery |
Predicting the weather with Linux - FSL's cluster
As is the nature of government programs, the "prototype" program shed its drop-dead date and grew into something larger and more permanent. For a while it became the "Program for Regional Observing and Forecasting Systems" before morphing into its current incarnation as the NOAA Forecast Systems Laboratory. FSL's mission covers a lot of ground, but, in the end, they remain a technology transfer group, dedicated to developing and evaluating technology in the weather forecasting arena.
The VAXen are long gone, of course, replaced by high-end SGI servers and
such. FSL took a different turn, however, with this
announcement last September that it was installing a new,
$15 million supercomputing system provided by High Performance Technologies Inc., also
known as HPTi. This isn't just any supercomputer, though: it's a
Beowulf-style Linux cluster. It is, perhaps, the first system of its kind.
Government agencies have been piecing together clusters for years, but this
may be the first that was purchased as a supported commercial product.
Greg Lindahl, senior architect at HPTi and leader of the FSL cluster project, invited me over to have a look. It wouldn't be like me to turn down a chance to see one of the biggest Linux systems on the planet, especially since it's in my home town...
What Jet is made ofThe FSL cluster (called "Jet") currently consists of 276 nodes, organized into three long banks. The nodes are unmodified, off-the-shelf Compaq Alpha systems with 667 MHz processors and 512 MB of memory. The current installation is simply the first phase of the system; the second phase, due to be deployed by late summer, will double the number of nodes. Then comes the third phase where, according to Mr. Lindahl, it "gets really big." The third phase also involves replacing the nodes currently being used, on the idea that they will be considered somewhat slow by then. All of these nodes are tied together by a Myrinet interconnect, which is alleged to allow every single node to be talking to another one at full speed simultaneously. The Myrinet system, by virtue of its speed, also eliminates the need to set up complicated network topologies between the nodes. Simple topology means that users do not need to worry about which nodes their job is running on, which makes their life easier. Run-time variance on this system runs at about 2% - a fraction of what can be encountered on clusters with complicated networking.
The software sideThe nodes in the Jet cluster run Red Hat's Alpha distribution, almost straight out of the box. They have applied the NFSv3 patch, and added a module for the Myrinet networking; it is otherwise a stock system. Low-level networking is done with MPI, though they have a version which has been hacked to work well with Myrinet. The interesting software, of course, is at the higher levels. Numerical weather prediction involves dividing the world (or a subsection thereof) into many grid cells, then cranking through a number of really hairy partial differential equations on each cell. With suitably clever programming (to deal with interactions between cells), this is the sort of job that was just meant for clustered systems.
The business of Linux clustersThe Jet cluster is an important step toward the legitimization of commercial Beowulf clusters. The "big iron" business is a hard one to break into, and Linux-based clusters have not had the track record and high-profile deployments to be allowed to play on that field. After all, a manager who has finally gotten funded to buy a multi-million dollar system is not going to be inclined to take chances. Such people want security.
|