Why Do Math? Reading Room: Political Calculus by Barry A. Cipra

SIAM News

Severe-weather Experiment Puts Leading-edge Numerical Model Technology to the Test

July 10, 2008

By Michelle Sipics

Operational forecasters at NOAA’s Storm Prediction Center create forecasts overnight, such as the one shown here---generated for Friday, June 6, 2008---referred to as “Day 1 Convective Outlooks.” Once such forecasts have been generated, participants in the Hazardous Weather Testbed select the general area, approximately 1500 km2, over which human probabilistic forecasts will be generated.

Each year in mid-April, as procrastinating Americans file their tax returns, re-searchers in Norman, Oklahoma, embark on a much more interesting activity: the Hazardous Weather Testbed Spring Experiment of the National Oceanic and Atmospheric Administration. The researchers---NOAA scientists, faculty members and graduate students from the University of Oklahoma, and weather forecasters from around the world---gather to determine whether severe storms can be predicted with numerical models---and, if so, how those models can be implemented to help weather forecasters.

The story, says Steven Weiss, science and operations officer of NOAA's Storm Prediction Center (SPC), is set in a veritable "sea of acronyms": the Center for Analysis and Prediction of Storms (CAPS) at the University of Oklahoma; the National Severe Storms Laboratory (NSSL); the Pittsburgh Supercomputing Center (PSC); and the National Weather Service's Environmental Modeling Center (EMC), along with other groups from the National Centers for Environmental Prediction (NCEP).

CAPS, based at the University of Oklahoma, was co-founded in 1989 by Kelvin Droegemeier, who is enthusiastic about the "melting pot of science, technology and operations" in Norman. "It's fun to watch dyed-in-the-wool theoreticians and researchers and hardcore modelers come together and have their eyes opened," he says with a laugh. "All these different groups can come together and work almost in an experimental lab where we're testing out new technology."

Droegemeier was director of CAPS for 12 years; although he stepped down in 2006 to take an appointment as associate vice president of research for the university, his involvement in the spring experiment continues.

Probabilistic Guidance
"The real notion of this is how to do warning based on numerical models," he explains. "Right now, warning is based on radar: ‘I think I see a tornado.' What we're talking about is getting numerical model technology to the point where, maybe one to two hours ahead of time, we can predict when a storm might produce a tornado---and based on that forecast provide a warning."

Current operational weather forecasts for North America are based on a grid with a horizontal resolution of approximately 12 kilometers, while the highest-resolution ensemble forecasts produced by NCEP use an average resolution of 40 kilometers. Only the net effect of thunderstorms can be represented at those resolutions, says Ming Xue, current director of CAPS and leader of the day-to-day activities for the Hazardous Weather Testbed's forecasts and experiments.

"The actual prediction of thunderstorms has to be inferred by human forecasters, based on coarse-resolution model output," Xue explains. "The experiment [attempts to] evaluate and demonstrate the value, for the first time, of ensemble forecasts at storm-resolving resolutions in providing probabilistic forecast guidance." In 2007, the models were run at 4-km grid spacing in a 10-member ensemble, with a run at 2 km for comparison. The 4-km spacing, Droegemeier points out, barely represents the larger storms explicitly. "Ideally," he says, "we want to be using grid spacings of 1 km or less."

In the meantime, the experiment remains the first to highlight the potential of storm-resolving ensemble forecasts on an operational level.

"We have seen many successful cases where thunderstorms were correctly forecasted with reasonable timing and location accuracy beyond 24 hours," Xue says. The experiment can also evaluate radar data assimilation at the national scale on short-range storm prediction, he adds.

Radar data assimilation is a new feature of the experiment, added in 2008. Because the grids in current operational numerical models are too coarse to resolve storms, the researchers in the spring experiment have to interpolate those coarse grid analyses to the finer grids being used in Norman. But those analyses still contain no radar data, Droegemeier points out.

"This year, we [made] one extremely important change," Droegemeier continues. "We assimilated radar data in the model's initial conditions, the first time this has ever been done in such forecasting. Severe storms have never been predicted using fine grids and fine data from radar, and this was the first time in history where we did so over such large areas, and also using an ensemble framework. If storms exist at the time the model forecast begins but are not present in the initial conditions, the model may never generate them. That's the value of radar data."

T-Storms on the TeraGrid
The kind of modeling used to achieve the predictions Xue and Droegemeier describe requires a lot of computing power---Droegemeier notes that simply halving the spacing of the 3D grid leads to a 16-fold increase in computing time. Hence the involvement of PSC.

PSC has made its piece of the National Science Foundation TeraGrid available for the spring experiment, which Droegemeier views as precisely the type of research envisioned by NSF for the TeraGrid: "They want these so-called grand challenge or hero applications that really push computing to its limits."

The collaboration between the spring experiment and PSC is made possible in part by a LambdaRail link between the two centers, set up during the 2007 experiment. LambdaRail, network infrastructure de-signed to connect the U.S. research community, allows the spring experiment to swap data with PSC. In particular, it allows for the transfer of high-frequency full-volume 3D data sets to the University of Oklahoma.

Such data sets are generated by the 2068-processor Cray XT3 system at PSC. The spring experiment uses approximately 1500 of the system's processors for about eight hours each night, running data analysis, forecasts, and post-processing. The theoretical peak speed for the combined 1500 processors is about 7.8 teraflop/s; the prediction model used for the 4-km and high-resolution 2-km forecasts achieves about 2 teraflop/s of sustained performance, Xue says. Still other experiment-related forecasts are done at the Indiana University Supercomputing Center. With so much data being thrown around, the experiment's success depends heavily on the reliability of the involved high-performance computing systems.

"What's interesting about what we're doing is that it's a quality-of-service issue. If there's a computing glitch and it pushes the forecast back---well, the forecast is no good if it comes out after the weather happens," Droegemeier laughs.

Following the Weather
Droegemeier is quick to point out that computers make research possible that, in theory, shouldn't be possible at all. "There's a theory of predictability that's holding true for the large-scale atmosphere pretty well, but it kind of falls on its face when you look at the work we're doing," Droegemeier says. "We're able to go out and do these experiments without knowing how predictable the atmosphere is. Computers allow us to do things that may or may not be possible."

That computing power, of course, has to be used in conjunction with improved models and algorithms. The techniques involved in the spring experiment are not necessarily new, but are collectively very well evolved, Droegemeier says.

"The model we're using is quite sophisticated and uses very good but not unconventional techniques to solve the governing partial differential equations. The most important improvement concerns how we define the initial conditions of the model, and especially the fact that we're bringing in fine-scale radar observations."

The result: In some cases, thunderstorm regions can be predicted up to a day in advance, within about 100 miles of spatial accuracy. Predicting individual storms isn't quite "on the radar" yet, but Droegemeier describes a 24-hour advance window for thunderstorm regions as a vast improvement over the current predictability of single storms.

"Classical predictability theory says the limits should be on the order of 30 minutes for a single thunderstorm," he says. In terms of thunderstorm regions, "we can get squall lines and other storm complexes reasonably well, but a given storm within them isn't really predictable" more than an hour or two in advance.

The lack of predictability also applies to the area covered by the experiment on a given day: The coverage area is not a fixed domain. Because the SPC is responsible for the lower 48 states, the researchers "follow the weather" during an experiment. (Such "on demand" forecasts are the ones done at the IU Supercomputing Center.)

"We will choose a domain each day where it looks like the severe weather threat is the greatest, or the scientific challenge is unusually large," Weiss explains.

Researchers and weather forecasters gather each year in Norman, Oklahoma, for NOAA's Hazardous Weather Testbed Spring Experiment. Kelvin Droegemeier, founder of the Center for Analysis and Prediction of Storms at the University of Oklahoma, defines the challenge: "getting numerical technology to the point where, maybe one to two hours ahead of time, we can predict when a storm might produce a tornado---and based on that forecast provide a warning."

Working on the Edge
"There's no roadmap in terms of where we should be going at this point," Weiss says. "But I think it works really well that we've sort of become a gathering ground over the last 12 years . . . where we're more interested in improving the models and finding out how we can use them to better formulate severe model forecasts, and not so much, ‘this model is better than that model.'"

The modeling community alone cannot decide what constitutes "success" for a new forecast model, Weiss continues. "It's the operational forecasters who will make the decision as to whether something is useful or not."

Droegemeier agrees, citing one of the spring experiment's larger goals: "to get the futuristic forecast technology that might be five years down the road from actual implementation into the hands of forecasters right now, to start experimenting with it and learning from it. It has to be developed in conjunction with forecasters."

The fundamental goal of the spring experiment, he continues, requires more analysis.

"[We want] to say, can we predict these intense events and if we can, what does it mean?" he says. "What does it mean for the airline industry, for construction and surface transportation, and so on? A big challenge is conveying the information to the stakeholders in a way that's useful to them."

Weiss elaborates on some of the social science aspects of the work: "A large part of what needs to be taken into account, and I think it's being acted upon more and more in parts of the meteorological community, is how we communicate to the user community."

The question is not only how warnings are sent, but also to whom. "How do we package [warning] information," he asks, "and who do we send it to who then moves that information out further, such as the media? We have to make this much more of an interdisciplinary approach. The science is tough enough as it is; now we have to figure out how we tell people what we know, but also what we don't know---the uncertainty of it.

"How do we get people not to panic?" he continues. "And emergency response---what do they do with the information? If we don't put it in a form they can use, we're not as effective as forecasters. This is why there's more and more work being done bringing the meteorological community together with the social science community, communicating information about risks and hazards."

That such a diverse group---with representatives from industry, academia, and the government---could work so well together in a $65 million a year enterprise might be a shock to some, but Droegemeier takes it in stride.

"We don't try to make them agree with one another, we let them have their disparate views, and sometimes sparks fly---but they're creative sparks," he says. "We try to bring these cultures together in a way where they can leverage their diversity for the benefit of all."

Droegemeier is particularly proud of the experiment's success in light of the naysayers he's encountered.

"To me one of the most exciting things is to see the models do the things we said they could 20 years ago. And I actually saved a lot of the documentation where people said it wasn't possible," he says. "The nature of these centers is to be working out on the bleeding edge where you have substantial likelihood of failure. So to see these things working, and to see it giving rise to all sorts of new challenges, is just absolutely wonderful."

And for the future?

"We are on the leading edge of testing new concepts and new ideas, and the operational community is looking at the Norman community to see what is going to be coming in the next few years," Weiss says. "It will help make the determination of where operational modeling for small-scale convective weather might be going."