June 17, 2010—Over the last 20 years, scientists have collected vast amounts of data about climate change, much of it accessible on the Web.
Now the challenge is figuring out how to integrate all that information into coherent datasets for further analysis—and a deeper understanding of the Earth’s changing climate.
In 2000, more than 20 countries began deploying an array of drifting, robotic probes called Argo floats to measure the physical state of the upper ocean. The floats, which look a little like old-fashioned hospital oxygen tanks with antennas, are designed to sink nearly a mile below the surface.
After moving with the currents for about 10 days, they gradually ascend, measuring temperature, salinity, and pressure in the top 2,000 meters of the sea as they rise. At the surface, they transmit the data to a pair of NASA Jason satellites orbiting 860 miles above the equator, then sink again to repeat the process.
So far more than 3,000 floats have been deployed around the world. The data they’ve collected has provided a big leap forward in understanding the upper levels of the Earth’s oceans—and their effect on global climate change—in the same way as early weather balloons expanded understanding of the earth’s atmosphere. What’s more, the data they collect is available in near real time to anyone interested, without restrictions, in a single data format.
Dr. Thomas Peterson, a scientist at the U.S. National Oceanographic and Atmospheric Administration (NOAA)’s National Climatic Data Center in Asheville, North Carolina, has been with the data center since 1991. “Back then,” says Peterson, “people came to us for integrated climate information because it was so hard to find the large amounts of data they needed to derive the information themselves. With the Internet, people can just download the data from the Web.”
The Argo floats are funded by some 50 agencies around the world. The program is one example among thousands of the ways in which the Web is facilitating scientists’ understanding of global climate change. Without the Web, in fact, the float system would not exist.
Studying human-caused change
Humans have probably been studying weather at least since they began raising crops. But rigorous climatology—the study of weather patterns over decades, centuries, or even millennia—dates only from the late 1800s. The study of anthropogenic, or human-caused, climate change is much younger. Until the 1950s, few suspected the earth’s climate might be changing as a result of human activity. And if a scientist in, say, Germany did suspect it, it would have been difficult indeed for him to work with scientists in England or China to explore the possibility.
By the 1980s, as evidence began accumulating of rising levels of atmospheric carbon dioxide, scientists who were pursuing particular aspects of climate change independently began holding international conferences to exchange information. But not until the 1990s did the Web enable them to collaborate remotely, in real time. That collaboration, along with the enormous amounts of data collected using web technologies, has revolutionized the field.
Today, climate scientists conduct studies with colleagues on the other side of the world, hold marathon webinars, and co-author papers with dozens or even hundreds of collaborators, all via the Web. Scientists use the Web to access, monitor, and share everything from in situ data collected by such means as the Argo floats and a worldwide network of 100,000 weather stations, to remote data from radar and satellites, to paleoclimatologic indicators like tree rings and core samples from glaciers and ancient lake beds.
A staggering amount of data
The sheer volume of scientific data on climate is staggering, collected around the world by government agencies, the military, universities, and thousands of other institutions.
NOAA stores about 3,000 terabytes of climate information, roughly equal to 43 Libraries of Congress. The agency has digitized weather records for the entire 20th century and scanned records older than that, including some kept by Thomas Jefferson and Benjamin Franklin. All of it is accessible on the Web. As part of its educational mission, NOAA has even established a presence in the Second Life virtual world, where members can watch 3D data visualizations of a glacier melting, a coral reef fading to white, and global weather patterns evolving.
The U.S. National Aeronautics and Space Administration (NASA) is an equally important player in climate research. Its Earth Observing System (EOS) of satellites collects data on land use, ocean productivity, and pollution and makes its findings available on the Web. There is even a NASA-sponsored program involving a network of beekeepers to collect data on the time of spring nectar flows, which appears to be getting earlier (http://honeybeenet.gsfc.nasa.gov).
The U.N.’s World Meteorological Organization’s Group on Earth Observations (GEO)—launched by the 2002 World Summit on Sustainable Development and the G8 leading industrialized countries—is developing a Global Earth Observation System of Systems, or GEOSS, both to link existing climatological observation systems and to support new ones.
Its intent is to promote common technical standards so that data collected in thousands of studies by thousands of instruments can be combined into coherent data sets. Users would access data, imagery, and analytical software through a single Internet access point called GEOPortal. The timetable is to have the system in place by 2015.
But—and this is a huge but—despite the wealth of information that’s been collected bearing on climate change, finding specific datasets among the thousands of formats and locations in which they’re stored can be daunting or even impossible.
How MIT’s DataSpace could help
Stuart Madnick, who is the John Norris Maguire Professor of Information Technology at MIT’s Sloan School of Management, believes a new MIT-developed approach called DataSpace could help. “Right now, papers on hundreds of subjects are published, but the data that backs them up often stays with the researcher,” says Madnick. “We want DataSpace to become the Google for multiple heterogeneous sets of data from a variety of distributed locations. It wouldn’t necessarily work the way Google does, but it would be as useful, scalable, and easy to use, and it would allow scientists to access, integrate, and re-use data across disciplines, including climate change.”
As a simple example of how DataSpace could work with respect to climatology, Madnick posits that a scientist wants to know the temperature and salinity of the water around Martha’s Vineyard, Massachusetts, over the past 20 years. Data that could answer the question could exist in all kinds of locations, from nearby Woods Hole Oceanographic Institute, to NOAA, to international fishing fleets. But right now, there is little or no integration of that data. DataSpace could perform that integration, which can require adjustments ranging from such simple things as reconciling Centigrade data with Fahrenheit, to compensating for differences in the ways various instruments measure.
Semantic Web technologies
DataSpace would incorporate “reasoning systems” that would “understand” disparate data in a way that now requires human intervention. Often called Semantic Web technologies, such linked-data systems would collect unstructured data, interpret data that is structured but not interpreted, and interpret what the data means.
How would such Semantic Web technologies be used to study climate change? Madnick provides an example. “Microbes are the most abundant and widely distributed organisms on Earth. They account for half of the world’s biomass and have been integral to life on Earth for more than 3.5 billion years. Marine microbes affect climate and climate affects them. In fact, they remove so much carbon dioxide from the atmosphere that some scientists see them as a potential solution to global warming. Yet many of the feedbacks between marine biogeochemistry and climate are only poorly understood. The next major step in the field involves incorporating the information from environmental genomics, targeted process studies, and the systems observing the oceans into numerical models. That would help to predict the ocean’s response to environmental perturbations, including climate change.”
Madnick believes such integration of disparate data, including genetics, populations, and ecosystems, is the next great challenge of climatology, and that Semantic Web technologies will be needed to meet the challenge.
A Wikipedia for climate change?
Another approach being developed at MIT is the Climate Collaboratorium, part of MIT’s Center for Collective Intelligence. MIT Sloan School Professor Thomas Malone describes the Climate Collaboratorium, still in its formative stage, as “radically open computer modeling to bring the spirit of systems like Wikipedia and Linux to global climate change.” His hope is that thousands of people around the world—from scientists, to business people, to interested laypeople—will take part via the Web to discuss proposed solutions in an organized and moderated way and to vote on proposed solutions.
Malone has written, “The spectacular emergence of the Internet and associated information technology has enabled unprecedented opportunities for such interactions. To date, however, these interactions have been incoherent and dispersed, contributions vary widely in quality, and there has been no clear way to converge on well-supported decisions concerning what actions—both grand and ground-level—humanity should take to solve its most pressing problem.” Malone says the Collaboratorium will not endorse positions, but be “an honest broker of the discussion.”
Asked what the biggest challenges are facing climate change scientists, NOAA’s Tom Peterson answers, “Communication. Too much is written by scientists for scientists, so it is often too dense for laypeople to understand. It’s rare for a scientist to take time out of trying to make progress on scientific questions to rigorously disprove some of the widely propagated errors about climate change.”
Still, what is truly remarkable about how the study of climate has changed over the past 20 years is the way the Web has given scientists from around the world, in disparate areas of research, a new way to collaborate. Thanks to the Web, many millions of people not only within but also beyond the scientific community now have access to an enormous tapestry of information. And new technologies like the Semantic Web will undoubtedly enrich that tapestry.