Two Penn State researchers are members of an interdisciplinary team of ecologists, statisticians, computer scientists and data scientists trying to determine how a changing climate is affecting lakes across the country, and what role the bodies of water play in the global nutrient cycle. Image: S.D. Welch

Researchers harness 'big data' to see the big picture on lakes, nutrient cycles

UNIVERSITY PARK, Pa. -- Two Penn State researchers will play a key role in a multi-university study of the contribution that U.S. lakes make to global nutrient cycles, funded by a $4.2 million grant from the National Science Foundation.

Tyler Wagner, adjunct professor of fisheries ecology in the College of Agricultural Sciences, and Ephraim Hanks, assistant professor of statistics in the Eberly College of Science, will collaborate with researchers at Michigan State University, the University of Wisconsin and the University of Missouri using a new approach to expand traditional ecology to regional and continental scales.

This interdisciplinary team of ecologists, statisticians, computer scientists and data scientists work in the evolving field of "macrosystems ecology," which is expected to play a critical role in helping address many challenges facing the environment, such as a changing climate.

"Lake nutrients, such as nitrogen, phosphorus and carbon, are strongly controlled by characteristics of individual lakes, including size, depth and characteristics of the surrounding landscape, such as the amount of agricultural land use in the catchment," said Wagner, who is assistant leader of the Pennsylvania Cooperative Fish and Wildlife Research Unit at Penn State.

"Although we might have a good understanding of these controls for some individual lakes, it is increasingly important to study these controls and processes at large spatial extents -- especially if we want to answer questions related to how lakes, and the important ecological services they provide, will change within the context of global climate change."

Ecologists often are asked how lakes, rivers, forests and other natural systems of a country or region will change as climate changes, noted lead researcher Patricia Soranno, professor of fisheries and wildlife and a macrosystems ecology pioneer at Michigan State University. "But ecologists still struggle with the basic idea of extrapolation -- how to apply knowledge gained from individual ecosystems that ecologists typically study to all ecosystems within regions, across continents and, ultimately, around the world."

For example, ecologists have data from across the country that have been collected on lakes for decades, but nobody has put them all together to understand the continental lake macrosystem, which consists of more than 130,000 lakes in the continental U.S. The data sources include many small, individual projects from university researchers, citizen-science efforts, governments and tribal agencies -- all of which need to be linked to terabytes of data collected from new or existing field sensors, observation networks and millions of high-definition satellite images.

Paired with this near-endless data deluge are many new computer science and statistical tools available to analyze such large data sets and easy access to supercomputers. Analyses that once took months or years to complete now can be conducted in hours or days. The challenge, though, is that most ecologists do not know how to put all these data together or how to use these new tools.

In this study, researchers will use huge data sets and the latest computer tools to harness and blend knowledge from individual studies and scale them up to show what is happening today and into the future at the scale of the entire United States.

"This research is a great opportunity for developing new statistical methods that will improve our ecological understanding of lakes, provide accurate predictions of lake nutrients and carbon, and identify novel ecosystems and the effect such novelty has on prediction accuracy," said Penn State's Hanks, who is one of two statisticians involved in the study.

"Not only will quantifying novelty be critical for understanding and quantifying prediction uncertainty in this research effort, but it also will help resource managers identify novel systems for which we lack adequate monitoring data."

In addition, the study will help to manage lake water quality better by providing the tools to predict the timing of excess-nutrient-driven algal blooms that can affect water supplies, close beaches and cause illnesses in people, pets and wildlife, the researchers said.