To deal with the challenge, the Ocean Sciences Meeting in Hawaii last month (23-28 February) heard proposals for creating a comprehensive, quality controlled database.
This would help overcome limited scientific capacity and allow under-resourced researchers to better participate in efforts to understand pressing issues such as climate change, scientists said.
The complexities involved in transforming raw oceanographic data into a useable state means researchers with limited data skills or financial resources are less able to meet the necessary standard, said Stephen Diggs, a data manager at the Scripps Institution of Oceanography, in the United States.
The explosion of ocean data being collected through modern methods exceeds scientists’ capacity to deal with it, he told SciDev.Net on the fringes of the conference.
“There definitely needs to be workforce development to get more data curators and managers that come out of their training ready to deal with this,” he said.
Interdisciplinary work that is crucial for answering questions such as what effect climate change is having on marine environments is especially sensitive to reliability issues, he added.
This is because, when faced with data sets from various disciplines, researchers often lack the expertise to spot and correct all the biases created by the idiosyncrasies of different regions and collection methods, he said.
Providing researchers with rigorously quality controlled data from outside their immediate field was the motivation behind a proposed initiative, which called for scientists’ support at a town hall meeting during the conference.
The International Quality-Controlled Ocean Database (IQuOD) would combine automated, large-scale data cleaning done by computers with finer-grained expert analysis. The goal would be to produce a definitive database of the data from the 13 million locations with sea surface temperature records — some dating back to the eighteenth century — scattered across various institutions.
According to Diggs, this would significantly improve on the oceanographic database hosted at the US National Oceanic and Atmospheric Administration’s (NOAA’s) National Oceanographic Data Center. Even though NOAA's only conducts limited automated quality control its repository of marine datasets is the world’s most comprehensive.
Other variables, such as salinity and oxygen concentrations in the water, could be added to the temperature data as the IQuOD project progresses, said Diggs.
While a lack of data management capacity is felt globally, this is particularly the case in regions where resources are stretched, said Diggs.
Quality control done at the level of individual researchers is a “huge drain on resources” and often whole data sets will be removed from analysis if the researcher involved lacks the necessary skills, said Rebecca Cowley, a data analyst with Australia’s Commonwealth Scientific and Industrial Research Organisation and an IQuOD team leader.
To build such a massive international database, funders must stop their piecemeal approach of supporting small, dispersed quality-control projects and focus on a concerted effort, she said.
Reliability issues are not limited to ocean data, with atmospheric scientists facing similar hurdles, according to Shawn Smith, a meteorologist at Florida State University, United States.
But he told SciDev.Net that these problems are “not insurmountable” and that advances in computing power have allowed more and more quality control to be completed automatically.