Bringing science and development together through news and analysis

  • Tidal wave of ocean data leaves scientists swamped

Image credit: NOAA's National Ocean Service

Speed read

  • Too few scientists, especially in poorer nations, know how to clean raw data

  • This limits the types of questions that they can investigate

  • A proposed project aims to create a vast, quality database of ocean temperatures

A lack of data curators and managers capable of cleaning up observational measurements, particularly in less developed nations, is limiting the scale and scope of ocean research, researchers have said on the sidelines of an oceans science meeting.

To deal with the challenge, the Ocean Sciences Meeting in Hawaii last month (23-28 February) heard proposals for creating a comprehensive, quality controlled database.

This would help overcome limited scientific capacity and allow under-resourced researchers to better participate in efforts to understand pressing issues such as climate change, scientists said.

The complexities involved in transforming raw oceanographic data into a useable state means researchers with limited data skills or financial resources are less able to meet the necessary standard, said Stephen Diggs, a data manager at the Scripps Institution of Oceanography, in the United States.

The explosion of ocean data being collected through modern methods exceeds scientists’ capacity to deal with it, he told SciDev.Net on the fringes of the conference.

“There definitely needs to be workforce development to get more data curators and managers that come out of their training ready to deal with this,” he said.

Interdisciplinary work that is crucial for answering questions such as what effect climate change is having on marine environments is especially sensitive to reliability issues, he added.

This is because, when faced with data sets from various disciplines, researchers often lack the expertise to spot and correct all the biases created by the idiosyncrasies of different regions and collection methods, he said.

Providing researchers with rigorously quality controlled data from outside their immediate field was the motivation behind a proposed initiative, which called for scientists’ support at a town hall meeting during the conference.

The International Quality-Controlled Ocean Database (IQuOD) would combine automated, large-scale data cleaning done by computers with finer-grained expert analysis. The goal would be to produce a definitive database of the data from the 13 million locations with sea surface temperature records — some dating back to the eighteenth century — scattered across various institutions.

According to Diggs, this would significantly improve on the oceanographic database hosted at the US National Oceanic and Atmospheric Administration’s (NOAA’s) National Oceanographic Data Center. Even though NOAA's only conducts limited automated quality control its repository of marine datasets is the world’s most comprehensive.

Other variables, such as salinity and oxygen concentrations in the water, could be added to the temperature data as the IQuOD project progresses, said Diggs.

While a lack of data management capacity is felt globally, this is particularly the case in regions where resources are stretched, said Diggs.

Quality control done at the level of individual researchers is a “huge drain on resources” and often whole data sets will be removed from analysis if the researcher involved lacks the necessary skills, said Rebecca Cowley, a data analyst with Australia’s Commonwealth Scientific and Industrial Research Organisation and an IQuOD team leader.

To build such a massive international database, funders must stop their piecemeal approach of supporting small, dispersed quality-control projects and focus on a concerted effort, she said.

Reliability issues are not limited to ocean data, with atmospheric scientists facing similar hurdles, according to Shawn Smith, a meteorologist at Florida State University, United States.

But he told SciDev.Net that these problems are “not insurmountable” and that advances in computing power have allowed more and more quality control to be completed automatically. 
Republish
We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit SciDev.Net — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on SciDev.Net” containing a link back to the original article.
  4. If you want to also take images published in this story you will need to confirm with the original source if you're licensed to use them.
  5. The easiest way to get the article on your site is to embed the code below.
For more information view our media page and republishing guidelines.