19/12/11

Quality control challenges for citizen science

Crowdsourced environmental data could be used in measuring black carbon emissions from cook stoves in India Copyright: Flickr/ World Resources

Send to a friend

The details you provide on this page will not be used to send unsolicited email, and will not be sold to a 3rd party. See privacy policy.

Crowdsourced environmental data can be useful, for example in measuring black carbon emissions, but concerns remain about quality, says Yojana Sharma.

With mobile phone users becoming ubiquitous in the developing world, harnessing citizen science from networks of phone users could provide valuable environmental data for use in scientific analysis, the executive director of the United Nations Environment Programme (UNEP) has said. 

But data experts caution that data quality and reliability issues need to be resolved before “crowdsourced” data collection can become more widespread.

"The public and their cell phones could, if encouraged, become early warning systems of droughts and floods, as well as forest fires and wildlife poaching,” Achim Steiner, UN under-secretary-general and UNEP executive director, told the Eye on Earth Summit in Abu Dhabi (12-15 December 2011).

"Managing, processing and making these volumes of data available in user-friendly ways and in service of sustainable development is one of the global challenges and one of the issues for Eye on Earth – and a key input to assisting Rio+20 [the UN conference on sustainable development] in June next year," he said. 

The Eye on Earth Summit, co-hosted by UNEP, looked at the importance of access to environmental and societal data.

Use of citizen-gathered data was discussed by the working group on technical infrastructure where providing the technical platforms and ensuring the reliability of harnessing crowdsourced data was described as an ambitious goal. 

"Such crowdsourced data will be of greatest use in tracking and modelling environmental change at all scales, specifically for trends and instantaneous phenomena, when associated metadata adhere to standards that have emerged for widespread global use across governments, industry and science," a working group white paper prepared in advance of the meeting said.

Climate benefits

Crowdsourcing is already being used to gather important data. In India, Project Suraya, linked to UNEP’s Atmospheric Brown Cloud initiative, is using special mobile phones in villages to measure levels of black carbon emitted by cooking stoves.

"The project is also linking to satellites with the aim of measuring how more efficient stoves are simultaneously improving public health while providing climate benefits in the atmosphere, " Steiner said.

Also being introduced in many countries is the inputting of rain gauge data by farmers.

"It can be useful for data gathering, such as reporting on the weather. You can give farmers a rain gauge and they text in once a week – it is far cheaper than government officials keeping a rain gauge. It is data at a very low cost," said Lalanath de Silva, director of the data Access Initiative at the World Resources Institution in Washington DC and a passionate advocate of open information and data in developing countries.

Citizen science, often dismissed in the past as uncontrolled and inaccurate, is beginning to be recognised as a valuable resource, with technical experts at the Abu Dhabi meeting discussing how it can be effectively used as new platforms for gathering crowdsourced information are developed.

Village, Rwanda

Interviews of villagers in Rwanda helped a government boundary mapping project

Flickr/gconard

 

For example, Microsoft’s Eye on Earth platform is explicitly focussed on gathering environmental data.

Meanwhile, Google’s Earth-mapping applications allow citizen input from anywhere in the world.

"Android smartphones with open data kit, has been empowering local communities all over the world to collect their own data and participate in the mapping and monitoring," Rebecca Moore, founder of Google Earth Outreach, told the conference.

Interview villagers

"It has turned out to be a very good way to engage with local communities in mapping, monitoring and protecting their own landscape, " she said, adding that Google had conducted training on forest monitoring in Tanzania, Colombia and elsewhere.

In Rwanda, after a government project mapping village boundaries, secondary school students were trained to interview villagers and upload information into a National Institute of Statistics of Rwanda database, using the ArcGIS server to structure the information.

The project is also e Thelping verify government information with on-the-ground observation in advance of the country’s 2012 population and housing census.

"Crowdsourcing is bolstering accumulation of much richer observational archives for all parts of the world; archives that may be tapped by scientists globally to achieve better understanding of the dynamics of our environment," the working group document said.

"We anticipate that scientists, the public, and decision-makers will have increasing opportunities to take advantage of crowdsourcing," it added.

But use of citizen data for science is still hampered by reliability considerations and the need to ensure the quality of all data.

Falsify data

De Silva admitted that there is an element of suspicion about citizen-collected data for scientific purposes.

"Those who post data have an obligation to determine their accuracy,” he said. "There will be a minor percentage of data that is inaccurate or malicious.

"Even if you have only trained scientists gathering data, you will have those who falsify data. The question is, How do you minimise that? "

John Calkins of the non-profit Environmental Systems Research Institute (ESRI) told SciDev.Net, "The amount of citizen reports is rising, but slowly. That’s because of the challenge of quality control. Some of the citizen data is going to be good and some of it bad. You will have to make sense of some of it so that if it appears absurd you will have to mark it as not likely. There is a whole analytical process that goes on to say ‘Is that good data?’" And citizen data cannot be relied upon over time. Low-cost citizen science output, while agile and able to respond to changing needs, is almost entirely voluntary, and continued availability of data contributions may not be guaranteed, the working group said.

Moore said it was improbable that people would ever be completely comfortable with the notion of citizen science.

In addition, ensuring data input is not the only aim. Harlan Onsrud, of Global Spatial Data Insfrastructure Association and co-chair of the working group, told the conference, "A lot of this information won’t have value unless we have more efficient ways of documenting and recalling the context of the data being gathered."

Huge amounts of unused and unanalysed data already exist. "The citizen science brain is just a storage bin," said Peter Gilruth, director of UNEP’s early warning and assessment division.