Republish

We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit SciDev.Net — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on SciDev.Net” containing a link back to the original article.
  4. If you want to also take images published in this story you will need to confirm with the original source if you're licensed to use them.
  5. The easiest way to get the article on your site is to embed the code below.
For more information view our media page and republishing guidelines.

The full article is available here as HTML.

Press Ctrl-C to copy

This past summer, the EU launched an initiative to track migration in real time using big data ‒ the masses of machine-readable data each one of us leaves behind every time we use an electronic device.
 
“The language around this is that it will help refugees,” says Linnet Taylor, a professor and researcher working on data justice. Yet, there’s also a threat.
 
“Being able to distinguish whether people are Ghanaian, Pakistani or Syrian, for instance, is likely to work against Ghanaians and Pakistanis who may have a perfectly valid claim to asylum, but will be shut out in a world of big data,” says Taylor, hinting at the continent’s migration crisis unfolding in recent years.
 
The refugee crisis is now one of the biggest worries in rights-based discussions about big data, according to Alexa Koenig, executive director of the Human Rights Center at the University of California, Berkeley School of Law in the United States.
 
Koenig, who is part of the Bellagio Science for Development residency taking place throughout November, says the concern is about the potential of such data to endanger or violate privacy.

“There’s a lot of thinking going on, looking into how we make sure that the big data that is being collected around vulnerable groups like refugees are being used to help them, as opposed to potentially being used against them.”
 
 

On the cusp

 
Ever since big data entered the mainstream a few years ago, opinion has been divided – some see gold-dust for new analytical insights and social benefit, others see a human rights disaster waiting to happen.
 
It’s still early to see concrete benefits for global development. “The reason is that there is no way right now to access big data at scale,” says Emmanuel Letouze, director and co-founder of the Data-Pop Alliance, a global coalition that promotes the use of big data in global development. “So we're stuck in a low equilibrium where we do pilots, case studies, proofs of concept, little things here and there.”
 
For example, big data has been used in pilot programmes to track people in humanitarian situations, model the spread of disease, develop early warnings for extreme weather in Lake Victoria, and even improve bus routes in Cote d’Ivoire.
 
If concrete benefits are rare, so too are examples of harm. Yet, the Facebook data breach scandal aside, early signs of danger can be seen in how social media has been used to crack down on dissent in countries such as China and Pakistan.
 
Latching on to problems and blowing them out of proportion can also deprive society of the many benefits of technology, argues Jan Piotrowski, environment correspondent at the Economist and a Bellagio resident. “Technologies will be put to good use, but could also be used for ill purposes, and this has been true since man picked up a stick.”
 
 
The fast pace of technological change also adds a twist to the age-old challenge of societies struggling to keep up with scientific advances, immortalised exactly two hundred years ago in Mary Shelley’s novel Frankenstein.
 
And data researcher Taylor sees this issue in big data, continuing a development policy pattern that goes back to World War II, with Northern countries attempting to create technology transfer and economic models in the South.
 
“What I see in this [EU] big data for migration statistics initiative … is that statistics agencies who are having to collaborate with start-ups on, for instance, analysing satellite data in combination with social media postings from people coming out of refugee camps on the Turkish border … they can't do any kind of meaningful analysis because they don't have the cultural and linguistic capacity to understand what they're seeing,” she explains.
 
There are also many cases of well-meaning but problematic development of apps that collect information about vulnerable people to help them access services, according to Koenig. She cites the example of electronic devices used to document sexual violence crimes without considering that some people or countries might find sharing certain images inappropriate.  
 
The imbalance extends to the development of big data analytics capacity, and some worry about a new digital divide. “I do think there is a risk of developing countries listening to what they are told … [and] making big investments in systems that are not built for their needs,” says Maria deArteaga, a researcher at Carnegie Mellon University in the United States. “What are the customizable parts? What are the embedded assumptions?”

“Being able to distinguish whether people are Ghanaian, Pakistani or Syrian, for instance, is likely to work against Ghanaians and Pakistanis who may have a perfectly valid claim to asylum.”

Linnet Taylor, Tilburg University

It’s not for lack of expertise, according to Justin Arenstein, founder and chief executive of Code for Africa – African data scientists simply need a stronger voice to help shape policy. “There is a growing and vibrant data science ecosystem across Africa, with some really curious data 'explorers' … who are using open source data to ask important developmental questions.”
 
“Local context - culture, politics, infrastructure etc. - will always influence how digital tools and technologies are rolled out,” says Tariq Khokhar, managing director and senior data scientist at The Rockefeller Foundation. “The most important local context is the data itself.”
 

Reinventing the system

 
As long as the technological infrastructure in the global South remains weak, governments will keep shifting their data to clouds managed by international companies, adds Bright Simons, president of the mPedigree network in Ghana. And this puts corporations in the driving seat. “Ongoing Big Data initiatives in Africa are heavily dominated by the telecom industry. Orange, for instance, has experimented with multiple concepts in Francophone West Africa, especially in Senegal.”
 
Davis Adieno, regional director for Africa at the Global Partnership for Sustainable Development Data in Kenya, sees the same pattern. “They [organisations and agencies] simply can’t afford to invest in massive databases and expensive licenses.”
 
This kind of dynamic “really is transferring power to corporations,” says Taylor. “They can produce solutions and they can determine what gets done.”
 
But Letouze believes companies are a much better option than governments when it comes to safeguarding sensitive data. “Can you imagine Trump, can you imagine Erdogan, can you imagine Putin, having - by law! – control?,” he says. “Most companies have tight security protocols in place to protect data. Everyday any big company thwarts hundreds of more breaching attempts.”
 
It’s not just companies or states sweeping up the digital breadcrumbs we leave behind, according to Koenig – researchers and civil society are in on it too. “One of the really interesting things right now is that there are so many groups collecting information,” she says. The question is how to manage it over what is a long life cycle. Some argue for better citizen education, a public interest framework not entirely led by states or corporations, or carving out a ‘true digital commons’ that balances civic with state and commercial interests.
 
Koenig says moves to come up with a framework to manage big data responsibly are in the very early stages. The legal and regulatory framework is currently aimed at states, historically the custodians of data about their citizens – but the shift to corporations over the last decade means regulators are on unfamiliar ground. “How do we begin to build a new framework for the 21st century that acknowledges those realities?” 
 
Europe’s GDPR regulations and Microsoft calling for a 4th Geneva Convention are signs of an evolving conversation, says Koenig. But efforts to understand and manage the role of corporations, beyond commerce, have only just begun.