But as well as interest, big data raises many questions — and the occasional puzzled look. Even the bravest admit confusion over how big data differs from, simply, data.
It’s easy to assume big data is about quantity, or to conflate the idea with calls for a data revolution. But big data is about more than size, and the revolution is broader: it’s about building the data infrastructures needed to track development goals effectively.
Big data certainly divides opinion. Some see huge potential to mine the hidden treasures in the world’s digital signals to improve decision-making. Others see a seductive new idea fraught with technical difficulties and serious risks for society.
A recent event organised by SciDev.Net touched on some of the challenges involved in making big data work for development. The articles we publish today unpack the debate’s many facets and offer some clarity on this powerful new idea — starting with the basic question: what exactly is big data?
Development or divide?
An overview article offers defining features and lists types of ‘big data’ — which is, incidentally, perhaps a misnomer, according to author Emmanuel Letouzé, our consultant for the project. Letouzé explains that much of what has come to be called big data is machine-readable data generated over the past decade from digital gadgets. These become databases that, unlike traditional surveys, are not designed specifically for statistical analysis.
It is telling that Letouzé takes several paragraphs to explain the term ‘big data’, dispelling any expectation of a simple definition, and shattering any illusion that it is simply about quantity.
“A ‘true’ big data revolution should be one where data can be leveraged to change power structures and decision-making processes, not just create insights.’”
He goes on to explain the appeal of big data — in particular, the shortcomings of current decision-making tools and examples where big data has offered new insight. For poor countries in particular, a major reason for its appeal is the lack of data, which leads to misleading statistics, and the potential for national statistical systems in such areas to ‘leapfrog’ this obstacle by embracing big data.
Yet the rosy picture is balanced by a sobering account of risks and challenges. These fall under three main categories: rights-based concerns, such as privacy and security; technical concerns, focusing on inaccurate and biased analyses; and concerns over a ‘new digital divide’.
The debate is reflected in features and opinion articles we publish today.
Devil in the detail?
In a podcast feature, Alex ‘Sandy’ Pentland, director of the Massachusetts Institute of Technology’s Human Dynamics Laboratory, argues that analyses using data grouped so that individuals cannot be recognised (known as aggregated data), which he says don’t carry privacy concerns, offers benefits. But Patrick Ball, director of the Human Rights Data Analysis Group, voices deep scepticism about the accuracy of such analyses, regardless of aggregation.
The statistical downsides undermine arguments for big data’s usefulness. But our curiosity is such that humankind has rarely shied away from an untapped resource — and as Letouzé points out, big data will keep growing, tempting our ingenuity to make the best of it.
So if we take statistical pitfalls seriously, and yet accept that big data offers benefits, other questions arise — many of them pertinent to development.
A news feature by Jan Piotrowski explores the gaps in tools, knowledge and funding that are constraining developing countries’ ability to tap into big data. Piotrowski also highlights the risk that big data databases may not represent people in poor regions where power supplies are intermittent or non-existent. And when it comes to mobile phone records, a ubiquitous type of big data, he finds it relies on first getting companies to share it.
In an opinion article, Eurostat’s Michail Skaliotis and Ceri Thompson acknowledge the capacity deficits in resource-poor countries but argue that national statistical offices should see big data as an incentive for innovation. They reject the view that big data can replace traditional statistics, seeing them instead as a complementary opportunity for national statistics offices to raise their game.
But an opinion article by activist and citizen journalist Sanjana Hattotuwa strikes a more ominous tone. Without disregarding big data’s potential to help deliver services, Hattotuwa says the numbers need a human face — they should be seen as representing a collection of individuals, not as sets of de-personalised information. He argues compellingly that, in the wrong hands, big data could easily undermine democracy.
Real concerns temper hype
In a recent Financial Times article, economist and journalist Tim Harford offers a litany of analytical complications set to keep big data from offering valuable insight.  Referring to one of the statistical challenges, he writes: “There must always be a question about who and what is missing, especially with a messy pile of found data.”
This rings true about the social side as well as the statistical side of big data. Data matters, but the political reality in which they exist also matters. For example, the extent to which developing countries can be equal partners in big data projects is one question.
Letouzé makes this explicit, saying that “a ‘true’ big data revolution should be one where data can be leveraged to change power structures and decision-making processes, not just create insights.” And in a podcast Philipp Schönrock, director of the Colombian development think-tank CEPEI, speaks about the need for standards and a legal framework as elements of building trust in managing data.
The perspectives offered in this collection suggest that, like many potentially powerful tools, there is a duality about big data that takes the debate beyond strictly technical questions. Even if sophisticated new methods do refine this rough gem, the gains are unlikely to automatically transform those parts of the world that are still developing more foundational analytical capacities. Initiatives devoted to improving technical capabilities must target users with both advanced and basic data analysis competencies.
Tempering excitement with a dose of reality doesn't have to mean dismissing big data’s potential. To borrow from Khalil Gibran's words on passion and reason, being more mindful of one over the other could mean losing the benefit of both.
Opinion & Special Features Editor, SciDev.Net
This article is part of the Spotlight on Data for development.