Big data: An opportunity to boost analytical capacities
- Official statistics can be slow and unresponsive, so big data is an opportunity
- But national statistics offices must evolve to stay relevant
- Companies, academics and international organisations stand ready to help
In recent years there has been much discussion — and even hype — about big data. Much of this centres on big data’s potential to cure many of the ills of official statistical production, such as how quickly National Statistics Offices (NSOs) can respond to a new policy request, high costs, poor timeliness, burden on respondents and even inaccuracies.
But while many see big data as competing with official statistics, we argue that big data is part of greater statistics.  Both are needed — the key question is, how official statistics should evolve to stay relevant in the age of big data.
Slow to respond
For decades, official statistics have served policy effectively, in a credible and trusted manner. But they suffer from two major weaknesses: poor timeliness and unresponsiveness to emerging policy needs.
Annual statistics such as health care expenditures are often produced with a delay of several years, a problem which is particularly acute in developing countries. Quarterly or monthly estimates of measures such as GDP also experience serious delays.
Similarly, the time taken from designing a brand new survey that meets a policy need through to actually producing results and releasing them may be several years. In the meantime, policymakers resort to alternative data sources, often of lower quality, to meet their data needs. When the statistics finally become available, their relevance may be limited.
Another weakness of official statistics is that, where they rely on traditional sample surveys, they can be unreliable when estimating statistics for small groups, small areas or small enterprises. For example, estimates of employment rates or education levels among disabled people may not be accurate if the sample is too small.
In the near future, the traditional trade-off between timeliness and accuracy will become a thing of the past if NSOs embrace the emerging ‘big data ecosystem’ — where the prevailing type of data consist of digital footprints, electronically observed records, sensor logs and similar by-products of our interactions with digital devices in daily activities. 
NSOs can and should include big data analytics in daily work. There will be methodological challenges, such as how to overcome biases in data that weren’t designed as formal surveys. Analysing Twitter messages, for example, will not be representative of the overall population: the analysis may suffer from a ‘selection bias’. But such challenges are hurdles rather than barriers.
Here to stay, and grow
It is important to recognise that big data is here to stay. There is high level international support, too, notably from the UN, which called for a ‘data revolution’ to underpin the post-2015 development goals.
And there are promising examples that suggest big data could eventually be used for official statistics. For example, the company PriceStats uses ‘web scraping’ technologies to collect thousands of online prices of consumer products from shops around the world, and produces daily estimates of inflation for many countries. The estimates and price indices are similar to those produced with official statistics. And in time, they may replace official statistics such as monthly consumer price indices altogether.
However, such a shift will not mean that NSOs are out of business. On the contrary, official statistics will continue to be important for benchmarking — to provide a value against which alternative estimates can be periodically compared — and to support the development of appropriate big data models.
With limited public resources available for official statistics, developing-country NSOs face great challenges in meeting statistical needs. But the potentially great benefit of new approaches offers an incentive for innovation.
And they have an advantage: limited statistical infrastructure means that poor countries are relatively unconstrained in moving towards new approaches. Further good news is that many have very high, and growing, levels of mobile phone use — a key technology in many big data scenarios.
How to get started
An NSO should start its big data exploration by identifying a clear question to answer. This could be about filling a data gap or making existing statistics more timely.
A strong link with national statistical priorities will be important to ensure that any big data project is developed to meet real needs, rather than being driven simply by new technologies.
This design phase is probably the most important step. The NSO should clearly articulate its needs and ask questions; find partners, stakeholders and technical expertise; and ensure it has access to the appropriate data sources. Finding partners will be crucial: these are likely to involve both public and private institutions, with strong involvement from universities.
Subsequent phases in the lifecycle of a project should include feasibility testing, proof of concept, prototyping and piloting. 
NSOs embarking on such a project can contact the UN Global Pulse initiative as well as leading academic initiatives such as the Massachusetts Institute of Technology’s Media Lab, innovative start-ups such as PriceStats and development actors like the World Economic Forum. All these potential partners have made clear to us their staunch support for this idea of helping NSOs build big data analytical capacity in developing countries.
Big data is an opportunity to embrace, rather than a threat. The potential benefit is enormous. Every NSO can start a big data project now.
Michail Skaliotis is head of Eurostat’s task force on Big Data. Ceri Thompson is team leader in statistical co-operation with developing countries, Eurostat. Skaliotis can be contacted at [email protected]a.eu and on Twitter @sitoilaks. Thompson can be contacted at [email protected] and on Twitter @cericurious
The views expressed in this article are those of the authors and do not necessarily reflect the official views of the European Commission.
This article is part of the Spotlight on Data for development.
References Martin Karlberg and Michail Skaliotis Big data for official statistics — strategies and some initial European applications (UN Economic Commission for Europe’s Conference of European Statisticians, September 2013)
 Michail SkaliotisTimeliness and accuracy in official statistics 2.0 (Eurostat, 2010)
 UN Global Pulse: Research (Accessed 25 March, 2014)