The Power Dynamics of Big and Open Data

At a recent CDI brown-bag discussion on data-intensive development, we hypothesised a mirror-image power dynamic between big data and open data.

Big Open Data Few Many

Open data has an inherent tendency to redistribute power from the few (who originally hold the data) to the many (who can now access the data).  It supports sousveillance.  Big data has an inherent tendency in the opposite direction.  It gathers data about the many but only the few have the power to capture, store, process, interpret and use that big data.  It supports surveillance.

The extent to which these are inherent affordances of these data systems vs. the extent to which these tendencies are inscribed into those data systems is a matter for further debate.  But what it does suggest is that big data per se is more reproductive than transformative of power inequalities within society.  Think of the way in which major users of big data – social media platforms, e-business multinationals, telecommunication companies – operate.  Their uses of big data reinforce inequality much more than they challenge it.

One way to address this is to reverse the power dynamic flow shown above: big data must become open data.  This could happen in various ways:

  • Big data as open data: big datasets are made openly available online in accessible format (as in all cases, with due consideration for data privacy and security).
  • Big data as shared data: big datasets are made available to particular organisations (e.g. those of civil society).
  • Big data as small data: sub-sets of big datasets are shared with the sources of that data for their use (e.g. the particular communities or groups from which the big data derived).

Reversing Big Data Inequalities

But what will make a reversal happen?  To understand this, we need to study open data motivations: what causes organisations to open their datasets?  Reviewing our knowledge of open data, we could not find examples of intrinsic motivations driving adoption of open data.  Instead, drivers to opening of big datasets seem likely to be extrinsic:

  • For public sector owners of big data, domestic political economy (e.g. local campaigns for access to data; economic benefits from creation of a local data economy) and external political economy (e.g. encouraging foreign investment through a reputation for openness).
  • For private sector owners of big data, government regulation to force opening of datasets, or shareholder/consumer pressure.

Without such extrinsic pressures and the openness that ensues, big data may not deliver its developmental potential.

 

A Research Agenda for Data-Intensive Development

In practice, there is a growing role for data within international development: what we can call “data-intensive development”.  But what should be the research agenda for this emerging phenomenon?

On 12th July 2016, a group of 40 researchers and practitioners gathered in Manchester at the workshop on “Big and Open Data for Development”, organised by the Centre for Development Informatics.  Identifying a research agenda was a main purpose for the workshop; particularly looking for commonalities that avoid fractionating our field by data type: big data vs. open data vs. real-time data vs. geo-located data, etc; each in its own little silo.

IMG_0828

A key challenge for data-intensive development research is locating the “window of relevance”.  Focus too far back on the curve of technical change – largely determined in the Western private sector – and you may fail to gain attention and interest in your research.  Focus too far forward and you may find there no actual examples in developing countries that you can research.

In 2014 and 2015, we had two failed attempts to organise conference tracks on data-and-development; each generating just a couple of papers.  By contrast, the 2016 workshop received two dozen submissions; too many to accommodate but suggesting a critical mass of research is finally starting to appear.

It is still early days – the reports from practice still give a strong sense of data struggling to find development purposes; development purposes struggling to find data.  But the workshop provided enough foundational ideas, emergent issues, and reports-back from pilot initiatives to show we are putting the basic building blocks of a research domain in place.

But where next?  Through a mix of day-long placing of Post-It notes on walls, presentation responses, and a set of group then plenary discussions[1], we identified a set of future research priorities, as shown below and also here as PDF.

DID Research Agenda

 

 

The agenda divided into four sub-domains:

  • Describing/Defining: working out the basic boundaries, contours and contents of the data-intensive development domain.
  • Practising: measuring and learning from the practice of data-intensive development.
  • Analysing: evaluating the impact of data-intensive development through various analytical lenses.
  • Resisting: guiding practical actions to challenge potential state and corporate data hegemony in developing countries.

Given the size and eclectic mix of the group, many different research interests were expressed.  But two came up much more than others.

First, power, politics and data-intensive development: analysing the power structures that shape DID initiatives, and that are inscribed into data systems; analysing the way in which DID produces and reproduces power; analysing what resistance to data hegemony would mean.

Second, justice, ethics, rights and data-intensive development: determining what a social justice perspective on DID would mean; analysing what DID can contribute to rights-based development; understanding how ethical principles would guide civil society interventions for better DID.

We hope, as a research community, to take these and other agenda items forward.  If you would like to join us, please sign up with the LinkedIn group on “Data-Intensive Development”.

 

[1] My thanks to Jaco Renken for collating these.