Archive for the ‘Data-for-Development’ Category

Big Data and Electoral Politics in India

What happens when big data and big politics collide?  One answer arises from a recent study of big data in the electricity distribution sector in an Indian state: “Exploring Big Data for Development: An Electricity Sector Case Study from India”.


The state electricity corporation has introduced millions of online digital meters that measure electricity flow along the distribution network and down to the level of consumers.  Producing a large stream of real-time data, these innovations should have addressed a critical problem in India: theft / non-payment by consumers which creates losses up to one-third of all supplied power.  But they did not.  Why should that be?

Big data does reduce some losses: technical losses from electrical resistance and faults are down; payment losses from urban consumers are down.  But the big data era has seen an unprecedented expansion of rural electrification, and in rural areas, payment losses have risen to 50% or more.  In other words, the corporation receives less than half the revenue it should given the electricity it is supplying to rural areas.

The expansion in rural electrification has been mandated by politicians.  The high level of rural payment losses has been condoned by politicians, given significant positive association between levels of electricity non-payment and likelihood of seat retention at an election.

Is this the silencing of big data in the face of big politics: the capability for accurate metering and billing of almost all consumers simply being overridden by electoral imperatives?  Not quite, because big data has been involved via an offsetting effect, and an epistemic effect.

  1. Offsetting Effect. Big data-driven technical and urban consumer loss reductions have allowed the State Government to “get away” with its political approach to rural electrification. The two effects of technical/urban loss reduction and political loss increase have roughly balanced one another out; a disappointing aggregate outcome but one that just falls under the threshold that would trigger some direct intervention by the regulators or by Central Government.
  2. Epistemic Effect. Big data creates a separate virtual model of phemonena: a so-called “data double”. This in turn can alter the “imaginaries” of those involved – the mental model and worldviews they have about the phenomena – and the wider discourse about the phenomena.
    This has happened in India.  Big data has created a new imaginary for electricity, particularly within the minds of politicians.  Before big data, the policy paradigm was one that saw electricity in terms of constraint: geographic constraint such that not all areas could be connected, and supply constraint such that “load-shedding” – regular blackouts and brownouts – was regarded as integral.
    After big data, the new paradigm is one of continuous, high-quality, universal electricity.  Plans and promises are now based on the idea that all districts – and all voters – can have 24 x 7 power.

In sum, one thing we know of digital systems is that they have unanticipated consequences.  This has been true of big data in this Indian state.  Far from reducing losses, the data-enabled growth in electricity connectivity has helped fuel a politically-enabled growth in free appropriation of electricity.

For further details, please refer to the working paper on this topic.

[1] Credit: Jorge Royan (Own work) CC-BY-SA-3.0, via Wikimedia Commons

The Affordances and Impacts of Data-Intensive Development

What is special about “data-intensive development”: the growing presence and application of data in the processes of international development?

We can identify three levels of understanding: qualities, affordances, and development impacts.

A. Data Qualities

Overused they may be but it still helps to recall the 3Vs.  Data-intensive development is based on a greater volume, velocity and variety of data than previously seen.  These are the core differentiating qualities of data from which affordances and impacts flow.

B. Data Affordances

The qualities are inherent functionalities of data.  From these qualities, combined with purposive use by individuals or organisations, the following affordances emerge[1]:

  • Datafication: an expansion of the phenomena about which data are held. A greater breadth: holding data about more things. A greater depth: holding more data about things.  And a greater granularity: holding more detailed data about things.  This is accelerated by the second affordance . . .
  • Digitisation: not just the conversion of analogue to digital data but the same conversion for all parts of the information value chain. Data processing and visualisation for development becomes digital; through growth of algorithms, development decision-making becomes digital; through growth of automation and smart technology, development action becomes digital.  Digitisation means dematerialisation of data (its separation from physical media) and liquification of data (its consequent fluidity of movement across media and networks), which underlie the third affordance . . .
  • Generativity: the use of data in ways not planned at the origination of the data. In particular, data’s reprogrammability (i.e. using data gathered for one purpose for a different purpose); and data’s recombinability (i.e. mashing up different sets of data to get additional, unplanned value from their intersection).

C. Data-Intensive Development Impacts

In turn, these affordances give rise to development impacts.  There are many ways in which these could be described, with much written about the (claimed) positive impacts.  Here I use a more critical eye to select four that can be connected to the concept of data (in)justice for development[2]:

i. (In)Visibility. The affordances of data create a far greater visibility for those development entities – people, organisations, processes, things, etc. – about which data is captured. They can more readily be part of development activity and decision making.  And they can also suffer loss of privacy and growth in surveillance from the state and private sector[3].

Conversely, those entities not represented in digital data suffer greater invisibility, as they are thrown further into shadow and exclusion from development decision-making.

Dematerialisation and generativity also make the whole information value chain increasingly invisible.  Data is gathered without leaving a physical trace.  Data is processed and decisions are made by algorithms whose code is not subject to external scrutiny.  The values, assumptions and biases inscribed into data, code and algorithms are unseen.

ii. Abstraction. A shift from primacy of the physical representation of development entities to their abstract representation: what Taylor & Broeders (2015) call the “data doubles” of entities, and the “shadow maps” of physical geographies. This abstraction typically represents a shift from qualitative to quantitative representation (and a shift in visibility from the physical to the abstract; from the real thing to its data imaginary).

iii. Determinism.  Often thought of in terms of solutionism: the growing use of data- and technology-driven approaches to development.  Alongside this growth in technological determinism of development, there is an epistemic determinism that sidelines one type of knowledge (messy, local, subjective) in favour of a different type of knowledge (remote, calculable and claiming-to-be-but-resolutely-not objective).  We could also identify the algorithmic determinism that increasingly shapes development decisions.

iv. (Dis)Empowerment. As the affordances of data change the information value chain, they facilitate change in the bases of power. Those who own and control the data, information, knowledge, decisions and actions of the new data-intensive value chains – including its code, visualisations, abstractions, algorithms, terminologies, capabilities, etc – are gaining in power.  Those who do not are losing power in relative terms.

D. Review

The idea of functionalities leading to affordances leading to impacts is too data-deterministic.  These impacts are not written, and they will vary through the different structural inscriptions imprinted into data systems, and through the space for agency that new technologies always permit in international development.  Equally, though, we should avoid social determinism.  The technology of data systems is altering the landscape of international development.  Just as ICT4D research and practice must embrace the affordances of its digital technologies, so data-intensive development must do likewise.

[1] Developed from: Lycett, M. (2013) ‘Datafication’: making sense of (big) data in a complex world. European Journal of Information Systems, 22(4), 381-386; Nambisan, S. (2016) Digital entrepreneurship: toward a digital technology perspective of entrepreneurship, Entrepreneurship Theory and Practice, advance online publication

[2] Developed from: Johnson, J.A. (2014) From open data to information justice. Ethics And Information Technology, 16(4), 263-274; Taylor, L. & Broeders, D. (2015) In the name of development: power, profit and the datafication of the global South. Geoforum, 64, 229-237; Sengupta, R., Heeks, R., Chattapadhyay, S. & Foster, C. (2017) Exploring Big Data for Development: An Electricity Sector Case Study from India, GDI Development Informatics Working Paper no.66, University of Manchester, UK; Shaw, J. & Graham, M. (2017) An informational right to the city? Code, content, control, and the urbanization of information. Antipode, advance online publication; Taylor, L. (2017) What Is Data Justice? The Case for Connecting Digital Rights and Freedoms on the Global Level, TILT, Tilburg University, Netherlands

[3] What Taylor & Broeders (2015) not entirely convincingly argue is a change from overt and consensual “legibility” to tacit and contentious “visibility” of citizens (who now morph into data subjects).


Data Justice for Development

13 October 2016 Leave a comment

What would “data justice for development” mean?  This is a topic of increasing interest.  It sits at the intersection of greater use of justice in development theory, and greater use of data in development practice.  Until recently, very little had been written about it but this has been addressed via a recent Centre for Development Informatics working paper: “Data Justice For Development: What Would It Mean?” and linked presentation / podcast.

Why concern ourselves with data justice in development?  Primarily because there are data injustices that require a response: governments hacking data on political opponents; mobile phone records being released without consent; communities unable to access data on how development funds are being spent.

But to understand what data justice means, we have to return to foundational ideas on ethics, rights and justice.  These identify three different mainstream perspectives on data justice:

  • Instrumental data justice, meaning fair use of data. This argues there is no notion of justice inherent to data ownership or handling.  Instead what matters is the purposes for which data is used.
  • Procedural data justice, meaning fair handling of data. This argues that citizens must give consent to the way in which data about them is processed.
  • Distributive data justice, meaning fair distribution of data. This could directly relate to the issue of who has what data, or could be interpreted in terms of rights-based data justice, relating to rights of data privacy, access, control, and inclusion / representation.

We can use these perspectives to understand the way data is used in development.  But we also need to take account of two key criticisms of these mainstream views.  First, that they pay too little attention to agency and practice including individual differences and choices and the role of individuals as data users rather than just data producers.  Second, that they pay too little attention to social structure, when it is social structure that at least partly determines issues such as the maldistribution of data in the global South, and the fact that data systems in developing countries benefit some and not others.

To properly understand what data justice for development means, then, we need a theory of data justice that goes beyond the mainstream views to more clearly include both structure and agency.

The working paper proposes three possible approaches, each of which provides a pathway for future research on data-intensive development; albeit the current ideas are stronger on the “data justice” than the “for development” component:

  • Cosmopolitan ideas such as Iris Marion Young’s social connection model of justice could link data justice to the social position of individuals within networks of relations.
  • Critical data studies is a formative field that could readily be developed through structural models of the political economy of data (e.g. “data assemblages”) combined with a critical modernist sensitivity that incorporates a network view of power-in-practice.
  • Capability theory that might be able to encompass all views on data justice within a single overarching framework.

Alongside this conceptual agenda could be an action agenda; perhaps a Data-Justice-for-Development Manifesto that would:

  1. Demand just and legal uses of development data.
  2. Demand data consent of citizens that is truly informed.
  3. Build upstream and downstream data-related capabilities among those who lack them in developing countries.
  4. Promote rights of data access, data privacy, data ownership and data representation.
  5. Support “small data” uses by individuals and communities in developing countries.
  6. Advocate sustainable use of data and data systems.
  7. Create a social movement for the “data subalterns” of the global South.
  8. Stimulate an alternative discourse around data-intensive development that places issues of justice at its heart.
  9. Develop new organisational forms such as data-intensive development cooperatives.
  10. Lobby for new data justice-based laws and policies in developing countries (including action on data monopolies).
  11. Open up, challenge and provide alternatives to the data-related technical structures (code, algorithms, standards, etc) that increasingly control international development.

Measuring Barriers to Big Data for Development

How can we measure the barriers to big data for development?  A research paper from Manchester’s Centre for Development Informatics suggests use of the design-reality gap model.

Big data holds much promise for development: to improve the speed, quality and consistency of a wide variety of development decisions[1].  At present, this is more potential than actuality because big data initiatives in developing countries face many barriers[2].

But so far there has been little sense of how these barriers can be systematically measured: work to date tends to be rather broad-brush or haphazard.  Seeking to improve this, we investigated use of an ICT4D framework already known for measurement of barriers: the design-reality gap model.

In its basic form the model is straightforward:

  • It records the gap between the design requirements or assumptions of big data vs. the current reality on the ground.
  • The gap is typically recorded on a scale from 0 (no gap: everything needed for big data is present) to 10 (radical gap: none of the requirements for big data is present).
  • The gap can be estimated via analysis of researchers, or derived directly from interviewees, or recorded from group discussions.
  • It is typically measured along seven “ITPOSMO” dimensions (see below).

As proof-of-concept, the model was applied to measure barriers to big data in the Colombian public sector; gathered from a mix of participant-observation in two IT summits, interviews, and secondary data analysis.
WP62 Graphic v2


As summarised in the figure above, the model showed serious barriers on all seven dimensions:

  • Information: some variety of data but limited volume, velocity and visibility (gap size 7).
  • Technology: good mobile, moderate internet and poor sensor availability with a strong digital divide (gap size 6).
  • Processes: few “information value chain” processes at work to put big data into action (gap size 7).
  • Objectives and values: basic data policies in place but lack of big data culture and drivers (gap size 7).
  • Skills and knowledge: foundational but not specialised big data capabilities (gap size 7).
  • Management systems and structures: general IT systems and structures in place but little specific to big data (gap size 7).
  • Other resources: some budgets earmarked for big data projects (gap size 5).

A simple summary would be that Colombia’s public sector has a number of the foundations or precursors for big data in place, but very few of the specific components that make up a big data ecosystem.  One can turn around each of the gaps to propose actions to overcome barriers: greater use of existing datasets; investments in data-capture technologies; prioritisation of value-generation rather than data-generation processes; etc.

As the working paper notes:

“Beyond the specifics of the particular case, this research provides a proof-of-concept for use of the design-reality gap model in assessing barriers to big data for development. Rephrasing the focus for the exercise, the model could equally be used to measure readiness for big data; BD4D critical success and failure factors; and risks for specific big data initiatives. …

We hope other researchers and consultants will make use of the design-reality gap model for future assessments of big-data-for-development readiness, barriers and risks.”

For those interested in taking forward research and practice in this area, please sign up with the LinkedIn group on “Data-Intensive Development”.

[1] Hilbert, M. (2016) Big data for development, Development Policy Review, 34(1), 135-174

[2] Spratt, S. & Baker, J. (2015) Big Data and International Development: Impacts, Scenarios and Policy Options, Evidence Report no. 163, IDS, University of Sussex, Falmer, UK

The Power Dynamics of Big and Open Data

At a recent CDI brown-bag discussion on data-intensive development, we hypothesised a mirror-image power dynamic between big data and open data.

Big Open Data Few Many

Open data has an inherent tendency to redistribute power from the few (who originally hold the data) to the many (who can now access the data).  It supports sousveillance.  Big data has an inherent tendency in the opposite direction.  It gathers data about the many but only the few have the power to capture, store, process, interpret and use that big data.  It supports surveillance.

The extent to which these are inherent affordances of these data systems vs. the extent to which these tendencies are inscribed into those data systems is a matter for further debate.  But what it does suggest is that big data per se is more reproductive than transformative of power inequalities within society.  Think of the way in which major users of big data – social media platforms, e-business multinationals, telecommunication companies – operate.  Their uses of big data reinforce inequality much more than they challenge it.

One way to address this is to reverse the power dynamic flow shown above: big data must become open data.  This could happen in various ways:

  • Big data as open data: big datasets are made openly available online in accessible format (as in all cases, with due consideration for data privacy and security).
  • Big data as shared data: big datasets are made available to particular organisations (e.g. those of civil society).
  • Big data as small data: sub-sets of big datasets are shared with the sources of that data for their use (e.g. the particular communities or groups from which the big data derived).

Reversing Big Data Inequalities

But what will make a reversal happen?  To understand this, we need to study open data motivations: what causes organisations to open their datasets?  Reviewing our knowledge of open data, we could not find examples of intrinsic motivations driving adoption of open data.  Instead, drivers to opening of big datasets seem likely to be extrinsic:

  • For public sector owners of big data, domestic political economy (e.g. local campaigns for access to data; economic benefits from creation of a local data economy) and external political economy (e.g. encouraging foreign investment through a reputation for openness).
  • For private sector owners of big data, government regulation to force opening of datasets, or shareholder/consumer pressure.

Without such extrinsic pressures and the openness that ensues, big data may not deliver its developmental potential.


A Research Agenda for Data-Intensive Development

18 July 2016 1 comment

In practice, there is a growing role for data within international development: what we can call “data-intensive development”.  But what should be the research agenda for this emerging phenomenon?

On 12th July 2016, a group of 40 researchers and practitioners gathered in Manchester at the workshop on “Big and Open Data for Development”, organised by the Centre for Development Informatics.  Identifying a research agenda was a main purpose for the workshop; particularly looking for commonalities that avoid fractionating our field by data type: big data vs. open data vs. real-time data vs. geo-located data, etc; each in its own little silo.


A key challenge for data-intensive development research is locating the “window of relevance”.  Focus too far back on the curve of technical change – largely determined in the Western private sector – and you may fail to gain attention and interest in your research.  Focus too far forward and you may find there no actual examples in developing countries that you can research.

In 2014 and 2015, we had two failed attempts to organise conference tracks on data-and-development; each generating just a couple of papers.  By contrast, the 2016 workshop received two dozen submissions; too many to accommodate but suggesting a critical mass of research is finally starting to appear.

It is still early days – the reports from practice still give a strong sense of data struggling to find development purposes; development purposes struggling to find data.  But the workshop provided enough foundational ideas, emergent issues, and reports-back from pilot initiatives to show we are putting the basic building blocks of a research domain in place.

But where next?  Through a mix of day-long placing of Post-It notes on walls, presentation responses, and a set of group then plenary discussions[1], we identified a set of future research priorities, as shown below and also here as PDF.

DID Research Agenda



The agenda divided into four sub-domains:

  • Describing/Defining: working out the basic boundaries, contours and contents of the data-intensive development domain.
  • Practising: measuring and learning from the practice of data-intensive development.
  • Analysing: evaluating the impact of data-intensive development through various analytical lenses.
  • Resisting: guiding practical actions to challenge potential state and corporate data hegemony in developing countries.

Given the size and eclectic mix of the group, many different research interests were expressed.  But two came up much more than others.

First, power, politics and data-intensive development: analysing the power structures that shape DID initiatives, and that are inscribed into data systems; analysing the way in which DID produces and reproduces power; analysing what resistance to data hegemony would mean.

Second, justice, ethics, rights and data-intensive development: determining what a social justice perspective on DID would mean; analysing what DID can contribute to rights-based development; understanding how ethical principles would guide civil society interventions for better DID.

We hope, as a research community, to take these and other agenda items forward.  If you would like to join us, please sign up with the LinkedIn group on “Data-Intensive Development”.


[1] My thanks to Jaco Renken for collating these.

Stakeholder Analysis of Open Government Data Initiatives

17 December 2015 Leave a comment

Many different actors are involved in open government data (OGD) initiatives, and it can be hard to understand the different roles they play.

Stakeholder analysis can help, such as mapping onto a power-interest grid (see example below).  This analyses stakeholders according to their power to impact the development and implementation of open government data, and their level of interest in OGD.  The former measured via a typical sources-of-power checklist: reward, coercive, legitimate, expert, personal, informational, affiliative.  The latter measured via text analysis of stakeholder statements.

Primary stakeholders are “those who have formal, official, or contractual relationships and have a direct and necessary… impact” (Savage et al., 1991:62). Others who affect or are affected by OGD but less formally and directly and essentially, can be categorised as secondary.

Applying this to Chile’s open government data initiative produced the mapping shown in the figure.

OGD Stakeholders


We can draw two conclusions.  First, that OGD in Chile has been mostly determined from within government. Second, that it has otherwise been shaped rather more by international than national forces.

Three absent stakeholders can be noted:

  • The local private sector is not an active part of the ecosystem at present, restricting options to derive economic value from OGD.
  • Citizens are not active in discussion or use of open government data, restricting options to derive political value from OGD.
  • Multinational firms and investors are not directly involved, but have a tertiary role: they are an audience to whom the presence and progress of OGD is sometimes projected.

In sum, this is an “inwards and upwards” pattern of open government data which is shaping OGD’s trajectory in the country.  Government is the “sun” and other stakeholders merely “planets”, so that perspectives and agendas within government dominate. One agenda is to broadcast signals of democracy to the outside world.

In facing “upwards” to these external stakeholders, what matters most is an appearance of transparency. This can be satisfied by the presence of datasets, some empowerment and accountability rhetoric in pronouncements, and membership of the Open Government Partnership and adherence to its minimum standards. This is not to say that government stakeholders care nothing for delivery of results; simply that the external audience-related incentives are much stronger for appearance than fulfilment.

Stakeholder analysis should therefore be a fundamental tool for open government data researchers and practitioners; helping them to understand the identities, strengths and weaknesses of key OGD actors.

This research is reported in more detail in: Gonzalez-Zapata, F. & Heeks, R. (2015) The multiple meanings of open government data: understanding different stakeholders and their perspectives, Government Information Quarterly, 32(4), 441-452

%d bloggers like this: