Archive

Archive for the ‘Data-for-Development’ Category

An Applied Data Justice Framework for Datafication and Development

Data is playing an ever-growing role in international development.  But what lens can we use to analyse the impact of data on development?

The emerging field of “data justice” offers some valuable ideas but they have not yet been put together into a systematic and comprehensive framework.  My open-access paper – Datafication, Development and Marginalised Urban Communities: An Applied Data Justice Framework, written with Satyarupa Shekhar – provides such a framework, as shown below.

The framework exposes five dimensions of data justice:

  • Procedural: fairness in the way in which data is handled.
  • Instrumental: fairness in the results of data being used.
  • Rights-based: adherence to basic data rights such as representation, privacy, access and ownership.
  • Structural: the degree to which the interests and power in wider society support fair outcomes in other forms of data justice.
  • Distributive: an overarching dimension relating to the (in)equality of data-related outcomes that can be applied to each of the other dimensions of data justice.

The dimensions can be used individually; for example, just to analyse data practices, or just to analyse the impact of context on new data systems in developing countries.  Or the model can be used holistically; for example, to understand the full development impact of a particular data initiative.

The Datafication, Development and Marginalised Urban Communities: An Applied Data Justice Framework paper takes the latter route.  It analyses “pro-equity data initiatives” that were implemented by data activists in four cities: Chennai, Nairobi, Pune and Surakarta.  These initiatives specifically sought to address the data injustices suffered by slum dwellers and other marginalised groups; particularly their invisibility to urban planners and other external agencies.

Using the data justice lens, this research finds that new data flows do have a positive impact in counteracting the injustice of invisibility, but they disproportionately serve those with the motivation and power to use that data.  Results in terms of service improvements and epistemic change are beneficial for slum communities and other marginalised citizens, and these initiatives can be justified on that basis.

However, though there can be no exact calibration from qualitative research, it is likely that these pro-equity initiatives actually increase relative inequalities.  Ordinary community members have seen some benefits but external actors who find the data to match their agenda and capabilities, benefit more.  It is the latter who are more empowered to access, use and control the new data.

If you would like to know more about this research’s findings, framework and recommendations for practice, then take a look at the paper: https://www.tandfonline.com/doi/full/10.1080/1369118X.2019.1599039

Advertisements

Measuring the Broadband Speed Divide using Crowdsourced Data

Digital applications and services increasingly require high-speed Internet connectivity. Yet a strong “broadband divide” exists between nations [1,2]. We try to understand how big data can be used to measure this divide. In particular, what new measurement opportunities can crowdsourced data offer?

The broadband divide has been widely measured using subscription rates. However, the broadband speed divide measured using observed speeds has been less explored due to the lack of data in the hands of regulators and statistical offices. This article focuses on measuring the fixed-network broadband speed divide between developed and developing countries, exploring the benefits and limitations of using new crowdsourced data.

To this aim we used measurements from the Speedtest Global Index, generated by Ookla using data volunteered by Internet users verifying the speed of their Internet connections [3]. These crowdsourced tests allow this firm to estimate monthly measurements of the average upload and download speeds at the country level.

The dataset used for this analysis comprised monthly data, from January to December 2018, for a total of 120 countries. Using the income and regional categorisations set by the World Bank we identified 64 developing countries and 54 developed countries in seven regions. Complete data for only two of the least developed countries were available so these were not included in the analysis.

The following table presents the download and upload speed averages on the fixed network, aggregated by region and level of development, and the totals for all the countries in our final sample (n=118), while the figure below shows the download and upload speeds aggregated by level of development.

Table 1. Average upload and download speed by region and development level, fixed network. January – December 2018 (Mbps)

Note: Unweighted averages
Source: Author calculations using data from Ookla’s Speedtest Global Index [3]

Figure 1. Average upload and download speed by level of development, fixed network. January – December 2018 (Mbps)

-Download speeds. We observe that the divide between developed and developing countries is pronounced with average download speeds for the latter being around one-third of the former. However, the divide is also evident within regions: in the developed world, countries in North America have speeds three-times higher than those in the Middle East. Within the developing countries those in Europe & Central Asia have the highest download speeds and those in the Middle East & North Africa have the lowest. Overall, download speeds are much lower in the developing world, thus creating an important impediment to the use of data-intensive digital applications and services.

-Upload speeds. We identify that overall there is an existing divide between developed and developing countries similar in magnitude to the one observed in download speeds. However, when looking at the group of developing countries we see that regional rankings are different compared to those identified using download speeds: the East Asia & Pacific region ranks first and North America ranks third – the latter with speeds that are two-thirds of their download speeds. Across regions, upload speeds are always slower in the developing world, and again the Middle East & North Africa region ranks at the bottom; but the divide between download and upload speeds is lower in the developing world. Considering that faster upload speeds are also required in a data-intensive era, the majority of the countries are far from the ideal of having faster networks with synchronous speeds.

Some benefits and limitations are identified when measuring the broadband speed divide using this type of crowdsourced data.

-Benefits. First, the availability of these types of data allows us to measure the broadband speed divide between developed and developing countries using observed instead of theoretical speeds. Second, these measurements are openly available on a website that can be accessed by the general public at no cost. Third, the divide can be measured and tracked over time more frequently than when using survey or administrative data. Finally, this site reports both download and upload speeds which are important to measure in a data-intensive era.

-Limitations. Even if there are data available for a good number of countries there are no complete data about the least developed countries, leaving behind this group. Also, there might be some bias in the production of data as crowdsourced measurements might be coming from ICT-literate individuals in certain countries [4]. Finally, from this source it is not possible to access complete datasets with additional data points such as the number of observations, medians, and latencies for each country.

These findings derive from a broader research project that, overall, is researching use of big data for measurement of the digital divide.  Readers are welcome to contact the author for details of that broader project: luis.riveraillingworth@manchester.ac.uk

References

[1] ITU (2018). Measuring the Information Society Report 2018. Geneva, Switzerland: International Telecommunication Union.

[2] Broadband Commission (2018). The State of the Broadband: Broadband catalyzing sustainable development. Geneva, Switzerland: Broadband Commission for Sustainable Development.

[3] Ookla. (2018). Speed Test Global Index [Online]. Available: http://www.speedtest.net/global-index/about [Accessed 01/03/2019]

[4] Bauer, S., Clark, D. D. & Lehr, W. (2010). Understanding broadband speed measurements. In,TPRC 2010. Available at SSRN: https://ssrn.com/abstract=1988332

The Puzzle of Digital Financial Inclusion: A Generation Game?

If we thought that financial inclusion and its digital variant are tightly correlated, we may be in for a surprise, judging from the Global Findex 2017 microdata released by the World Bank last month. Owning a bank account (financial inclusion) and owning a mobile money account (its digital variant) throw a puzzling pattern. I plot the averages of bank account ownership and mobile money account ownership in 144 countries across groups of low to high incomes economies, showing a clear separating trend. The thought is borne by 25 low income countries with the two measures of financial inclusion strongly correlated at 0.7. But as income level steps up (to middle and high incomes level) bank account shares increase while mobile money shares decrease. The final panel is flat at the bottom right: most of the 44 high income countries have more than 80% bank account shares with less than 20% mobile money account shares. The correlation? –0.2. One explanation for this negative correlation can be discounted. The digital variant is not yet a substitute for a bank account: savers cannot yet use their mobile money account on its own or as a substitute to secure property or business investment. As countries move up the economic ladder, the puzzle of separation insists on an explanation.

 

Figure 1. The puzzle of bank account ownership vs mobile money account ownership (number of countries in parentheses) Source: calculated from Global Findex 2017 microdataaccXmobXgroup

I explore an alternative here. In high income economies financial inclusion is nearly universal among adults. Not so in low and middle income economies; on the demand side lower average incomes as well as lack of trust in banks coupled with, on the supply side, weak financial infrastructures combine to leave many adults financially excluded. But the costs of financial services, such as sending and receiving money, have been pared down thanks to mobile technology, especially in low income economies. In Uganda, transfers can be made cheaply and directly from the south west to the north east without recourse to Kampala in the centre.

First in this exploration I show a map of the uneven financial inclusion around the world (https://globalfindex.worldbank.org/ accessed 31 October 2018). Map 1 shows that financial inclusion varies along levels of development. The high income economies of North America, Europe, Australia and New Zealand, are homes to adults with the majority of them having a bank account. Moreover a financial inclusion gradient is discernible with economies around the equator, where many lower and middle income economies are located, reporting lower percentages of account ownership. In particular, available data from African economies in the Global Findex and on the map show how financial inclusion is still a minority story on the continent.

 

Map 1 Financial inclusion around the world 2017, source: Global Findex 2017 report

map101

But has mobile technology made any difference to financial inclusion? It is increasingly so. A map of ownership of mobile money accounts (those who own an account and use a mobile phone to access it) tells how things have improved (Map 2). Over the last three years, some economies in East Africa such as Uganda or Kenya have accumulated owners of mobile money accounts; West African economies are treading the same path. Although it remains the case that the majority of African economies are home to the majority of adults without a mobile money account (60% or more without one).

 

Map 2 Digital financial inclusion in Sub-Saharan Africa, source: Global Findex 2017 report

map102

To explore further I build a non-linear multilevel model of accounts for each type of financial inclusion: in one the model explains owning a bank account, in the other owning a mobile money account. The model is non-linear because ownership is an indicator, as well as multilevel because 154,472 adults reside in 144 countries. The models account for country income groups, average national incomes, population, age, gender, education, employment, and personal incomes (quintiles). The most interesting findings relate to the associations with age and gender. I show marginal predictions of age and gender for financial inclusion below.

 

Figure 2. Marginal predictions of financial inclusion (own a bank account), calculated from the Global Findex 2017 microdata

accAgeFem

Figure 2 shows the age gradient of financial inclusion that is consistent with the life cycle effects of incomes and wealth. With age comes accumulation of wealth from earnings that needs to be stored for investment and consumption. So for both genders higher age groups have higher odds of owning a bank account (compared to the youngest age group) in a step-wise manner. The youngest (hollow point ○) and the oldest (solid point ●) form bookends to the predictions; both for men (left) and for women (right). There is also a clear gender inequality, although by age 25 women (diamond ◊, right) already have higher odds than the youngest male group. Thus financial inclusion reflects the life cycle effects of earning and saving.

 

Figure 3. Marginal predictions of digital financial inclusion (own a mobile money account), calculated from the Global Findex 2017

 

 

But the marginal predictions for digital financial inclusion do not conform at all to the life cycle effect (figure 3). Digital financial inclusion does not move lock-step with age. In contrast with traditional financial inclusion, the two oldest age groups have lower odds of owning a mobile money account; instead the highest predicted marginals are attained by the mid-30s. The solid point (● oldest group) for instance is furthest below the hollow point (○ youngest group). Here the two oldest–youngest groups do not form bookends. The gender digital divide is also sharper. For similar levels of other characteristics, no female groups have higher odds of owning a mobile money account than the youngest male group.

The strong age reversion effect (inclusion does not move in lock-step with age but reverts after age 40) suggests a generation effect. This is also consistent with the fact that many of the low income economies are still young while many of the high income economies are already ageing.

The puzzle that digital financial inclusion parts ways with financial inclusion may be driven by the generation effect. But there is no reason to expect that the life cycle effect should disappear soon. Thus the need for financial accounts around the world is likely to grow as adults age, leading to some reconciliation in paths of financial inclusion.

Using Big Data to Learn from Positive Outliers

29 October 2018 Leave a comment

Why do a few individuals, communities or organisations achieve significantly better results than their peers?  The positive deviance approach tries to answer this question.

The story began in 1990, the Vietnamese government invited Save the Children (SCF) to help overcome the problem of child malnutrition.  Jerry Sternin, the SCF Programme Director, was asked to demonstrate impact within six months and decided to try the idea of positive deviance.  Building on past work[1]he undertook a village survey of child height and weight, looking for positive deviants: children from poor families, living among high malnutrition rates, who were nonetheless well-nourished.

In the pilot survey, he found six such families and began to study them intensively (see Figure 1).  By observing the food preparation, cooking and serving behaviours of these families, he found three consistent yet rare behaviours. Mothers of positive deviants:

  1. washed their children’s hands every time they came in contact with anything unclean;
  2. added to their children’s diet tiny shrimps from the rice paddies, and the greens from sweet potato tops; and
  3. fed their children less per meal but more often: four to five times per day compared to two times in non-positive deviant families.

Sternin and his team then scaled out those simple, affordable, community-inspired practices and, within two years, this had reduced malnutrition by 80% in 250 communities, rehabilitating an estimated 50,000 malnourished children[2].

Figure 1: Jerry Sternin speaking to mothers in a village in Vietnam

The simple power of the positive deviance (PD) approach has led to its successful application in more than 60 countries across the globe[3].  Yet PD still faces a number of challenges to its diffusion and implementation.  As a result, we decided to investigate whether big data might help address those challenges, via a systematic review, published in the Electronic Journal of Information Systems in Developing Countries.

A priori, big data provides opportunities in relation to two main PD challenges.

1. Time, Cost and Sample Size. Relying on in-depth primary data collection, the PD approach is time- and labour-intensive with costs proportional to sample size[4]. As a result, PD sample sizes are traditionally small.  Statistically and practically, this can make it hard to identify positive deviants, given their relative rarity (see Figure 2)[5].  By contrast, cost of gathering big data tends to be very low since it often makes use of already existing “data exhaust” from digital processes.  With big data thus covering large – often very large – sample sizes, greater numbers of PDs can be identified, and generalisation to even-larger populations is easier.

Figure 2: Positive deviants in a normal distribution

2. Domain and Geographic Scope. To date, most applications of PD have been highly concentrated. In a recent systematic literature review[6], 89% of applications in developing countries were in public health, 83% were in rural communities, and just four countries had hosted roughly half of all PD implementations.  A simultaneous review of big data in developing countries, on the other hand, showed datasets and demonstrated value across a much wider set of domains and locations.  As a result, big data could help positive deviance to break from its current path dependency.

To assess these and other benefits that big data may bring to the PD approach – relating to behaviour identification, methodological risk, and scalability – a “big data-based positive deviance” research project has been designed and is underway.  The project is currently identifying positive deviants from large-scale datasets in the education and agriculture domains, with results planned to emerge in 2019.

For further details on the challenges of positive deviance and the opportunities offered by big data, please refer to the review article.

REFERENCES

[1]Wishik, S. M. & Van Der Vynckt, S. (1976) The use of nutritional “positive deviants” to identify approaches for modification of dietary practices, American Journal of Public Health, 66(1), 38–42. Zeitlin, M. F. et al.(1990) Positive Deviance in Child Nutrition: With Emphasis on Psychosocial and Behavioural Aspects and Amplications for Development. Tokyo: United Nations University.
[2]Sternin, J. (2002) Positive deviance: a new paradigm for addressing today’s problems today, The Journal of Corporate Citizenship, 57–63.
[3]Felt, L. J. (2011) Present Promise, Future Potential: Positive Deviance and Complementary Theory.  Lapping, K. et al.(2002) The positive deviance approach: challenges and opportunities for the future., Food and Nutrition Bulletin, 23(4 Suppl), 130–7.  Marsh, D. R., Schroeder, D. G., Dearden, K. A., Sternin, J. & Sternin, M. (2004) The power of positive deviance, BMJ, 329(7475), 1177–1179.
[4]Marsh et al. (ibid.).
[5]Springer, A., Nielsen, C. & Johansen, I. (2016) Positive Deviance by the NumbersPositive Deviance Initiative. Available at: https://positivedeviance.org/background/.
[6]Albanna, B. & Heeks, R. (2018) Positive deviance, big data and development: a systematic literature review, Electronic Journal of Information Systems in Developing Countries.

Big Data and Healthcare in the Global South

The global healthcare landscape is changing. Healthcare services are becoming ever more digitised with the adoption of new technologies and electronic health records. This development typically generates enormous amounts of data which, if utilised effectively, have the potential to improve healthcare services and reduce costs.

The potential of big data in healthcare

Decision making in medicine relies heavily on data from different sources, such as research and clinical data, rather than only based on individuals’ training and professional knowledge. Historically, healthcare organisations have often based their understanding of information on an incomplete grasp of reality on the ground, which could lead to poor health outcomes. This issue has recently become more manageable with the advent of big data technologies.

Big data comprises unstructured and structured data from clinical, financial and operational systems, and data from public health records and social media that goes beyond the health organisations’ walls. Big data, therefore, can support more insightful analysis and enable evidence-based medicine by making data transparent and usable at much broader verities, much larger volumes and higher velocities than was ever available to healthcare organisations [1].

Using big data, healthcare providers can, for example, manage population health by identifying patients at high-risk during disease outbreaks and then take preventive actions. In one case, Google used data from user search histories to track the spread of influenza around the world in near real time (see figure below).

Google Flu Trends correlated with influenza outbreak[2]

Big data can also be used for identifying procedures and treatments that are costly or delivering insignificant benefits. For example, one healthcare centre in the USA has been using clinical data to bring to light costly procedures and other treatments. This helped it to reduce and identify unnecessary procedures and duplicate tests. In essence, big data not only helped to improve high standards of patient care but also helped to reduce the costs of healthcare [3].

Medical big data in the global south

The potential healthcare benefits of big data are exciting. However, it can offer the most significant potential rewards for developing countries. While global healthcare is facing challenges to improve health outcomes and to reduce costs, these issues can be severe in developing countries.

Lack of sufficient resources, poor use of existing funds, poverty, and lack of managerial and related capabilities are the main differences between developing and developed countries. This means health inequality is more pronounced in the global south. Equally, mortality and birth rates are relatively high in developing countries as compared to developed countries, which have better-resourced facilities [4].

Given improvements in the quality and quantity of clinical data, the quality of care can be improved. In the global south in particular, where health is more a question of access to primary healthcare than a question of individual lifestyle, big data can play a prominent role in improving the use of scarce resources.

How is medical big data utilised in the global south?

To investigate this key question, I analysed the introduction of Electronic Health Records (EHR), known as SEPAS, in Iranian hospitals. SEPAS is a large-scale project which aims to build a nationally integrated system of EHR for Iranian citizens. Over the last decade, Iran has progressed from having no EHR to 82% EHR coverage for its citizens [5].

EHR is one of the most widespread applications of medical big data in healthcare. In effect, SEPAS is built with the aim to harness data and extract value from it and to make real-time and patient-centred information available to authorised users.

However, the analysis of SEPAS revealed that medical big data is not utilised to its full potential in the Iranian healthcare industry. If the big data system is to be successful, the harnessed data should inform decision-making processes and drive actionable results.

Currently, data is gathered effectively in Iranian public hospitals, meaning that the raw and unstructured data is mined and classified to create a clean set of data ready for analysis. This data is also transferred into summarised and digestible information and reports, confirming that real potential value can be extracted from the data.

In spite of this, the benefit of big data is not yet realised in guiding clinical decisions and actions in Iranian healthcare. SEPAS is only being used in hospitals by IT staff and health information managers who work with data and see the reports from the system. However, the reports and insights are not often sent to clinicians and little effort is made by management to extract lessons from some potentially important streams of big data.

Limited utilisation of medical big data in developing countries has also been reported in other studies. For example, a recent study in Saudi Arabia [6] reported the low number of e-health initiatives. This suggests the utilisation of big data faces more challenges in these countries.

Although this study cannot claim to have given a complete picture of the utilisation of medical big data in the global south, some light has been shed on the topic. While there is no doubt that medical big data could have a significant impact on the improvement of healthcare in the global south, there is still much work to be done. Healthcare policymakers in developing countries, and in Iran in particular, need to reinforce the importance of medical big data in hospitals and ensure that it is embedded in practice. To do this, the barriers to effective datafication should be first investigated in this context.

References

[1] Kuo, M.H., Sahama, T., Kushniruk, A.W., Borycki, E.M. and Grunwell, D.K. (2014). Health big data analytics: current perspectives, challenges and potential solutions. International Journal of Big Data Intelligence, 1(1-2), 114-126.

[2] Dugas, A.F., Hsieh, Y.H., Levin, S.R., Pines, J.M., Mareiniss, D.P., Mohareb, A., Gaydos, C.A., Perl, T.M. and Rothman, R.E. (2012). Google Flu Trends: correlation with emergency department influenza rates and crowding metrics. Clinical infectious diseases, 54(4), 463-469.

[3] Allouche G. (2013). Can Big Data Save Health Care? Available at: https://www.techopedia.com/2/29792/trends/big-data/can-big-data-save-health-care (Accessed: August 2018).

[4] Shah A. (2011). Healthcare around the World. Global Issues. Available at: http://www.globalissues.org/article/774/health-care-around-the-world (Accessed: August 2018).

[5] Financial Tribune (2017). E-Health File for 66m Iranians. Available at: https://financialtribune.com/articles/people/64502/e-health-files-for-66m-iranians (Accessed: August 2018).

[6] Alsulame K, Khalifa M, Househ M. (2016). E-Health Status in Saudi Arabia: A Review of Current Literature. Health Policy and Technology, 5(2), 204-210.

Measuring the Big Data Knowledge Divide Using Wikipedia

Big data is of increasing importance; yet – like all digital technologies – it is affected by a digital divide of multiple dimensions. We set out to understand one dimension: the big data ‘knowledge divide’; meaning the way in which different groups have different levels of knowledge about big data [1,2].

To do this, we analysed Wikipedia – as a global repository of knowledge – and asked: how does people’s knowledge of big data differ by language?

An exploratory analysis of Wikipedia to understand the knowledge divide looked at differences across ten languages in production and consumption of the specific Wikipedia article entitled ‘Big Data’ in each of the languages. The figure below shows initial results:

  • The Knowledge-Awareness Indicator (KAI) measures the total number of views of the ‘Big Data’ article divided by total number of views of all articles for each language (multiplied by 100,000 to produce an easier-to-grasp number). This relates specifically to the time period 1 February – 30 April 2018.
  • ‘Total Articles’ measures the overall number of articles on all topics that were available for each language at the end of April 2018, to give a sense of the volume of language-specific material available on Wikipedia.

‘Big Data’ article knowledge-awareness, top-ten languages*

ko=Korean; zh=Chinese; fr=French; pt=Portuguese; es=Spanish; de=German; it=Italian; ru=Russian; en=English; ja=Japanese.
Note: Data analysed for 46 languages, 1 February to 30 April 2018.
* Figure shows the top-ten languages with the most views of the ‘Big Data’ article in this period.
Source: Author using data from the Wikimedia Toolforge team [3]

 

Production. Considering that Wikipedia is built as a collaborative project, the production of content and its evolution can be used as a proxy for knowledge. A divide relating to the creation of content for the ‘Big Data’ article can be measured using two indicators. First, article size in bytes: longer articles would tend to represent the curation of more knowledge. Second, number of edits: seen as representing the pace at which knowledge is changing. Larger article size and higher number of edits may allow readers to have greater and more current knowledge about big data. On this basis, we see English far ahead of other languages: articles are significantly longer and significantly more edited.

Consumption. The KAI provides a measure of the level of relative interest in accessing the ‘Big Data’ article which will also relate to level of awareness of big data. Where English was the production outlier, Korean and to a lesser extent Chinese are the consumption outliers: there appears to be significantly more relative accessing of the article on ‘Big Data’ in those languages than in others. This suggests a greater interest in and awareness of big data among readers using those languages. Assuming that accessed articles are read and understood, the KAI might also be a proxy for the readers’ level of knowledge about big data.

We can draw two types of conclusion from this work.

First, and addressing the specific research question, we see important differences between language groups; reflecting an important knowledge divide around big data. On the production side, much more is being written and updated in English about big data than in other languages; potentially hampering non-English speakers from engaging with big data; at least in relative terms. This suggests value in encouraging not just more non-English Wikipedia writing on big data, but also non-English research (and/or translation of English research) given research feeds Wikipedia writing. This value may be especially notable in relation to East Asian languages given that, on the consumption side, we found much greater relative interest and awareness of big data among Wikipedia readers.

Second, and methodologically, we can see the value of using Wikipedia to analyse knowledge divide questions. It provides a reliable source of openly-accessible, large-scale data that can be used to generate indicators that are replicable and stable over time.

This research project will continue exploring the use of Wikipedia at the country level to measure and understand the digital divide in the production and consumption of knowledge, focusing specifically on materials in Spanish.

References

[1] Andrejevic, M. (2014). ‘Big Data, Big Questions |The Big Data Divide.’ International Journal of Communication, 8.

[2] Michael, M., & Lupton, D. (2015). ‘Toward a Manifesto for the “Public Understanding of Big Data”.’ Public Understanding of Science, 25(1), 104–116. doi: 10.1177/0963662515609005

[3] Wikimedia Toolforge (2018). Available at: https://tools.wmflabs.org/

Big Data and Urban Transportation in India

12 February 2018 Leave a comment

What effect are big data systems having on urban transportation?

To investigate this, the Centre for Internet and Society was commissioned by the Universities of Manchester and Sheffield, to conduct a study of the big data system recently implemented by the Bengaluru Metropolitan Transport Corporation (BMTC).  The “Intelligent Transport System” (ITS) took three years to reach initial operational status in 2016, and now covers the more than five million daily passenger journeys undertaken on BMTC’s 6,400 buses.

ITS (see figure below) processes many gigabytes of data per day via three main components: vehicle tracking units that continuously transmit bus locations using the mobile cell network; online electronic ticketing machines that capture details of all ticketing transactions; and a passenger information system with linked mobile app to provide details such as bus locations, routes and arrival times.

ITS Architecture (Mishra 2016)[1]

At the operational level the system is functioning moderately well: the data capture and transmission components mainly work though with some malfunctions; and the passenger-facing components are present but have data and functionality challenges that still need to be fully worked-through.  Higher-level use of big data for tactical and strategic decision-making – optimising routes, reducing staff numbers, increasing operational efficiency – is intended, but not yet evidenced.

Just over a year since full roll-out, this is not unexpected but it is a reminder that big data systems take many years to implement: in this case, at least four years to get the operational functions working, and years more to integrate big data into managerial decision-making.

Nonetheless some broader impacts can already be seen.  Big data has changed the mental model – the “imaginary” – that managers and politicians have of bus transport in Bengaluru.  Where daily operations of the bus fleet and bus crews were largely opaque to management prior to ITS, now they are increasingly visible.  Big data is thus changing the landscape of what is seen to be possible within the organisation, and has already resulted in plans for driver-only buses, and a restructuring that is removing middle management from the organisation: a layer no longer required when big data puts central management in direct contact with the operational front line.

Big data is also leading to shifts in power.  Some of these are tentative: a greater transparency of operations to the general public and civil society that may receive a step change once ITS data is openly shared.  Others are more concrete: big data is shifting power upwards in the organisation – away from front-line labour, and away from middle managers towards those in central management who have the capabilities to control and use the new data streams.

For further details of this study, see Development Informatics working paper no.72: “Big Data and Urban Transportation in India: A Bengaluru Bus Corporation Case Study”.

[1] Mishra, B. (2016) Intelligent Transport System (ITS), presentation at workshop on Smart Mobility for Bengaluru, Bengaluru, 10 Jun https://www.slideshare.net/EMBARQNetwork/bmtc-intelligent-transport-system

%d bloggers like this: