Archive

Posts Tagged ‘Data-for-Development’

ICTs and Precision Development: Towards Personalised Development

5 November 2019 Leave a comment

Are ICTs about to deliver a new type of socio-economic development: personalised development?

ICTs can only have a significant development impact if they work at scale; touching the lives of thousands or better still millions of people.  Traditionally, this meant a uniform approach where everyone gets to use the same application in the same way.

Increasingly, though, ICTs have been enabling “precision development”: increasingly-precise in terms of who or what is targeted, what is known about the target, and the specificity of the associated development intervention.  The ultimate end-point would be “personalised development”: interventions customised to each individual.

Elements of digitally-enabled individualisation have already emerged: farmers navigating through web- or IVR-based systems to find the specific information they need; micro-entrepreneurs selecting the m-money savings and loan scheme and level that suited them.  But there is still rigidity and constraints within these systems.

Though we are far from its realisation, the potential for truly personalised development is now emerging.  For example:

  • Personalised Learning: “a methodology, according to which teaching and learning are focused on the needs and abilities of individual learners”[1]. ICTs are integral to personalised learning and technology-enabled personalisation has had a demonstrable positive impact on educational performance[2].
  • Precision Agriculture: though around as a concept for at least two decades, precision agriculture is only now starting to find implementations – often still at pilot stage – in the global South[3]. Combining data from on-ground sensors and remote sensing, precision agriculture provides targeted guidance in relation to “seeds, fertilizers, water, pesticides, and energy”.  The ultimate intention is that guidance will be customised to the very specific soil, micro-climate, etc. parameters of individual farms; even smallholder farms.
  • Personalised Healthcare: diagnosis and treatment may appear personalised but typically involve identifying which illness group a person belongs to, and then prescribing the generic treatment for that group. This is becoming more accurate with improvements in electronic health records that provide a more person-specific history and context[4].  Precision medicine prescribes even more narrowly for the individual; typically based on genetic analysis that requires strong digital capabilities.  Though at early stages, this is already being implemented in developing countries[5].

ICTs are thus leading us on a precision development track that will lead to personalised development.  The promise of this can be seen in the examples above: individualised information on learning level, farm status, or health status that then enables a much more effective development intervention.

It will be interesting to log other examples of “ICT4PD” as they emerge . . .

[1] Izmestiev, D. (2012). Personalized Learning: A New ICT-Enabled Education Approach, UNESCO Institute for Information Technologies in Education, Moscow.

[2] Kumar, A., & Mehra, A. (2018). Remedying Education with Personalized Learning: Evidence from a Randomized Field Experiment in India, ResearchGate.

[3] Say, S. M., Keskin, M., Sehri, M., & Sekerli, Y. E. (2018). Adoption of precision agriculture technologies in developed and developing countriesThe Online Journal of Science and Technology8(1), 7-15.

[4] Haskew, J., Rø, G., Saito, K., Turner, K., Odhiambo, G., Wamae, A., … & Sugishita, T. (2015). Implementation of a cloud-based electronic medical record for maternal and child health in rural KenyaInternational Journal of Medical Informatics84(5), 349-354.

[5] Mitropoulos, K., Cooper, D. N., Mitropoulou, C., Agathos, S., Reichardt, J. K., Al-Maskari, F., … & Lopez-Correa, C. (2017). Genomic medicine without borders: Which strategies should developing countries employ to invest in precision medicine? Omics: A Journal of Integrative Biology21(11), 647-657.

An Applied Data Justice Framework for Datafication and Development

Data is playing an ever-growing role in international development.  But what lens can we use to analyse the impact of data on development?

The emerging field of “data justice” offers some valuable ideas but they have not yet been put together into a systematic and comprehensive framework.  My open-access paper – Datafication, Development and Marginalised Urban Communities: An Applied Data Justice Framework, written with Satyarupa Shekhar – provides such a framework, as shown below.

The framework exposes five dimensions of data justice:

  • Procedural: fairness in the way in which data is handled.
  • Instrumental: fairness in the results of data being used.
  • Rights-based: adherence to basic data rights such as representation, privacy, access and ownership.
  • Structural: the degree to which the interests and power in wider society support fair outcomes in other forms of data justice.
  • Distributive: an overarching dimension relating to the (in)equality of data-related outcomes that can be applied to each of the other dimensions of data justice.

The dimensions can be used individually; for example, just to analyse data practices, or just to analyse the impact of context on new data systems in developing countries.  Or the model can be used holistically; for example, to understand the full development impact of a particular data initiative.

The Datafication, Development and Marginalised Urban Communities: An Applied Data Justice Framework paper takes the latter route.  It analyses “pro-equity data initiatives” that were implemented by data activists in four cities: Chennai, Nairobi, Pune and Surakarta.  These initiatives specifically sought to address the data injustices suffered by slum dwellers and other marginalised groups; particularly their invisibility to urban planners and other external agencies.

Using the data justice lens, this research finds that new data flows do have a positive impact in counteracting the injustice of invisibility, but they disproportionately serve those with the motivation and power to use that data.  Results in terms of service improvements and epistemic change are beneficial for slum communities and other marginalised citizens, and these initiatives can be justified on that basis.

However, though there can be no exact calibration from qualitative research, it is likely that these pro-equity initiatives actually increase relative inequalities.  Ordinary community members have seen some benefits but external actors who find the data to match their agenda and capabilities, benefit more.  It is the latter who are more empowered to access, use and control the new data.

If you would like to know more about this research’s findings, framework and recommendations for practice, then take a look at the paper: https://www.tandfonline.com/doi/full/10.1080/1369118X.2019.1599039

Analysing the Perceptions of Digital Gig Workers: Mining Emotions from Job Reviews

In a previous post, we provided a discussion of how the analysis of user-generated content (e.g. comments/posts on social media and/or job review sites) can help in understanding perceptions of digital gig workers. The prevailing assumption is that generally, digital gig workers contend with non-standard working conditions, e.g. the lack of social security coverage, long working hours, lower salaries, and the lack of benefits. Nevertheless, it is believed that digital gig workers in the Global South in particular perceive their jobs as being better than local benchmarks (i.e. office-based work).

To test the above assumptions, we developed and employed automatic text analysis methods to answer the following research questions:

  • How do digital gig workers feel about their jobs?
  • Which topics pertaining to decent work standards do they frequently talk about?
  • Are there any differences—in terms of sentiments and topics—across different geographic locations, or across genders?

We hereby present the results of analytics in the way of answering the questions above.

Firstly, we collected online posts published by digital gig workers from Glassdoor, a web-based platform for sharing reviews of companies and their management. Focussing on reviews of the digital gig companies Upwork, Fiverr and Freelancer, we retrieved a total of 567 reviews, 297 of which include geographic metadata (i.e. the geographic location associated with the account/profile of the user posting a review). For our text analysis, we made use of the Pro and Con fields that each review came with.

Based on the NRC Emotion Lexicon, a dictionary-based emotion detection method (implemented in the R statistical package) was applied on the reviews, classifying them according to Robert Plutchik’s eight basic emotions: Joy, Trust, Fear, Surprise, Sadness, Anticipation, Anger, and Disgust. We then grouped the reviews as either coming from the Global North or the Global South based on the geographic metadata attached to them. Shown in the figure below are the 15 most frequent emotion-bearing words found within reviews, represented according to the emotions they express. Bars in amber correspond to words prevalent in reviews from the Global North (GN) while those in blue pertain to those in reviews from the Global South (GS). 

Riza GNGSemotion

It can be observed that there are more words within GS reviews containing emotions that are clearly positive. All of the 15 words associated with Trust were found more often in GS reviews. Furthermore, 10 and 8 words associated with Joy and Anticipation, respectively, were more frequent in GS reviews. These results support the belief that digital gig workers in the Global South (GS) do express positive feelings towards their jobs.

Meanwhile, our results show that digital gig workers from both GN and GS express negative emotions. On the one hand, GS reviews were the source of 11 and 10 words associated with Anger and Fear, respectively. On the other hand, 15 and 11 words associated with Sadness and Disgust, respectively, were contained in GN reviews. This suggests that generally speaking, digital gig workers do have to contend with less than ideal working conditions, which in turn trigger such negative emotions.

Finally, 10 words associated with Surprise came from GN, 5 from GS. It is worth noting though that this particular emotion can either be negative or positive depending on context.

These results are but “teasers” to the full results of our automated analysis. Further details including the topics/themes towards which such emotions are targeted, as well as answers to the second and third research questions stated above, will be presented by Dr Victoria Ikoro in the upcoming 3rd Annual ICT4D North Workshop to be held in the Management School of the University of Liverpool on the 6th June 2019.

 

Measuring the Broadband Speed Divide using Crowdsourced Data

Digital applications and services increasingly require high-speed Internet connectivity. Yet a strong “broadband divide” exists between nations [1,2]. We try to understand how big data can be used to measure this divide. In particular, what new measurement opportunities can crowdsourced data offer?

The broadband divide has been widely measured using subscription rates. However, the broadband speed divide measured using observed speeds has been less explored due to the lack of data in the hands of regulators and statistical offices. This article focuses on measuring the fixed-network broadband speed divide between developed and developing countries, exploring the benefits and limitations of using new crowdsourced data.

To this aim we used measurements from the Speedtest Global Index, generated by Ookla using data volunteered by Internet users verifying the speed of their Internet connections [3]. These crowdsourced tests allow this firm to estimate monthly measurements of the average upload and download speeds at the country level.

The dataset used for this analysis comprised monthly data, from January to December 2018, for a total of 120 countries. Using the income and regional categorisations set by the World Bank we identified 64 developing countries and 54 developed countries in seven regions. Complete data for only two of the least developed countries were available so these were not included in the analysis.

The following table presents the download and upload speed averages on the fixed network, aggregated by region and level of development, and the totals for all the countries in our final sample (n=118), while the figure below shows the download and upload speeds aggregated by level of development.

Table 1. Average upload and download speed by region and development level, fixed network. January – December 2018 (Mbps)

Note: Unweighted averages
Source: Author calculations using data from Ookla’s Speedtest Global Index [3]

Figure 1. Average upload and download speed by level of development, fixed network. January – December 2018 (Mbps)

-Download speeds. We observe that the divide between developed and developing countries is pronounced with average download speeds for the latter being around one-third of the former. However, the divide is also evident within regions: in the developed world, countries in North America have speeds three-times higher than those in the Middle East. Within the developing countries those in Europe & Central Asia have the highest download speeds and those in the Middle East & North Africa have the lowest. Overall, download speeds are much lower in the developing world, thus creating an important impediment to the use of data-intensive digital applications and services.

-Upload speeds. We identify that overall there is an existing divide between developed and developing countries similar in magnitude to the one observed in download speeds. However, when looking at the group of developing countries we see that regional rankings are different compared to those identified using download speeds: the East Asia & Pacific region ranks first and North America ranks third – the latter with speeds that are two-thirds of their download speeds. Across regions, upload speeds are always slower in the developing world, and again the Middle East & North Africa region ranks at the bottom; but the divide between download and upload speeds is lower in the developing world. Considering that faster upload speeds are also required in a data-intensive era, the majority of the countries are far from the ideal of having faster networks with synchronous speeds.

Some benefits and limitations are identified when measuring the broadband speed divide using this type of crowdsourced data.

-Benefits. First, the availability of these types of data allows us to measure the broadband speed divide between developed and developing countries using observed instead of theoretical speeds. Second, these measurements are openly available on a website that can be accessed by the general public at no cost. Third, the divide can be measured and tracked over time more frequently than when using survey or administrative data. Finally, this site reports both download and upload speeds which are important to measure in a data-intensive era.

-Limitations. Even if there are data available for a good number of countries there are no complete data about the least developed countries, leaving behind this group. Also, there might be some bias in the production of data as crowdsourced measurements might be coming from ICT-literate individuals in certain countries [4]. Finally, from this source it is not possible to access complete datasets with additional data points such as the number of observations, medians, and latencies for each country.

These findings derive from a broader research project that, overall, is researching use of big data for measurement of the digital divide.  Readers are welcome to contact the author for details of that broader project: luis.riveraillingworth@manchester.ac.uk

References

[1] ITU (2018). Measuring the Information Society Report 2018. Geneva, Switzerland: International Telecommunication Union.

[2] Broadband Commission (2018). The State of the Broadband: Broadband catalyzing sustainable development. Geneva, Switzerland: Broadband Commission for Sustainable Development.

[3] Ookla. (2018). Speed Test Global Index [Online]. Available: http://www.speedtest.net/global-index/about [Accessed 01/03/2019]

[4] Bauer, S., Clark, D. D. & Lehr, W. (2010). Understanding broadband speed measurements. In,TPRC 2010. Available at SSRN: https://ssrn.com/abstract=1988332

Data, Platforms and Power

19 February 2019 Leave a comment

We know that digital platforms can be very powerful, but how does their use of data relate to power?

In three ways[1] that derive from the datafication and digitisation affordances of platforms:

  1. Addressing Information Failure. Platforms succeed in part by finding ways to overcome information failures in existing markets. These failures may be sources of power for incumbents. For example, estate agents (realtors) hold power in real estate markets due to information asymmetries; such as knowledge of house sale prices.  Real estate platforms put such data into the public domain, thus undermining the power of incumbents.  Information failures may also be a source of weakness in existing markets.  For example, riders with traditional taxi firms don’t know exactly when their cab will arrive.  Platforms provide such data and so, again, undermine incumbents.

 

  1. Mashing Up. As they deal with digitised data, platforms can gain power by integrating different data streams onto the platform. Real estate platforms integrate online information about neighbourhoods.  Ride-hailing platforms integrate online maps to show cab location and routes to riders and drivers.

 

  1. Controlling New Data. By digitising transactions and associated processes, platforms create, capture and control new data. This bolsters their power; typically by creating new information asymmetries: the platforms know things that others don’t.  Real estate platforms can monitor search behaviours of buyers to understand more about which features of house listings they value most.  Ride-hailing platforms understand spatio-temporal patterns of supply and demand alongside many other behavioural characteristics of riders and drivers.

 

This simple framework can usefully be applied in order to analyse the role of data in platforms, and its contribution to power.

 

[1] Categorisation and examples developed from Drouillard, M. (2017) Addressing voids: how digital start-ups in Kenya create market infrastructure. In: Digital Kenya, B. Ndemo and T. Weiss (eds). London: Palgrave Macmillan, 97–131

Using Big Data to Learn from Positive Outliers

29 October 2018 Leave a comment

Why do a few individuals, communities or organisations achieve significantly better results than their peers?  The positive deviance approach tries to answer this question.

The story began in 1990, the Vietnamese government invited Save the Children (SCF) to help overcome the problem of child malnutrition.  Jerry Sternin, the SCF Programme Director, was asked to demonstrate impact within six months and decided to try the idea of positive deviance.  Building on past work[1]he undertook a village survey of child height and weight, looking for positive deviants: children from poor families, living among high malnutrition rates, who were nonetheless well-nourished.

In the pilot survey, he found six such families and began to study them intensively (see Figure 1).  By observing the food preparation, cooking and serving behaviours of these families, he found three consistent yet rare behaviours. Mothers of positive deviants:

  1. washed their children’s hands every time they came in contact with anything unclean;
  2. added to their children’s diet tiny shrimps from the rice paddies, and the greens from sweet potato tops; and
  3. fed their children less per meal but more often: four to five times per day compared to two times in non-positive deviant families.

Sternin and his team then scaled out those simple, affordable, community-inspired practices and, within two years, this had reduced malnutrition by 80% in 250 communities, rehabilitating an estimated 50,000 malnourished children[2].

Figure 1: Jerry Sternin speaking to mothers in a village in Vietnam

The simple power of the positive deviance (PD) approach has led to its successful application in more than 60 countries across the globe[3].  Yet PD still faces a number of challenges to its diffusion and implementation.  As a result, we decided to investigate whether big data might help address those challenges, via a systematic review, published in the Electronic Journal of Information Systems in Developing Countries.

A priori, big data provides opportunities in relation to two main PD challenges.

1. Time, Cost and Sample Size. Relying on in-depth primary data collection, the PD approach is time- and labour-intensive with costs proportional to sample size[4]. As a result, PD sample sizes are traditionally small.  Statistically and practically, this can make it hard to identify positive deviants, given their relative rarity (see Figure 2)[5].  By contrast, cost of gathering big data tends to be very low since it often makes use of already existing “data exhaust” from digital processes.  With big data thus covering large – often very large – sample sizes, greater numbers of PDs can be identified, and generalisation to even-larger populations is easier.

Figure 2: Positive deviants in a normal distribution

2. Domain and Geographic Scope. To date, most applications of PD have been highly concentrated. In a recent systematic literature review[6], 89% of applications in developing countries were in public health, 83% were in rural communities, and just four countries had hosted roughly half of all PD implementations.  A simultaneous review of big data in developing countries, on the other hand, showed datasets and demonstrated value across a much wider set of domains and locations.  As a result, big data could help positive deviance to break from its current path dependency.

To assess these and other benefits that big data may bring to the PD approach – relating to behaviour identification, methodological risk, and scalability – a “big data-based positive deviance” research project has been designed and is underway.  The project is currently identifying positive deviants from large-scale datasets in the education and agriculture domains, with results planned to emerge in 2019.

For further details on the challenges of positive deviance and the opportunities offered by big data, please refer to the review article.

REFERENCES

[1]Wishik, S. M. & Van Der Vynckt, S. (1976) The use of nutritional “positive deviants” to identify approaches for modification of dietary practices, American Journal of Public Health, 66(1), 38–42. Zeitlin, M. F. et al.(1990) Positive Deviance in Child Nutrition: With Emphasis on Psychosocial and Behavioural Aspects and Amplications for Development. Tokyo: United Nations University.
[2]Sternin, J. (2002) Positive deviance: a new paradigm for addressing today’s problems today, The Journal of Corporate Citizenship, 57–63.
[3]Felt, L. J. (2011) Present Promise, Future Potential: Positive Deviance and Complementary Theory.  Lapping, K. et al.(2002) The positive deviance approach: challenges and opportunities for the future., Food and Nutrition Bulletin, 23(4 Suppl), 130–7.  Marsh, D. R., Schroeder, D. G., Dearden, K. A., Sternin, J. & Sternin, M. (2004) The power of positive deviance, BMJ, 329(7475), 1177–1179.
[4]Marsh et al. (ibid.).
[5]Springer, A., Nielsen, C. & Johansen, I. (2016) Positive Deviance by the NumbersPositive Deviance Initiative. Available at: https://positivedeviance.org/background/.
[6]Albanna, B. & Heeks, R. (2018) Positive deviance, big data and development: a systematic literature review, Electronic Journal of Information Systems in Developing Countries.

Big Data and Healthcare in the Global South

The global healthcare landscape is changing. Healthcare services are becoming ever more digitised with the adoption of new technologies and electronic health records. This development typically generates enormous amounts of data which, if utilised effectively, have the potential to improve healthcare services and reduce costs.

The potential of big data in healthcare

Decision making in medicine relies heavily on data from different sources, such as research and clinical data, rather than only based on individuals’ training and professional knowledge. Historically, healthcare organisations have often based their understanding of information on an incomplete grasp of reality on the ground, which could lead to poor health outcomes. This issue has recently become more manageable with the advent of big data technologies.

Big data comprises unstructured and structured data from clinical, financial and operational systems, and data from public health records and social media that goes beyond the health organisations’ walls. Big data, therefore, can support more insightful analysis and enable evidence-based medicine by making data transparent and usable at much broader verities, much larger volumes and higher velocities than was ever available to healthcare organisations [1].

Using big data, healthcare providers can, for example, manage population health by identifying patients at high-risk during disease outbreaks and then take preventive actions. In one case, Google used data from user search histories to track the spread of influenza around the world in near real time (see figure below).

Google Flu Trends correlated with influenza outbreak[2]

Big data can also be used for identifying procedures and treatments that are costly or delivering insignificant benefits. For example, one healthcare centre in the USA has been using clinical data to bring to light costly procedures and other treatments. This helped it to reduce and identify unnecessary procedures and duplicate tests. In essence, big data not only helped to improve high standards of patient care but also helped to reduce the costs of healthcare [3].

Medical big data in the global south

The potential healthcare benefits of big data are exciting. However, it can offer the most significant potential rewards for developing countries. While global healthcare is facing challenges to improve health outcomes and to reduce costs, these issues can be severe in developing countries.

Lack of sufficient resources, poor use of existing funds, poverty, and lack of managerial and related capabilities are the main differences between developing and developed countries. This means health inequality is more pronounced in the global south. Equally, mortality and birth rates are relatively high in developing countries as compared to developed countries, which have better-resourced facilities [4].

Given improvements in the quality and quantity of clinical data, the quality of care can be improved. In the global south in particular, where health is more a question of access to primary healthcare than a question of individual lifestyle, big data can play a prominent role in improving the use of scarce resources.

How is medical big data utilised in the global south?

To investigate this key question, I analysed the introduction of Electronic Health Records (EHR), known as SEPAS, in Iranian hospitals. SEPAS is a large-scale project which aims to build a nationally integrated system of EHR for Iranian citizens. Over the last decade, Iran has progressed from having no EHR to 82% EHR coverage for its citizens [5].

EHR is one of the most widespread applications of medical big data in healthcare. In effect, SEPAS is built with the aim to harness data and extract value from it and to make real-time and patient-centred information available to authorised users.

However, the analysis of SEPAS revealed that medical big data is not utilised to its full potential in the Iranian healthcare industry. If the big data system is to be successful, the harnessed data should inform decision-making processes and drive actionable results.

Currently, data is gathered effectively in Iranian public hospitals, meaning that the raw and unstructured data is mined and classified to create a clean set of data ready for analysis. This data is also transferred into summarised and digestible information and reports, confirming that real potential value can be extracted from the data.

In spite of this, the benefit of big data is not yet realised in guiding clinical decisions and actions in Iranian healthcare. SEPAS is only being used in hospitals by IT staff and health information managers who work with data and see the reports from the system. However, the reports and insights are not often sent to clinicians and little effort is made by management to extract lessons from some potentially important streams of big data.

Limited utilisation of medical big data in developing countries has also been reported in other studies. For example, a recent study in Saudi Arabia [6] reported the low number of e-health initiatives. This suggests the utilisation of big data faces more challenges in these countries.

Although this study cannot claim to have given a complete picture of the utilisation of medical big data in the global south, some light has been shed on the topic. While there is no doubt that medical big data could have a significant impact on the improvement of healthcare in the global south, there is still much work to be done. Healthcare policymakers in developing countries, and in Iran in particular, need to reinforce the importance of medical big data in hospitals and ensure that it is embedded in practice. To do this, the barriers to effective datafication should be first investigated in this context.

References

[1] Kuo, M.H., Sahama, T., Kushniruk, A.W., Borycki, E.M. and Grunwell, D.K. (2014). Health big data analytics: current perspectives, challenges and potential solutions. International Journal of Big Data Intelligence, 1(1-2), 114-126.

[2] Dugas, A.F., Hsieh, Y.H., Levin, S.R., Pines, J.M., Mareiniss, D.P., Mohareb, A., Gaydos, C.A., Perl, T.M. and Rothman, R.E. (2012). Google Flu Trends: correlation with emergency department influenza rates and crowding metrics. Clinical infectious diseases, 54(4), 463-469.

[3] Allouche G. (2013). Can Big Data Save Health Care? Available at: https://www.techopedia.com/2/29792/trends/big-data/can-big-data-save-health-care (Accessed: August 2018).

[4] Shah A. (2011). Healthcare around the World. Global Issues. Available at: http://www.globalissues.org/article/774/health-care-around-the-world (Accessed: August 2018).

[5] Financial Tribune (2017). E-Health File for 66m Iranians. Available at: https://financialtribune.com/articles/people/64502/e-health-files-for-66m-iranians (Accessed: August 2018).

[6] Alsulame K, Khalifa M, Househ M. (2016). E-Health Status in Saudi Arabia: A Review of Current Literature. Health Policy and Technology, 5(2), 204-210.

%d bloggers like this: