A perspective on the 2016 International Data Responsibility Conference in The Hague, The Netherlands
Reint-Jan Groot Nuelend
With input and feedback from Walle Bos
On 19 February, the International Data Responsibility Group (IDRG) hosted the second International Data Responsibility Conference in The Hague, The Netherlands. This annual meeting brings together experts and practitioners working with data for crisis-affected communities and the most at-risk populations worldwide. Through presentations by a variety of experts and six interactive workshops – of which the author visited two – participants explored the potential risks and harms that could be caused by using this data, and ways to prevent these from materializing. This year’s International Data Responsibility Conference was a forward-looking conference centring on finding solutions to the worries about data use for doing good.
During the opening plenary panel discussion moderated by Constantijn van Oranje, the World Food Programme (WFP) showcased the potential of data use for increased efficiency of food programmes. In a pioneering project with Leiden University, WFP is exploring the potential of using data on the movement and mobility of credit cards in Lebanon, to better understand the behaviour of its beneficiary populations and to detect anomalies in spending patterns. Having knowledge of the movements of people can be a strong added value to the effective implementation of programmes within target areas. Besides presenting its own work with and on data, WFP explicitly intended to learn from the conference. Following the latter intention, the WFP posed the following question: how can one avoid harm, while effectively and sustainably using digital data to benefit the people that need it most?
Lacking data infrastructure
Also during the panel discussion, Muchiri Nyaggah (Local Development Research Institute) flagged that technology and data can make people ‘disappear’. For example, poverty trends based on new data streams run the risk of presenting wrong facts, as data on the very poorest people is often unavailable. In many areas in the world where support is needed, large groups of underprivileged people don't have access to online platforms or financial services, which means they do not produce the same data as those people that do have access. Using wrong, one-sided, data creates the risk that one puts people in danger by excluding them from support. The main question then becomes: before we are using data for development, is the data produced by this infrastructure the right selection and representation of the population we aim to study? Many participants seemed to agree: often it is not, in the area where it is mostly needed.
Enabling the potential of data
On the same panel, Jelte van Wieren of the Dutch Ministry of Foreign Affairs started by quoting a friend: humanitarian organisations might be as conservative as the Catholic Church. They are not on top of the latest developments concerning data and the humanitarian sector is not leveraging data as a new global resource. There are rapidly developing open and closed datasets that can be very useful for effective humanitarian assistance. For example, geospatial data can enhance an information position beyond the frontline of a conflict, in order to better allocate humanitarian support. Often, this data is held by the private sector, which problematizes access. For corporates, restraint, hesitation and outright denial of access to datasets is often justifiable on several grounds and considerations. To access this data, we need to better understand the motives and incentives of corporate data holders, and design a trusted framework for corporate data sharing for responsible and laudable causes. Van Wieren: "Currently we are living in a ‘wild west’ of using data, while at the same time there is a high need of increasing the efficiency of aid."
Who is producing the data?
Another issue that can potentially cause harm: the people that produce data that can be helpful in humanitarian efforts are also the intended beneficiaries of this data. However, data is often collected by third parties often based half-way across the world, who might not know the questions and worries of the user of the data on micro level. Emmanuel Letouzé (Data-Pop Alliance), in the report-back of his workshop, noted that it is important therefore to introduce a new literacy among vulnerable populations in a digital age. Key in this is to learn from and not repeat the past, as literacy programmes have been used to enslave populations – those that are in power of teaching a language have a large amount of control over a population, for example by deciding who does and who does not learn it.
Even with responsible data literacy programmes however, it seems unavoidable that a few natural monopolies in the sphere of data will arise. An important solution that was presented are 'data communities' – clusters within countries where consumers and producers of data are being brought together from local to national level. That is becoming the normative approach in order to bring more inclusivity to data collection.
Is data a public good?
In its workshop during the first breakout round, UN Global Pulse underlined that there is a need for a responsibility mechanism for harnessing the international data revolution. Global Pulse functions as a network of innovation labs where research on Big Data for Development is conceived and coordinated. Their objectives: 1) achieve a critical mass of implemented innovations, 2) lower systemic barriers to adoption and scaling and 3) strengthen the big data innovation ecosystem. Global Pulse put forward the following question: is there always consent from people about whom data is collected? And how is consent even feasible in the age of big data? The participants in the Global Pulse workshop agreed: we lack a uniform international consensus on how to use data.
Global Pulse explained the guidelines they have devised that can advise responsible data-use. In brief it translates into four steps: 1) identify risks and harms, and those affected; 2) quantify the likelihood of occurrence and magnitude, identify factors that influence this; 3) identify positive effects and targeted beneficiary; 4) quantify the likelihoods of positive effects. In going through these four steps one needs to ask if there are alternative uses of the data in question and make an effort in identifying harms of not using certain data, whilst overall assessing the proportionality of the risks and the harms in relation to the positive effects.
A continuously returning issue during the conference was the fact that there is no clear definition of data and what we are allowed to do with it under international law. There are no provisions that make the misuse of data in itself a war crime, for example. This is one of the reasons why it is important to question how we could begin to fill those doctrine gaps. Having international standards on data ethics would help, but there is no consensus on what such standards should contain or how to enforce them. Moreover, it is important that data is flexible and regulation should be specific to the purpose of data use. There is a large difference, for example, between the admissibility of commercial use of data and its use for humanitarian action.
De-identification of data
In the Engine Room & Harvard Humanitarian Initiative breakout session, Danna Ingleton (The Engine Room) posed that data is people, and data can often be used to identify a person directly or through combined datasets. Danna questioned if it is ever possible to anonymize data and warned that data can be collected, processed and communicated in a manner that amplifies and deepens inequality and discrimination.
Nathaniel Raymond (Harvard Humanitarian Initiative) brought another crucial issue to the table: “You cannot prevent to do harm, if you don’t know what harm you can do.” The principle of ‘nola mi tengere’, in medicine – you do not touch certain parts of the body if you do not know what harm you can do – could be useful here. Raymond mentions the example that brain and heart surgery was long forbidden by this principle of ethics. Similarly, he pleas for caution and restraint of collecting, sharing and using data, as the possible harms are not always clear yet.
The 2016 International Data Responsibility Conference drilled down on some of the core problems facing the community aiming to innovate international development and humanitarian response through new uses of digital data. The concluding session therefore ended up being a summary of problems we are facing now or in the near future.
Linnet Taylor (University of Amsterdam) closed the conference by presenting a recap and ideas for the agenda for 2016. Linnet noted that context is essential in discussing the potential harms and risks of using data. Without background information one cannot make a well founded statement on the potential negative impacts. She classifies data responsibility as a ‘super-wicked problem’, due to the combination of 4 components:
A few main takeaways from the conference:
It seems that with this many big questions, the advice of Linnet Taylor could help this space by narrowing down to concepts that can be grasped more easily. Linnet advised the attendees to “move from 10.000 feet questions to 1.000 feet ones”, to start tackling problems one by one. I am very curious to see what next year’s International Data Responsibility Conference will bring... will we hear more concrete answers to these challenges?
One of the highlights of the day was the launch of the website http://thebiggerpicture.online/. This website combines the World Press photo with big data. Its aim at looking at the bigger picture of the photo, by trying to find the narrative behind the data on the one hand and the data behind the narrative on the other.
 Levin, K., Cashore, B., Bernstein, S., & Auld, G. (2012). Overcoming the tragedy of super wicked problems: constraining our future selves to ameliorate global climate change. Policy Sciences, 45(2), 123-152.
Josje Spierings is head of the Secretariat of the International Data Responsibility Group, a collaboration between the Data & Society Research Institute, Data-Pop Alliance, the GovLab at NYU, UN Global Pulse, Signal Program - Harvard Humanitarian Initiative - Harvard University and Leiden University.