AI researchers detail obstacles to data sharing in Africa

AI researchers detail obstacles to data sharing in Africa

AI researchers say data sharing is a key part of economic growth in Africa but that it faces a number of common obstacles, including the threat of data colonialism. The African data market is expected to grow steadily in the coming years, and the African Data Centre trade organization predicts the African data market will need hundreds of new datacenters to meet demand in the coming decade.

In a paper titled “Narratives and Counternarratives on Data Sharing in Africa,” the research team lays out structural problems including but limited to financial or infrastructure problems. Coauthors argue that failure to consider ethical concerns associated with those obstacles could cause irreparable harm.

“Currently, a significant proportion of Africa’s digital infrastructure is controlled by Western technology powers, such as Amazon, Google, Facebook, and Uber,” the paper reads. “Traditional colonial powers pursued colonial invasion through justifications such as ‘educating the uneducated.’ Data accumulation processes are accompanied by similar colonial rhetoric, such as ‘liberating the bottom billion,’ ‘helping the unbanked,’ ‘connecting the unconnected,’ and using data to ‘leapfrog poverty.’”

Power imbalances, lack of investment in building trust, and disregard for local knowledge and context are identified as the three most common barriers to data sharing, as “entire heterogeneous geographies of people have their data accessed and shared, yet do not reap the same benefits as the data collectors and owners of data infrastructures,” according to the paper. Coauthors argue that dominant narratives around data sharing in Africa today focus on a lack of knowledge about the value of data and often suffers from what coauthors refer to as deficit narratives: stories that focus on subjects like poverty, unemployment, or illiteracy rates.

“In recent years, the African continent as a whole has been considered a frontier opportunity for building data collection infrastructures. The enthusiasm around data sharing, and especially in machine learning or data science for development/social good settings, has ranged from tempered discussions around new research avenues to proclamations that ‘the AI invasion is coming to Africa (and it’s a good thing)‘. In this work, we echo previous discussions that this can lead to data colonialism and significant, irreparable harm to communities.”

Coauthors argue that responsible data sharing in Africa should reject practices that lead to data colonialism and focus on meeting the needs of individuals and local communities first. They say this requires awareness and examination of influencing issues like legacies of colonialism and slavery. They warn that this context can contribute to data policy or practices rooted in Western-centric extractive practices that are “ill-suited for the African context.”

The largest datacenter in Africa is reportedly under construction in South Africa. It’s part of a surge of investment in datacenters and African telecom companies that some have deemed a gold rush. Microsoft opened its first datacenter in Africa in 2019. AWS opened a South Africa region last year. Google is expected to complete construction on the Equiano subsea cable later this year, and Facebook is constructing a subsea cable that’s expected to be completed in two or three years. Nvidia is also ramping up operations in Africa.

An analysis of the rise of the African cloud by Xalam Analytics found that less than 1% of global public cloud revenue came from Africa in 2018.

Above: An illustration of stakeholders in the African data ecosystem in the paper “Narratives and Counternarratives on Data Sharing in Africa”

 

The paper reaches its conclusions through interviews with African data experts and insights from coauthors, a number of whom grew up in Africa or currently live on the continent. Rediet Abebe grew up in Ethiopia and cofounded Black in AI. Abebe is an assistant professor at UC Berkeley’s Electrical Engineering and Computer Sciences (EECS), the first Black faculty member in school history.

Abeba Birhane also grew up in Ethiopia. Currently a Ph.D. student at the University of Dublin, her writing about relational ethics received a Best Paper award at the Black in AI workshop at NeurIPS in 2018. Birhane has written at length about algorithmic colonization. Sekou Remy grew up in Trinidad and Tobago but currently works as a research scientist and technical lead at IBM Research Africa in Kenya. And George Obaido and Kehinde Aruleba wrote the paper in association with the University of the Witwatersrand in South Africa.

“Data sharing practices which operate in the absence of knowledge of local norms and contexts contribute — albeit indirectly — to the erosion of trust among stakeholders in the data-sharing ecosystem,” the paper reads. “As machine learning and data science move to focus on the Global South and especially the African continent, the need to understand what challenges exist in data sharing, and how we can improve data practices become more pressing.”

Power plays a major role in data sharing in Africa. For example, research cited in the paper found that Africans are significantly underrepresented in the biomedical research community, even when the data comes from Africa.

“Power asymmetries, historically inherited from the colonial era, often get carried over into data practices and manifest themselves in various forms, from imbalanced authorship to uneven bargaining powers that come with funding,” the paper reads. The coauthors add that power imbalance is also a factor in relationships between project managers and data analysts; data analysts and data collectors; and data collectors and research participants.

The paper also encourages understanding attitudes about data among African researchers. Governments in places like Ghana and Kenya have opened data portals, but a survey of South African researchers found that only about one in five shares data with others, and a 2018 study involving life scientists in more than a dozen sub-Saharan African nations described a number of disincentives to data sharing. That same year, governments in nations like Botswana, Ethiopia, and South Africa developed national data strategies. To address common issues, the African Union formed an AI working group in 2019.

“Trust is the fundamental component of all relationships in a data sharing ecosystem,” the paper reads. “The future of open data management and data sharing and their contribution to the advancement of science and technology in Africa will continue to increase, despite the slow pace caused by the lack of funding, redundant policy frameworks, and limited infrastructures.”

The paper was accepted for publication at the ACM Fairness, Accountability, and Transparency (FAccT). The virtual conference begins next week. Other papers accepted for publication at FAccT include research that examines how language models do with word association and censorship and a call for a culture change in machine learning by Ethical AI team at Google and University of Washington. The FAccT conference was cofounded by Timnit Gebru, the Ethical AI team lead Google fired in late 2020. The conference has a history of being sponsored by a number of Big Tech companies with poor records of hiring Black researchers, like Facebook AI Research (FAIR), Google’s DeepMind, and Google.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Leave a Reply

Your email address will not be published. Required fields are marked *