As technology shifts to voice-enabled products, investment from three leading organizations will help make voice technology accessible in Kiswahili.
Over the next decade, speech is expected to become the primary way people interact with devices — from phones and laptops to digital assistants.
Today’s voice-enabled devices, however, are inaccessible to vast swaths of the planet’s languages, accents, and speech patterns. Currently, neither Amazon’s Alexa, Apple’s Siri, nor Google Home support a single native African language.
To help ensure that people everywhere benefit from this massive technological shift, the Bill & Melinda Gates Foundation, the Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) GmbH (German Development Cooperation), and the UK’s Foreign Commonwealth & Development Office (FCDO) are investing $3.4 million in Mozilla Common Voice as part of an ambitious, open-source initiative to build out voice data sets in Kiswahili, an East African language spoken by an estimated 100 million people in Kenya, Uganda, Tanzania, Rwanda, Burundi, and South Sudan.
Most of the voice data currently used to train machine learning algorithms are held by a handful of major companies.
This poses challenges for companies seeking to develop high-quality speech recognition technologies, while also exacerbating the voice recognition divide between English speakers and the rest of the world.
Launched in 2017, Common Voice aims to level the playing field while mitigating AI bias. It enables anyone to donate their voices to a free, publicly available database that startups, researchers, and developers can use to train voice-enabled apps, products, and services.
Today, it represents the world’s largest multi-language public domain voice data set, with more than 9,000 hours of voice data in 60 different languages, including widely spoken languages and regional languages like Welsh and Kabyle, a language spoken in Northern Algeria. More than 166,000 people worldwide have contributed to the project as of [date].
This $3.4 million investment will accelerate the growth of Common Voice’s Kiswahili data set, engage more communities and volunteers in East Africa, and facilitate the hiring of key new roles, including Chenai Chair, Special Advisor on Africa; Kathleen Siminyu, Machine Learning Fellow; and two Community Fellows (Rebecca Ryakitimbo Mwimbi and Britone Mwasaru).
A key goal of the project is to explore whether it is possible to develop voice recognition for the languages of underserved communities as a platform.
With this data available as a digital public good in the open-source domain, it could allow local innovation in emerging markets to develop products and services serving marginalized communities.
Common Voice will be collaborating with African companies, start-ups and universities to develop locally suitable, voice-enabled technology solutions that are relevant to the Sustainable Development Goals (SDGs).
Partners will be selected to implement voice functionality through specific financial services and agriculture use cases, such as an advisory chatbot through AgriFin Digital Farmer in Kenya.
This project builds off a cooperation between Mozilla and the German Ministry for Economic Cooperation and Development (BMZ) through its “FAIR Forward: Artificial Intelligence for All” initiative to create open data for Kinyarwanda, a widely spoken language in Rwanda with over 12 million speakers.
Following a kickoff hackathon in Kigali in 2019, the project led to the launch of Digital Umuganda, a Rwandan startup grounded in the idea of “Umuganda,” a concept of self-help and cooperation rooted in the Rwandan culture.
Kinyarwanda is now one of the fastest-growing languages on the Common Voice platform. As of May 2021, Digital Umaganda has collected over 1,700 hours of Kinyarwanda voice data from over 840 contributors.
Together with GIZ, Digital Umuganda is currently developing “Mbaza”, an AI-backed chatbot with speech-to-text and text-to-speech functionality, which will provide critical COVID-19 information in the Kinyarwanda language. This will be the first project to benefit from Common Voice’s work in Kinyarwanda.
“Language is a powerful part of who we are, and people, not profit-making companies, are the right guardians of how language appears in our digital lives,” said Chenai Chair, Special Advisor for Africa Innovation Mradi at the Mozilla Foundation.
“By making it easy to donate voice data in Kiswahili, Common Voice will support East Africans to play a direct role in creating technology that helps rather than harms their communities. We are thrilled to join with partners who share Mozilla’s vision for helping more people in more places to access voice technology.”
UK Minister for Africa, James Duddridge said that Technology is key to improving how we communicate with each other, do business and share expertise.
“The UK is proud to support this tech initiative with Common Voice to help communities in East Africa benefit from livelihood support, access new markets, and narrow the digital divide.”
“Voice-enabled products have the unique opportunity to better reach millions of people who are traditionally excluded from digital services. But this requires the technology to understand the people and vice versa,” said Balthas Seibold from GIZ.
“Most importantly, for a true democratization of the foundations of AI it needs the perspectives of those voices who are not heard yet. Together with our partners on the ground, we want to help increase access to technology, unlock local expertise and innovation, and help drive adoption at scale by the population that would benefit most from support.”
“I’m excited about the initiative to build a sizable speech dataset for Kiswahili, particularly the economic and digital access possibilities that this could unlock for the local communities,” said Kathleen Siminyu, Mozilla’s Machine Learning Fellow.