Skip to content

Developing Voice-Activated Software Applications

Nvidia and Mozilla have recently expanded their database of speech data sourced collectively, now totalling 13,905 hours of speech across 76 different languages. The updated version of the dataset includes 182,000 distinct voices along with demographic details such as age, gender, and accent....

Developing Voice-Activated Programs
Developing Voice-Activated Programs

Developing Voice-Activated Software Applications

Updated Crowdsourced Speech Dataset Now Available: 13,905 Hours of Speech in 76 Languages

Nvidia and Mozilla have recently updated a renowned crowdsourced speech dataset, making it one of the world's largest open speech datasets. The updated dataset, available through the Mozilla Common Voice project, contains 13,905 hours of speech in 76 languages and 182,000 unique voices.

This extensive dataset includes demographic information such as age, gender, and accent, making it a valuable resource for developing voice-enabled services and AI models in various languages, including less commonly represented ones. The dataset now includes 16 new languages: Basaa, Slovak, Northern Kurdish, Bulgarian, Kazakh, Bashkir, Galician, Uyghur, Armenian, Belarusian, Urdu, Guarani, Serbian, Uzbek, Azerbaijani, and Hausa.

Interested individuals can access the updated dataset by visiting the Mozilla Common Voice website or repository. The data is openly available for research and commercial use under an open license. Mozilla encourages contributors and developers to participate in expanding and improving the dataset.

Additionally, Mozilla provides toolkits, such as those for transcribing audio using open-source Whisper models, to support working with the data securely and privately. For those looking to collaborate or use the dataset for enterprise purposes, Mozilla can provide details on licensing and data access.

Mozilla is also working on an initiative to create a data collective ("marketplace") to facilitate controlled sharing and licensing of curated datasets, which may include this speech dataset in the future.

For more information and to download the dataset, visit the Mozilla Common Voice official site or platform. You can also use Mozilla and EleutherAI toolkits available on platforms like Mozilla.ai Blueprints for accessing or building similar datasets. For collaboration or enterprise use cases, contact Mozilla or relevant entities for licensing and data access details.

  1. The renovated Mozilla Common Voice dataset, with its 13,905 hours of speech in 76 languages, is particularly beneficial for AI and data-and-cloud-computing technology, as it aids in developing voice-enabled services and AI models.
  2. This extensive dataset, containing unique voices from various demographics and languages, is not only open for research purposes but also encourages contributors to enhance and curate it using Mozilla's provided toolkits, potentially making it available in Mozilla's future data collective (marketplace).

Read also:

    Latest