Common Voice wants to teach machines to speak human

Date of publication

Who supplies the speech recognition software for the Starship Enterprise? Admittedly, that didn't interest me in the past. It was fascinating that Kirk and Co. could "talk to the computer".

In the 90s I tried Dragon Naturalspeaking. But I never got beyond training and creating a profile.

Today, people use Siri, Alexa and Hey Google as a matter of course. Will it be Apple, Amazon or Google that supply the software for the voice recognition of the starship Enterprise in the year 2200? I hope not, as these companies are obviously pursuing interests other than carrying the idea of the Federation of United Planets in their DNA.

Donate your voice

For some time now, I have therefore been making contributions to the Common Voice project of the Mozilla Foundation. This project has no other goal than to counter the domination of commercially oriented speech recognition with an open source approach. This is more than commendable. Who wants to have a bug in their home/on their website/on their smartphone from commercially oriented companies?

What we learned from the Babelmonkeys: it takes an insane amount of time to organise and prepare relevant training sets for the algorithms. Mozilla is stepping into the breach and inviting people to contribute. "Donate your voice". This small request, which hardly takes any time in everyday life, has been with us for weeks now.

"However, the vast majority of data used by large companies is not accessible to the majority of people. We believe this stifles innovation. That's why we've launched Common Voice, a project to help make voice recognition accessible to everyone." Source: Common Voice website

Mozilla makes it easy to get started. Simply call up the website and - if available on your own computer - speak into the built-in microphone. In addition to "donating" one's own voice (reading out sentences), confirming the sentences read out is also important for the project. It is rare, but it does happen: Jokers who read other things aloud or just record sounds. But also: simple misreaders, where the head thought of other sentence constructions than the sentence in front of them.

Now also with dialects

And because the project enables another dimension of speech recognition, we invite all dialect speakers to participate. Whether Saxon, Franconian or with a Berlin dialect - with a sufficient number of speech samples, there is a chance that speech assistants will have no problems with dialects in the future. But that is up to your cooperation.

With Deep Speech, Mozilla also provides the appropriate software for the development of voice assistants.

So go ahead. Join in.

Profile picture for user DeepL

DeepL is a deep learning company that develops AI systems for languages. The company, based in Cologne, Germany, was founded in 2009 as Linguee, and introduced the first internet search engine for translations. Linguee has answered over 10 billion queries from more than 1 billion users.

Profile picture for user luckow

Stephan Luckow

Stephan is an open source evangelist and constantly curious about technologies. Thematically, his blog posts can best be summarised as "curiosity satisfied".