Startec

Startec

Meta’s new AI models can recognize and produce speech for more than 1,000 languages

Mai 22, às 16:38

·

3 min de leitura

·

0 leituras

Meta has built AI models that can recognize and produce speech for more than 1,000 languages—a tenfold increase on what’s currently available. It’s a significant step toward preserving languages that are at...
Meta’s new AI models can recognize and produce speech for more than 1,000 languages

Meta has built AI models that can recognize and produce speech for more than 1,000 languages—a tenfold increase on what’s currently available. It’s a significant step toward preserving languages that are at risk of disappearing, the company says.

Meta is releasing its models to the public via the code hosting service GitHub. It claims that making them open source will help developers working in different languages to build new speech applications—like messaging services that understand everyone, or virtual-reality systems that can be used in any language.

There are around 7,000 languages in the world, but existing speech recognition models cover only about 100 of them comprehensively. This is because these kinds of models tend to require huge amounts of labeled training data, which is available for only a small number of languages, including English, Spanish, and Chinese.

Meta researchers got around this problem by retraining an existing AI model developed by the company in 2020 that is able to learn speech patterns from audio without requiring large amounts of labeled data, such as transcripts. 

They trained it on two new data sets: one that contains audio recordings of the New Testament Bible and its corresponding text taken from the internet in 1,107 languages, and another containing unlabeled New Testament audio recordings in 3,809 languages. The team processed the speech audio and the text data to improve its quality before running an algorithm designed to align audio recordings with accompanying text. They then repeated this process with a second algorithm trained on the newly aligned data. With this method, the researchers were able to teach the algorithm to learn a new language more easily, even without the accompanying text.

“We can use what that model learned to then quickly build speech systems with very, very little data,” says Michael Auli, a research scientist at Meta who worked on the project.

“For English, we have lots and lots of good data sets, and we have that for a few more languages, but we just don’t have that for languages that are spoken by, say, 1,000 people.” 

The researchers say their models can converse in over 1,000 languages but recognize more than 4,000. 

They compared the models with those from rival companies, including OpenAI Whisper, and claim theirs had half the error rate, despite covering 11 times more languages.

However, the team warns the model is still at risk of mistranscribing certain words or phrases, which could result in inaccurate or potentially offensive labels. They also acknowledge that their speech recognition models yielded more biased words than other models, albeit only 0.7% more.

While the scope of the research is impressive, the use of religious texts to train AI models can be controversial, says Chris Emezue, a researcher at Masakhane, an organization working on natural-language processing for African languages, who was not involved in the project.

“The Bible has a lot of bias and misrepresentations,” he says.


Continue lendo

Showmetech

Motorola Razr Plus é o novo dobrável rival do Galaxy Z Flip
Após duas tentativas da Motorola em emplacar — novamente — telefones dobráveis, eis que temos aqui a terceira, e aparentemente bem-vinda, tentativa. Estamos falando do Motorola Razr Plus, um smartphone...

Hoje, às 15:20

DEV

Mentoring for the LGBTQ+ Community
Once unpublished, all posts by chetanan will become hidden and only accessible to themselves. If chetanan is not suspended, they can still re-publish their posts from their dashboard. Note: Once...

Hoje, às 15:13

TabNews

IA: mais um arrependido / Déficit de TI / Apple: acusação grave · NewsletterOficial
Mais um pioneiro da IA se arrepende de seu trabalho: Yoshua Bengio teria priorizado segurança em vez de utilidade se soubesse o ritmo em que a tecnologia evoluiria – ele junta-se a Geoffr...

Hoje, às 14:37

Hacker News

The Analog Thing: Analog Computing for the Future
THE ANALOG THING (THAT) THE ANALOG THING (THAT) is a high-quality, low-cost, open-source, and not-for-profit cutting-edge analog computer. THAT allows modeling dynamic systems with great speed,...

Hoje, às 14:25

TabNews

[DISCUSÃO/OPINIÕES] – Outsourcing! O que, para quem, por que sim, por que não! · dougg
Quero tentar trazer nesta minha primeira publicação, uma mistura de um breve esclarecimento sobre o que são empresas de outsourcing, como elas funcionam e ganham dinheiro, mas também, ven...

Hoje, às 13:58

TabNews

Duvida: JavaScript - Desenvolver uma aplicação que vai ler um arquivo *.json · RafaelMesquita
Bom dia a todos Estou estudando javascript e me deparei com uma dificuldade e preciso de ajuda *Objetivo do estudo: *desenvolver uma aplicação que vai ler um arquivo *.json Conteudo do in...

Hoje, às 13:43

Showmetech

Automatize suas negociações com um robô de criptomoedas
Índice Como o robô de criptomoedas Bitsgap funciona?Qual a vantagem de utilizar um robô de criptomoedas?Bitsgap é confiável? O mercado de trading tem se tornado cada vez mais popular e as possibilidades de...

Hoje, às 13:13

Hacker News

Sketch of a Post-ORM
I’ve been writing a lot of database access code as of late. It’s frustrating that in 2023, my choices are still to either write all of the boilerplate by hand, or hand all database access over to some...

Hoje, às 13:11

Showmetech

14 chuveiros elétricos para o banho dos seus sonhos
Índice Chuveiro ou Ducha?Tipos de chuveiro elétrico9 fatores importantes para considerar na hora de comprar chuveiros elétricosMelhores chuveiros elétricosDuo Shower LorenzettiFit HydraAcqua Storm Ultra...

Hoje, às 11:00

DEV

Learn about the difference between var, let, and const keywords in JavaScript and when to use them.
var, let, and const: What's the Difference in JavaScript? JavaScript is a dynamic and flexible language that allows you to declare variables in different ways. You can use var, let, or const keywords to...

Hoje, às 10:21