May 9, 2025
Trending News

Meta announces Voicebox, an artificial intelligence model for voice

  • June 17, 2023
  • 0

Meta has announced its latest generative AI model after Voicebox, ImageBind, designed to help creators perform speech-generating tasks such as audio editing, sampling, and stylization, even if they’re

Meta announces Voicebox, an artificial intelligence model for voice

Meta has announced its latest generative AI model after Voicebox, ImageBind, designed to help creators perform speech-generating tasks such as audio editing, sampling, and stylization, even if they’re not specifically trained to do so through contextual learning.

Claiming that this new model of artificial intelligence will benefit many people around the world, Meta uses examples such as helping visually impaired people hear text messages from their friends with their own voice, and people speaking a foreign language with their own voice.

The AI ​​model itself can both create high-quality sound recordings and edit pre-recorded sounds to remove unwanted glitches such as car horns, preserve the content and style of the sound, is multilingual, generate speech in six languages. Future enhancements to the model include providing natural voices to visual assistants or non-game characters during games in the metaverse.

Meta also compared Voicebox to other AI voice models, specifically naming Vall-E and YourTTS as competitors, showing that Voicebox is more advanced and outpacing both models when comparing Word error rate and style similarity.

Voice Box AI
Voice Box AI

Voicebox is built on the Flow Matching model, the latest non-autoregressive generative Meta model that can learn highly non-deterministic mappings between text and speech, allowing Voicebox to learn from a variety of speech data without the need for tagging. carefully, it allows the data to be more diverse and at scale.

Voicebox has so far trained over 50,000 hours of recorded speech and transcription from public audiobooks in English, French, Spanish, German, Polish and Portuguese, and can also predict a segment of speech based on surrounding speech and text. section

Finally, Meta says that while this technology could usher in a new era of productive AI for language, it could create the potential for abuse and unintended harm.

Meta’s research paper on Voicebox will detail how it built a highly efficient classifier that can distinguish real speech from Voicebox generated speech. Meta will not make the AI ​​program itself public and will not release its source code.

Source: Port Altele

Leave a Reply

Your email address will not be published. Required fields are marked *