SeamlessM4T, Meta publishes AI multimodal translation in 100 languages
August 23, 2023
0
Meta took the AI revolution seriously, very seriously.and SeamlessM4T is the latest example (and also one of the most interesting) of this. And that reminds me of earlier
Meta took the AI revolution seriously, very seriously.and SeamlessM4T is the latest example (and also one of the most interesting) of this. And that reminds me of earlier this year, when Yann LeCun said that ChatGPT is not innovative, not revolutionary. Let’s recall that LeCun, in addition to being the head of AI at Meta, is an entire institution in the field of artificial intelligence. Something we could agree on if it referred to the underlying technology of the service, but not so much regarding its conception and implementation.
Meta had a similar problem to Google earlier this year, namely that both tech companies have been in the field of artificial intelligence for quite some time, but the visibility of their progress in this regard was relatively low. To give just one example, in 2011 Google created DistBelief, its big evolution in 2015 would become TensorFlow, an open source machine learning library that has been widely used ever since. And yes, in case you’re wondering, it’s no coincidence that Google’s SoCs, called Tensor and specifically trained for AI tasks, are called exactly that.
Be that as it may, and going back to Meta, if leCun could be accused of anything when he made that statement, it was that Meta was not as revolutionary as OpenAI at the street leveland very soon after that Microsoft would start to be too. Now it seems clear that he had to come to this conclusion himself, and since then Meta has begun to demonstrate its potential in the field of artificial intelligence. And yes, it becomes a superpower in it.
So when barely a month has passed since the said statements, Meta presented and published by LlaMa (Large Language Model Meta AI), a generative model trained with a large data set composed of texts in 20 different languages, which, according to what we could see in its specifications, wanted to compete head-to-head with GPT-3 (beware, I’m talking about a model not about ChatGPT because what Meta presents is a model, not a chatbot based on it). Later, SAM, an AI capable of identifying and separating different components of an image, will arrive, and several other announcements have been made since then.
And that brings us to the present, a moment in which the company has already managed to consolidate its position in this field, but still does not aim to let up. So as we can read on his blog, Meta introduced SeamlessM4T, a multimodal AI model capable of translating between 100 languages. Multimodal in this context means that it allows both written and audio inputs and outputs, although its range in terms of number of languages varies depending on the type of input and/or output media used. These are his abilities, as we can read on the blog:
Automatic speech recognition for nearly 100 languages
Translation voice to text for nearly 100 input and output languages
Translation voice to voicesupports nearly 100 input languages and 35 (+ English) output languages
Translation text to text for almost 100 languages
Translation text to speechsupports nearly 100 input languages and 35 (+ English) output languages
The number of languages supported makes SeamlessM4T huge in scope, but even more interesting is that is able to detect language changes when the input source is audio that alternates between two or more languages in the same sentence/conversation. Unlike other systems of this type, in which input can only occur in a single language, or in which the software must be signaled that such a change has occurred, this model is intelligent in this respect.
Another notable aspect of SeamlessM4T is that Meta will publish it under the CC BY-NC 4.0 license., that is, it will be publicly accessible and free of charge to all types of entities, both public and private, interested in using it for any purpose. However, as with LlaMa, Meta does not appear to have any plans to create an online translation service based on this model. But if you’d like to try it out, don’t worry, you can already do so if you wish on the SeamlessM4T page on HuggingFace.
Donald Salinas is an experienced automobile journalist and writer for Div Bracket. He brings his readers the latest news and developments from the world of automobiles, offering a unique and knowledgeable perspective on the latest trends and innovations in the automotive industry.