Google gives the Gemini 1.5 Pro ears
- April 10, 2024
- 0
Gemini 1.5 Pro can interpret sound. Google only makes the features available to users who have access to Vertex AI and AI Studio. Google is taking a meaningful
Gemini 1.5 Pro can interpret sound. Google only makes the features available to users who have access to Vertex AI and AI Studio. Google is taking a meaningful
Gemini 1.5 Pro can interpret sound. Google only makes the features available to users who have access to Vertex AI and AI Studio.
Google is taking a meaningful step in the further development of its AI model Gemini. Gemini 1.5 Pro can now interpret audio. The model no longer requires a written transcript of a conversation to begin with: you can upload the audio fragment directly. Gemini 1.5 Pro also knows how to handle the sound of videos.
The ability to listen to audio directly is an important addition to the capabilities of Google’s AI model. The company made a false start early in the AI hype with the rather painful launch of Gemini’s predecessor: Bard. Google now seems to be well on its way to matching the quality of the LLMs of its main competitor OpenAI. In any case, the integration of audio is a useful addition.
Users will soon be able to start using the new features, but only within Vertex AI and the AI Studio. Finally, the powerful Gemini 1.5 Pro model is not as freely available as the Gemini chatbot or other LLMs. It seems inevitable that the general public will have access to similar features in the future.
Source: IT Daily
As an experienced journalist and author, Mary has been reporting on the latest news and trends for over 5 years. With a passion for uncovering the stories behind the headlines, Mary has earned a reputation as a trusted voice in the world of journalism. Her writing style is insightful, engaging and thought-provoking, as she takes a deep dive into the most pressing issues of our time.