May 18, 2025
Trending News

Google Gemini is not as good at reading comprehension as Google claims

  • July 1, 2024
  • 0

Two scientific experiments tested Google Gemini’s ability to analyze long texts and images. The LLM failed both tests. Google boasts that the Gemini Pro model can process up

Google Gemini is not as good at reading comprehension as Google claims

Google Gemini Reading Comprehension

Two scientific experiments tested Google Gemini’s ability to analyze long texts and images. The LLM failed both tests.

Google boasts that the Gemini Pro model can process up to two million tokens at a time. That’s roughly equivalent to 2 hours of video, 22 hours of audio, 60,000 lines of code, or 1.5 million words. Google is pushing hard on Gemini’s high token limit to differentiate the model from OpenAI’s GPT models.

But are twins as good at reading comprehension as Google claims? Two scientific studies have put this to the test and come to a different conclusion. Researchers from the Allen Institute and Princeton University had the Gemini models read a book with 260,000 words and answer questions about it.

Twins in tube form

Gemini Pro scored 46 percent in the test, while Flash scored a poor 20 percent. The researchers found that Gemini was quite good at extracting information from very specific sentences, but the accuracy of the answers dropped when the question required reading larger chunks.

Another experiment from an American university tested the ability to analyze images. The researchers created a dataset of images and asked the models questions about objects in the image. To make it more challenging, the researchers added additional distracting images to the slideshow. Gemini Flash in particular, a model that is said to excel primarily in speed, scored highly in this test, achieving a score of thirty percent in the most difficult sequences.

(Too) high promises

It is worth noting that the OpenAI and Anthropic models did not perform much better than Gemini. GPT-4o also only just passed the reading comprehension test. However, Google needs to be careful not to make too high promises.

Source: IT Daily

Leave a Reply

Your email address will not be published. Required fields are marked *