Google's best AI versus the OpenAI heavyweight: We compare Gemini 1.5 Pro to GPT-4o

It was abundantly clear that Google’s presentation during I/O would be about artificial intelligence, and it had been anticipated for several weeks. In fact, the news was so well-known that OpenAI blew Sundar Pichai’s keynote (and it wasn’t the first time): The new GPT-4o model preceded Google’s new version, Gemini 1.5 Pro. After spending a few days playing with both, I am faced with the difficult task of confronting them. Because, Which is better?

Both Google and OpenAI have improved the speed and execution of their models to reduce response latency as much as possible. They expand context to process greater amounts of information and both are integrated into the company’s premium products; however, free ChatGPT users have limited access to GPT-4o. For practical purposes, they are very similar, including their results. They both revealed both their virtues and shortcomings when I tried to tickle them.

Gemini 1.5 Pro gets less wet than GPT-4o

Google I/O image

Before I start with the results, I will reveal the test table. I chose a series of orders to experiment with all areas where a chatbot can help: text, images, math problems, translation, code and more. I used Gemini 1.5 Pro with Google One AI subscription and GPT-4o with ChatGPT Plus. The user needs to be paid on both so their capabilities are not diminished.

In the case of ChatGPT, I used the Android app and also the web version of the desktop browser. I switched between both the web version of Android and the desktop browser for Gemini, but I also have the chatbot integrated into my Google Pixel 8 Pro and it replaces Google Assistant. Since the transaction is in the cloud and both platforms maintain conversations online, No matter where the queries are made: the results will be the same.

Two AIs are on the phone, both are prepared and waiting for the list of orders I will send them. I’ll start with something simple: Who am I?

Although the twins have access to the internet with the world’s largest search engine, they prefer not to get wet. ChatGPT confuses many Iván Linares: the first news that I am a film director.

Something more difficult and common for those using an AI chatbot without leaving the search engine and resource controller functions. Why is the world flat and not round?

Neither falls, and both deny it with disproven scientific evidence. Let’s see if I can mix them up.

Gemini resolves the issue with a rather dry response, while ChatGPT is more hesitant. How he loves to give himself away and get lost in arguments.

Sensitive question time: Potato omelette with or without onions?

Gemini 1.5 Pro tends not to position itself and offer politically correct answers from different sides. GPT-4o loves to show off how much training he’s had. And it drops as much data as possible when it gets the opportunity (can be prevented by customizing the behavior, but I preferred to leave both AIs as default). Yes, it is less specific than Gemini 1.5 Pro, I noticed that Google has made a huge improvement compared to previous versions.

At this point, I asked them to create a visual that would not cause controversy with the potato omelette. I made a little trick here because Gemini Advanced doesn’t render images in Spanish yet: I asked him in English with a VPN connected to the United States. As for the results… I think it’s easy to declare the winner.

Who knows what the green thing on Gemini is? And ChatGPT makes a filled cake more than just an omelette

Solving more complex problems

So far, I have nailed them with searches, objective evaluations and visuals. I see that Gemini is in a better positionChatGPT gives data more than concreteness. It’s time for more complex problems.

I begin with a seemingly simple question that I asked in a previous confrontation: «Multiply the number of iPhone models Apple launches in 2022 by the number of years Stephen King will be around in 2024».

Neither is right: in 2022, Apple launched four iPhone 14s and one iPhone SE. The rest of the logic is correct: I remember Google Bard, the chatbot before Gemini, was causing quite a stir at the time.

Let’s continue with a seemingly mathematical problem that requires a certain amount of logical reasoning to solve: «If my mobile phone had no battery and they sent me a message every half hour, how many SMS messages would I read at midnight??».

I have no more questions, your honor: Gemini 1.5 Pro wins the battle by a landslide.

Now I will ask you for a code which is a Bookmark applet built in Javascript for the web browser. The idea is that when the Bookmarklet in question is clicked, the browser separates the images from the text. with a button to download them. The order was as follows:

Imagine I need to download images from any web page. I want you to make me a Bookmark that parses website code to open a page (in a popup or as a new tab) where all images appear in JPG, PNG, or WEBP format; You can ignore the rest of the formats. Every photo must have a download button so that I can download the photo I want. If Bookmarklet manages to convert the image format to JPG, you can do this.

I was surprised by the excellent results of both: fully valid, operational and for the first time with exactly what I want. Since they separate the thumbnails, I’d need to polish the code so they could load images at maximum resolution, but on the first try I have no complaints. Gemini deserves a special mention because it gave me the result much faster.

Results of applying Bookmarklet to a desktop browser. Left, Gemini: right, ChatGPT

Verdict: You can see the improvement Google has applied to Gemini 1.5 Pro

I’ve been with Gemini (formerly Google Bard) and ChatGPT since their inception, using the different models introduced and all the updates, so my future with both is based on experience. It seems to me that OpenAI has greatly increased the speed with GPT-4o without optimizing the rationales or subjective interpretations of the answers; It’s the exact opposite of Google with the newly introduced revision in Gemini 1.5 Pro you can see how much shine it makes every aspect of interpretation and response.

Both are very fast and efficient, effective at most tasks and, let’s not forget, prone to errors: You should never get caught up in what they say. This must be engraved in stone.

Image 2 on Gemini (left) and DALL-E 3 on ChatGPT (right)

Best of all, I gave them the text of this article in PDF format to check the document analysis (neither of us had the slightest problem). One of themEpic, illustration-style image that can be used as a post cover». I chose the winner from those obtained. As a curiosity, in order for Gemini to do this, I had to translate the article, connect to the American VPN, and request the image in English.

Cover image | Gemini

Xataka on Android | Google is repeating the same mistake as always with the Gemini rollout: a bunch of names, duplicate apps and services, and now two assistants

Source: Xatak Android

John

John Wilkes is a seasoned journalist and author at Div Bracket. He specializes in covering trending news across a wide range of topics, from politics to entertainment and everything in between.