May 13, 2025
Trending News

New lightweight version of Google’s Gemini 1.5 Flash 8B on the market

  • October 4, 2024
  • 0

The AI ​​model Gemini 1.5 Flash has been released in a new version. Gemini Flash-8B is more compact, faster and cheaper than its predecessor. Gemini 1.5 Flash-8B is

New lightweight version of Google’s Gemini 1.5 Flash 8B on the market

Google Gemini

The AI ​​model Gemini 1.5 Flash has been released in a new version. Gemini Flash-8B is more compact, faster and cheaper than its predecessor.

Gemini 1.5 Flash-8B is now available and is considered one of the most affordable lightweight LLMs on the market. The model is optimized for speed and efficiency and is primarily designed to operate on devices such as smartphones and sensors. The AI ​​workload is less because the hardware does not allow for high performance.

Good performance for a compact model

Nevertheless, this lighter version is in no way inferior to its predecessors. According to benchmark tests, it even delivers comparable performance in some areas. SiliconANGLE knows that tasks such as chatting, transcription and contextually correct translation of long texts are no problem.

Benchmark tests
Source: Google Blog

Gemini 1.5 Flash was announced at Google I/O in May 2024 and made available to paying customers a few weeks later. At the time of publication, Google reported that the input size was 60 times larger than OpenAI’s GPT-3.5 Turbo. Flash-8B is twice as fast. The number of requests per minute has also doubled to 4,000 compared to 1.5 Flash.

In terms of price, Gemini 1.5 Flash-8B is around the recommended retail price of comparable models from OpenAI and Anthropic PBC. Flash-8B costs $0.15 per million output tokens and only $0.01 per million input tokens reused. At OpenAI, GPT-4o mini is the cheapest model at $0.15 per million input tokens. This becomes half the price if you reuse prompt prefixes or work with batches. Anthropic’s Claude 3 Haiku, on the other hand, charges a price of $0.25 per million input tokens and $0.03 per million input tokens for reused tokens.

Source: IT Daily

Leave a Reply

Your email address will not be published. Required fields are marked *