DeepMind AI Lab has developed a family of Flamingo models that do more with less costly and time-consuming training.
The model is designed to combine text and image input to get a text-only response.
Flamingo was trained on a custom dataset created for multimodal machine learning research. The set consists of 185 million images and 182 GB of text from the public Internet.
One of the components of Flamingo is a pre-trained Chinchilla LM language model with 70 billion parameters. DeepMind has “merged” the algorithm with visual learning elements. Engineers also added “new architectural middleware” that keeps data isolated and frozen, giving it 80 billion Flamingo VLM parameters.
“A single Flamingo model can achieve the highest results in a wide variety of tasks, competing with approaches that require fine-tuning for a given task on more samples,” the developers said.
According to the organization, Flamingo is superior to previous multi-step learning approaches. The model also proved to be more efficient than fine-tuned algorithms that use more data.
Going forward, Flamingo can reduce the amount of energy consumed in AI training and reduce the need for high-performance hardware. However, the company did not reveal the details as they obtained these results.
The developers stressed that the Flamingo can be quickly adapted to resource-constrained environments and low-resource tasks such as AI bias assessment.
Recall that in April DeepMind introduced the Chinchilla language model with 70 billion parameters.
In February, the British AI Lab demonstrated its AlphaCode tool, which can code itself.
In December 2021, DeepMind developed a massive Gopher language model containing 280 billion parameters.
Source: Fork Log
I’m Sandra Torres, a passionate journalist and content creator. My specialty lies in covering the latest gadgets, trends and tech news for Div Bracket. With over 5 years of experience as a professional writer, I have built up an impressive portfolio of published works that showcase my expertise in this field.