Midjourney turns the tables, from image to challenge

Image-generating AI models such as Midjourney, DALL-E 2, and Stable Diffusion have become extremely popular recently. Although its splendor has been overshadowed by its “brothers” chatbots in recent months, image generation has already become a daily tool of many professionals, and improvements to models (it seems that DALL-E 2 has finally learned to draw human hands) are making these AI offer increasingly reliable generations and therefore closer to the goal set by each challenge.

There is a problem at this point though, and this is it creating a challenge is not as easy as it might seem at first glance. These models are trained to understand natural language, but even the best developments in this sense have their limitations. This has led to the rise of websites where we can find challenges created by third parties, but also commercial services where we can describe the image we want and it will be transformed by an expert in the field into the most suitable challenge. Yes, you read that right, an expert in “talking” to AI models, in what is called prompt engineering (I find this number very useful, but to elevate it to the level of engineering seems a bit presumptuous, TRUE).

As I said, services like Midjourney form a phrase that is very common in game descriptions and that you’ve probably read more than once: Easy to play, hard to controlof which we can be quite clear that those in charge are very, very aware and are constantly trying to improve…unfortunately for those who have found room for professional development in it.

🚨 New Midjourney feature 🚨

/describe

1️⃣Upload a picture
2️⃣ Get back 4 suggested text prompts
3️⃣ Try to generate new images based on challenges

This is super power 💪 🔥

🧵 Quick thread on how to use it pic.twitter.com/f1e9xaC78y

— Linus (●ᴗ●) (@LinusEkenstam) April 3, 2023

The latest news in this regard discovered by an AI expert Linus Ekenstam and it represents a giant leap in that sense now we can upload an image to Midjourney and the model will generate several prompts to allow us to create such an image. The potential of this new feature is huge, as it gives us the best tool yet to learn how to generate challenges that match the logic used by Midjourney to create images.

On the other hand, it opens the door to something I’ve been thinking about for a while, and which may be even more revolutionary. You may recall that one of the main innovations of GPT-4 is that it is multimodal, that is, it allows a combination of text and images in input hints. Now that Midjourney has already shown that it supports images as an input mode (although for now it’s designed to generate text output), the potential of combining images and text in a postgetting output either text or image is spectacular.

Not everything is perfect, that’s for sure. Enter picture prompts can pose a huge threat to copyrighted content, something that many people today will turn upside down. It may be possible to train models like Midjourney to recognize popular copyrighted content (from Mickey Mouse to the Coca-Cola logo) in image challenges, but it’s more difficult for less recognizable content.

Source: Muy Computer

David

Donald Salinas is an experienced automobile journalist and writer for Div Bracket. He brings his readers the latest news and developments from the world of automobiles, offering a unique and knowledgeable perspective on the latest trends and innovations in the automotive industry.