AI image generators have been delighting (and haunting) us for some years now, thanks to OpenAI, Imagen, Adobe Firefly, DALL-E-3, and more. As the technology advances, we seem to have more and more options to refine our results. Now, Google Labs has released Whisk, a tool that lets you upload images as guidance instead of text prompts.
Google Labs’ Whisk Generates Images—From Other Images
If you live in the US, you’ll now have access to Whisk from Google Labs, an “experiment in generative AI,” per Google’s blog. With Whisk, instead of relying solely on descriptive text prompts, you can add images as references. The platform will ask for three key characteristics: subject, scene, and style. The tool will then mix, or “whisk” those ingredients together and generate the perfect image for you.
Whisk uses Imagen 3, Google’s latest image generation model.
Google hasn’t done away with text prompts altogether with Whisk. You still have the option to write generation prompts for each of the three categories, or add a general note. You can also refine an image after you see Whisk’s first attempt. For instance, let’s say you generate a vintage-style holiday card of a cat lying in the snow. Upon seeing the generation, you might become inspired to add snowflakes as a finishing touch.
Every time you add or generate an image in any of Whisk’s three categories, the platform will do the work of creating a detailed written description of it. Hence, if you want to add to or edit an existing image, you can just customize the text.
Lastly, if you’re feeling uninspired, you can randomize your visual components by selecting a die icon. For more complex generations, you can also add more than one subject, scene, or style reference.
Once you’re happy with your masterpiece, you can either save it on the platform or download it for local access.
Worth the Whisk?
With all the advanced AI image generation options out there to enhance your photos or produce “original” art, Google’s new tool might just seem like a gimmick. But the way that Whisk leverages visual references in its image generation is unique, and I can see how it’s valuable in creative and professional scenarios.
Say you’re working on a pitch deck and need images that look similar to a reference you already have. Instead of trying to reverse engineer that reference in words, you can simply upload the file, along with a brief text description of how you’d like your new image to differ.
To differentiate Whisk from other AI image software out there, Google has established that the platform is meant to be exploratory—not for finesse. While other products might be a better fit for fine-tune edits, Whisk is best for brainstorming:
“We built it for rapid visual exploration, not pixel-perfect edits. It’s about exploring ideas in new and creative ways, allowing you to work through dozens of options and download the ones you love.”
Let’s be honest: sometimes, it’s hard to find the words. Trust me, I constantly find myself grasping at invisible straws in an effort to find the right descriptor. To me, this gives Whisk some serious potential, for all those times when it’s just easier just to say “I want an image that looks like this one.”
Leave a Reply