Tech
This new Google AI tool lets you easily generate images from other photos – no prompt required
Composing the right prompt and description to create an AI-generated image can be challenging. Often, the resulting image misses the mark, forcing you to tweak your prompt repeatedly until you get the right result. Now, a new tool from Google aims to simplify the process by allowing you to create an image based on other images.
Also: The best AI image generators
Initially available in the US, Whisk is the latest Google Labs experiment freely accessible to anyone with a Google account. It’s powered by Google’s Gemini AI and offers several ways to create an image from other images.
How to use Whisk to create images
To get started, sign in to the Whisk home page with your Google account. Choose one of three templates for generating your image. You can select a sticker, which creates a flat image similar to those found in messaging apps. An enamel pin adds a bit more depth to the image, while a plushie results in a three-dimensional image.
By default, Whisk automatically selects an image for the style based on the template you choose. Next, pick the image you want to use for the subject. You can either select one of the images provided on the page or upload your own. Gemini analyzes the images for style and subject, then combines them to generate a new image. If you don’t like the result, you can change the subject image and generate a different result.
Also: I tested 9 AI content detectors – and these 2 correctly identified AI text every time
While this process sounds straightforward, you can get even more creative. To have more control, select the option to start from scratch. Here, you can choose images for the subject, scene, and style, either by uploading your own or by writing a traditional prompt. If you’re unsure where to start, you can ask Whisk for inspiration, and it will generate a series of images for you.
Once you’re ready, tell Whisk to generate a new image based on the combined selections. In response, Whisk displays multiple images based on the mix. You can refine the results by adding or removing source images or editing the prompt.
All the images you generate are automatically saved to your Whisk library. From there, you can delete any unwanted images and download the ones you like. Downloads are saved as JPG files, allowing you to use them with other apps and services.
How does Google pull off this type of image generation?
Rather than duplicate your source images to create new ones, Whisk pulls out a few key elements.
“Behind the scenes, the Gemini model automatically writes a detailed caption of your images. It then feeds those descriptions into Google’s latest image generation model, Imagen 3,” Thomas Iljic, director of product management for Google, wrote in a blog post published Monday. “This process captures your subject’s essence, not an exact replica. That way, you can easily remix your subjects, scenes, and styles in novel ways.”
Also: 7 ways to write better ChatGPT prompts – and get the results you want faster
As a result, the generated images of a person may have a different height, weight, hairstyle, or skin tone than the original. Google also allows you to edit the underlying prompt if you want to guide the results in a specific direction.
“In our early testing with artists and creatives, people have been describing Whisk as a new type of creative tool — not a traditional image editor,” Iljic added. “We built it for rapid visual exploration, not pixel-perfect edits. It’s about exploring ideas in new and creative ways, allowing you to work through dozens of options and download the ones you love.”