Tech

Google’s Whisk AI generator will ‘remix’ the pictures you plug in

techhigDecember 16, 2024

0 111 2 minutes read

Google has introduced a brand new AI software referred to as Whisk that permits you to generate pictures utilizing different pictures as prompts as an alternative of requiring a protracted textual content immediate.

With Whisk, you possibly can provide pictures to recommend what you’d like as the topic, the scene, and the type of your AI-generated picture, and you’ll immediate Whisk with a number of pictures for every of these three issues. (If you’d like, you possibly can fill in textual content prompts, too.) Should you don’t have pictures readily available, you possibly can click on a cube icon to have Google fill in some pictures for the prompts (although these pictures additionally look like AI-generated). You can too enter some textual content right into a textual content field on the finish of the method if you wish to add additional element concerning the picture you’re in search of, however it’s not required.

Whisk will then generate pictures and a textual content immediate for every picture. You possibly can favourite or obtain the picture in case you’re pleased with the outcomes, or you possibly can refine a picture by getting into extra textual content into the textual content field or clicking the picture and modifying the textual content immediate.

A screenshot of Whisk. I clicked the cube to generate a topic, scene, and elegance. I swapped out the auto-generated scene by getting into a textual content immediate. Whisk created the primary two pictures, which I iterated on by asking Whisk so as to add some steam across the topic (as a result of it’s a hearth being in water), ensuing within the subsequent two pictures.

Screenshot by Jay Peters / The Verge

In a weblog submit, Google stresses that Whisk is designed to be for “fast visible exploration, not pixel-perfect edits.” The corporate additionally says that Whisk might “miss the mark,” which is why it helps you to edit the underlying prompts.

Within the jiffy I’ve used the software whereas penning this story, it’s been entertaining to tinker with. Pictures take a number of seconds to generate, which is annoying, and whereas the pictures have been slightly unusual, every part I’ve generated has been enjoyable to iterate on.

Google says Whisk makes use of the “newest” iteration of its Imagen 3 picture technology mannequin, which it introduced as we speak. Google additionally launched Veo 2, the following model of its video technology mannequin, which the corporate says has an understanding of “the distinctive language of cinematography” and hallucinates issues like additional fingers “much less continuously” than different fashions (a type of different fashions might be OpenAI’s Sora). Veo 2 is coming first to Google’s VideoFX, which you may get on the Google Labs waitlist for, and it will likely be expanded to YouTube Shorts “different merchandise” someday subsequent 12 months.