- Designer Input
- Posts
- Goodbye "Prompt Engineers"?
Goodbye "Prompt Engineers"?
How to Generate The Perfect Image With Your AI Team
Designer Input
Welcome back to Designer Input,
This Week’s Topics:
SDXL + GPT4 Vision: Idea-to-Image
Midjourney Updates: Mobile App + 4k Upscale
3 AI Tools for Designer2.0’s
AI-Video of the Week: Animated Logo
Stories Worth Reading: Claude AI is now available in 95 countries.
Main Insight
Idea2Image = SDXL + GPT4 Vision
Idea2Img framework
“Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation”
This week, the Microsoft Azure research team published a new paper introducing Idea2Img.
The Idea2Img framework generates images from the initial idea prompt using Stable Diffusion XL.
GPT Vision then carefully reviews each image to ensure it matches the original idea.
Idea-to-Image Process:
Human Idea Input
First draft images generated with SDXL
GPT Vision checks the images and picks the best draft
GPT Vision crafts an improved prompt
New Image generated with SDXL
This iterative process continues until the image precisely matches the user's idea.
In other words, the loop persists until GPTVision rates the generated image a perfect 10/10 score.
With Idea2Img, users can provide text, a base image to maintain key elements like a person or item, or a reference image to copy the visual style.
While the code has not yet been released and there is no public demo, one of the authors, Zhengyuan Yang, stated on the HuggingFace forum that they are preparing the code and plan to release it soon.
Idea2Img is another step toward creating fully automated AI agents that can communicate and focus on different parts of a workflow.
With this framework, users only need to provide an idea or a goal. Then, the team of AI agents (in this case SDXL+GPT) can figure out the steps, perform quality checks, and iterate until generating a satisfactory result.
Reply