
GPT Image 2 Prompt Writing Guide: 7 Rules for 90% Hit Rate
A practical GPT Image 2 prompt writing guide from 200+ generations. The 7 rules, structure, keywords, and anti-patterns for one-shot success.
If you've tried GPT Image 2 and felt like it ignores half your prompt, the issue is almost never the model — it's the way the prompt is written. After running 200+ generations and comparing a hit-rate matrix, the same 7 rules account for the difference between "first-try success" and "five retries until I gave up."
This is a practical GPT Image 2 prompt writing guide. Every rule below is something you can apply to your next prompt in 30 seconds.
Why most GPT Image 2 prompts fail
Three patterns cause about 80% of prompt failures:
- Treating GPT Image 2 like Stable Diffusion — stuffing the prompt with
masterpiece, 8k, ultra detailed, high qualitykeyword soup. These tokens are noise to GPT Image 2. - Writing unstructured run-on sentences — one long English/Chinese sentence with everything jumbled together. GPT Image 2 reads structure; structure reads back.
- Forgetting to quote text content — saying
the headline says limited offeris way less reliable than sayingthe headline says "Limited Offer". The quotes change everything.
If you fix only those three, your hit rate doubles. Below are the 7 rules in detail.
Rule 1: Structure your prompt — subject, scene, style, text, camera
A reliable GPT Image 2 prompt has 5 ordered components:
| Component | What goes here | Example |
|---|---|---|
| Subject | The main object or character | a white stainless steel water bottle |
| Scene | Background and environment | on a beige linen tablecloth, soft indoor light |
| Style | Visual mood and reference | editorial product photography, premium feel |
| Text | All on-image text in quotes | top-left red badge: "50% off" |
| Camera | Lens, angle, lighting | 45-degree side light, shallow depth of field |
Stitch them together with commas. A complete prompt looks like:
A white stainless steel water bottle, on a beige linen tablecloth,
soft indoor light, editorial product photography, premium feel,
top-left red badge "50% off", bottom black bold text
"Daily Commute Companion", 45-degree side light, shallow depth of field.This structure works because GPT Image 2 is a language model — it follows narrative order. Random order = random output.
Rule 2: Quote every piece of on-image text
This is the single highest-leverage rule. The difference between:
❌ the headline says limited offer
✅ the headline reads "Limited Offer"
Is a 30-40 percentage-point hit rate gap on text-rendering accuracy. Why? The quotes tell the model "this exact string is what you render," instead of "describe the concept of a limited offer."
Same applies to non-Latin text:
❌ 标题写限时五折
✅ 标题写 "限时五折"
When you have multiple text elements:
Headline at top reads "2026 Spring Collection",
subhead reads "30% Off Sitewide",
bottom-left small text reads "Code: SPRING30",
right-side vertical text reads "Limited Time".Each piece quoted, each location specified.
Rule 3: Specify location for every element
GPT Image 2 understands spatial language well — but only if you give it.
Vague: a logo and some text on the image
Precise: a circular logo in the top-left corner, three lines of text in the bottom-right corner
Spatial vocabulary that works reliably:
top-left / top-right / top-center / bottom-left / bottom-right / bottom-centercentered / vertically centered / horizontally centeredforeground / midground / backgroundabove the headline / below the subhead / next to the icon
When you have 3+ elements, every element gets a location. No exceptions.
Rule 4: Constrain the negative — say what you DON'T want
Diffusion models had explicit "negative prompt" fields. GPT Image 2 doesn't, but it understands plain-language constraints:
... no text on the bottle itself,
no shadows on the background,
no other objects in frame,
no watermark.Anti-patterns are especially useful for:
- Removing watermarks (
no watermark, no logo overlay) - Cleaning busy backgrounds (
solid plain background, no decorations) - Avoiding extra hands or fingers (
hands clearly visible, anatomically correct) - Preventing over-decoration (
minimalist, no extra ornaments)
About 1 in 5 retries can be eliminated by spending 10 seconds writing what you don't want.
Rule 5: Anchor the style with a reference, not adjectives
"Beautiful" "stunning" "amazing" tell the model nothing. Anchored references tell it everything.
Weak: a beautiful illustration of a girl
Strong: a Studio Ghibli style illustration of a girl, soft watercolor textures, warm color palette
High-leverage style anchors:
| Category | Anchor examples |
|---|---|
| Illustration | Studio Ghibli, Pixar, Cartoon Network 2010s, Adventure Time, Genshin Impact |
| Photography | Wes Anderson, Annie Leibovitz, National Geographic, Vogue editorial, Kodak Portra 400 |
| Painting | Monet impressionism, Van Gogh post-impressionism, Hopper realism, ukiyo-e |
| Modern | Y2K aesthetic, vaporwave, brutalist design, Memphis pattern, Bauhaus |
| Cinematic | Wong Kar-wai, Christopher Nolan, A24 film palette, Blade Runner 2049 |
The model knows these references. Use them.
Rule 6: Lock the camera and lighting in real photography terms
For photo-realistic outputs, the difference between amateur and pro is camera vocabulary.
Beginner: a realistic photo of a coffee cup on a desk
Pro:
A coffee cup on a wooden desk, shot on Sony A7R IV, 35mm f/2.8 lens,
shallow depth of field, soft natural window light from the left,
golden hour color temperature, slight film grain.Camera terms that demonstrably improve realism:
- Lens:
35mm,50mm,85mm portrait lens,wide-angle 24mm,macro 100mm - Aperture:
f/1.4,f/2.8,shallow depth of field,deep focus - Body:
Sony A7R IV,Canon EOS R5,Leica M11,Hasselblad medium format - Light:
golden hour,blue hour,softbox studio lighting,Rembrandt lighting,rim light - Film:
Kodak Portra 400,Fujifilm Velvia,Ilford HP5 black and white
These are not flowery — they are technical instructions the model knows how to interpret.
Rule 7: Iterate with directed edits, not full regenerations
This is where most users waste 70% of their API budget.
Bad workflow:
Generate → not perfect → tweak prompt → regenerate from scratch → composition
changes → cry → repeat 5 times.Good workflow:
Generate → not perfect → "in this image, change [X] to [Y],
keep everything else identical" → done.GPT Image 2 supports multi-turn directed editing that preserves the rest of the image. This is its single biggest cost-saver.
Examples of effective directed-edit prompts:
"Change the model's jacket from navy to beige. Keep face,
background, lighting, and pose unchanged."
"Replace the headline text with 'Spring Sale'. Keep all other
text, layout, and styling identical."
"Remove the watermark in the bottom-right corner. Keep
everything else exactly the same."The phrase "keep everything else identical" is the magic incantation. Don't skip it.
Putting it all together: a complete real-world prompt
Here's a prompt that uses all 7 rules at once. This is for an e-commerce hero image:
A white stainless steel insulated water bottle, standing upright
on a beige linen tablecloth, with soft window light from the left
at 45 degrees, premium minimalist product photography style.
Top-left red rectangular badge reads "Limited 50% Off",
top-right gold circular badge reads "24h Hot/Cold",
below the bottle bold black headline reads "Daily Commute Companion",
bottom-center small text reads "Tap to Shop".
Shot on Sony A7R IV, 50mm f/2.8 lens, shallow depth of field,
clean composition, no other objects in frame, no watermarks,
1:1 aspect ratio.This kind of prompt typically produces a usable result on the first or second try, instead of the 5-7 retries you'd need with a vague prompt.
Common GPT Image 2 prompt anti-patterns
A short list of things to stop doing immediately:
| Anti-pattern | Why it fails | What to do instead |
|---|---|---|
masterpiece, 8k, ultra detailed keyword stuffing | Noise to GPT Image 2 | Use real style anchors (Rule 5) |
| Single run-on sentence with no commas | Hard for the model to parse structure | Use the 5-component structure (Rule 1) |
Describing text in concept (a sale headline) | Won't render the right words | Always quote the exact string (Rule 2) |
| Prompts in mixed languages without intention | Model gets confused on which language to render | Stay in one language for instructions, quote the target language for on-image text |
| 50-line mega-prompts | Diminishing returns past ~15 specifications | Cap at 10-15 specs, use directed edits for refinements |
| No mention of aspect ratio | Model defaults vary | Always end with 1:1 / 16:9 / 9:16 aspect ratio |
Quick checklist before hitting Generate
Before you submit any GPT Image 2 prompt, run through:
- Does it have all 5 components (subject, scene, style, text, camera)?
- Is every piece of on-image text in quotes?
- Does every element have a specified location?
- Have I excluded what I don't want?
- Is the style anchored to a real reference?
- Are camera and lighting specified (for photo)?
- Is the aspect ratio at the end?
If all 7 boxes are checked, your hit rate jumps to ~90%.
Want to skip the writing entirely?
If you want pre-written GPT Image 2 prompts you can copy-paste directly, browse gpt-image2.art/explore — every example image has its source prompt visible, organized by use case (e-commerce, social media, character design, photography, infographics, posters).
Further reading
More Posts

Did GPT Image 2 Really Dethrone Nano Banana? My Verdict
I went through every hot take, benchmark, and OpenAI doc about GPT Image 2 vs Nano Banana 2. The verdict is more nuanced than "it crushed Banana".

GPT Image 2 Knowledge Graph Prompt Guide: 5 Production Templates for Exam Prep, Xiaohongshu, Lecture Notes, Slides & SOPs
A copy-paste prompt framework for turning any topic into a one-shot knowledge-graph infographic with GPT Image 2. Five battle-tested templates for civil-service exam study cards, Xiaohongshu posts, classroom handouts, slide visuals, and operational SOPs.

GPT Image 2 API: Complete Guide (Python, Node.js, Curl)
Complete GPT Image 2 API integration guide. Auth, parameters, Python/Node.js samples, image editing, batch generation, error handling, cost tips.
Generate your first image with GPT Image 2 — right now
Reliable non-Latin text rendering, directed editing, and 50+ ready-to-use prompts. No downloads — just open in your browser.