AI art prompts, the craft
2026 state · 4 min
AI image generation in 2026 is in a strange place. The models are wildly more capable than they were three years ago — Midjourney v8, Sora's image side, Ideogram 4, ChatGPT 5.5's image-1.5 — but the craft of prompting them has moved in the opposite direction from the obvious one. The old paradigm was longer prompts: load every modifier you could think of, weight them, hope. The new paradigm is shorter prompts and more decisive style references. Understanding why is most of what's worth knowing.
Four moves that still work
Lead with the subject, not the style. "A pixel-art moka pot at sunset, El Segundo refinery in the background" beats "soft sand pastel illustration of a pixel-art moka pot at sunset with mid-century modern color palette." Models in 2026 lock onto the subject earlier in the token sequence; style words at the front compete with the subject for attention.
Constraint first, modifier second. If you need a specific aspect ratio, color palette, or lighting condition, name it as a constraint at the end of the prompt: 3:2 landscape, warm sand and ocean blue palette, late-afternoon sunlight. Don't mix constraints into the descriptive flow — the parsers in modern image models treat trailing constraint phrases as harder rules than embedded ones.
One named reference is worth ten adjectives. "in the style of David Hockney's California pool paintings" gives the model more to work with than fifteen adjectives describing Hockney's palette. Same with directors (Wes Anderson, Carlos Reygadas), photographers (Saul Leiter, William Eggleston), and illustrators. Avoid living artists who have publicly opted out of training; the request is more honest if you reach for the dead and the public-domain.
Iterate with the model, not at it. First-prompt outputs in 2026 are usually 60% of where you want to land. The last 40% is asking for a specific change against the previous image: "the same scene, but shift the light to early morning," "same composition, less saturation in the sky." Don't keep rewriting the prompt from scratch. The good move is conversational.
When to use which model
- Midjourney v8 · atmospheric, painterly, cinematic. Best for moods.
- ChatGPT 5.5 (image-1.5) · object consistency, character coherence across panels, "draw me this exact thing." Best for products.
- Ideogram 4 · text inside images. Posters, packaging, signage. The only model that gets typography right consistently.
- Sora image · physics, motion-implied stills, dynamic compositions. Best for action.
What to avoid
Negative prompts have mostly stopped working. The 2024 trick of listing things you didn't want ("--no blur, --no watermark") is either ignored or actively counterproductive in v8 and later. Just describe what you do want. Trust the model to omit the things you didn't ask for. This was a hard habit to drop — it ran on the intuition that more specificity always helps. Modern models reward the opposite.
Also: stop using parenthetical weights. "(coffee:1.2)" was a 2023 SD1.5-era hack that was already obsolete by 2024 in MJ and never worked at all in DALL-E. The current models read prose; speak in prose.
For PointCast specifically, the lane is documented in the Manus
image-gen runbook (in docs/briefs/):
Midjourney for atmospheric reads-card headers, ChatGPT 5.5 for
object-shaped consistency, Ideogram for anything with text. The
prompts in there follow the four moves above. They take less
than thirty seconds to write and produce more usable images than
the prompts I wrote two years ago that took five minutes.