AI Prompting 101: Natural Language vs Danbooru Tags (plus Negative Prompts)

Prompting is just communicating intent to the model. Different model families respond better to different “dialects”:

Natural language prompting (plain English descriptions)
Danbooru-style prompting (comma-separated tags, common in anime/illustration models)

This guide is short and practical. For the natural-language section, the core structure is based on SergVidocq’s beginner guide. And if you want to see the kind of tagged images many anime models learn from, you can browse the Danbooru dataset here.

Part 1 — Natural language prompting (plain English)

The basic prompt “recipe” (layered, not magical)

Build your prompt like a stack of clear layers (adapted from the referenced article):

Main subject / scene
- “a young female wizard” (better than “girl”)
Action + context (what’s happening, where)
- “sitting on a rock, reading an ancient spellbook in an abandoned castle”
Descriptive adjectives (shape, mood, quality, vibe)
- “mysterious, detailed, cinematic”
Appearance details (clothes, pose, expression, unique traits)
- “purple hooded cloak, leather outfit with ornaments, focused expression”
Background / environment
- “floating magical particles, full moon behind ruined windows”
Style / medium
- “digital illustration”, “photo”, “pencil sketch”, “oil painting”, “3D render”
Lighting + color palette (optional but powerful)
- “dim moonlight, cool blue-purple tones”, “warm golden hour”
Logic + clarity
- Avoid contradictions and ambiguity (“red cat and blue dog”, not “cat and dog, red and blue”)

Tip from the referenced guide: English usually yields more predictable results, since a lot of training data and community prompting is English-first.

“Good vs bad” examples (natural language)

Too vague

girl, castle

Clear and structured

A young female wizard sitting on a rock, reading an ancient spellbook inside an abandoned castle. Purple hooded cloak, leather outfit with ornaments, calm focused expression. Floating magical particles, moonlight through broken windows. Digital illustration, high detail, cool blue-purple tones, soft rim lighting.

Small habits that improve results fast

Be specific about the subject (age, role, species, object type)
Put the “story” in one sentence (subject + action + place), then add details
Declare style explicitly if you care about it
Use lighting words when the mood matters (“candlelit”, “neon”, “overcast”, “studio lighting”)
Remove contradictions before adding more tokens

Part 2 — Danbooru-style prompting (tag prompts)

Danbooru-style prompts are typically:

comma-separated tags
often used by anime/illustration checkpoints trained on tag datasets
very “keyword-y” rather than sentence-y

What a tag prompt looks like

Example (anime-ish):

1girl, wizard, purple cloak, hood, spellbook, sitting, castle interior, moonlight, floating particles, detailed background, cinematic lighting, masterpiece

Common tag “categories”

You can think in buckets (you don’t need all of them):

Subject & count: 1girl, 1boy, 2girls, solo
Identity / archetype: wizard, knight, android
Appearance: purple cloak, hood, long hair, gloves
Pose / action: sitting, reading, looking at viewer
Environment: castle interior, night, moon
Lighting / mood: moonlight, rim light, dramatic lighting
Composition / camera: close-up, full body, wide shot, from above
Quality / style meta-tags (model-dependent): masterpiece, best quality, highly detailed

Tag prompting: practical rules

Use commas to separate concepts.
Prefer known tags (common words/tags the model likely learned).
Don’t overstuff: 15–40 strong tags often beats 200 noisy tags.
Order is less critical than it used to be, but you’ll still see people put “subject first, style/quality later.”

Mixing styles: hybrid prompting

Many modern workflows do:

a short natural-language core + a small set of tags Example:
a wizard reading in an abandoned castle, moonlight,
1girl, purple cloak, spellbook, castle interior, floating particles, cinematic lighting, detailed background

This is especially useful if:

you want the clarity of a sentence
but your model responds well to tags

Negative prompts (works for both styles)

A negative prompt says what you don’t want: artifacts, anatomy errors, unwanted styles, watermarks, etc.

What negatives are best for

Removing common junk: watermark, text, logo
Avoiding anatomy issues: extra fingers, extra limbs, bad hands
Preventing style drift: blurry, lowres, jpeg artifacts

Two good negative strategies

1) Minimal, focused negatives (recommended to start)

low quality, blurry, watermark, text, extra fingers, bad hands

Pros: less fighting with your positive prompt.

2) Model-specific “negative packs” Some communities use longer lists tuned to a checkpoint/style. Pros: can reduce common failure modes for that model. Cons: can overconstrain or cause “samey” outputs.

Negative prompt mistakes to avoid

Contradicting your positives If you want “film grain” but negative includes “grain”, you’ll get weird results.
Trying to fix everything with negatives If composition is wrong, use better positives, img2img, or ControlNet—negatives won’t invent structure.

Quick choosing guide: which prompt style should you use?

Use natural language when:
- you’re on SDXL / photoreal / generalist checkpoints
- you want clear scene descriptions
Use Danbooru tags when:
- you’re on anime/illustration checkpoints that “think in tags”
- you want consistent character/illustration conventions
Use hybrid when:
- you want the best of both worlds and the model supports it

Tiny templates you can copy

Natural language template

[Main subject], [action] in [setting]. [Appearance details]. [Background elements]. [Style/medium]. [Lighting]. [Color palette].

Danbooru template

[count/subject], [identity], [appearance], [pose/action], [environment], [lighting/mood], [composition], [quality/style tags]

Negative template (starter)

low quality, blurry, watermark, text, bad hands, extra fingers, extra limbs