AI Prompting 101: Natural Language vs Danbooru Tags (plus Negative Prompts)
Prompting is just communicating intent to the model. Different model families respond better to different “dialects”:
- Natural language prompting (plain English descriptions)
- Danbooru-style prompting (comma-separated tags, common in anime/illustration models)
This guide is short and practical. For the natural-language section, the core structure is based on SergVidocq’s beginner guide. And if you want to see the kind of tagged images many anime models learn from, you can browse the Danbooru dataset here.
Part 1 — Natural language prompting (plain English)
The basic prompt “recipe” (layered, not magical)
Build your prompt like a stack of clear layers (adapted from the referenced article):
Main subject / scene
- “a young female wizard” (better than “girl”)
Action + context (what’s happening, where)
- “sitting on a rock, reading an ancient spellbook in an abandoned castle”
Descriptive adjectives (shape, mood, quality, vibe)
- “mysterious, detailed, cinematic”
Appearance details (clothes, pose, expression, unique traits)
- “purple hooded cloak, leather outfit with ornaments, focused expression”
Background / environment
- “floating magical particles, full moon behind ruined windows”
Style / medium
- “digital illustration”, “photo”, “pencil sketch”, “oil painting”, “3D render”
Lighting + color palette (optional but powerful)
- “dim moonlight, cool blue-purple tones”, “warm golden hour”
Logic + clarity
- Avoid contradictions and ambiguity (“red cat and blue dog”, not “cat and dog, red and blue”)
Tip from the referenced guide: English usually yields more predictable results, since a lot of training data and community prompting is English-first.
“Good vs bad” examples (natural language)
Too vague
girl, castle
Clear and structured
A young female wizard sitting on a rock, reading an ancient spellbook inside an abandoned castle. Purple hooded cloak, leather outfit with ornaments, calm focused expression. Floating magical particles, moonlight through broken windows. Digital illustration, high detail, cool blue-purple tones, soft rim lighting.
Small habits that improve results fast
- Be specific about the subject (age, role, species, object type)
- Put the “story” in one sentence (subject + action + place), then add details
- Declare style explicitly if you care about it
- Use lighting words when the mood matters (“candlelit”, “neon”, “overcast”, “studio lighting”)
- Remove contradictions before adding more tokens
Part 2 — Danbooru-style prompting (tag prompts)
Danbooru-style prompts are typically:
- comma-separated tags
- often used by anime/illustration checkpoints trained on tag datasets
- very “keyword-y” rather than sentence-y
What a tag prompt looks like
Example (anime-ish):
1girl, wizard, purple cloak, hood, spellbook, sitting, castle interior, moonlight, floating particles, detailed background, cinematic lighting, masterpiece
Common tag “categories”
You can think in buckets (you don’t need all of them):
- Subject & count:
1girl,1boy,2girls,solo - Identity / archetype:
wizard,knight,android - Appearance:
purple cloak,hood,long hair,gloves - Pose / action:
sitting,reading,looking at viewer - Environment:
castle interior,night,moon - Lighting / mood:
moonlight,rim light,dramatic lighting - Composition / camera:
close-up,full body,wide shot,from above - Quality / style meta-tags (model-dependent):
masterpiece,best quality,highly detailed
Tag prompting: practical rules
- Use commas to separate concepts.
- Prefer known tags (common words/tags the model likely learned).
- Don’t overstuff: 15–40 strong tags often beats 200 noisy tags.
- Order is less critical than it used to be, but you’ll still see people put “subject first, style/quality later.”
Mixing styles: hybrid prompting
Many modern workflows do:
- a short natural-language core + a small set of tags Example:
a wizard reading in an abandoned castle, moonlight,1girl, purple cloak, spellbook, castle interior, floating particles, cinematic lighting, detailed background
This is especially useful if:
- you want the clarity of a sentence
- but your model responds well to tags
Negative prompts (works for both styles)
A negative prompt says what you don’t want: artifacts, anatomy errors, unwanted styles, watermarks, etc.
What negatives are best for
- Removing common junk:
watermark, text, logo - Avoiding anatomy issues:
extra fingers, extra limbs, bad hands - Preventing style drift:
blurry, lowres, jpeg artifacts
Two good negative strategies
1) Minimal, focused negatives (recommended to start)
low quality, blurry, watermark, text, extra fingers, bad hands
Pros: less fighting with your positive prompt.
2) Model-specific “negative packs” Some communities use longer lists tuned to a checkpoint/style. Pros: can reduce common failure modes for that model. Cons: can overconstrain or cause “samey” outputs.
Negative prompt mistakes to avoid
- Contradicting your positives If you want “film grain” but negative includes “grain”, you’ll get weird results.
- Trying to fix everything with negatives If composition is wrong, use better positives, img2img, or ControlNet—negatives won’t invent structure.
Quick choosing guide: which prompt style should you use?
Use natural language when:
- you’re on SDXL / photoreal / generalist checkpoints
- you want clear scene descriptions
Use Danbooru tags when:
- you’re on anime/illustration checkpoints that “think in tags”
- you want consistent character/illustration conventions
Use hybrid when:
- you want the best of both worlds and the model supports it
Tiny templates you can copy
Natural language template
[Main subject], [action] in [setting]. [Appearance details]. [Background elements]. [Style/medium]. [Lighting]. [Color palette].
Danbooru template
[count/subject], [identity], [appearance], [pose/action], [environment], [lighting/mood], [composition], [quality/style tags]
Negative template (starter)
low quality, blurry, watermark, text, bad hands, extra fingers, extra limbs