How to Build Great Skills for Claude Code

Apr 4, 2026

AI AI Skills Claude Code Coding assistants Prompt Engineering Tutorial

Most Claude Code skills fail silently — they either never trigger, produce generic output, or break the moment something unexpected happens. Here’s how to build ones that actually work.

What is a Skill?

A skill is a set of instructions that teaches Claude Code how to handle a specific type of task. Think of it as a recipe — it tells Claude what steps to follow, what to look for in the codebase, and how to produce consistent, high-quality output.

A skill lives as a SKILL.md file in your project (usually under .claude/skills/), and Claude reads it whenever it detects a relevant request. The file contains markdown instructions, and optionally references additional files for detailed guidance.

This article shares practical patterns for building skills that trigger reliably, research code thoroughly, handle failures gracefully, and produce output developers actually want. The examples come from real skills we built and open-sourced at github.com/axitdev/skills — feel free to reference them, but the patterns here apply to any skill you’re building.

The Architecture That Works

After iterating on several skills, we found a consistent three-layer structure works best:

skill-name/
├── SKILL.md                    (under 500 lines — main instructions)
├── references/
│   └── detailed-reference.md   (loaded on demand — syntax, patterns, etc.)
└── .skill-config.yaml          (optional — user-customizable settings)

Why this structure matters for quality

The key insight is progressive disclosure. Claude loads skill metadata (name + description) for every message to decide whether to activate a skill. The SKILL.md body loads when the skill triggers. Reference files load only when needed during execution.

This isn’t just about organization — it directly affects output quality. Every token in a bloated SKILL.md competes with the actual code research and generation for space in Claude’s context window. A 700-line SKILL.md stuffed with syntax examples means less room for Claude to think about your actual codebase. Extract reference material, and Claude has more headroom for the work that matters.

What goes where

SKILL.md (under 500 lines) — the brain of the skill:

YAML frontmatter with name and description (the trigger mechanism)
References section pointing to additional files
Configuration section with defaults
Step-by-step workflow
Edge cases and failure handling
Principles section

Reference files — the knowledge base, loaded on demand:

Syntax examples, cheat sheets, common patterns, prompt templates
The SKILL.md tells Claude when to read them with an explicit instruction

## References

- `references/syntax-guide.md` — Syntax examples for all supported formats.
  **Read this before generating output.** Do not generate from memory.

The bold instruction is important — it prevents Claude from “winging it” with training data that might be outdated.

Config files — user-customizable settings:

A .yaml file letting users tune behavior without editing SKILL.md
Every field optional, sensible defaults for everything
Rule: if the config file doesn’t exist, use defaults and don’t ask the user to create one. The skill should work immediately without any setup.

Before and after

Here’s what a skill looks like when you don’t follow this structure vs. when you do.

Before (everything in SKILL.md, ~700 lines):

---
name: my-diagrammer
description: Creates diagrams.
---

# Diagrammer

## Workflow
1. Ask what to diagram
2. Generate diagram

## Mermaid Syntax Reference
### Sequence diagrams
(50 lines of syntax examples)
### Class diagrams
(40 lines of syntax examples)
### ERD
(40 lines of syntax examples)
### State diagrams
(30 lines)
### Flowcharts
(30 lines)
... (8 more diagram types, 200+ more lines)

## Common Patterns
(100 lines of reusable patterns)

Problems: description is too vague to trigger reliably, syntax reference bloats the file, no failure handling, no config, no principles.

After (structured, ~300 lines SKILL.md + reference files):

---
name: my-diagrammer
description: >
  Generate diagrams from codebases by researching actual project code. Use this
  skill whenever the user asks to create, generate, or update diagrams. Trigger
  on phrases like "diagram the flow", "visualize the schema", "show me how X
  works". Also trigger when the user says "how does X connect to Y" — this
  implies a visual would help.
---

# Diagrammer

## References
- `references/syntax.md` — **Read before generating.** Don't use from memory.

## Configuration
(config table with defaults)

## Workflow
### Step 0: Load config
### Step 1: Check for existing diagrams
### Step 2: Ask what to diagram (skip if already specified)
### Step 3: Research the code (the most important step)
### Step 4: Generate
### Step 5: Save and confirm

## Handling Failures
(what to do when code isn't found, MCP fails, etc.)

## Principles
1. Accuracy over aesthetics
2. Research thoroughly
3. Never lose work

The second version triggers more reliably, researches code before generating, handles failures, and keeps the heavy syntax reference out of the main file.

Writing a Description That Actually Triggers

The description is the most important part of your skill. It’s the only thing Claude sees when deciding whether to activate it. A bad description means the skill never triggers — or triggers for the wrong things.

Be pushy. The skill-creator documentation explicitly recommends this because Claude tends to “under-trigger” — to not use skills when they’d be helpful.

A good description follows this pattern:

description: >
  {What the skill does in one sentence}. Use this skill whenever {explicit 
  triggers}. Trigger on phrases like "{keyword 1}", "{keyword 2}", "{keyword 3}".
  Also trigger when the user {describes the intent without using the keywords} — 
  e.g. "{natural language example 1}", "{natural language example 2}".
  Even if the user just {edge case trigger} — use this skill.

The four layers:

What it is — one-sentence summary
Explicit triggers — direct keywords and phrases
Implicit triggers — descriptions of intent without specific keywords
Catch-all — edge cases that should still trigger the skill

Don’t be shy about listing trigger phrases. Better to over-trigger and handle gracefully than to never trigger at all.

The Interactive Workflow Pattern

Most good skills follow the same general workflow:

Load configuration — read config, set defaults
Check existing work — don’t duplicate what’s already done
Understand the request — ask at most one clarifying question
Research the code — the most important step
Produce the output — using real names from the code
Confirm and offer follow-ups — let the user iterate

The golden rule: ask the user at every decision point, but don’t over-ask. If the user already specified what they want in their initial message, skip the questions.

Check Existing Work First

Before creating anything, check if it already exists. If your skill produces files, maintain an index and search it before generating:

> I found an existing {thing} that might be what you're looking for:
> - [{Name}](./path/to/file.md) — short description
>
> Would you like to:
> 1. View this existing one
> 2. Update/regenerate it
> 3. Create a new, separate one

This prevents the frustrating experience of Claude creating a second version of something that already exists. One of our skills initially lacked this check, and Claude would happily create duplicate output for the same request.

Research the Code Thoroughly

This is where code-aware skills add their real value. If your skill operates on a codebase, tell Claude to research deeply:

Start broad — look at the project structure, find relevant directories
Go deep — read actual files, follow call chains, note method signatures and relationships
Surface hidden behavior — middleware, event listeners, observers, cron jobs
Ask if unclear — ambiguous code? Ask the user rather than guessing

The key instruction that makes this work:

This is the most important step. Thoroughly research the actual codebase 
before producing output. Spend more time reading code than writing output.

Without this emphasis, Claude tends to rush to generation based on partial understanding.

Adapt to the environment

If your skill targets a specific language or ecosystem, consider adding environment detection so it adapts automatically. For example, a PHP skill might detect which framework is in use and adjust its research paths accordingly — looking for controllers in app/Http/Controllers/ for one framework but src/Controller/ for another. A Python skill might detect Django vs. Flask vs. FastAPI.

The pattern is: auto-detect by default, let the user override via config for edge cases.

Handle Failures Gracefully

Things will go wrong — code doesn’t match expectations, MCP servers disconnect, searches find nothing. Plan for every failure mode explicitly.

When the code doesn’t exist

If your skill researches code and the requested feature/flow isn’t there, don’t invent or guess:

> I searched the codebase but couldn't find an implementation for "{request}".
> Here's what I found related to this area: {list what you did find}.
>
> Would you like me to:
> 1. Look again — point me to the right files
> 2. Create a draft showing a proposed design (marked as DRAFT)
> 3. Try something else instead

The first time we didn’t handle this, Claude invented a plausible-looking but completely fictional flow diagram. Adding explicit “if nothing found” instructions fixed it completely.

When external services fail

If your skill depends on an MCP server or external tool, always have a fallback:

If the external service fails mid-creation (after research is done), don't 
lose the work. Immediately save the output in a local format. Then offer 
to retry the external service.

Principle: never lose completed research. The user’s time waiting for code analysis is valuable. If the final rendering step fails, save what you have.

When MCP tools change

For any skill that depends on MCP servers, don’t hardcode tool names:

Before doing anything, list the available MCP tools to discover what the 
server provides. Tool names, parameters, and capabilities change between 
versions. **Never assume a fixed API.** Always discover first.

We learned this the hard way — an MCP skill that assumed a specific tool name broke when the server updated. Making it discovery-first (introspect tools → adapt) made it resilient to any API change.

Index Files: Track What Exists

Every skill that produces files should maintain an index — a markdown file listing everything it has created:

# Generated Output

> Index. Last updated: 2025-12-15

### {Category}
- [{Title}](./{category}/{name}.md) — {short description}

Build the index dynamically — only include section headers for categories that actually have entries. Our first version had a static template listing all possible categories, most of them empty. Much better to generate sections on the fly.

Rules every index should follow:

Only add, never remove (unless the user explicitly asks)
Keep entries sorted alphabetically within each section
Use relative paths
Mark drafts: [DRAFT] {Title}
Update the Last updated date

Config Design Principles

After building several config files, these patterns emerged:

Every field is optional. The skill must work with zero configuration.
Defaults are sensible. Cover the 80% case out of the box.
Comments as documentation. The config file itself teaches the user what’s available:

# Detail level:
#   overview  — high-level, no method names
#   standard  — class/method names, happy + key error paths
#   detailed  — every call, all error paths, middleware, events
detail: standard

Don’t duplicate the SKILL.md. The config table in SKILL.md is a quick reference. The .yaml file has the full documentation.
No config file prompt. If it doesn’t exist, use defaults silently. Never tell the user “you should create a config file.”

Principles Section: The Skill’s Values

Every skill should end with a principles section — these are the tiebreakers when instructions are ambiguous. Order them by priority.

The best principles we’ve found across multiple skills:

Accuracy over aesthetics — correct but plain beats pretty but wrong
Ask, don’t assume — when in doubt, ask the user
Research thoroughly — more time reading code than producing output
Never lose work — if something fails, save what you have
Discover, don’t assume — for MCP skills, introspect tools before using them
Real names always — use actual class/method/table names from the code
Graceful degradation — always have a fallback path

These aren’t just nice-to-haves. Each principle exists because we hit a real problem that it solves. “Never lose work” came from an MCP failure that discarded 5 minutes of code analysis. “Real names always” came from Claude generating output with generic labels instead of actual class names.

How to Test a Skill

Don’t just read your skill and imagine how it would work. Run it. The gap between “looks correct when read” and “works correctly when used” is always wider than you expect.

Write 3-5 realistic test prompts — the kind of thing a real user would actually say. Not “test the diagram feature” but “diagram the payment flow in this project” or “explain how user authentication works.” Run each prompt with the skill installed and evaluate:

Did the skill trigger? If not, your description needs more trigger phrases.
Did it research before producing? If it jumped straight to output, your research step isn’t emphatic enough.
Are the names real? Check that it uses actual class/method names from your code, not generic placeholders.
Did it handle edge cases? Try a prompt for something that doesn’t exist in the codebase. Does it invent, or does it tell you it found nothing?
Is the output the right depth? If you asked for an overview and got 2000 words, or asked for deep dive and got 3 paragraphs, the depth instructions need tuning.

Common Mistakes (So You Don’t Have To)

1. Putting reference content in SKILL.md. Syntax examples, cheat sheets, and pattern libraries belong in references/ files. Your SKILL.md should be the workflow, not the encyclopedia. This also wastes context window — Claude has less room to think about your code when the SKILL.md is bloated.

2. Static index templates. Listing all possible sections in the index template creates bloat. Build it dynamically — only sections with actual entries.

3. Assuming one framework/environment. If your skill targets a language with multiple frameworks, detect the environment and provide parallel paths. Let the user override via config.

4. No failure handling for “nothing found.” Without explicit instructions, Claude will invent plausible-looking but fictional output. Always handle the “I found nothing” case.

5. Hardcoded MCP tool names. MCP servers evolve. Tools get renamed, parameters change. Discover tools at runtime, don’t assume they’ll stay the same.

6. No duplicate checking. Without checking existing work first, Claude happily creates a second version of something that already exists. Always search before creating.

7. Over-asking questions. If the user already specified what they want in their initial message, don’t ask them again. Parse their intent from what they said and start working. One clarifying question max.

Checklist: Before You Ship a Skill

SKILL.md is under 500 lines
Description is pushy with explicit trigger phrases
Heavy reference content is extracted to references/ files
Config has sensible defaults, every field is optional
Workflow checks for existing work before creating
“Nothing found” is handled explicitly
External service failures have fallback paths
Index (if applicable) is built dynamically, not from a static template
Principles section captures the skill’s values
Tested with 3-5 real prompts, not just reviewed as text

The patterns in this post come from building and iterating on real skills. You can see full working examples at github.com/axitdev/skills — use them as reference or as a starting point for your own.