Skill design principles
Foundational principles for designing composable, predictable, and maintainable skill files including single responsibility and idempotency.
On this page
- Single responsibility: one job, done well
- Why this matters for agents
- The test: can you describe it without “and”?
- Composability over completeness
- The power of composition
- Designing skill files for composition
- How this looks in code
- Idempotency and side-effect management
- Build “ensure” patterns into your steps
- The side-effect spectrum
- Progressive disclosure of complexity
- Layer your inputs
- Don’t require knowledge the agent might not have
- When to split vs. merge skill files
- Split when
- Merge when
- The sweet spot
- Naming conventions
- How tools fit in
- Putting it together
Building a single skill file that works is one challenge. Building a set of skill files that work well together is a different challenge entirely. Individual skills need to be correct, but a skill ecosystem needs to be coherent. Each skill file should have a clear role, compose naturally with others, and avoid surprising side effects.
This guide covers the design principles that govern how skill files relate to each other and how they behave when agents chain multiple skills together to accomplish complex tasks.
Single responsibility: one job, done well
The single responsibility principle, borrowed from software engineering, is the most important constraint in skill design. A skill file should do one thing and do it completely.
Why this matters for agents
When an agent picks a skill, it reads the description section to decide whether the skill matches the current task. If a skill file does three things, the agent has to reason about whether it needs all three, some of them, or just one. This overhead leads to misuse: the agent either skips the skill because it seems too heavy, or picks it for the wrong reason.
Consider a skill file called manage-user.md:
# Skill: manage user
## Description
Create, update, delete, or look up users in the application database.
Handles all user lifecycle operations.
## Steps
### 1. Determine the operation
Figure out whether the user wants to create, update, delete, or look
up a user account.
### 2. If creating...
...
### 3. If updating...
...
### 4. If deleting...
...
### 5. If looking up...
...
This forces the agent to navigate a branching structure where different steps apply depending on the operation. The skill description says it does four different things, so the agent has to figure out which branch it actually needs. And if the agent wants to compose this with another skill, it’s unclear what the output looks like because it depends on which branch ran.
Compare this with four focused skill files:
create-user.md
# Skill: create user
## Description
Create a new user account. Use when the user asks to register,
sign up, or add a new team member. Fails if the email is already
registered.
## Steps
### 1. Collect the user details
Get the user's full name, email address, and role (admin, member,
or viewer) from the conversation context.
### 2. Check for duplicates
Query the database to see if an account with this email already
exists. If it does, report the conflict and stop.
### 3. Create the account
Insert the new user record and return the new user ID and profile.
get-user.md
# Skill: get user
## Description
Look up a user by ID or email. Use to check if a user exists or
to retrieve their profile before modifying the account.
## Steps
### 1. Query by identifier
Search the database for a matching user ID or email address.
### 2. Return the profile
Return the full user profile, or a clear "not found" message.
Each skill file has a focused description, steps that always apply (no branching), and a predictable output. The agent never has to guess which path through the skill is relevant.
The test: can you describe it without “and”?
If your skill file’s description naturally includes “and” connecting two unrelated capabilities, you probably need two skill files:
- “Search for files and replace content” -> split into
search-files.mdandreplace-in-file.md - “Read a config file and validate its schema” -> split into
read-config.mdandvalidate-config.md - “Query the database and export results to CSV” -> split into
query-database.mdandexport-to-csv.md
The exception is when the “and” connects two parts of a single atomic operation, like “compress and upload a file” where doing one without the other would leave the system in a broken state.
Composability over completeness
It’s tempting to build skill files that handle entire workflows end-to-end. Resist this. Small, composable skills that agents can chain together are far more powerful and flexible than monolithic ones.
The power of composition
Imagine you need a workflow that:
- Finds all TypeScript files modified today
- Runs a linter on each file
- Summarizes the linting results
Monolithic approach, one skill that does all three:
# Skill: lint recent TypeScript
## Description
Find recently modified TypeScript files and lint them, then
produce a summary of all linting issues.
This is inflexible. What if you want to lint a specific file? What if you want to find files but not lint them? What if you want to lint Python instead of TypeScript?
Composable approach, three skill files the agent chains:
search-files.mdfinds files matching criterialint-file.mdruns a linter on a given filesummarize-lint-results.mdformats linting output into a report
The composable approach lets the agent:
- Search for Python files and lint them (reuses
search-files.md+lint-file.md) - Lint a single known file (uses
lint-file.mddirectly) - Search for files modified in the last week (uses
search-files.mdwith different instructions) - Summarize results from any source, not just linting (uses
summarize-lint-results.md)
Each skill is useful on its own and becomes more valuable when combined with others.
Designing skill files for composition
Skills compose well when they follow these patterns:
Accept simple inputs, not tangled prerequisites. A skill’s input section should ask for a file path, a user ID, or a search term. It should not require a complex object that only comes from one specific other skill. Keep skills loosely coupled.
Produce structured output, not rendered prose. A skill that returns JSON or a clearly structured format can feed into other skills. A skill that returns a beautifully formatted paragraph is a dead end. Unless formatting is the skill’s explicit purpose, prefer structured output.
Stay stateless when possible. A skill that depends on previous invocations is harder to compose because the agent has to manage invocation order. If state is unavoidable, make it explicit in the input section.
Good:
# Skill: get order details
## Input
- An order ID (provided by the user or from a previous step)
## Steps
### 1. Look up the order
Query the orders table by the provided order ID...
Bad:
# Skill: get customer orders
## Description
Get orders for the currently selected customer.
You must run the "select customer" skill first.
The second skill can’t be used independently. It’s coupled to another skill’s side effect.
How this looks in code
When you implement the tools that a skill calls, the same principles apply. A tool that accepts a simple identifier is easier for any skill to use than one that depends on hidden state:
// Good: stateless, accepts an identifier
async function getOrder(orderId: string) {
return await db.orders.findById(orderId);
}
// Bad: depends on prior state
async function getCustomerOrders() {
// relies on a "selected customer" set by a previous call
return await db.orders.findByCustomer(state.selectedCustomerId);
}
Idempotency and side-effect management
An idempotent operation produces the same result whether you execute it once or ten times. This matters for skill files because agents frequently retry operations, sometimes because an error occurred, sometimes because they lost track of what they already did.
Build “ensure” patterns into your steps
Skill files that create or modify things should include checking logic. Instead of blindly creating, check first and handle the “already exists” case:
# Skill: ensure project setup
## Steps
### 1. Check if the project directory exists
Look for the project directory at the expected path.
If it already exists with the expected structure, skip to step 3.
### 2. Create the project directory
Create the directory and initialize the project scaffold...
### 3. Verify the setup
Confirm the directory exists and contains the expected files.
Report whether a new project was created or an existing one was found.
This skill is safe to run twice. The second run detects the existing project and skips the creation step. The output tells the agent whether anything actually changed.
The side-effect spectrum
Not all side effects are equal. Your skill file’s description should communicate where it falls on this spectrum:
| Category | Description | Agent impact | Example skills |
|---|---|---|---|
| Pure read | No side effects at all | Safe to call freely | search-files.md, get-user.md |
| Observable | Reads data but logs access | Safe but leaves traces | query-database.md (with audit log) |
| Reversible | Modifies data but can be undone | Agent should confirm before running | update-config.md, rename-file.md |
| Irreversible | Cannot be undone | Agent should strongly confirm | delete-user.md, send-email.md |
For irreversible skills, build a dry-run step into the skill file itself:
# Skill: send notification email
## Steps
### 1. Compose the email
Build the email with the recipient, subject, and body...
### 2. Preview before sending
Show the composed email to the user and ask for confirmation.
Display the recipient, subject line, and full body text.
Do NOT proceed until the user explicitly confirms.
### 3. Send the email
Only after confirmation, call the email API...
The confirmation step is part of the skill’s instructions. The agent follows the steps and naturally pauses for confirmation before the irreversible action.
Progressive disclosure of complexity
A skill file should handle the common case simply and reveal complexity only when needed. The simplest invocation should cover the most common scenario, with optional inputs available for fine-tuning.
Layer your inputs
# Skill: search codebase
## Input
Required:
- A search term or regex pattern
Optional (for narrowing results):
- A directory path to search within (defaults to project root)
- A file type filter like "js", "py", or "rust"
Optional (for advanced tuning):
- Maximum number of results (default 50, max 500)
- Whether to match case-sensitively (default: no)
- Whether to include hidden directories (default: no)
The agent can invoke this skill with just a search term for the common case, or provide the optional inputs when it needs precision. The skill works well at every level of detail.
Don’t require knowledge the agent might not have
If an input requires domain-specific knowledge, either make it optional with a good default, or explain the options in the skill file:
Good:
## Input
- The Elasticsearch index to query. Available indexes:
- `logs-app` for application logs (this is the default)
- `logs-system` for system/infrastructure logs
- `metrics-*` for time-series metrics data
Bad:
## Input
- The Elasticsearch index to query
The first version gives the agent everything it needs to make a good choice. The second forces the agent to guess or ask the user.
When to split vs. merge skill files
This is the question that comes up most often in practice. Here are concrete heuristics.
Split when
- The skill has conditional branches. Different steps apply depending on a mode or operation type. That’s two skills sharing one file.
- The skill has unrelated failure modes. Errors from one path confuse recovery for another.
- The description requires “or.” “Search files or search file contents” should be two skill files.
- Different use cases need different risk levels. A skill that reads data and also deletes data shouldn’t combine those in one file. Reading is safe to run freely; deleting needs confirmation.
Merge when
- The operations are always done together. Compressing and uploading is one logical action. Make it one skill file.
- Splitting would create chatty round-trips. If the agent would always run skill A then immediately run skill B with A’s output, consider merging them into one file.
- The operations share expensive setup. If both need to authenticate with an external API or gather the same context, merging avoids duplicated work.
- The total skill count is getting unmanageable. If an agent has access to 200 skill files, it struggles to select the right one. Sometimes merging related skills improves selection accuracy.
The sweet spot
Most well-designed skill libraries land between 10 and 40 skill files. Fewer than 10 usually means each skill is too broad. More than 50 usually means skills are too granular and the agent spends too much time choosing.
A practical guideline: if you’re building a new skill, ask whether an existing skill file could handle the use case with an additional optional input or an extra step. If the addition fits naturally, extend the existing skill. If it feels forced, create a new one.
Naming conventions
Consistent naming helps agents (and humans) select skills quickly. Use a verb-noun pattern for file names and stick with it across your entire skill library:
review-pr.md (not pr-review.md or reviewPR.md)
generate-changelog.md (not changelog.md or make-changelog.md)
create-user.md (not add-user.md or new-user.md)
search-files.md (not file-search.md or find-files.md)
run-migration.md (not execute-migration.md or do-migration.md)
Pick one verb for each operation type and use it everywhere:
| Operation | Verb | Example file names |
|---|---|---|
| Read one | get | get-user.md, get-order.md, get-config.md |
| Read many | list or search | list-users.md, search-files.md |
| Create | create | create-note.md, create-project.md |
| Update | update | update-config.md, update-profile.md |
| Delete | delete | delete-record.md, delete-branch.md |
| Execute | run | run-migration.md, run-tests.md |
| Generate | generate | generate-changelog.md, generate-docs.md |
| Review | review | review-pr.md, review-security.md |
The title inside the skill file (the # Skill: ... heading) should match the file name. review-pr.md contains # Skill: PR review. generate-changelog.md contains # Skill: changelog generator. No ambiguity.
For real examples of focused, well-named skill files, see the PR review skill and the changelog generator skill in the skill library.
How tools fit in
Skills and tools are different things. A skill is a markdown file that describes a recipe: what to do, step by step, in human-readable terms. A tool is a function or API the agent can call to actually do something. Skills tell the agent what to do. Tools are what skills use to do it.
When you’re applying design principles, start with the skill file. Get the responsibility, composition, and naming right at the skill level first. Then implement the tools the skill needs. If your skill file is well-designed (focused, composable, with clear inputs and outputs), the underlying tool implementations tend to fall into place naturally.
// The "review-pr" skill might use tools like these:
const getDiff = { name: "get_pr_diff" /* ... */ };
const readFile = { name: "read_file" /* ... */ };
const postComment = { name: "post_review_comment" /* ... */ };
Each tool does one mechanical thing. The skill file orchestrates them into a meaningful workflow. This separation is what makes skill-based systems flexible: you can swap out tools without rewriting the skill, or rewrite the skill’s approach without changing the tools.
Putting it together
These principles reinforce each other. Single responsibility makes each skill file easy to understand. Composability makes small skills more powerful than big ones. Idempotency makes skills safe to retry. Progressive disclosure keeps simple cases simple. Thoughtful splitting and merging gives you enough granularity for precision without so much that selection becomes noisy.
For a deeper look at how to structure the sections within a single skill file, see How to design AI agent skills. To see these principles applied in practice, browse the skill library examples like the PR review skill, the changelog generator, the onboarding checklist generator, and the release notes generator. And when you’re ready to verify that your skills behave correctly, head to Testing and Debugging Skills to learn how to test across the full range of inputs an agent might encounter.
Related articles
How to design AI agent skills
A deep dive into the four pillars of skill design: clear descriptions, well-typed parameters, error handling, and predictable output.
The cost of bad tool descriptions
Your agent keeps picking the wrong tool. The problem isn't the code, it's the description. Real examples of tool descriptions that fail and how to fix them.
Prompt engineering vs skill design
They sound similar but they solve different problems. When to write a better prompt and when to build a skill instead.