Skill design principles

Building a single skill file that works is one challenge. Building a set of skill files that work well together is a different challenge entirely. Individual skills need to be correct, but a skill ecosystem needs to be coherent. Each skill file should have a clear role, compose naturally with others, and avoid surprising side effects.

This guide covers the design principles that govern how skill files relate to each other and how they behave when agents chain multiple skills together to accomplish complex tasks.

Single responsibility: one job, done well

The single responsibility principle, borrowed from software engineering, is the most important constraint in skill design. A skill file should do one thing and do it completely.

Why this matters for agents

When an agent picks a skill, it reads the description section to decide whether the skill matches the current task. If a skill file does three things, the agent has to reason about whether it needs all three, some of them, or just one. This overhead leads to misuse: the agent either skips the skill because it seems too heavy, or picks it for the wrong reason.

Consider a skill file called manage-user.md:

# Skill: manage user

## Description

Create, update, delete, or look up users in the application database.
Handles all user lifecycle operations.

## Steps

### 1. Determine the operation

Figure out whether the user wants to create, update, delete, or look
up a user account.

### 2. If creating...

...

### 3. If updating...

...

### 4. If deleting...

...

### 5. If looking up...

...

This forces the agent to navigate a branching structure where different steps apply depending on the operation. The skill description says it does four different things, so the agent has to figure out which branch it actually needs. And if the agent wants to compose this with another skill, it’s unclear what the output looks like because it depends on which branch ran.

Compare this with four focused skill files:

create-user.md

# Skill: create user

## Description

Create a new user account. Use when the user asks to register,
sign up, or add a new team member. Fails if the email is already
registered.

## Steps

### 1. Collect the user details

Get the user's full name, email address, and role (admin, member,
or viewer) from the conversation context.

### 2. Check for duplicates

Query the database to see if an account with this email already
exists. If it does, report the conflict and stop.

### 3. Create the account

Insert the new user record and return the new user ID and profile.

get-user.md

# Skill: get user

## Description

Look up a user by ID or email. Use to check if a user exists or
to retrieve their profile before modifying the account.

## Steps

### 1. Query by identifier

Search the database for a matching user ID or email address.

### 2. Return the profile

Return the full user profile, or a clear "not found" message.

Each skill file has a focused description, steps that always apply (no branching), and a predictable output. The agent never has to guess which path through the skill is relevant.

The test: can you describe it without “and”?

If your skill file’s description naturally includes “and” connecting two unrelated capabilities, you probably need two skill files:

“Search for files and replace content” -> split into search-files.md and replace-in-file.md
“Read a config file and validate its schema” -> split into read-config.md and validate-config.md
“Query the database and export results to CSV” -> split into query-database.md and export-to-csv.md

The exception is when the “and” connects two parts of a single atomic operation, like “compress and upload a file” where doing one without the other would leave the system in a broken state.

Composability over completeness

It’s tempting to build skill files that handle entire workflows end-to-end. Resist this. Small, composable skills that agents can chain together are far more powerful and flexible than monolithic ones.

The power of composition

Imagine you need a workflow that:

Finds all TypeScript files modified today
Runs a linter on each file
Summarizes the linting results

Monolithic approach, one skill that does all three:

# Skill: lint recent TypeScript

## Description

Find recently modified TypeScript files and lint them, then
produce a summary of all linting issues.

This is inflexible. What if you want to lint a specific file? What if you want to find files but not lint them? What if you want to lint Python instead of TypeScript?

Composable approach, three skill files the agent chains:

search-files.md finds files matching criteria
lint-file.md runs a linter on a given file
summarize-lint-results.md formats linting output into a report

The composable approach lets the agent:

Search for Python files and lint them (reuses search-files.md + lint-file.md)
Lint a single known file (uses lint-file.md directly)
Search for files modified in the last week (uses search-files.md with different instructions)
Summarize results from any source, not just linting (uses summarize-lint-results.md)

Each skill is useful on its own and becomes more valuable when combined with others.

Designing skill files for composition

Skills compose well when they follow these patterns:

Accept simple inputs, not tangled prerequisites. A skill’s input section should ask for a file path, a user ID, or a search term. It should not require a complex object that only comes from one specific other skill. Keep skills loosely coupled.

Produce structured output, not rendered prose. A skill that returns JSON or a clearly structured format can feed into other skills. A skill that returns a beautifully formatted paragraph is a dead end. Unless formatting is the skill’s explicit purpose, prefer structured output.

Stay stateless when possible. A skill that depends on previous invocations is harder to compose because the agent has to manage invocation order. If state is unavoidable, make it explicit in the input section.

Good:

# Skill: get order details

## Input

- An order ID (provided by the user or from a previous step)

## Steps

### 1. Look up the order

Query the orders table by the provided order ID...

Bad:

# Skill: get customer orders

## Description

Get orders for the currently selected customer.
You must run the "select customer" skill first.

The second skill can’t be used independently. It’s coupled to another skill’s side effect.

How this looks in code

When you implement the tools that a skill calls, the same principles apply. A tool that accepts a simple identifier is easier for any skill to use than one that depends on hidden state:

// Good: stateless, accepts an identifier
async function getOrder(orderId: string) {
  return await db.orders.findById(orderId);
}

// Bad: depends on prior state
async function getCustomerOrders() {
  // relies on a "selected customer" set by a previous call
  return await db.orders.findByCustomer(state.selectedCustomerId);
}

Idempotency and side-effect management

An idempotent operation produces the same result whether you execute it once or ten times. This matters for skill files because agents frequently retry operations, sometimes because an error occurred, sometimes because they lost track of what they already did.

Build “ensure” patterns into your steps

Skill files that create or modify things should include checking logic. Instead of blindly creating, check first and handle the “already exists” case:

# Skill: ensure project setup

## Steps

### 1. Check if the project directory exists

Look for the project directory at the expected path.
If it already exists with the expected structure, skip to step 3.

### 2. Create the project directory

Create the directory and initialize the project scaffold...

### 3. Verify the setup

Confirm the directory exists and contains the expected files.
Report whether a new project was created or an existing one was found.

This skill is safe to run twice. The second run detects the existing project and skips the creation step. The output tells the agent whether anything actually changed.

The side-effect spectrum

Not all side effects are equal. Your skill file’s description should communicate where it falls on this spectrum:

Category	Description	Agent impact	Example skills
Pure read	No side effects at all	Safe to call freely	`search-files.md`, `get-user.md`
Observable	Reads data but logs access	Safe but leaves traces	`query-database.md` (with audit log)
Reversible	Modifies data but can be undone	Agent should confirm before running	`update-config.md`, `rename-file.md`
Irreversible	Cannot be undone	Agent should strongly confirm	`delete-user.md`, `send-email.md`

For irreversible skills, build a dry-run step into the skill file itself:

# Skill: send notification email

## Steps

### 1. Compose the email

Build the email with the recipient, subject, and body...

### 2. Preview before sending

Show the composed email to the user and ask for confirmation.
Display the recipient, subject line, and full body text.
Do NOT proceed until the user explicitly confirms.

### 3. Send the email

Only after confirmation, call the email API...

The confirmation step is part of the skill’s instructions. The agent follows the steps and naturally pauses for confirmation before the irreversible action.

Progressive disclosure of complexity

A skill file should handle the common case simply and reveal complexity only when needed. The simplest invocation should cover the most common scenario, with optional inputs available for fine-tuning.

Layer your inputs

# Skill: search codebase

## Input

Required:

- A search term or regex pattern

Optional (for narrowing results):

- A directory path to search within (defaults to project root)
- A file type filter like "js", "py", or "rust"

Optional (for advanced tuning):

- Maximum number of results (default 50, max 500)
- Whether to match case-sensitively (default: no)
- Whether to include hidden directories (default: no)

The agent can invoke this skill with just a search term for the common case, or provide the optional inputs when it needs precision. The skill works well at every level of detail.

Don’t require knowledge the agent might not have

If an input requires domain-specific knowledge, either make it optional with a good default, or explain the options in the skill file:

Good:

## Input

- The Elasticsearch index to query. Available indexes:
  - `logs-app` for application logs (this is the default)
  - `logs-system` for system/infrastructure logs
  - `metrics-*` for time-series metrics data

Bad:

## Input

- The Elasticsearch index to query

The first version gives the agent everything it needs to make a good choice. The second forces the agent to guess or ask the user.

When to split vs. merge skill files

This is the question that comes up most often in practice. Here are concrete heuristics.

Split when

The skill has conditional branches. Different steps apply depending on a mode or operation type. That’s two skills sharing one file.
The skill has unrelated failure modes. Errors from one path confuse recovery for another.
The description requires “or.” “Search files or search file contents” should be two skill files.
Different use cases need different risk levels. A skill that reads data and also deletes data shouldn’t combine those in one file. Reading is safe to run freely; deleting needs confirmation.

Merge when

The operations are always done together. Compressing and uploading is one logical action. Make it one skill file.
Splitting would create chatty round-trips. If the agent would always run skill A then immediately run skill B with A’s output, consider merging them into one file.
The operations share expensive setup. If both need to authenticate with an external API or gather the same context, merging avoids duplicated work.
The total skill count is getting unmanageable. If an agent has access to 200 skill files, it struggles to select the right one. Sometimes merging related skills improves selection accuracy.

The sweet spot

Most well-designed skill libraries land between 10 and 40 skill files. Fewer than 10 usually means each skill is too broad. More than 50 usually means skills are too granular and the agent spends too much time choosing.

A practical guideline: if you’re building a new skill, ask whether an existing skill file could handle the use case with an additional optional input or an extra step. If the addition fits naturally, extend the existing skill. If it feels forced, create a new one.

Naming conventions

Consistent naming helps agents (and humans) select skills quickly. Use a verb-noun pattern for file names and stick with it across your entire skill library:

review-pr.md          (not pr-review.md or reviewPR.md)
generate-changelog.md (not changelog.md or make-changelog.md)
create-user.md        (not add-user.md or new-user.md)
search-files.md       (not file-search.md or find-files.md)
run-migration.md      (not execute-migration.md or do-migration.md)

Pick one verb for each operation type and use it everywhere:

Operation	Verb	Example file names
Read one	`get`	`get-user.md`, `get-order.md`, `get-config.md`
Read many	`list` or `search`	`list-users.md`, `search-files.md`
Create	`create`	`create-note.md`, `create-project.md`
Update	`update`	`update-config.md`, `update-profile.md`
Delete	`delete`	`delete-record.md`, `delete-branch.md`
Execute	`run`	`run-migration.md`, `run-tests.md`
Generate	`generate`	`generate-changelog.md`, `generate-docs.md`
Review	`review`	`review-pr.md`, `review-security.md`

The title inside the skill file (the # Skill: ... heading) should match the file name. review-pr.md contains # Skill: PR review. generate-changelog.md contains # Skill: changelog generator. No ambiguity.

For real examples of focused, well-named skill files, see the PR review skill and the changelog generator skill in the skill library.

How tools fit in

Skills and tools are different things. A skill is a markdown file that describes a recipe: what to do, step by step, in human-readable terms. A tool is a function or API the agent can call to actually do something. Skills tell the agent what to do. Tools are what skills use to do it.

When you’re applying design principles, start with the skill file. Get the responsibility, composition, and naming right at the skill level first. Then implement the tools the skill needs. If your skill file is well-designed (focused, composable, with clear inputs and outputs), the underlying tool implementations tend to fall into place naturally.

// The "review-pr" skill might use tools like these:
const getDiff = { name: "get_pr_diff" /* ... */ };
const readFile = { name: "read_file" /* ... */ };
const postComment = { name: "post_review_comment" /* ... */ };

Each tool does one mechanical thing. The skill file orchestrates them into a meaningful workflow. This separation is what makes skill-based systems flexible: you can swap out tools without rewriting the skill, or rewrite the skill’s approach without changing the tools.

Putting it together

These principles reinforce each other. Single responsibility makes each skill file easy to understand. Composability makes small skills more powerful than big ones. Idempotency makes skills safe to retry. Progressive disclosure keeps simple cases simple. Thoughtful splitting and merging gives you enough granularity for precision without so much that selection becomes noisy.

For a deeper look at how to structure the sections within a single skill file, see How to design AI agent skills. To see these principles applied in practice, browse the skill library examples like the PR review skill, the changelog generator, the onboarding checklist generator, and the release notes generator. And when you’re ready to verify that your skills behave correctly, head to Testing and Debugging Skills to learn how to test across the full range of inputs an agent might encounter.