Prompt engineering vs skill design

Someone tells you the agent isn’t working right. Do you fix the prompt or redesign the skill? Most people guess wrong. I’ve watched teams spend weeks rewriting system prompts when the real problem was a missing tool, and I’ve seen developers build entirely new MCP servers when all they needed was a two-sentence change to an instruction block.

The confusion makes sense. Both prompt engineering and skill design shape how an AI agent behaves. They overlap in places. But they solve different problems, and knowing which one to reach for will save you a lot of wasted effort.

What prompt engineering actually is

Prompt engineering is about how you talk to the model. It’s the system prompt, the user message structure, the examples you include, the formatting instructions you give. When you tell the model “respond in bullet points” or “use a formal tone” or “think step by step before answering,” that’s prompt engineering.

The scope is the conversation itself. You’re shaping what the model does with the information it already has access to. If the model understands the task and has the right data but produces bad output, prompt engineering is your fix.

Here’s a concrete example. You have an agent that summarizes support tickets. It has access to the ticket data through a tool. It pulls the right tickets. But the summaries are too long, miss the key action items, and use overly technical language for a non-technical audience. This is a prompt engineering problem. The model has everything it needs; it just needs better instructions on what to do with it.

What skill design actually is

Skill design is about writing the skill file: the markdown document that defines what an agent should do, step by step, including when to trigger it, which tools to use, and what the output should look like. A skill file is the plan that ties together the agent’s available tools (searching a database, sending an email, reading a file, calling an API) into a coherent workflow.

The scope of skill design extends beyond the conversation. You’re changing what the model can do, not just what it says. If the model doesn’t follow the right process, picks the wrong tool for a task, or misses important steps, your skill file needs work. And if the tools themselves are missing or return bad data, that’s a tool design problem that the skill file alone can’t solve, but the skill file is where you define which tools to call and how to use their output.

Same support ticket example, different failure mode. The agent produces great summaries when it has the data, but it keeps pulling tickets from the wrong project. Or it can only search by ticket ID when it really needs to search by date range. Or the tool returns the full ticket history when the agent only needs the latest update, flooding the context window. These are skill design problems. No amount of prompt rewriting will give the model a search-by-date capability that doesn’t exist.

When prompt engineering is the answer

Reach for prompt engineering when the model has the right information but does the wrong thing with it.

Formatting is off. The model returns a wall of text when you wanted a table. It gives you JSON when you wanted natural language. It uses markdown headers when you wanted a flat list. Add formatting instructions to your prompt.

Tone is wrong. The response is too casual for a business context, or too stiff for a chat interface. A sentence or two in the system prompt about voice and audience fixes this.

The model skips steps. It jumps to an answer without showing its reasoning. It forgets to check edge cases. It doesn’t ask clarifying questions when the input is ambiguous. Add explicit instructions about the process you want it to follow.

Output is inconsistent. Sometimes the model includes citations, sometimes it doesn’t. Sometimes it translates units, sometimes it doesn’t. Consistency problems are almost always prompt problems. Be explicit about what should always (or never) appear in the output.

The model misunderstands the task. Not because it lacks tools, but because the task description is vague. “Analyze this data” could mean a dozen things. “Calculate the month-over-month percentage change for each metric and flag any that changed by more than 10%” is a prompt fix.

When skill design is the answer

Reach for skill design when the model lacks the ability to do what’s needed.

Missing capability. The agent needs to send a Slack message but only has email tools. It needs to query a database but only has file-reading tools. You can’t prompt your way into a capability that doesn’t exist. Build the tool.

Wrong tool selection. The agent consistently picks search_documents when it should pick query_database. This is usually a description problem, which I’ll get to in a moment, but it’s still skill design territory. The tool descriptions are the interface the model uses to make selection decisions.

Bad tool output. The tool works but returns data in a format the model struggles with. Maybe it returns a massive XML blob when a clean JSON object would be better. Maybe it returns 500 results when the model only needs the top 10. Fix the tool’s output, not the prompt.

Missing parameters. The search tool only accepts a keyword but the model needs to filter by date. The email tool can send but can’t attach files. These are capability gaps in the tool itself.

The overlap zone, where it gets interesting

Here’s the thing that trips people up: tool descriptions are prompts. The description you write for a tool is prompt engineering applied to tool selection.

Consider this tool description:

name: search_tickets
description: Searches tickets

Versus this one:

name: search_tickets
description: Search support tickets by keyword, status, or date range.
Use this when the user asks about specific support issues, ticket
history, or wants to find tickets matching certain criteria. Do NOT
use this for general product questions.

The second description is doing prompt engineering work. It tells the model when to use the tool, when not to use it, and what kinds of inputs make sense. If your agent keeps picking the wrong tool, you might not need a new tool or a new system prompt. You might just need a better tool description.

This overlap zone is where I see the most confusion. Someone notices the agent picking the wrong tool and thinks “I need to redesign my skill architecture.” But really they need to spend 15 minutes rewriting their tool descriptions. That’s prompt engineering applied to skill design. For more on this, see the cost of bad descriptions.

A decision framework

When something goes wrong with your agent, walk through these questions in order:

Is the model picking the right tool? Look at the tool calls in your logs. If it’s choosing search_users when it should choose search_orders, that’s a skill design problem. Start with the tool descriptions. Make them clearer about when each tool should be used. If descriptions are already good, you might have too many similar tools and need to consolidate or rename them.

Is the tool returning useful data? Check what comes back from the tool call. Is the data there? Is it in a format the model can work with? Is there too much of it? If the tool returns garbage, fix the tool. If it returns too much, add filtering or pagination. If the format is confusing, restructure the output. This is skill design.

Is the model using the data well? The right tool was called, good data came back, but the final response is still wrong. Now you’re in prompt engineering territory. The model needs better instructions about how to interpret the data, what to include in its response, and how to format the output.

Is the model following the right process? Maybe it calls one tool when it should call three in sequence. Maybe it doesn’t validate data before acting on it. This could go either way. If the process requires tools that don’t exist, that’s skill design. If the tools exist but the model doesn’t know the right order, that’s prompt engineering. You might add a step-by-step instruction in the system prompt, or you might build a multi-step workflow that encodes the process into the skill itself.

Common mistakes I keep seeing

Prompting your way out of a missing skill. “Be sure to check the user’s order history before responding.” Great instruction, but if there’s no order history tool, the model will either hallucinate an answer or apologize that it can’t help. Neither is what you want. Check your tool use patterns and make sure the right capabilities exist.

Building a new skill when you need a better description. Someone notices the agent ignoring their carefully built tool and immediately starts building a second one. But the first tool’s description said “Performs data operations” which tells the model almost nothing. Rewrite the description first. Read more about this in writing skill instructions.

Overloading the system prompt to compensate for bad tools. I’ve seen system prompts that are thousands of tokens long, full of instructions like “When the user asks about X, call tool Y with parameter Z set to ‘abc’.” At that point you’re essentially hardcoding tool selection in the prompt, which defeats the purpose of having an agent that reasons about tool use. If you need that level of control, your tools need better descriptions or your tool set needs restructuring.

Ignoring the feedback loop. Prompt engineering and skill design aren’t one-time activities. You ship, you watch the logs, you see what goes wrong, and you iterate. The best agent developers I know spend as much time reading tool call logs as they do writing code. That observation loop is where you learn whether your next fix should be a prompt change or a skill change.

Start with the right question

The next time your agent misbehaves, resist the urge to immediately open the system prompt or start coding a new tool. Instead, ask: does the model have what it needs? If yes, it’s a prompt problem. If no, it’s a skill problem. That single question will point you in the right direction more often than not.

When you “design a skill,” the concrete output of that work is a skill file: a markdown document that defines what the agent does, when it does it, and how it reports results. Prompt engineering shapes how the agent communicates. Skill design produces the files that define what the agent can do. The next time you’re deciding which to invest in, ask whether the problem is in how the agent talks or in what the agent does. That tells you whether to edit a prompt or write a skill file.

For a deeper look at how tools and agents interact, see the tool use patterns guide. And if you’re designing skills from scratch, the anatomy of a skill article walks through the structure of a well-built one. For real examples of skill files that put these ideas into practice, see the PR review skill and test writer skill. For the broader taxonomy that includes plugins and integrations alongside skills and tools, see What’s the difference between an AI skill, tool, plugin, and integration?.