AI agent skill anti-patterns to avoid
Common mistakes in agent skill design: the god tool, leaky abstractions, over-parameterization, and patterns that lead to unreliable agents.
On this page
Learning what to do is important. Learning what not to do can save you weeks of debugging. This article catalogs the most common mistakes in agent skill design: patterns that seem reasonable at first but lead to unreliable, unmaintainable, or confusing agent behavior.
Each anti-pattern includes a description of the problem, a concrete example, the consequences, and the fix.
The god tool
The problem: A single tool that tries to do everything. It accepts dozens of parameters, handles multiple unrelated use cases, and has a description so long that the agent can’t figure out when to use it.
What it looks like:
{
name: "manage_project",
description: "Create, read, update, delete, search, analyze, deploy, test, " +
"lint, format, document, and monitor projects and their files, " +
"dependencies, configurations, and environments.",
parameters: {
action: { type: "string", enum: ["create", "read", "update", "delete",
"search", "analyze", "deploy", "test", "lint", "format", "doc", "monitor"] },
target: { type: "string" },
path: { type: "string", optional: true },
config: { type: "object", optional: true },
options: { type: "object", optional: true },
filters: { type: "object", optional: true },
output_format: { type: "string", optional: true },
recursive: { type: "boolean", optional: true },
dry_run: { type: "boolean", optional: true },
verbose: { type: "boolean", optional: true },
// ... 15 more parameters
}
}
Why it fails:
- The agent can’t reliably pick the right
actionvalue for a given task - The parameters are a grab bag where most are irrelevant for any given invocation
- Error messages are generic because the tool doesn’t know which specific operation failed
- Testing requires covering a combinatorial explosion of action-parameter pairs
- A bug in the “deploy” path can break the “search” path because they share code
The fix: Split it into focused tools, each with a clear purpose and minimal parameters. Skill Composition shows how to combine focused tools into complex capabilities without losing the power of the monolithic approach.
// Instead of one god tool, build focused tools:
{ name: "search_files", parameters: { pattern: "string", path: "string" } }
{ name: "run_linter", parameters: { path: "string", fix: "boolean" } }
{ name: "deploy_project", parameters: { environment: "string", version: "string" } }
{ name: "run_tests", parameters: { path: "string", filter: "string" } }
Leaky abstractions
The problem: A tool that exposes internal implementation details in its interface, forcing the agent to understand the underlying system rather than working through a clean abstraction.
What it looks like:
from __future__ import annotations
# Leaky: requires knowledge of internal SQL schema
async def query_users(
table: str = "auth_users_v2",
join_clause: str = "",
where_clause: str = "",
select_columns: str = "*",
) -> list[dict]:
"""Query the user database using SQL fragments."""
query = f"SELECT {select_columns} FROM {table} {join_clause} WHERE {where_clause}"
return await db.execute(query)
The agent now needs to know the table is called auth_users_v2, how to write SQL join clauses, and what columns exist. The implementation is leaking straight through the interface.
What it should look like:
# Clean abstraction: domain-level interface
async def find_users(
name: str | None = None,
email: str | None = None,
role: str | None = None,
active: bool | None = None,
limit: int = 50,
) -> list[dict]:
"""Find users matching the given criteria.
Returns user records with id, name, email, role, and status fields.
All filters are optional and combined with AND logic.
"""
query = build_user_query(name=name, email=email, role=role, active=active)
return await db.execute(query, limit=limit)
Why leaky abstractions fail:
- The agent has to learn your internal data model, which eats context and introduces errors
- Implementation changes (renaming a table, changing a schema) break the agent’s learned patterns
- SQL injection and other security problems become the agent’s responsibility to avoid
- The tool can’t be reused because it’s tied to one specific database schema
The fix: Design tool interfaces at the domain level, not the implementation level. Ask “what does the user want to accomplish?” rather than “what does the system need to execute?” The parameter schema section of Tool Use Patterns covers this idea in more detail.
Over-parameterization
The problem: A tool with too many parameters, most of which rarely get used. The agent has to decide which parameters to set on every invocation, which increases errors and wastes context on parameter descriptions.
What it looks like:
{
name: "read_file",
parameters: {
path: { type: "string", description: "File path to read" },
encoding: { type: "string", description: "File encoding", default: "utf-8" },
start_line: { type: "number", description: "Starting line number" },
end_line: { type: "number", description: "Ending line number" },
max_lines: { type: "number", description: "Maximum lines to return" },
include_line_numbers: { type: "boolean", description: "Add line numbers to output" },
strip_comments: { type: "boolean", description: "Remove comment lines" },
collapse_whitespace: { type: "boolean", description: "Collapse multiple blank lines" },
language_hint: { type: "string", description: "Programming language for syntax detection" },
follow_imports: { type: "boolean", description: "Also read imported files" },
include_metadata: { type: "boolean", description: "Include file metadata in response" },
format: { type: "string", enum: ["raw", "markdown", "json"], description: "Output format" },
}
}
Twelve parameters for reading a file. The agent will sometimes set parameters it shouldn’t, forget ones it should, or construct invalid combinations.
The fix: Start with the minimum viable parameter set and create separate tools for specialized behavior. Progressive disclosure.
// Core tool: simple and focused
{
name: "read_file",
parameters: {
type: "object",
properties: {
path: { type: "string", description: "File path to read" },
offset: { type: "number", description: "Start at this line number" },
limit: { type: "number", description: "Maximum lines to return" },
},
required: ["path"], // offset and limit are optional by omission
}
}
// Specialized tool for when you need metadata
{
name: "file_info",
parameters: {
path: { type: "string", description: "File path to inspect" },
}
}
A reasonable guideline: if a tool has more than 5 parameters, take a hard look at whether it’s doing too much. Most well-designed tools need 2-4 parameters.
Ignoring error cases
The problem: Tools that only handle the happy path and return cryptic errors (or crash) when anything goes wrong.
What it looks like:
async def deploy(environment: str, version: str) -> dict:
"""Deploy the application."""
image = f"registry.example.com/app:{version}"
await docker_pull(image)
await docker_stop("app-container")
await docker_run(image, name="app-container")
return {"status": "deployed"}
What happens when the image doesn’t exist? When the container won’t stop? When the port is already in use? When the registry is unreachable? This tool treats deployment as an atomic operation that always succeeds. In reality, each step can fail in different ways.
What it should look like:
async def deploy(environment: str, version: str) -> dict:
"""Deploy the application to the specified environment."""
image = f"registry.example.com/app:{version}"
# Step 1: Verify image exists
try:
await docker_pull(image)
except ImageNotFoundError:
return {
"status": "failed",
"error": f"Image {image} not found in registry",
"suggestion": (
f"Check that version '{version}' has been built and pushed. "
"Run the CI pipeline to build the image first."
),
}
except ConnectionError:
return {
"status": "failed",
"error": "Cannot reach container registry",
"suggestion": "Check network connectivity and verify registry URL.",
"retryable": True,
}
# Step 2: Stop existing container (may not exist)
try:
await docker_stop("app-container")
except ContainerNotFoundError:
pass # No container running: that's fine
# Step 3: Start new container
try:
await docker_run(image, name="app-container")
except PortInUseError as e:
return {
"status": "failed",
"error": f"Port {e.port} is already in use",
"suggestion": f"Stop the process using port {e.port}, or use a different port.",
"partial": {"image_pulled": True, "old_container_stopped": True},
}
return {
"status": "deployed",
"image": image,
"environment": environment,
}
Every step has error handling. Every error message tells the agent what went wrong and what to try next. Partial progress is reported so the agent knows what state the system is in. For a thorough treatment of these patterns, see Error Handling Patterns.
Tightly coupled skill chains
The problem: Tools that assume they’ll always be called in a specific sequence, with each tool depending on side effects of the previous one rather than on explicit inputs.
What it looks like:
// Step 1: writes to a global temp file
async function prepareData() {
const data = await fetchData();
await writeFile("/tmp/agent_data.json", JSON.stringify(data));
return { status: "prepared" };
}
// Step 2: reads from the same global temp file
async function analyzeData() {
const raw = await readFile("/tmp/agent_data.json");
const data = JSON.parse(raw);
const analysis = performAnalysis(data);
await writeFile("/tmp/agent_analysis.json", JSON.stringify(analysis));
return { status: "analyzed" };
}
// Step 3: reads from both global temp files
async function generateReport() {
const data = JSON.parse(await readFile("/tmp/agent_data.json"));
const analysis = JSON.parse(await readFile("/tmp/agent_analysis.json"));
return createReport(data, analysis);
}
Why this fails:
- If
prepareDataruns twice, the temp file is silently overwritten - If
analyzeDataruns withoutprepareData, it fails with a confusing “file not found” error - If two agent sessions run at the same time, they clobber each other’s temp files
- The tools can’t be reused anywhere else because they depend on hardcoded file paths
- Testing requires filesystem setup and teardown
The fix: Pass data explicitly through parameters. Each skill should be self-contained.
async function prepareData(): Promise<{ data: Record<string, unknown> }> {
const data = await fetchData();
return { data };
}
async function analyzeData(
data: Record<string, unknown>,
): Promise<{ analysis: AnalysisResult }> {
const analysis = performAnalysis(data);
return { analysis };
}
async function generateReport(
data: Record<string, unknown>,
analysis: AnalysisResult,
): Promise<{ report: string }> {
return { report: createReport(data, analysis) };
}
Now each tool is independent, testable, and composable. The orchestration layer (the agent or a workflow) passes data between them explicitly.
Vague or misleading descriptions
The problem: Tool descriptions that are too vague for the agent to know when to use them, or that mislead the agent about what the tool actually does.
Examples of bad descriptions:
| Description | Problem |
|---|---|
| ”Handles files” | What does “handle” mean? Read? Write? Delete? Search? |
| ”Database operations” | Which operations? On which database? |
| ”Helper utility” | For what? When should it be used? |
| ”Process data” | What kind of data? What kind of processing? |
| ”Smart search” | What makes it “smart”? How is it different from regular search? |
The fix: Be specific about what the tool does, when to use it, and when not to use it. The Cost of Bad Descriptions catalogs the most common description mistakes with before-and-after fixes. Tool Use Patterns covers the broader design principles, but the short version: include the action, the target, example use cases, and explicit guidance on when to prefer a different tool.
Good: "Search for files by name or extension using glob patterns.
Use when you need to find files (e.g., all .tsx components, config files).
Do NOT use for searching file contents: use grep_search instead."
Hidden side effects
The problem: Tools that do more than their name and description suggest. A “read” tool that also logs access. A “search” tool that caches results to disk. An “analyze” tool that sends telemetry.
Hidden side effects break the agent’s ability to reason about what has happened. If the agent calls read_file and doesn’t expect any state to change, but the tool quietly modifies a cache or writes a log, the agent’s model of the system drifts from reality.
The fix:
- If a tool has side effects, say so. “Reads the file and records the access in the audit log.”
- If a side effect is optional (like caching), make it a parameter. “Set
cache: trueto cache the result for future reads.” - Prefer pure tools (same input always produces same output, no external state changes) whenever possible. Side effects should be deliberate, not accidental.
Every one of these anti-patterns looks reasonable when you’re building fast and trying to get something working. The god tool feels productive because you only have one thing to maintain. The leaky abstraction feels efficient because you skip the wrapper layer. The tightly coupled chain feels clean because the data “just works” through temp files. But each one plants a failure mode that shows up later, usually at the worst possible time. The cheapest fix is to recognize these patterns early and restructure before they harden into architecture.
Related articles
Context management for AI agents
Strategies for working within context window limits: summarization, selective loading, and memory patterns for agent skills.
Error handling patterns for AI agents
How to build agent skills that handle failures gracefully: retry strategies, fallbacks, partial completion, and informative error responses.
Human-in-the-loop patterns for AI agents
Patterns for involving humans in agent workflows: approval gates, progressive autonomy, and knowing when to escalate.