Skills Are Prompts, But Not All Prompts Are Equal
What's new about a Skill isn't any single property. It is that the properties live together in one named, loadable package.
A Skill is a prompt. Anything that enters an LLM's context shapes what comes out, and a SKILL.md file enters the context like any user instruction or system prompt. Mechanically, the skeptic is right.
Each individual property of a Skill can already be done some other way. A narrowly written system prompt can function as task-scoped. A shared template can be reusable across people. Retrieval can load context only when relevant. A folder can package a prompt with examples and helper scripts. None of these capabilities is new on its own.
What is new is the combination, as a single package.
A Skill is the prompt-shaped package where four properties live together. It is task-scoped rather than project-scoped or session-scoped. The system loads it selectively when the task calls for it, instead of the user pasting it in. It can be reused across people, rather than living inside one person's session.1
And it bundles its full payload, instructions plus examples plus scripts plus references, as one named unit that can be versioned, installed, and uninstalled.
| Handwritten prompt |
System prompt |
Custom instruction |
CLAUDE.md |
Skill | |
|---|---|---|---|---|---|
| Task-scoped | ● | if narrow | ○ | ○ | ● |
| Selectively loaded | ○ | ○ | ○ | ○ | ● |
| Reusable across people | ○ | if shared | ○ | ● | ● |
| Bundled payload | ○ | ○ | ○ | partial | ● |
Each property, on its own, is reachable with what you already have. The combination is what's harder to assemble, because it requires the package to be a first-class unit the system loads, addresses, and unloads on its own. You can build that yourself, with a routing layer and a prompt library and some glue. Skills are the form in which the combination ships as a default, with the system handling the loading instead of the user.
The claim is not that this is unprecedented. The claim is that it is now the default form, and the default form is what determines whether most users will ever benefit from it. What unification enables is why that matters.
What Unification Makes Possible
Two consequences follow from the same property. Both depend on the package being named, selectively loaded, and reusable. Each one solves a different problem.
The first thing it enables is that expert knowledge becomes transferable across domains. A handwritten prompt encoding what a senior contracts lawyer would check is useful exactly once, to the person who wrote it. The same instructions in a Skill, named and addressable, can be loaded by a founder who has never read a contract carefully.
The lawyer's working knowledge becomes a passable substitute for the lawyer, accessible to anyone whose task calls for it. Before Skills, this transfer required the recipient to know enough to copy the right prompt at the right time. Now the system handles the loading.
The second thing it enables is that detailed refusals become viable. The model tends to produce something close to the average response in its training distribution. For factual queries, the average is usually correct. For creative or aesthetic work, the average is what people recognize as AI slop: generic, bland, machine-made.
Pushing the model off the average requires telling it specifically what to avoid, not just what to produce. The list of patterns to refuse is often long and domain-specific. A system prompt is the wrong place for it because most conversations don't need the list. A custom instruction is the wrong place because the scope is global.
A Skill loads only when the relevant task is at hand, so it can carry the full refusal list without paying a cost on every other interaction.
An Example That Shows Both at Once
Anthropic's pptx Skill is built for slide decks, and it shows both consequences working in a single package.2 Asked to make a deck about quarterly results, a model with no Skill produces what people now recognize as AI slop: title plus three bullets per slide, generic styling, the same look every time.
The Skill changes the result. It tells the model to pick a content-informed color palette and reject default blue, to commit to a single visual motif and repeat it across the deck, to treat text-only slides as failures, and to vary layouts. It also names specific AI-generated tells and refuses them: accent lines under titles, full-width colored bars, cream backgrounds.
--- name: pptx description: "Use this skill any time a .pptx file is involved..." --- ## Design Ideas ### Before Starting - Pick a bold, content-informed color palette: the palette should feel designed for THIS topic. - Dominance over equality: one color should dominate (60-70% visual weight). - Commit to a visual motif: pick ONE distinctive element and repeat it across every slide. ### Avoid (Common Mistakes) - Don't default to blue - pick colors that reflect the specific topic. - Don't create text-only slides - add images, icons, charts, or visual elements. - NEVER use accent lines under titles - these are a hallmark of AI-generated slides. - Don't add full-width colored bars - they read as AI slop unless explicitly requested. - Don't default to cream/beige backgrounds - use white or the user's brand palette. ## QA (Required) Convert slides to images, then run a verification loop: overflow, low contrast, overlapping elements, leftover placeholder text. Fix and re-verify.
Two things came together in that one package. The user who asked for the deck did not need to know what makes a deck good. That knowledge came from a designer, packaged once, addressable from then on.
The deck also did not come back as the slop the model would have produced by default. The specific refusals came from someone who had watched the model fail enough times to know exactly which patterns to name.
A skeptic could write a system prompt that captures part of this. They could write it and live with the cost: it gets loaded on every call where it might be relevant, paid for in tokens, and carried even on calls where it is not. The selective loading isn't a UX detail. It is what makes a long, domain-specific refusal list practical to maintain over time.
What Unification Does Not Solve
Skills can be bad. The format guarantees nothing about the contents. A Skill written by someone without taste in a domain produces tasteless results, and one written before a major change in tools or conventions can keep telling the model to follow rules that no longer apply.
Length and structure do not protect against shallow judgment. A long bad Skill can be worse than no Skill, because users may treat it as authority. Unification is a property of the package, not of what the package contains.
The Markdown Is Simple. What's in the Package Is Not.
Skills are prompts. That is the starting point.
What is new is not the markdown. The markdown is now part of a named, selectively loaded, reusable package that carries its full payload as a unit. What ships in a Skill is a compact way to do two jobs at once: move expert knowledge across domains, and refuse the model's defaults at a level of detail most general-purpose prompts can't afford to carry.
The practical upshot is a different question to ask before reaching for a Skill. The question is no longer "is this prompt good." The question is "does this task call for a package the user could not have assembled themselves." When the answer is yes, the markdown stops being markdown and starts being infrastructure.
The markdown is simple. What's in the package is not.
- Anthropic's documentation on Agent Skills covers the package format and the progressive-disclosure loading model in more detail. See platform.claude.com/docs/en/agents-and-tools/agent-skills/overview.
- The full instruction file for the
pptxSkill, including the specific patterns shown above, is published in Anthropic's open repository at github.com/anthropics/skills/tree/main/skills/pptx.
comments