How Agents Find the Right Skill: Semantic Discovery for SKILL.md

Folders of SKILL.md files do not scale. How AgentPrizm lets agents discover skills by intent — semantic search over a governed registry, via MCP or REST.

Gene Avakyan · Founder, AgentPrizm · 7 min read

A SKILL.md file is a small, beautiful idea. You write down how to do one thing — fill out a quarterly tax form, reconcile a Stripe payout, draft a SOC 2 evidence request — as a self-contained markdown document the agent can load and follow. One file, one capability. It reads like a recipe, and it works.

It works at three skills. It works at ten. Then a team adopts the pattern, every agent run that solves something new leaves a skill behind, and a few months later you have a folder with two hundred markdown files in it. The idea was never wrong. The way most teams organize it is what breaks.

The folder is a filing cabinet, not an index

Filename- and path-based organization quietly assumes the agent already knows what it is looking for. That assumption fails in three predictable ways as a fleet grows.

The agent cannot know what exists. An agent mid-task does not have your directory tree memorized. To pick a skill it has to either be told the exact path up front, or list a folder and guess from filenames. At ten files that is fine. At two hundred, the list of names alone is more context than you want to spend, and most of it is irrelevant to the task at hand. The agent ends up choosing from the handful of skills someone remembered to mention — which means the other hundred-and-ninety might as well not exist.

Lookup is by name, not by intent. Say the relevant skill lives at skills/irs-1040-schedule-c.md. The agent's task is "help this freelancer report their business income." A keyword or path match needs the words to line up. "Report business income" does not contain "1040" or "Schedule C," so a literal lookup misses the one file that would have solved the problem. The knowledge is right there on disk and the agent walks past it, because the filing system indexes by label and the agent is searching by purpose.

Duplication compounds the first two. When an agent cannot find the skill that already exists, it does the reasonable thing: it writes a new one. Now you have tax-form-helper.md and irs-1040-schedule-c.md doing nearly the same job under different names, drifting apart over time. The folder does not get more useful as it grows. Past a certain size it gets less useful, because the cost of finding the right skill rises faster than the value of having more of them.

Notice that none of these are storage problems. The files are fine. The retrieval is broken — and retrieval is the hard part, the same way it is for agent memory. Writing things down is easy. Finding the right one at the right moment is the engineering.

Discover by what you want to do

The fix is to let an agent find a skill by intent rather than by name. Instead of "give me the file at this path," the agent asks "what do I have that helps me report business income?" and gets back the skills that match what it is trying to accomplish — regardless of what they happen to be called.

AgentPrizm's AgentSkills does this with semantic search. When a skill is published, the system embeds its discovery text — the skill's name and description, not the full body. That distinction matters. The description is written to express what the skill is for: the goal it serves, the situation it applies to. Embedding intent-shaped text means a query like "fill out a tax form" lands on the right skill even when the filename says irs-1040-schedule-c and the word "tax" never appears in it. The match is on meaning, not on string overlap.

Keeping the embedded text to name plus description is deliberate, not a shortcut. Discovery is a different job from execution. The agent first needs to find which skill, cheaply and accurately, across a large set; only then does it load the full instructions for the one it picked. Embedding short, intent-dense discovery text keeps that first step fast and keeps the match focused on purpose rather than on incidental phrasing buried in a long document.

In practice the flow looks like this. An agent hits a step it does not already know how to do, searches the registry by describing the goal, gets back the closest-matching skills, and pulls the full body of the best one — all mid-task, without a human wiring up paths in advance:

# Find a skill by intent, then load its full instructions
skill_search  query="report freelance business income on taxes"
skill_get     id="<the best match from the search>"

That is runtime discovery. The agent is not limited to the skills someone thought to mention when the run started. It can reach into the whole registry the moment it needs to, which is exactly when filename-based lookup leaves it stranded.

Discovery is more than nearest-neighbor

It is tempting to look at this and conclude you just need a vector database — embed the skills, do a similarity search, done. Nearest-neighbor is the easy 80 percent. The part that makes a skill registry actually usable in a fleet is everything a raw vector index leaves out.

Scoping. A search should return the skills you are allowed to use, not every skill that has ever been embedded. AgentSkills scopes discovery to your containers — your private or team skills — or, when you want it, across the public marketplace. The same intent query gives different, correct answers depending on whose skills are in scope. A bare vector index has no notion of "yours" versus "everyone's"; it just returns whatever is nearest, which in a shared store is a privacy and relevance problem at once.

Governance. Skills are not anonymous vectors. Each one has an owner and a provenance. You can install a skill — take a private copy you control — or fork it, which produces a public, copyleft descendant whose lineage stays attached. Those operations are the difference between "I found some text that looked relevant" and "I adopted a known skill from a known source on known terms." A vector store does not model ownership or derivation; a registry has to.

Versioning. A skill changes over time. Discovery has to keep pointing at the right version, and adoption has to be a deliberate act rather than a silent overwrite the next time someone re-embeds the folder. Treating skills as governed records — not as rows that get clobbered — is what lets a fleet trust the registry instead of re-verifying every result by hand.

Strip those three away and you are back to a folder of files, just with cosine similarity bolted on. The embedding is the entry point. Scoping, governance, and versioning are what make the answer trustworthy — which is the whole reason an agent should rely on discovery in the first place.

Private skills and a public marketplace

The same machinery runs at two scopes. Inside your containers, AgentSkills is a private registry your agents discover against — the skills your team has accumulated, searchable by intent, scoped to you. Across the public marketplace, the same intent search reaches a shared pool of skills that anyone can browse at /skills and that agents can search through the marketplace endpoint.

The marketplace is free today. We expect to add monetization for skill authors in the future, but that does not exist yet — there is nothing to buy or sell right now, and publishing or forking costs nothing. We would rather say that plainly than imply a market that is not there.

Everything described here is driven the way agents already work: through MCP tools (skill_search, skill_get, and the rest) or over REST at /api/v1/agent/skills/search for your own scope and /api/v1/agent/marketplace/search for the public pool. An agent that already speaks to AgentPrizm for memory reaches skills through the same connection — no new integration, no local subprocess to babysit.

The takeaway

A SKILL.md is the right unit. The folder is the wrong index. As soon as you have more skills than an agent can hold in its head, the question stops being "where is the file" and becomes "what do I have that helps me do this" — and that is a semantic question, not a path lookup. Embedding intent-shaped discovery text answers it; scoping, governance, and versioning make the answer one you can build on.

If you want to see it, browse the public skills at /skills, or read the developer-facing detail — the search endpoints, container scoping, and install-versus-fork semantics — in the docs.

← All postsRead the docsSee pricing

Give your agents a memory

Ship agents that remember.

Six memory types, container scoping, confidence scores, validity windows, and audit trails — over a REST API or MCP. Free until your agents ship.

Talk to us