Setting Up a Persistent Memory MCP Server in 10 Minutes

A practical guide to connecting an MCP memory server to Claude Code, Cursor, or Claude Desktop so your AI agent remembers context across sessions.

Gene Avakyan · Founder, AgentPrizm · 7 min read

Most AI tools forget everything the moment you close the tab. You explain your codebase, your preferences, the three architectural decisions you already argued through last week — and the next session starts from zero. That is not a model problem. It is a memory problem, and it has a fairly boring fix: give the agent a place to write things down, and a standard way to reach it.

The standard part now exists. It is called MCP, and connecting a memory server to it takes about ten minutes. This post explains what MCP is, why memory is the use case worth caring about, and then walks through the actual setup.

What MCP is, in plain terms

MCP stands for the Model Context Protocol. It is an open standard, introduced by Anthropic in November 2024, for connecting AI applications to outside systems — data sources, tools, and workflows (Anthropic, "Introducing the Model Context Protocol").

A "server," in MCP terms, is just a program that exposes some capability — reading files, querying a database, storing memories — in a format the AI can understand. A "tool" is one specific action that server offers, like search or create. The AI application (Claude Code, Cursor, Claude Desktop, ChatGPT) is the "client." It speaks MCP; the server speaks MCP; they connect.

The official docs use a hardware analogy that lands well: MCP is "like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect electronic devices, MCP provides a standardized way to connect AI applications to external systems" (modelcontextprotocol.io).

Why does a standard matter here? Before MCP, every tool needed its own bespoke integration with every AI app — an N-times-M mess. Anthropic's framing was that even capable models are "constrained by their isolation from data — trapped behind information silos." MCP's answer: "Instead of maintaining separate connectors for each data source, developers can now build against a standard protocol" (Anthropic).

It is not a single-vendor idea anymore, which is the part execs should note. In March 2025, OpenAI adopted MCP across its products; CEO Sam Altman wrote, "People love MCP and we are excited to add support across our products" (TechCrunch, March 26, 2025). When two direct competitors agree on a wire format, you are looking at infrastructure, not a fad. Build against it once and your integration works across clients.

Why memory is the killer use case

You can plug a lot of things into MCP. Calendars, Figma, databases, a 3D printer. But memory is the one that changes the economics of an AI product, and it is worth being concrete about why.

An agent without memory is a new hire on their first day, every single day. Pleasant, capable, and completely unaware that you have met before. For a developer, that means re-explaining the project on every session. For a business running support or sales agents, it is worse: the customer told you their account tier, their last ticket, their preferences — and the agent acts like none of it happened. That is repeated work, and customers notice.

Memory flips the default. The agent writes down what it learns and reads it back next time. Decisions persist. Preferences persist. Customer history persists. The work compounds instead of resetting.

There is a strategic angle too. A model is a commodity — you can swap Claude for GPT for Gemini behind the same interface. The accumulated, structured record of your customers and your codebase is not a commodity. It is the thing a competitor cannot copy by renting the same model. Memory is where the durable advantage lives, and MCP is the cleanest way to wire it in.

That is the case for connecting a memory server. Here is how.

The 10-minute setup

We will use AgentPrizm, our hosted memory layer, as the server. Full disclosure: this is our product. We are using it because it runs over remote MCP-over-HTTP, which means there is nothing to install — no subprocess, no clone, no local runtime. You add a config block and you are done. If you would rather run something else, the shape of the setup is similar; the config is what differs.

Step 1 — Get an API key (about 2 minutes)

Sign up at agentprizm.com, then open the Agents page in the dashboard and create a key. Keys are prefixed ap_. Copy it somewhere safe — treat it like a password, because anyone holding it can read and write your memories.

The free Hobby tier is enough to test all of this end to end. Details are on the pricing page.

Step 2 — Add the MCP server config (about 3 minutes)

AgentPrizm exposes a remote MCP endpoint over HTTP. You point your client at the URL and pass the key as a Bearer token. That is the whole connection.

The config is the same JSON in every MCP-aware client; only the file location changes. For Claude Code, Cursor, and Claude Desktop, add this to your MCP servers config:

{
  "mcpServers": {
    "agentprizm-memory": {
      "type": "http",
      "url": "https://agentprizm.com/api/mcp",
      "headers": {
        "Authorization": "Bearer ap_YOUR_KEY"
      }
    }
  }
}

Replace ap_YOUR_KEY with the key from Step 1. Save, then restart the client so it picks up the new server. Because this is HTTP, there is no local process to babysit — the client talks directly to https://agentprizm.com/api/mcp and authenticates with the header.

Step 3 — Confirm the tools loaded (about 1 minute)

Once the client restarts, it should discover the memory tools automatically. In Claude Code you can ask it to list available tools, or just say something like "remember that this project uses MongoDB, not Postgres" and watch it call memory_create. If the tools show up, you are connected.

Step 4 — Use it (the rest of your 10 minutes)

There are eight tools. You will mostly use two:

  • memory_create — store something: a fact, a decision, a preference.
  • memory_recall — semantically search what you have stored.

The full set:

| Tool | What it does | |---|---| | memory_bootstrap | Loads owner info, directives, and lessons at the start of a session | | memory_recall | Semantic search across your memories | | memory_create | Store a new memory | | memory_forget | Remove or supersede a memory, with a reason logged | | memory_ingest | Extract memories from a conversation transcript | | memory_ingest_url | Fetch a web page, chunk it, and store it as searchable content | | memory_context | Return a token-budgeted context block ready to drop into a prompt | | memory_profile | Generate an executive summary of a set of memories |

A good habit: call memory_bootstrap at the start of a session and memory_create whenever a real decision gets made. The agent stops asking you to repeat yourself.

Two concepts that make memory usable

A pile of notes with no organization is its own kind of useless. Two ideas keep it sane.

Containers scope memories to a project or context. Tag a memory with a container like acme-backend, and recalls filtered to that container will not drag in unrelated notes from a different client. It is routing, not a folder hierarchy — lighter weight, and it keeps one agent's work from bleeding into another's.

Memory types keep the categories small on purpose. There are six, and only six:

  • fact — something true about the world or the project
  • lesson — something learned, usually the hard way
  • directive — a rule the agent should follow
  • preference — how you like things done
  • contact — a person and why they matter
  • bookmark — a link worth keeping

Typing a memory is what lets recall prioritize sensibly — a directive should outrank a bookmark when the agent is deciding what to act on.

What this is and is not

Being straight about the boundaries: AgentPrizm is a hosted service you reach over a REST API or this MCP endpoint. It uses hybrid recall — semantic plus keyword — to find relevant memories, scoped by containers and filterable by type, with an audit trail behind forgets. That is the product.

It is not a self-hosted package or an SDK you embed, and the recall is hybrid rather than anything more exotic. If your requirements are heavier than "a hosted memory layer my agents reach over MCP," check the docs first to see whether the model fits before you wire it in.

For ten minutes of setup, though, the payoff is real: an agent that remembers what you told it last time. Once you have worked with one, the goldfish version is hard to go back to.


Want the deeper reference? The docs cover the full tool surface and the REST API. Pricing, including the free tier, is on the pricing page.

← All postsRead the docsSee pricing

Give your agents a memory

Ship agents that remember.

Six memory types, container scoping, confidence scores, validity windows, and audit trails — over a REST API or MCP. Free until your agents ship.

Talk to us