This page is a work in progress and will be updated soon.

MCP Server

Connect any AI agent to your project's living memory via MCP

Agent ready

Ask anything...

Overview

Keystone exposes its project intelligence as a remote MCP (Model Context Protocol) server. Any MCP-compatible client (Claude Code, Cursor, Zed, Claude Desktop) can query your project's memory mid-reasoning, before writing or reviewing code.

No local installation required. Clients connect with a single API key and a URL.

How it works

AI Agent
Claude Code · Cursor · Zed · Claude Desktop
Any MCP-compatible client
Keystone MCP Server
  • Validate API key (SHA-256)
  • Resolve user + organization
  • Route JSON-RPC method
tools/call
  • list_projectsDB query
  • asksearch + LLM + GitHub
  • search_memorysemantic search
  • get_project_statusDB query

The agent calls list_projects to discover slugs, then calls ask with one or more slugs and a question. Keystone searches its vector embeddings (commits, PRs, issues, READMEs), optionally reads live files from GitHub, and returns a Markdown answer with cited sources. The agent uses this context to make better decisions before touching any code.

Quick start

Generate an API key

Open Settings → API Keys in the dashboard, click New key, give it a name (e.g. Claude Code – MacBook), and copy the token. The full token is shown only once.

Add Keystone to your IDE

Pick the snippet for your IDE below and paste it into the relevant config file, replacing ks_live_<your-token> with the key you just generated.

Restart and verify

Restart your IDE so it picks up the new server. Your agent should now show four Keystone tools available: list_projects, ask, search_memory, get_project_status.

Available tools

list_projects

Lists every project in your organization with its current sync and synthesis status. Agents typically call this first to discover the project slugs they can use in other tools.

Inputs: none

Output:

[
  {
    "slug": "my-api",
    "name": "My API",
    "githubFullName": "org/my-api",
    "synthesisStatus": "completed",
    "lastSyncedAt": "2026-04-29T10:00:00.000Z"
  }
]

ask

The core tool. Asks a question about one or more projects using Keystone's full intelligence pipeline:

  1. Semantic search over pgvector embeddings of PRs, commits, issues, and READMEs (Mistral codestral-embed-2505)
  2. Live file access via the GitHub API; the agent can browse the repo tree and read specific files
  3. LLM synthesis with Mistral devstral-small-latest, constrained to cite sources and respond in Markdown
  4. Usage logged to ChatUsageLog with source: 'mcp'

Unlike the web chat, ask is non-streaming; it returns the complete answer once finished. This is intentional: MCP tool calls are synchronous from the agent's perspective.

Inputs:

{
  "projectSlugs": ["my-api", "my-frontend"],
  "question": "What architectural pattern is used for state management?"
}

Output: [{ "type": "text", "text": "<Markdown answer with citations>" }]

search_memory

Returns raw matching chunks from the knowledge base, with no LLM involved. Useful when your own model wants to interpret the results, or when you want to inspect the context Keystone would feed into ask.

Inputs:

{
  "projectSlugs": ["my-api"],
  "query": "how authentication tokens are issued"
}

Output: an array of matches, each with repo, sourceId, content, metadata, and similarity (0–1). Top 12 results above similarity 0.3.

get_project_status

Returns the memory score and pipeline status for a single project. Useful for the agent to verify that a project has been ingested and synthesized before querying it.

Inputs:

{ "projectSlug": "my-api" }

Output:

{
  "slug": "my-api",
  "githubFullName": "org/my-api",
  "memoryScore": 92,
  "breakdown": {
    "total": 92,
    "ingestion": 25,
    "synthesis": 25,
    "coverage": 25,
    "freshness": 15,
    "keystoneFolder": 10
  },
  "synthesisStatus": "completed",
  "synthesisProgress": 100,
  "lastSyncedAt": "2026-04-29T10:00:00.000Z",
  "ingestionCompleted": true,
  "embeddingCount": 318,
  "keystoneFolder": { "detected": true, "fileCount": 6 }
}

Memory score breakdown (max 100):

  • Ingestion (25) — at least one SyncLog with status: "COMPLETED".
  • Synthesis (25)synthesisStatus === "completed".
  • Coverage (25) — based on number of ProjectEmbedding rows: 00, 1–498, 50–19917, 200+25.
  • Freshness (15) — age of lastSyncedAt: ≤7d15, ≤30d10, ≤90d5, older or null0.
  • .keystone folder (10) — detected in the repo tree.

Protocol

The server implements the MCP Streamable HTTP transport over a single endpoint:

POST /api/mcp

All requests and responses use JSON-RPC 2.0 (protocol version 2025-03-26). The server handles four method types:

MethodDescription
initializeHandshake; client sends capabilities, server replies with its own
tools/listReturns the list of available tools with their JSON schemas
tools/callExecutes a tool and returns the result
notifications/initializedClient acknowledgement after init; server replies 202, no body

Example: initialize

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-03-26",
    "capabilities": {},
    "clientInfo": { "name": "claude-code", "version": "1.0" }
  }
}

Response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2025-03-26",
    "capabilities": { "tools": {} },
    "serverInfo": { "name": "keystone", "version": "1.0.0" }
  }
}

Example: tools/call

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "ask",
    "arguments": {
      "projectSlugs": ["my-api"],
      "question": "How is authentication implemented?"
    }
  }
}

Authentication

MCP clients authenticate using long-lived API keys with the prefix ks_live_. Keys are generated and revoked from Settings → API Keys.

Token format:

ks_live_<64 hex characters>

Security model:

  • The full token is shown once at creation time and is never stored in plain text
  • The database stores a SHA-256 hash plus a short prefix for display (e.g. ks_live_Ab3x...)
  • On each request, the server hashes the incoming token and looks it up, with no reversible storage
  • lastUsedAt is updated on every successful request
The MCP endpoint also accepts standard Supabase Bearer tokens, so you can call it during local development without generating an API key. The server detects the auth type from the token prefix.

IDE setup

{
  "mcpServers": {
    "keystone": {
      "type": "http",
      "url": "https://app.keystone.dev/api/mcp",
      "headers": {
        "Authorization": "Bearer ks_live_<your-token>"
      }
    }
  }
}
IDEConfig file
Claude Code~/.claude/settings.json (global) or .claude/settings.json (per-project)
Cursor.cursor/mcp.json
Claude Desktopclaude_desktop_config.json

Usage and limits

Every ask call is logged to ChatUsageLog with source: 'mcp' and counts toward the same weekly token budget as the web chat and CLI (WEEKLY_TOKEN_LIMIT, default 500_000 per user per organization, resets every Monday at 00:00 UTC).

search_memory, list_projects, and get_project_status do not consume the token budget; only ask invokes the LLM.

Rate limiting

Each API key has a per-minute cap on tools/call requests (MCP_RATE_LIMIT_PER_MINUTE, default 60). initialize and tools/list are exempt. When the limit is exceeded, the server returns HTTP 429 with Retry-After, X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers so clients can back off cleanly.

Have your agent call list_projects once per session to discover slugs, then pass one or more slugs to ask to query across repositories simultaneously. Cross-repo questions ("how does the API authenticate against the worker?") work best when both projects are in the slug list.