MCP Server Configuration#

QMD exposes a Model Context Protocol (MCP) server via qmd mcp, enabling AI clients like Claude Code and Claude Desktop to call QMD's search and retrieval tools directly — without shell commands .

Prerequisites#

Install QMD and generate embeddings for at least one collection before configuring any MCP client :

npm install -g @tobilu/qmd
qmd collection add ~/path/to/markdown --name myknowledge
qmd embed

Client Configuration#

All clients use the same underlying command — qmd mcp — but place the JSON config in different files :

Client	Config File
Claude Code	`~/.claude/settings.json`
Claude Desktop	`~/Library/Application Support/Claude/claude_desktop_config.json`
OpenClaw	`~/.openclaw/openclaw.json`

The JSON entry for Claude Code and Claude Desktop :

{
  "mcpServers": {
    "qmd": { "command": "qmd", "args": ["mcp"] }
  }
}

OpenClaw uses a slightly different key structure — mcp.servers instead of mcpServers .

For Claude Code, the recommended path is the plugin marketplace :

claude plugin marketplace add tobi/qmd
claude plugin install qmd@qmd

Transport: stdio vs. HTTP#

By default the MCP server runs over stdio (launched as a subprocess per client). This means models are loaded fresh each session, adding ~3 GB of model loading overhead on first use.

For a shared, long-lived server that keeps models in VRAM across requests, use HTTP transport :

qmd mcp --http # localhost:8181, foreground
qmd mcp --http --port 8080 # custom port
qmd mcp --http --daemon # background (PID at ~/.cache/qmd/mcp.pid)
qmd mcp stop # stop daemon

HTTP endpoints :

POST /mcp — MCP Streamable HTTP (JSON, stateless)
GET /health — liveness check with uptime

Point any MCP client at http://localhost:8181/mcp to connect. Embedding/reranking contexts are disposed after 5 min idle but LLM models stay loaded in VRAM.

Exposed Tools#

Once configured, these tools are available to the AI client :

Tool	Purpose
`structured_search`	Search with pre-expanded queries; accepts `lex` (BM25), `vec` (vector), and `hyde` (hypothetical answer) sub-queries
`get`	Retrieve a document by file path or `#docid`, with optional `full` content and `lineNumbers`
`multi_get`	Batch retrieve by glob pattern or comma-separated list; `maxBytes` skips large files (default 10 KB)
`status`	Index health and collection info; no parameters

See the full MCP setup reference for the complete structured_search schema and parameter details.

Troubleshooting#

Symptom	Fix
Server not starting	Run `which qmd` and `qmd mcp` manually to surface errors
No results	Run `qmd collection list` and `qmd embed`
Slow first search	Normal — ~3 GB of models are loading; subsequent searches are fast