MCP Server Configuration#
QMD exposes a Model Context Protocol (MCP) server via qmd mcp, enabling AI clients like Claude Code and Claude Desktop to call QMD's search and retrieval tools directly — without shell commands .
Prerequisites#
Install QMD and generate embeddings for at least one collection before configuring any MCP client :
npm install -g @tobilu/qmd
qmd collection add ~/path/to/markdown --name myknowledge
qmd embed
Client Configuration#
All clients use the same underlying command — qmd mcp — but place the JSON config in different files :
| Client | Config File |
|---|---|
| Claude Code | ~/.claude/settings.json |
| Claude Desktop | ~/Library/Application Support/Claude/claude_desktop_config.json |
| OpenClaw | ~/.openclaw/openclaw.json |
The JSON entry for Claude Code and Claude Desktop :
{
"mcpServers": {
"qmd": { "command": "qmd", "args": ["mcp"] }
}
}
OpenClaw uses a slightly different key structure — mcp.servers instead of mcpServers .
For Claude Code, the recommended path is the plugin marketplace :
claude plugin marketplace add tobi/qmd
claude plugin install qmd@qmd
Transport: stdio vs. HTTP#
By default the MCP server runs over stdio (launched as a subprocess per client). This means models are loaded fresh each session, adding ~3 GB of model loading overhead on first use.
For a shared, long-lived server that keeps models in VRAM across requests, use HTTP transport :
qmd mcp --http # localhost:8181, foreground
qmd mcp --http --port 8080 # custom port
qmd mcp --http --daemon # background (PID at ~/.cache/qmd/mcp.pid)
qmd mcp stop # stop daemon
HTTP endpoints :
POST /mcp— MCP Streamable HTTP (JSON, stateless)GET /health— liveness check with uptime
Point any MCP client at http://localhost:8181/mcp to connect. Embedding/reranking contexts are disposed after 5 min idle but LLM models stay loaded in VRAM.
Exposed Tools#
Once configured, these tools are available to the AI client :
| Tool | Purpose |
|---|---|
structured_search | Search with pre-expanded queries; accepts lex (BM25), vec (vector), and hyde (hypothetical answer) sub-queries |
get | Retrieve a document by file path or #docid, with optional full content and lineNumbers |
multi_get | Batch retrieve by glob pattern or comma-separated list; maxBytes skips large files (default 10 KB) |
status | Index health and collection info; no parameters |
See the full MCP setup reference for the complete structured_search schema and parameter details.
Troubleshooting#
| Symptom | Fix |
|---|---|
| Server not starting | Run which qmd and qmd mcp manually to surface errors |
| No results | Run qmd collection list and qmd embed |
| Slow first search | Normal — ~3 GB of models are loading; subsequent searches are fast |
Related Files#
skills/qmd/references/mcp-setup.md— concise MCP setup reference (added in PR #64)README.mdMCP section — full HTTP transport docs and tool list