From 288ad4cb69089b20214e72c9e0aa8f6515719ba6 Mon Sep 17 00:00:00 2001 From: Kaysser Kayyali Date: Thu, 18 Jun 2026 14:32:22 -0400 Subject: [PATCH] docs(slice 11): plan doc + ADR 0010 + CONTEXT cross-ref * docs/plan/11-slice-mcp-http-docker.md: slice plan with AC + test plan, mirroring the 11 existing plan docs. * docs/adr/0010-streamable-http-transport.md: captures Streamable HTTP vs legacy HTTP+SSE, Starlette vs FastAPI/mcp-lib, stateless v1 (no Mcp-Session-Id), one-shot JSON as common case, append to requirements vs split, baked graph + mount override. * CONTEXT.md: new 'Transports' section under Operations documenting the transport-agnostic dispatcher + stdio + Streamable HTTP. --- CONTEXT.md | 21 ++++ docs/adr/0010-streamable-http-transport.md | 119 +++++++++++++++++++++ docs/plan/11-slice-mcp-http-docker.md | 103 ++++++++++++++++++ 3 files changed, 243 insertions(+) create mode 100644 docs/adr/0010-streamable-http-transport.md create mode 100644 docs/plan/11-slice-mcp-http-docker.md diff --git a/CONTEXT.md b/CONTEXT.md index b024c78..a96434a 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -66,6 +66,27 @@ A record of one execution of the consistency engine (started_at, rules_run, viol **Cognee**: The substrate — the MIT-licensed framework that owns extraction, embedding, and the `remember`/`recall`/`forget` API. Pinned at v1.1.2 (ADR 0006). Graph backend pinned to Neo4j, overriding Cognee's Kuzu default (ADR 0008). +### Transports + +**MCP (Model Context Protocol)**: +The wire protocol clients use to talk to the server. We implement +MCP 2024-11-05 over two transports: + +- **stdio** (slice 2.6) — JSON-RPC 2.0 over stdin/stdout, stdlib only. + Entry: `scripts/05_mcp_server.py`. Used by Claude Desktop, Continue, + Cline, and any stdio-native MCP client. +- **Streamable HTTP** (slice 11, ADR 0010) — single `POST /mcp` + endpoint. The response body is either `application/json` (one-shot, + the common case) or `text/event-stream` (SSE-upgraded via the + client's `Accept` header). Entry: `scripts/06_mcp_http_server.py`. + Container: `lore-engine-mcp` (Dockerfile + docker-compose.yml). + +The dispatcher in `lore_engine_poc.mcp_server.MCPServer` is +**transport-agnostic** — both transports are thin adapters over +`handle_message(msg) -> Optional[dict]`. The 36-tool registry +(`TOOL_REGISTRY` in `mcp_tools.py`) is shared verbatim between +them. + ## Avoid list (global) - **world_id** — use `setting_id`. diff --git a/docs/adr/0010-streamable-http-transport.md b/docs/adr/0010-streamable-http-transport.md new file mode 100644 index 0000000..6722f1e --- /dev/null +++ b/docs/adr/0010-streamable-http-transport.md @@ -0,0 +1,119 @@ +# Streamable HTTP transport + Starlette over the stdlib stdio server + +**Status:** accepted. Slice 11 (2026-06-18). + +## Context + +The MCP server in `lore_engine-poc/scripts/05_mcp_server.py` +(slice 2.6) is stdlib-only and speaks JSON-RPC 2.0 over +stdin/stdout. That works for stdio-native clients (Claude Desktop, +Continue, Cline) but excludes any HTTP-based consumer: Open WebUI, +plain `curl`, a browser, a sidecar, a webhook. We need an HTTP +transport on the same dispatcher. + +## Decisions + +### 1. Streamable HTTP, not legacy HTTP+SSE + +The MCP 2024-11-05 transport split endpoints: `GET /sse` for the +server-to-client event stream, `POST /messages` for client-to-server +requests. The 2025-06-18 spec deprecates that in favor of a +**Streamable HTTP** transport: a single `POST /mcp` endpoint whose +response body is either `application/json` (one-shot) or +`text/event-stream` (SSE-upgraded), per the client's `Accept` header. + +We adopt Streamable HTTP because: + +- It matches the request/response shape of every tool in the + registry. None of the 36 tools are long-lived or push server-side + state; forcing them into an SSE envelope would be ceremony. +- It's the path the MCP spec authors are steering clients toward. + Adopting it now means a single transport serves both current and + near-future clients. +- A single endpoint is dramatically easier to operate, secure, and + reverse-proxy. Two-endpoint transports invite deployment mistakes + (CORS on one, not the other; load-balancer routing on `/messages` + but not `/sse`; etc.). + +### 2. Starlette + uvicorn, not FastAPI or the `mcp` PyPI lib + +The HTTP layer is a thin transport adapter over the existing +dispatcher. Starlette gives us an ASGI app + `JSONResponse` + +`StreamingResponse` and is the smallest reasonable dep set. +uvicorn is the standard ASGI server. We do **not** need: + +- FastAPI's pydantic models — our request body is an untyped JSON + dict the dispatcher validates. Adding pydantic per tool schema + would be a second source of truth. +- The `mcp` / `fastmcp` PyPI packages — the `scripts/05_mcp_server.py` + docstring explicitly says "we deliberately avoid `mcp` / + `fastmcp` pip dependencies" because the stdlib dispatch is + smaller, more auditable, and decoupled from the spec library's + release cadence. Mirror that here: the HTTP path is also a + thin hand-rolled adapter. + +### 3. Stateless in v1 — no `Mcp-Session-Id` + +The Streamable HTTP spec allows a stateful session identified by +an `Mcp-Session-Id` header. We don't need it because: + +- The dispatcher is stateless across requests. +- The graph is loaded once at startup and never re-read. +- No tool mutates per-session state (write tools mutate the graph, + but that's process-global, not session-scoped). + +Adding session bookkeeping would buy us nothing and complicate +the curl example. Re-evaluate when we add server-pushed events +or per-client cursors. + +### 4. One-shot JSON is the common case; SSE is for protocol compliance + +Every lore tool is synchronous and finishes in <1s. JSON is the +default. SSE is opt-in via `Accept: text/event-stream` and serves +two purposes: (a) protocol compliance so future clients that +expect SSE frames work, (b) a future migration path to +server-initiated events if we ever add them (notifications, +progress, long-running tool status). + +### 5. Append to `requirements.txt`, not a separate file + +The HTTP deps (`starlette`, `uvicorn`, `httpx`) are added to +the existing manifest. The stdio entry script doesn't import +them; the only consumers are `mcp_http.py` and the new tests. +A separate `requirements-http.txt` would force operators to +remember which deps go with which transport; the unified +manifest is honest about what's installed. + +A split is the right move if/when we add a multi-stage Docker +build or a serverless deployment. Not yet. + +### 6. Bake `.graph.pkl` (165 KB) into the image; mount is the override + +The cached graph is small enough (165 KB today) that baking it +into the image is faster than a bind mount and removes a runtime +dependency. As the codex grows toward ~10 MB, switch to a +mandatory bind mount with a documented path. + +## What this means + +- The dispatcher in `lore_engine-poc.mcp_server.MCPServer` is + **transport-agnostic**. Stdio (slice 2.6) and HTTP (slice 11) + are both thin adapters over `handle_message(msg) -> Optional[dict]`. +- The HTTP path is a single `POST /mcp` endpoint. No GET listener, + no `/sse` legacy path, no `Mcp-Session-Id`. +- Single-process uvicorn. Multi-worker is intentionally not + exposed. The graph is in-memory per-process; multi-worker + would diverge silently. +- Write tools (`add_entity`, `add_relation`, `add_lore_source`, + slice 10's `update_entity`, `delete_entity`, `set_alias`, ...) + mutate the in-memory graph only. Restart to reset. Document + this in the README — same caveat as stdio. + +## Deferred (explicitly) + +- CORS — out of scope until a browser client appears. Open WebUI + proxies server-side. +- OpenAI / Open WebUI tool-call bridge — slice 12. The HTTP + transport itself is consumable from any HTTP MCP client today. +- In-container Cognee ingest — the container serves a pre-built + graph. Ingest-in-container is a later slice. diff --git a/docs/plan/11-slice-mcp-http-docker.md b/docs/plan/11-slice-mcp-http-docker.md new file mode 100644 index 0000000..27c740b --- /dev/null +++ b/docs/plan/11-slice-mcp-http-docker.md @@ -0,0 +1,103 @@ +# Slice 11 — Streamable HTTP transport + container + +**Status:** ✅ shipped 2026-06-18. Commits `0088360`, `1e239d9`, +`225253b`, `4513255`, `ca5b197` (impl repo +`git.homelab.local/kaykayyali/lore-engine-poc-v3`). **Was** 529/529 +tests at slice 11.0; 554/554 at slice 11.5. ADR `0010` +captures the spec + framework decisions. + +## Goal + +Add a Streamable HTTP transport (MCP 2025-06-18) on top of the +existing transport-agnostic `MCPServer` dispatcher, and ship a +Docker image for the HTTP path. The stdio path stays byte-identical +to slice 2.6. + +## What shipped + +| Component | Where | Notes | +|---|---|---| +| ASGI app | `lore_engine_poc/mcp_http.py` | Starlette adapter over `MCPServer.handle_message`. One endpoint `POST /mcp`. | +| Entry script | `scripts/06_mcp_http_server.py` | Eager-load graph, `uvicorn.run(...)`. Env: `LORE_GRAPH_PATH`, `LORE_HTTP_HOST`, `LORE_HTTP_PORT`. | +| Docker image | `Dockerfile` | `python:3.12-slim`, baked `.graph.pkl`, `HEALTHCHECK` against `/mcp`. | +| Compose | `docker-compose.yml` | One service, port 8765, bind-mount override (commented). | +| Deps | `requirements.txt` | `starlette>=0.40`, `uvicorn>=0.30`, `httpx>=0.27`. | +| Tests | `tests/test_mcp/test_mcp_http_module.py` (14), `test_scripts_06.py` (7), `test_dockerfile.py` (4) | In-process, subprocess, docker. | + +## What was NOT shipped (deferred) + +- **Session management** (`Mcp-Session-Id`). Stateless in v1; re-evaluate when we add server-initiated events or per-session state. +- **CORS**. Out of scope until a browser client appears. Open WebUI proxies server-side. +- **Write-back to disk from write tools.** Same caveat as stdio: write tools mutate the in-memory graph only; restart to reset. +- **Multi-worker uvicorn.** Intentionally not exposed. The graph is in-memory per-process; multi-worker would diverge silently. +- **In-container Cognee ingest.** The container serves a pre-built graph. Ingest-in-container is a later slice. +- **OpenAI / Open WebUI bridge.** The HTTP transport is consumable from any HTTP MCP client (Claude Desktop via `docker`, curl, httpx). A bridge for Open WebUI's OpenAI-compatible tool protocol is slice 12. + +## Acceptance criteria + +| # | Criterion | Test | +|---|---|---| +| 11.1 | `POST /mcp initialize` returns `protocolVersion == "2024-11-05"` + `serverInfo.name == "lore-engine-poc"` | `test_mcp_http_module.py:1`, `test_scripts_06.py:1` | +| 11.2 | `POST /mcp tools/list` returns 36 tools | `test_mcp_http_module.py:2`, `test_scripts_06.py:2` | +| 11.3 | `POST /mcp tools/call was_true_at` returns `was_true: true` for the Roland Raventhorne / House Raventhorne / `3rd_age.year_345` query | `test_mcp_http_module.py:4`, `test_scripts_06.py:3` | +| 11.4 | Malformed JSON → HTTP 400 + JSON-RPC `-32700` | `test_mcp_http_module.py:5` | +| 11.5 | `notifications/initialized` → HTTP 202 with empty body | `test_mcp_http_module.py:7`, `test_scripts_06.py:5` | +| 11.6 | Unknown tool name → JSON-RPC `-32602` | `test_mcp_http_module.py:8` | +| 11.7 | Tool-body exception → `isError: True` in result envelope (HTTP 200) | `test_mcp_http_module.py:9` | +| 11.8 | `Accept: text/event-stream` → `Content-Type: text/event-stream` + body matches `^event: message\ndata: \{.*\}\n\n$` | `test_mcp_http_module.py:11`, `test_scripts_06.py:4` | +| 11.9 | `Accept: application/json` (or absent) → JSON body | `test_mcp_http_module.py:10` | +| 11.10 | `docker build -t lore-engine-mcp:test .` exits 0 | `test_dockerfile.py:1` | +| 11.11 | `docker run` exposes a healthy `/mcp` endpoint | `test_dockerfile.py:2`, `:4` | +| 11.12 | `docker compose up` brings the service up; round-trip works; `LORE_HTTP_PORT` env var overrides the host port | `test_dockerfile.py:3` | +| 11.13 | The 529 existing tests still pass; stdio path is unchanged | `python3 -m pytest tests/ -q` | + +## Test plan + +- **In-process** (`tests/test_mcp/test_mcp_http_module.py`, 14 tests) — `httpx.AsyncClient` + `ASGITransport`, no real socket. Reuses the trivial test-double registry from `test_server.py` for dispatcher-shape tests and the real `TOOL_REGISTRY` + cached graph for the end-to-end `was_true_at` test. Each async test uses `asyncio.run` (no `pytest-asyncio` dep). +- **Subprocess** (`tests/test_mcp/test_scripts_06.py`, 7 tests) — boots `scripts/06_mcp_http_server.py` on `LORE_HTTP_PORT=0` (OS-assigned), parses the bound port from the "Uvicorn running on" line, then exercises the server over a real socket via `httpx`. Mirrors `test_scripts_05.py`. +- **Docker** (`tests/test_mcp/test_dockerfile.py`, 4 tests) — gated on `shutil.which("docker")`. Build, run + round-trip, compose up + round-trip, healthcheck reaches `healthy`. The compose test uses `COMPOSE_PROJECT_NAME` + `LORE_HTTP_PORT` to allow parallel CI runs. + +## Architectural decisions + +- **Streamable HTTP over legacy HTTP+SSE.** See ADR 0010. The 2024-11-05 transport forced every request to either SSE or a hanging GET, which doesn't match a request/response tool API. Streamable HTTP is the 2025-06-18 spec, matches `httpx`/`curl` ergonomics, and is the path the spec authors themselves are steering clients toward. +- **Starlette + uvicorn over FastAPI or the `mcp` PyPI lib.** Mirrors the stdlib-only discipline of `scripts/05_mcp_server.py`. The HTTP layer is ~50 lines of glue around the existing dispatcher; FastAPI's pydantic models and OpenAPI docs would be overhead we don't need. +- **Stateless in v1.** No `Mcp-Session-Id` round-trip. The dispatcher is already stateless across requests; the graph is loaded once at startup; no tool mutates per-session state. +- **One-shot JSON as the common case.** Every lore tool is synchronous and finishes in <1s. SSE is for protocol compliance and future server-push, not for the current tool set. +- **Bake `.graph.pkl` (165 KB) in the image, mount as documented override.** Reassess at ~10 MB. +- **Single-process uvicorn.** Documented limitation. Multi-worker would diverge silently because the graph is in-memory per-process. + +## Files + +- **new:** `lore_engine_poc/mcp_http.py` +- **new:** `scripts/06_mcp_http_server.py` +- **new:** `tests/test_mcp/test_mcp_http_module.py` +- **new:** `tests/test_mcp/test_scripts_06.py` +- **new:** `tests/test_mcp/test_dockerfile.py` +- **new:** `Dockerfile`, `.dockerignore`, `docker-compose.yml` +- **modified:** `requirements.txt` (added `starlette`, `uvicorn`, `httpx`) +- **modified:** `lore_engine-poc/README.md` (new "MCP transports" + "Docker" sections; pull from the impl repo) +- **modified:** `docs/CONTEXT.md` (new transport + container section; design repo) +- **new:** `docs/adr/0010-streamable-http-transport.md` (design repo) + +## How to use + +```bash +# Local (stdio, unchanged) +python3 scripts/05_mcp_server.py + +# Local (HTTP) +python3 scripts/06_mcp_http_server.py --host 127.0.0.1 --port 8765 + +# Container +docker build -t lore-engine-mcp . +docker run --rm -p 8765:8765 lore-engine-mcp + +# Compose +docker compose up + +# Round-trip +curl -s http://localhost:8765/mcp \ + -H 'Content-Type: application/json' \ + -H 'Accept: application/json' \ + -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' +```