docs(slice 11): plan doc + ADR 0010 + CONTEXT cross-ref

* docs/plan/11-slice-mcp-http-docker.md: slice plan with AC + test plan, mirroring the 11 existing plan docs. * docs/adr/0010-streamable-http-transport.md: captures Streamable HTTP vs legacy HTTP+SSE, Starlette vs FastAPI/mcp-lib, stateless v1 (no Mcp-Session-Id), one-shot JSON as common case, append to requirements vs split, baked graph + mount override. * CONTEXT.md: new 'Transports' section under Operations documenting the transport-agnostic dispatcher + stdio + Streamable HTTP.
2026-06-18 14:32:22 -04:00
parent 119b01684b
commit 288ad4cb69
3 changed files with 243 additions and 0 deletions
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -66,6 +66,27 @@ A record of one execution of the consistency engine (started_at, rules_run, viol
 **Cognee**:
 The substrate — the MIT-licensed framework that owns extraction, embedding, and the `remember`/`recall`/`forget` API. Pinned at v1.1.2 (ADR 0006). Graph backend pinned to Neo4j, overriding Cognee's Kuzu default (ADR 0008).

+### Transports
+
+**MCP (Model Context Protocol)**:
+The wire protocol clients use to talk to the server. We implement
+MCP 2024-11-05 over two transports:
+
+- **stdio** (slice 2.6) — JSON-RPC 2.0 over stdin/stdout, stdlib only.
+  Entry: `scripts/05_mcp_server.py`. Used by Claude Desktop, Continue,
+  Cline, and any stdio-native MCP client.
+- **Streamable HTTP** (slice 11, ADR 0010) — single `POST /mcp`
+  endpoint. The response body is either `application/json` (one-shot,
+  the common case) or `text/event-stream` (SSE-upgraded via the
+  client's `Accept` header). Entry: `scripts/06_mcp_http_server.py`.
+  Container: `lore-engine-mcp` (Dockerfile + docker-compose.yml).
+
+The dispatcher in `lore_engine_poc.mcp_server.MCPServer` is
+**transport-agnostic** — both transports are thin adapters over
+`handle_message(msg) -> Optional[dict]`. The 36-tool registry
+(`TOOL_REGISTRY` in `mcp_tools.py`) is shared verbatim between
+them.
+
 ## Avoid list (global)

 - **world_id** — use `setting_id`.
--- a/docs/adr/0010-streamable-http-transport.md
+++ b/docs/adr/0010-streamable-http-transport.md
@@ -0,0 +1,119 @@
+# Streamable HTTP transport + Starlette over the stdlib stdio server
+
+**Status:** accepted. Slice 11 (2026-06-18).
+
+## Context
+
+The MCP server in `lore_engine-poc/scripts/05_mcp_server.py`
+(slice 2.6) is stdlib-only and speaks JSON-RPC 2.0 over
+stdin/stdout. That works for stdio-native clients (Claude Desktop,
+Continue, Cline) but excludes any HTTP-based consumer: Open WebUI,
+plain `curl`, a browser, a sidecar, a webhook. We need an HTTP
+transport on the same dispatcher.
+
+## Decisions
+
+### 1. Streamable HTTP, not legacy HTTP+SSE
+
+The MCP 2024-11-05 transport split endpoints: `GET /sse` for the
+server-to-client event stream, `POST /messages` for client-to-server
+requests. The 2025-06-18 spec deprecates that in favor of a
+**Streamable HTTP** transport: a single `POST /mcp` endpoint whose
+response body is either `application/json` (one-shot) or
+`text/event-stream` (SSE-upgraded), per the client's `Accept` header.
+
+We adopt Streamable HTTP because:
+
+- It matches the request/response shape of every tool in the
+  registry. None of the 36 tools are long-lived or push server-side
+  state; forcing them into an SSE envelope would be ceremony.
+- It's the path the MCP spec authors are steering clients toward.
+  Adopting it now means a single transport serves both current and
+  near-future clients.
+- A single endpoint is dramatically easier to operate, secure, and
+  reverse-proxy. Two-endpoint transports invite deployment mistakes
+  (CORS on one, not the other; load-balancer routing on `/messages`
+  but not `/sse`; etc.).
+
+### 2. Starlette + uvicorn, not FastAPI or the `mcp` PyPI lib
+
+The HTTP layer is a thin transport adapter over the existing
+dispatcher. Starlette gives us an ASGI app + `JSONResponse` +
+`StreamingResponse` and is the smallest reasonable dep set.
+uvicorn is the standard ASGI server. We do **not** need:
+
+- FastAPI's pydantic models — our request body is an untyped JSON
+  dict the dispatcher validates. Adding pydantic per tool schema
+  would be a second source of truth.
+- The `mcp` / `fastmcp` PyPI packages — the `scripts/05_mcp_server.py`
+  docstring explicitly says "we deliberately avoid `mcp` /
+  `fastmcp` pip dependencies" because the stdlib dispatch is
+  smaller, more auditable, and decoupled from the spec library's
+  release cadence. Mirror that here: the HTTP path is also a
+  thin hand-rolled adapter.
+
+### 3. Stateless in v1 — no `Mcp-Session-Id`
+
+The Streamable HTTP spec allows a stateful session identified by
+an `Mcp-Session-Id` header. We don't need it because:
+
+- The dispatcher is stateless across requests.
+- The graph is loaded once at startup and never re-read.
+- No tool mutates per-session state (write tools mutate the graph,
+  but that's process-global, not session-scoped).
+
+Adding session bookkeeping would buy us nothing and complicate
+the curl example. Re-evaluate when we add server-pushed events
+or per-client cursors.
+
+### 4. One-shot JSON is the common case; SSE is for protocol compliance
+
+Every lore tool is synchronous and finishes in <1s. JSON is the
+default. SSE is opt-in via `Accept: text/event-stream` and serves
+two purposes: (a) protocol compliance so future clients that
+expect SSE frames work, (b) a future migration path to
+server-initiated events if we ever add them (notifications,
+progress, long-running tool status).
+
+### 5. Append to `requirements.txt`, not a separate file
+
+The HTTP deps (`starlette`, `uvicorn`, `httpx`) are added to
+the existing manifest. The stdio entry script doesn't import
+them; the only consumers are `mcp_http.py` and the new tests.
+A separate `requirements-http.txt` would force operators to
+remember which deps go with which transport; the unified
+manifest is honest about what's installed.
+
+A split is the right move if/when we add a multi-stage Docker
+build or a serverless deployment. Not yet.
+
+### 6. Bake `.graph.pkl` (165 KB) into the image; mount is the override
+
+The cached graph is small enough (165 KB today) that baking it
+into the image is faster than a bind mount and removes a runtime
+dependency. As the codex grows toward ~10 MB, switch to a
+mandatory bind mount with a documented path.
+
+## What this means
+
+- The dispatcher in `lore_engine-poc.mcp_server.MCPServer` is
+  **transport-agnostic**. Stdio (slice 2.6) and HTTP (slice 11)
+  are both thin adapters over `handle_message(msg) -> Optional[dict]`.
+- The HTTP path is a single `POST /mcp` endpoint. No GET listener,
+  no `/sse` legacy path, no `Mcp-Session-Id`.
+- Single-process uvicorn. Multi-worker is intentionally not
+  exposed. The graph is in-memory per-process; multi-worker
+  would diverge silently.
+- Write tools (`add_entity`, `add_relation`, `add_lore_source`,
+  slice 10's `update_entity`, `delete_entity`, `set_alias`, ...)
+  mutate the in-memory graph only. Restart to reset. Document
+  this in the README — same caveat as stdio.
+
+## Deferred (explicitly)
+
+- CORS — out of scope until a browser client appears. Open WebUI
+  proxies server-side.
+- OpenAI / Open WebUI tool-call bridge — slice 12. The HTTP
+  transport itself is consumable from any HTTP MCP client today.
+- In-container Cognee ingest — the container serves a pre-built
+  graph. Ingest-in-container is a later slice.
--- a/docs/plan/11-slice-mcp-http-docker.md
+++ b/docs/plan/11-slice-mcp-http-docker.md
@@ -0,0 +1,103 @@
+# Slice 11 — Streamable HTTP transport + container
+
+**Status:** ✅ shipped 2026-06-18. Commits `0088360`, `1e239d9`,
+`225253b`, `4513255`, `ca5b197` (impl repo
+`git.homelab.local/kaykayyali/lore-engine-poc-v3`). **Was** 529/529
+tests at slice 11.0; 554/554 at slice 11.5. ADR `0010`
+captures the spec + framework decisions.
+
+## Goal
+
+Add a Streamable HTTP transport (MCP 2025-06-18) on top of the
+existing transport-agnostic `MCPServer` dispatcher, and ship a
+Docker image for the HTTP path. The stdio path stays byte-identical
+to slice 2.6.
+
+## What shipped
+
+| Component | Where | Notes |
+|---|---|---|
+| ASGI app | `lore_engine_poc/mcp_http.py` | Starlette adapter over `MCPServer.handle_message`. One endpoint `POST /mcp`. |
+| Entry script | `scripts/06_mcp_http_server.py` | Eager-load graph, `uvicorn.run(...)`. Env: `LORE_GRAPH_PATH`, `LORE_HTTP_HOST`, `LORE_HTTP_PORT`. |
+| Docker image | `Dockerfile` | `python:3.12-slim`, baked `.graph.pkl`, `HEALTHCHECK` against `/mcp`. |
+| Compose | `docker-compose.yml` | One service, port 8765, bind-mount override (commented). |
+| Deps | `requirements.txt` | `starlette>=0.40`, `uvicorn>=0.30`, `httpx>=0.27`. |
+| Tests | `tests/test_mcp/test_mcp_http_module.py` (14), `test_scripts_06.py` (7), `test_dockerfile.py` (4) | In-process, subprocess, docker. |
+
+## What was NOT shipped (deferred)
+
+- **Session management** (`Mcp-Session-Id`). Stateless in v1; re-evaluate when we add server-initiated events or per-session state.
+- **CORS**. Out of scope until a browser client appears. Open WebUI proxies server-side.
+- **Write-back to disk from write tools.** Same caveat as stdio: write tools mutate the in-memory graph only; restart to reset.
+- **Multi-worker uvicorn.** Intentionally not exposed. The graph is in-memory per-process; multi-worker would diverge silently.
+- **In-container Cognee ingest.** The container serves a pre-built graph. Ingest-in-container is a later slice.
+- **OpenAI / Open WebUI bridge.** The HTTP transport is consumable from any HTTP MCP client (Claude Desktop via `docker`, curl, httpx). A bridge for Open WebUI's OpenAI-compatible tool protocol is slice 12.
+
+## Acceptance criteria
+
+| # | Criterion | Test |
+|---|---|---|
+| 11.1 | `POST /mcp initialize` returns `protocolVersion == "2024-11-05"` + `serverInfo.name == "lore-engine-poc"` | `test_mcp_http_module.py:1`, `test_scripts_06.py:1` |
+| 11.2 | `POST /mcp tools/list` returns 36 tools | `test_mcp_http_module.py:2`, `test_scripts_06.py:2` |
+| 11.3 | `POST /mcp tools/call was_true_at` returns `was_true: true` for the Roland Raventhorne / House Raventhorne / `3rd_age.year_345` query | `test_mcp_http_module.py:4`, `test_scripts_06.py:3` |
+| 11.4 | Malformed JSON → HTTP 400 + JSON-RPC `-32700` | `test_mcp_http_module.py:5` |
+| 11.5 | `notifications/initialized` → HTTP 202 with empty body | `test_mcp_http_module.py:7`, `test_scripts_06.py:5` |
+| 11.6 | Unknown tool name → JSON-RPC `-32602` | `test_mcp_http_module.py:8` |
+| 11.7 | Tool-body exception → `isError: True` in result envelope (HTTP 200) | `test_mcp_http_module.py:9` |
+| 11.8 | `Accept: text/event-stream` → `Content-Type: text/event-stream` + body matches `^event: message\ndata: \{.*\}\n\n$` | `test_mcp_http_module.py:11`, `test_scripts_06.py:4` |
+| 11.9 | `Accept: application/json` (or absent) → JSON body | `test_mcp_http_module.py:10` |
+| 11.10 | `docker build -t lore-engine-mcp:test .` exits 0 | `test_dockerfile.py:1` |
+| 11.11 | `docker run` exposes a healthy `/mcp` endpoint | `test_dockerfile.py:2`, `:4` |
+| 11.12 | `docker compose up` brings the service up; round-trip works; `LORE_HTTP_PORT` env var overrides the host port | `test_dockerfile.py:3` |
+| 11.13 | The 529 existing tests still pass; stdio path is unchanged | `python3 -m pytest tests/ -q` |
+
+## Test plan
+
+- **In-process** (`tests/test_mcp/test_mcp_http_module.py`, 14 tests) — `httpx.AsyncClient` + `ASGITransport`, no real socket. Reuses the trivial test-double registry from `test_server.py` for dispatcher-shape tests and the real `TOOL_REGISTRY` + cached graph for the end-to-end `was_true_at` test. Each async test uses `asyncio.run` (no `pytest-asyncio` dep).
+- **Subprocess** (`tests/test_mcp/test_scripts_06.py`, 7 tests) — boots `scripts/06_mcp_http_server.py` on `LORE_HTTP_PORT=0` (OS-assigned), parses the bound port from the "Uvicorn running on" line, then exercises the server over a real socket via `httpx`. Mirrors `test_scripts_05.py`.
+- **Docker** (`tests/test_mcp/test_dockerfile.py`, 4 tests) — gated on `shutil.which("docker")`. Build, run + round-trip, compose up + round-trip, healthcheck reaches `healthy`. The compose test uses `COMPOSE_PROJECT_NAME` + `LORE_HTTP_PORT` to allow parallel CI runs.
+
+## Architectural decisions
+
+- **Streamable HTTP over legacy HTTP+SSE.** See ADR 0010. The 2024-11-05 transport forced every request to either SSE or a hanging GET, which doesn't match a request/response tool API. Streamable HTTP is the 2025-06-18 spec, matches `httpx`/`curl` ergonomics, and is the path the spec authors themselves are steering clients toward.
+- **Starlette + uvicorn over FastAPI or the `mcp` PyPI lib.** Mirrors the stdlib-only discipline of `scripts/05_mcp_server.py`. The HTTP layer is ~50 lines of glue around the existing dispatcher; FastAPI's pydantic models and OpenAPI docs would be overhead we don't need.
+- **Stateless in v1.** No `Mcp-Session-Id` round-trip. The dispatcher is already stateless across requests; the graph is loaded once at startup; no tool mutates per-session state.
+- **One-shot JSON as the common case.** Every lore tool is synchronous and finishes in <1s. SSE is for protocol compliance and future server-push, not for the current tool set.
+- **Bake `.graph.pkl` (165 KB) in the image, mount as documented override.** Reassess at ~10 MB.
+- **Single-process uvicorn.** Documented limitation. Multi-worker would diverge silently because the graph is in-memory per-process.
+
+## Files
+
+- **new:** `lore_engine_poc/mcp_http.py`
+- **new:** `scripts/06_mcp_http_server.py`
+- **new:** `tests/test_mcp/test_mcp_http_module.py`
+- **new:** `tests/test_mcp/test_scripts_06.py`
+- **new:** `tests/test_mcp/test_dockerfile.py`
+- **new:** `Dockerfile`, `.dockerignore`, `docker-compose.yml`
+- **modified:** `requirements.txt` (added `starlette`, `uvicorn`, `httpx`)
+- **modified:** `lore_engine-poc/README.md` (new "MCP transports" + "Docker" sections; pull from the impl repo)
+- **modified:** `docs/CONTEXT.md` (new transport + container section; design repo)
+- **new:** `docs/adr/0010-streamable-http-transport.md` (design repo)
+
+## How to use
+
+```bash
+# Local (stdio, unchanged)
+python3 scripts/05_mcp_server.py
+
+# Local (HTTP)
+python3 scripts/06_mcp_http_server.py --host 127.0.0.1 --port 8765
+
+# Container
+docker build -t lore-engine-mcp .
+docker run --rm -p 8765:8765 lore-engine-mcp
+
+# Compose
+docker compose up
+
+# Round-trip
+curl -s http://localhost:8765/mcp \
+  -H 'Content-Type: application/json' \
+  -H 'Accept: application/json' \
+  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}'
+```