docs(slice 11): plan doc + ADR 0010 + CONTEXT cross-ref

* docs/plan/11-slice-mcp-http-docker.md: slice plan with AC + test
  plan, mirroring the 11 existing plan docs.
* docs/adr/0010-streamable-http-transport.md: captures Streamable
  HTTP vs legacy HTTP+SSE, Starlette vs FastAPI/mcp-lib, stateless
  v1 (no Mcp-Session-Id), one-shot JSON as common case, append to
  requirements vs split, baked graph + mount override.
* CONTEXT.md: new 'Transports' section under Operations documenting
  the transport-agnostic dispatcher + stdio + Streamable HTTP.
This commit is contained in:
2026-06-18 14:32:22 -04:00
parent 119b01684b
commit 288ad4cb69
3 changed files with 243 additions and 0 deletions

View File

@@ -66,6 +66,27 @@ A record of one execution of the consistency engine (started_at, rules_run, viol
**Cognee**:
The substrate — the MIT-licensed framework that owns extraction, embedding, and the `remember`/`recall`/`forget` API. Pinned at v1.1.2 (ADR 0006). Graph backend pinned to Neo4j, overriding Cognee's Kuzu default (ADR 0008).
### Transports
**MCP (Model Context Protocol)**:
The wire protocol clients use to talk to the server. We implement
MCP 2024-11-05 over two transports:
- **stdio** (slice 2.6) — JSON-RPC 2.0 over stdin/stdout, stdlib only.
Entry: `scripts/05_mcp_server.py`. Used by Claude Desktop, Continue,
Cline, and any stdio-native MCP client.
- **Streamable HTTP** (slice 11, ADR 0010) — single `POST /mcp`
endpoint. The response body is either `application/json` (one-shot,
the common case) or `text/event-stream` (SSE-upgraded via the
client's `Accept` header). Entry: `scripts/06_mcp_http_server.py`.
Container: `lore-engine-mcp` (Dockerfile + docker-compose.yml).
The dispatcher in `lore_engine_poc.mcp_server.MCPServer` is
**transport-agnostic** — both transports are thin adapters over
`handle_message(msg) -> Optional[dict]`. The 36-tool registry
(`TOOL_REGISTRY` in `mcp_tools.py`) is shared verbatim between
them.
## Avoid list (global)
- **world_id** — use `setting_id`.

View File

@@ -0,0 +1,119 @@
# Streamable HTTP transport + Starlette over the stdlib stdio server
**Status:** accepted. Slice 11 (2026-06-18).
## Context
The MCP server in `lore_engine-poc/scripts/05_mcp_server.py`
(slice 2.6) is stdlib-only and speaks JSON-RPC 2.0 over
stdin/stdout. That works for stdio-native clients (Claude Desktop,
Continue, Cline) but excludes any HTTP-based consumer: Open WebUI,
plain `curl`, a browser, a sidecar, a webhook. We need an HTTP
transport on the same dispatcher.
## Decisions
### 1. Streamable HTTP, not legacy HTTP+SSE
The MCP 2024-11-05 transport split endpoints: `GET /sse` for the
server-to-client event stream, `POST /messages` for client-to-server
requests. The 2025-06-18 spec deprecates that in favor of a
**Streamable HTTP** transport: a single `POST /mcp` endpoint whose
response body is either `application/json` (one-shot) or
`text/event-stream` (SSE-upgraded), per the client's `Accept` header.
We adopt Streamable HTTP because:
- It matches the request/response shape of every tool in the
registry. None of the 36 tools are long-lived or push server-side
state; forcing them into an SSE envelope would be ceremony.
- It's the path the MCP spec authors are steering clients toward.
Adopting it now means a single transport serves both current and
near-future clients.
- A single endpoint is dramatically easier to operate, secure, and
reverse-proxy. Two-endpoint transports invite deployment mistakes
(CORS on one, not the other; load-balancer routing on `/messages`
but not `/sse`; etc.).
### 2. Starlette + uvicorn, not FastAPI or the `mcp` PyPI lib
The HTTP layer is a thin transport adapter over the existing
dispatcher. Starlette gives us an ASGI app + `JSONResponse` +
`StreamingResponse` and is the smallest reasonable dep set.
uvicorn is the standard ASGI server. We do **not** need:
- FastAPI's pydantic models — our request body is an untyped JSON
dict the dispatcher validates. Adding pydantic per tool schema
would be a second source of truth.
- The `mcp` / `fastmcp` PyPI packages — the `scripts/05_mcp_server.py`
docstring explicitly says "we deliberately avoid `mcp` /
`fastmcp` pip dependencies" because the stdlib dispatch is
smaller, more auditable, and decoupled from the spec library's
release cadence. Mirror that here: the HTTP path is also a
thin hand-rolled adapter.
### 3. Stateless in v1 — no `Mcp-Session-Id`
The Streamable HTTP spec allows a stateful session identified by
an `Mcp-Session-Id` header. We don't need it because:
- The dispatcher is stateless across requests.
- The graph is loaded once at startup and never re-read.
- No tool mutates per-session state (write tools mutate the graph,
but that's process-global, not session-scoped).
Adding session bookkeeping would buy us nothing and complicate
the curl example. Re-evaluate when we add server-pushed events
or per-client cursors.
### 4. One-shot JSON is the common case; SSE is for protocol compliance
Every lore tool is synchronous and finishes in <1s. JSON is the
default. SSE is opt-in via `Accept: text/event-stream` and serves
two purposes: (a) protocol compliance so future clients that
expect SSE frames work, (b) a future migration path to
server-initiated events if we ever add them (notifications,
progress, long-running tool status).
### 5. Append to `requirements.txt`, not a separate file
The HTTP deps (`starlette`, `uvicorn`, `httpx`) are added to
the existing manifest. The stdio entry script doesn't import
them; the only consumers are `mcp_http.py` and the new tests.
A separate `requirements-http.txt` would force operators to
remember which deps go with which transport; the unified
manifest is honest about what's installed.
A split is the right move if/when we add a multi-stage Docker
build or a serverless deployment. Not yet.
### 6. Bake `.graph.pkl` (165 KB) into the image; mount is the override
The cached graph is small enough (165 KB today) that baking it
into the image is faster than a bind mount and removes a runtime
dependency. As the codex grows toward ~10 MB, switch to a
mandatory bind mount with a documented path.
## What this means
- The dispatcher in `lore_engine-poc.mcp_server.MCPServer` is
**transport-agnostic**. Stdio (slice 2.6) and HTTP (slice 11)
are both thin adapters over `handle_message(msg) -> Optional[dict]`.
- The HTTP path is a single `POST /mcp` endpoint. No GET listener,
no `/sse` legacy path, no `Mcp-Session-Id`.
- Single-process uvicorn. Multi-worker is intentionally not
exposed. The graph is in-memory per-process; multi-worker
would diverge silently.
- Write tools (`add_entity`, `add_relation`, `add_lore_source`,
slice 10's `update_entity`, `delete_entity`, `set_alias`, ...)
mutate the in-memory graph only. Restart to reset. Document
this in the README — same caveat as stdio.
## Deferred (explicitly)
- CORS — out of scope until a browser client appears. Open WebUI
proxies server-side.
- OpenAI / Open WebUI tool-call bridge — slice 12. The HTTP
transport itself is consumable from any HTTP MCP client today.
- In-container Cognee ingest — the container serves a pre-built
graph. Ingest-in-container is a later slice.

View File

@@ -0,0 +1,103 @@
# Slice 11 — Streamable HTTP transport + container
**Status:** ✅ shipped 2026-06-18. Commits `0088360`, `1e239d9`,
`225253b`, `4513255`, `ca5b197` (impl repo
`git.homelab.local/kaykayyali/lore-engine-poc-v3`). **Was** 529/529
tests at slice 11.0; 554/554 at slice 11.5. ADR `0010`
captures the spec + framework decisions.
## Goal
Add a Streamable HTTP transport (MCP 2025-06-18) on top of the
existing transport-agnostic `MCPServer` dispatcher, and ship a
Docker image for the HTTP path. The stdio path stays byte-identical
to slice 2.6.
## What shipped
| Component | Where | Notes |
|---|---|---|
| ASGI app | `lore_engine_poc/mcp_http.py` | Starlette adapter over `MCPServer.handle_message`. One endpoint `POST /mcp`. |
| Entry script | `scripts/06_mcp_http_server.py` | Eager-load graph, `uvicorn.run(...)`. Env: `LORE_GRAPH_PATH`, `LORE_HTTP_HOST`, `LORE_HTTP_PORT`. |
| Docker image | `Dockerfile` | `python:3.12-slim`, baked `.graph.pkl`, `HEALTHCHECK` against `/mcp`. |
| Compose | `docker-compose.yml` | One service, port 8765, bind-mount override (commented). |
| Deps | `requirements.txt` | `starlette>=0.40`, `uvicorn>=0.30`, `httpx>=0.27`. |
| Tests | `tests/test_mcp/test_mcp_http_module.py` (14), `test_scripts_06.py` (7), `test_dockerfile.py` (4) | In-process, subprocess, docker. |
## What was NOT shipped (deferred)
- **Session management** (`Mcp-Session-Id`). Stateless in v1; re-evaluate when we add server-initiated events or per-session state.
- **CORS**. Out of scope until a browser client appears. Open WebUI proxies server-side.
- **Write-back to disk from write tools.** Same caveat as stdio: write tools mutate the in-memory graph only; restart to reset.
- **Multi-worker uvicorn.** Intentionally not exposed. The graph is in-memory per-process; multi-worker would diverge silently.
- **In-container Cognee ingest.** The container serves a pre-built graph. Ingest-in-container is a later slice.
- **OpenAI / Open WebUI bridge.** The HTTP transport is consumable from any HTTP MCP client (Claude Desktop via `docker`, curl, httpx). A bridge for Open WebUI's OpenAI-compatible tool protocol is slice 12.
## Acceptance criteria
| # | Criterion | Test |
|---|---|---|
| 11.1 | `POST /mcp initialize` returns `protocolVersion == "2024-11-05"` + `serverInfo.name == "lore-engine-poc"` | `test_mcp_http_module.py:1`, `test_scripts_06.py:1` |
| 11.2 | `POST /mcp tools/list` returns 36 tools | `test_mcp_http_module.py:2`, `test_scripts_06.py:2` |
| 11.3 | `POST /mcp tools/call was_true_at` returns `was_true: true` for the Roland Raventhorne / House Raventhorne / `3rd_age.year_345` query | `test_mcp_http_module.py:4`, `test_scripts_06.py:3` |
| 11.4 | Malformed JSON → HTTP 400 + JSON-RPC `-32700` | `test_mcp_http_module.py:5` |
| 11.5 | `notifications/initialized` → HTTP 202 with empty body | `test_mcp_http_module.py:7`, `test_scripts_06.py:5` |
| 11.6 | Unknown tool name → JSON-RPC `-32602` | `test_mcp_http_module.py:8` |
| 11.7 | Tool-body exception → `isError: True` in result envelope (HTTP 200) | `test_mcp_http_module.py:9` |
| 11.8 | `Accept: text/event-stream``Content-Type: text/event-stream` + body matches `^event: message\ndata: \{.*\}\n\n$` | `test_mcp_http_module.py:11`, `test_scripts_06.py:4` |
| 11.9 | `Accept: application/json` (or absent) → JSON body | `test_mcp_http_module.py:10` |
| 11.10 | `docker build -t lore-engine-mcp:test .` exits 0 | `test_dockerfile.py:1` |
| 11.11 | `docker run` exposes a healthy `/mcp` endpoint | `test_dockerfile.py:2`, `:4` |
| 11.12 | `docker compose up` brings the service up; round-trip works; `LORE_HTTP_PORT` env var overrides the host port | `test_dockerfile.py:3` |
| 11.13 | The 529 existing tests still pass; stdio path is unchanged | `python3 -m pytest tests/ -q` |
## Test plan
- **In-process** (`tests/test_mcp/test_mcp_http_module.py`, 14 tests) — `httpx.AsyncClient` + `ASGITransport`, no real socket. Reuses the trivial test-double registry from `test_server.py` for dispatcher-shape tests and the real `TOOL_REGISTRY` + cached graph for the end-to-end `was_true_at` test. Each async test uses `asyncio.run` (no `pytest-asyncio` dep).
- **Subprocess** (`tests/test_mcp/test_scripts_06.py`, 7 tests) — boots `scripts/06_mcp_http_server.py` on `LORE_HTTP_PORT=0` (OS-assigned), parses the bound port from the "Uvicorn running on" line, then exercises the server over a real socket via `httpx`. Mirrors `test_scripts_05.py`.
- **Docker** (`tests/test_mcp/test_dockerfile.py`, 4 tests) — gated on `shutil.which("docker")`. Build, run + round-trip, compose up + round-trip, healthcheck reaches `healthy`. The compose test uses `COMPOSE_PROJECT_NAME` + `LORE_HTTP_PORT` to allow parallel CI runs.
## Architectural decisions
- **Streamable HTTP over legacy HTTP+SSE.** See ADR 0010. The 2024-11-05 transport forced every request to either SSE or a hanging GET, which doesn't match a request/response tool API. Streamable HTTP is the 2025-06-18 spec, matches `httpx`/`curl` ergonomics, and is the path the spec authors themselves are steering clients toward.
- **Starlette + uvicorn over FastAPI or the `mcp` PyPI lib.** Mirrors the stdlib-only discipline of `scripts/05_mcp_server.py`. The HTTP layer is ~50 lines of glue around the existing dispatcher; FastAPI's pydantic models and OpenAPI docs would be overhead we don't need.
- **Stateless in v1.** No `Mcp-Session-Id` round-trip. The dispatcher is already stateless across requests; the graph is loaded once at startup; no tool mutates per-session state.
- **One-shot JSON as the common case.** Every lore tool is synchronous and finishes in <1s. SSE is for protocol compliance and future server-push, not for the current tool set.
- **Bake `.graph.pkl` (165 KB) in the image, mount as documented override.** Reassess at ~10 MB.
- **Single-process uvicorn.** Documented limitation. Multi-worker would diverge silently because the graph is in-memory per-process.
## Files
- **new:** `lore_engine_poc/mcp_http.py`
- **new:** `scripts/06_mcp_http_server.py`
- **new:** `tests/test_mcp/test_mcp_http_module.py`
- **new:** `tests/test_mcp/test_scripts_06.py`
- **new:** `tests/test_mcp/test_dockerfile.py`
- **new:** `Dockerfile`, `.dockerignore`, `docker-compose.yml`
- **modified:** `requirements.txt` (added `starlette`, `uvicorn`, `httpx`)
- **modified:** `lore_engine-poc/README.md` (new "MCP transports" + "Docker" sections; pull from the impl repo)
- **modified:** `docs/CONTEXT.md` (new transport + container section; design repo)
- **new:** `docs/adr/0010-streamable-http-transport.md` (design repo)
## How to use
```bash
# Local (stdio, unchanged)
python3 scripts/05_mcp_server.py
# Local (HTTP)
python3 scripts/06_mcp_http_server.py --host 127.0.0.1 --port 8765
# Container
docker build -t lore-engine-mcp .
docker run --rm -p 8765:8765 lore-engine-mcp
# Compose
docker compose up
# Round-trip
curl -s http://localhost:8765/mcp \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}'
```