docs(slice 11): plan doc + ADR 0010 + CONTEXT cross-ref
* docs/plan/11-slice-mcp-http-docker.md: slice plan with AC + test plan, mirroring the 11 existing plan docs. * docs/adr/0010-streamable-http-transport.md: captures Streamable HTTP vs legacy HTTP+SSE, Starlette vs FastAPI/mcp-lib, stateless v1 (no Mcp-Session-Id), one-shot JSON as common case, append to requirements vs split, baked graph + mount override. * CONTEXT.md: new 'Transports' section under Operations documenting the transport-agnostic dispatcher + stdio + Streamable HTTP.
This commit is contained in:
21
CONTEXT.md
21
CONTEXT.md
@@ -66,6 +66,27 @@ A record of one execution of the consistency engine (started_at, rules_run, viol
|
||||
**Cognee**:
|
||||
The substrate — the MIT-licensed framework that owns extraction, embedding, and the `remember`/`recall`/`forget` API. Pinned at v1.1.2 (ADR 0006). Graph backend pinned to Neo4j, overriding Cognee's Kuzu default (ADR 0008).
|
||||
|
||||
### Transports
|
||||
|
||||
**MCP (Model Context Protocol)**:
|
||||
The wire protocol clients use to talk to the server. We implement
|
||||
MCP 2024-11-05 over two transports:
|
||||
|
||||
- **stdio** (slice 2.6) — JSON-RPC 2.0 over stdin/stdout, stdlib only.
|
||||
Entry: `scripts/05_mcp_server.py`. Used by Claude Desktop, Continue,
|
||||
Cline, and any stdio-native MCP client.
|
||||
- **Streamable HTTP** (slice 11, ADR 0010) — single `POST /mcp`
|
||||
endpoint. The response body is either `application/json` (one-shot,
|
||||
the common case) or `text/event-stream` (SSE-upgraded via the
|
||||
client's `Accept` header). Entry: `scripts/06_mcp_http_server.py`.
|
||||
Container: `lore-engine-mcp` (Dockerfile + docker-compose.yml).
|
||||
|
||||
The dispatcher in `lore_engine_poc.mcp_server.MCPServer` is
|
||||
**transport-agnostic** — both transports are thin adapters over
|
||||
`handle_message(msg) -> Optional[dict]`. The 36-tool registry
|
||||
(`TOOL_REGISTRY` in `mcp_tools.py`) is shared verbatim between
|
||||
them.
|
||||
|
||||
## Avoid list (global)
|
||||
|
||||
- **world_id** — use `setting_id`.
|
||||
|
||||
119
docs/adr/0010-streamable-http-transport.md
Normal file
119
docs/adr/0010-streamable-http-transport.md
Normal file
@@ -0,0 +1,119 @@
|
||||
# Streamable HTTP transport + Starlette over the stdlib stdio server
|
||||
|
||||
**Status:** accepted. Slice 11 (2026-06-18).
|
||||
|
||||
## Context
|
||||
|
||||
The MCP server in `lore_engine-poc/scripts/05_mcp_server.py`
|
||||
(slice 2.6) is stdlib-only and speaks JSON-RPC 2.0 over
|
||||
stdin/stdout. That works for stdio-native clients (Claude Desktop,
|
||||
Continue, Cline) but excludes any HTTP-based consumer: Open WebUI,
|
||||
plain `curl`, a browser, a sidecar, a webhook. We need an HTTP
|
||||
transport on the same dispatcher.
|
||||
|
||||
## Decisions
|
||||
|
||||
### 1. Streamable HTTP, not legacy HTTP+SSE
|
||||
|
||||
The MCP 2024-11-05 transport split endpoints: `GET /sse` for the
|
||||
server-to-client event stream, `POST /messages` for client-to-server
|
||||
requests. The 2025-06-18 spec deprecates that in favor of a
|
||||
**Streamable HTTP** transport: a single `POST /mcp` endpoint whose
|
||||
response body is either `application/json` (one-shot) or
|
||||
`text/event-stream` (SSE-upgraded), per the client's `Accept` header.
|
||||
|
||||
We adopt Streamable HTTP because:
|
||||
|
||||
- It matches the request/response shape of every tool in the
|
||||
registry. None of the 36 tools are long-lived or push server-side
|
||||
state; forcing them into an SSE envelope would be ceremony.
|
||||
- It's the path the MCP spec authors are steering clients toward.
|
||||
Adopting it now means a single transport serves both current and
|
||||
near-future clients.
|
||||
- A single endpoint is dramatically easier to operate, secure, and
|
||||
reverse-proxy. Two-endpoint transports invite deployment mistakes
|
||||
(CORS on one, not the other; load-balancer routing on `/messages`
|
||||
but not `/sse`; etc.).
|
||||
|
||||
### 2. Starlette + uvicorn, not FastAPI or the `mcp` PyPI lib
|
||||
|
||||
The HTTP layer is a thin transport adapter over the existing
|
||||
dispatcher. Starlette gives us an ASGI app + `JSONResponse` +
|
||||
`StreamingResponse` and is the smallest reasonable dep set.
|
||||
uvicorn is the standard ASGI server. We do **not** need:
|
||||
|
||||
- FastAPI's pydantic models — our request body is an untyped JSON
|
||||
dict the dispatcher validates. Adding pydantic per tool schema
|
||||
would be a second source of truth.
|
||||
- The `mcp` / `fastmcp` PyPI packages — the `scripts/05_mcp_server.py`
|
||||
docstring explicitly says "we deliberately avoid `mcp` /
|
||||
`fastmcp` pip dependencies" because the stdlib dispatch is
|
||||
smaller, more auditable, and decoupled from the spec library's
|
||||
release cadence. Mirror that here: the HTTP path is also a
|
||||
thin hand-rolled adapter.
|
||||
|
||||
### 3. Stateless in v1 — no `Mcp-Session-Id`
|
||||
|
||||
The Streamable HTTP spec allows a stateful session identified by
|
||||
an `Mcp-Session-Id` header. We don't need it because:
|
||||
|
||||
- The dispatcher is stateless across requests.
|
||||
- The graph is loaded once at startup and never re-read.
|
||||
- No tool mutates per-session state (write tools mutate the graph,
|
||||
but that's process-global, not session-scoped).
|
||||
|
||||
Adding session bookkeeping would buy us nothing and complicate
|
||||
the curl example. Re-evaluate when we add server-pushed events
|
||||
or per-client cursors.
|
||||
|
||||
### 4. One-shot JSON is the common case; SSE is for protocol compliance
|
||||
|
||||
Every lore tool is synchronous and finishes in <1s. JSON is the
|
||||
default. SSE is opt-in via `Accept: text/event-stream` and serves
|
||||
two purposes: (a) protocol compliance so future clients that
|
||||
expect SSE frames work, (b) a future migration path to
|
||||
server-initiated events if we ever add them (notifications,
|
||||
progress, long-running tool status).
|
||||
|
||||
### 5. Append to `requirements.txt`, not a separate file
|
||||
|
||||
The HTTP deps (`starlette`, `uvicorn`, `httpx`) are added to
|
||||
the existing manifest. The stdio entry script doesn't import
|
||||
them; the only consumers are `mcp_http.py` and the new tests.
|
||||
A separate `requirements-http.txt` would force operators to
|
||||
remember which deps go with which transport; the unified
|
||||
manifest is honest about what's installed.
|
||||
|
||||
A split is the right move if/when we add a multi-stage Docker
|
||||
build or a serverless deployment. Not yet.
|
||||
|
||||
### 6. Bake `.graph.pkl` (165 KB) into the image; mount is the override
|
||||
|
||||
The cached graph is small enough (165 KB today) that baking it
|
||||
into the image is faster than a bind mount and removes a runtime
|
||||
dependency. As the codex grows toward ~10 MB, switch to a
|
||||
mandatory bind mount with a documented path.
|
||||
|
||||
## What this means
|
||||
|
||||
- The dispatcher in `lore_engine-poc.mcp_server.MCPServer` is
|
||||
**transport-agnostic**. Stdio (slice 2.6) and HTTP (slice 11)
|
||||
are both thin adapters over `handle_message(msg) -> Optional[dict]`.
|
||||
- The HTTP path is a single `POST /mcp` endpoint. No GET listener,
|
||||
no `/sse` legacy path, no `Mcp-Session-Id`.
|
||||
- Single-process uvicorn. Multi-worker is intentionally not
|
||||
exposed. The graph is in-memory per-process; multi-worker
|
||||
would diverge silently.
|
||||
- Write tools (`add_entity`, `add_relation`, `add_lore_source`,
|
||||
slice 10's `update_entity`, `delete_entity`, `set_alias`, ...)
|
||||
mutate the in-memory graph only. Restart to reset. Document
|
||||
this in the README — same caveat as stdio.
|
||||
|
||||
## Deferred (explicitly)
|
||||
|
||||
- CORS — out of scope until a browser client appears. Open WebUI
|
||||
proxies server-side.
|
||||
- OpenAI / Open WebUI tool-call bridge — slice 12. The HTTP
|
||||
transport itself is consumable from any HTTP MCP client today.
|
||||
- In-container Cognee ingest — the container serves a pre-built
|
||||
graph. Ingest-in-container is a later slice.
|
||||
103
docs/plan/11-slice-mcp-http-docker.md
Normal file
103
docs/plan/11-slice-mcp-http-docker.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# Slice 11 — Streamable HTTP transport + container
|
||||
|
||||
**Status:** ✅ shipped 2026-06-18. Commits `0088360`, `1e239d9`,
|
||||
`225253b`, `4513255`, `ca5b197` (impl repo
|
||||
`git.homelab.local/kaykayyali/lore-engine-poc-v3`). **Was** 529/529
|
||||
tests at slice 11.0; 554/554 at slice 11.5. ADR `0010`
|
||||
captures the spec + framework decisions.
|
||||
|
||||
## Goal
|
||||
|
||||
Add a Streamable HTTP transport (MCP 2025-06-18) on top of the
|
||||
existing transport-agnostic `MCPServer` dispatcher, and ship a
|
||||
Docker image for the HTTP path. The stdio path stays byte-identical
|
||||
to slice 2.6.
|
||||
|
||||
## What shipped
|
||||
|
||||
| Component | Where | Notes |
|
||||
|---|---|---|
|
||||
| ASGI app | `lore_engine_poc/mcp_http.py` | Starlette adapter over `MCPServer.handle_message`. One endpoint `POST /mcp`. |
|
||||
| Entry script | `scripts/06_mcp_http_server.py` | Eager-load graph, `uvicorn.run(...)`. Env: `LORE_GRAPH_PATH`, `LORE_HTTP_HOST`, `LORE_HTTP_PORT`. |
|
||||
| Docker image | `Dockerfile` | `python:3.12-slim`, baked `.graph.pkl`, `HEALTHCHECK` against `/mcp`. |
|
||||
| Compose | `docker-compose.yml` | One service, port 8765, bind-mount override (commented). |
|
||||
| Deps | `requirements.txt` | `starlette>=0.40`, `uvicorn>=0.30`, `httpx>=0.27`. |
|
||||
| Tests | `tests/test_mcp/test_mcp_http_module.py` (14), `test_scripts_06.py` (7), `test_dockerfile.py` (4) | In-process, subprocess, docker. |
|
||||
|
||||
## What was NOT shipped (deferred)
|
||||
|
||||
- **Session management** (`Mcp-Session-Id`). Stateless in v1; re-evaluate when we add server-initiated events or per-session state.
|
||||
- **CORS**. Out of scope until a browser client appears. Open WebUI proxies server-side.
|
||||
- **Write-back to disk from write tools.** Same caveat as stdio: write tools mutate the in-memory graph only; restart to reset.
|
||||
- **Multi-worker uvicorn.** Intentionally not exposed. The graph is in-memory per-process; multi-worker would diverge silently.
|
||||
- **In-container Cognee ingest.** The container serves a pre-built graph. Ingest-in-container is a later slice.
|
||||
- **OpenAI / Open WebUI bridge.** The HTTP transport is consumable from any HTTP MCP client (Claude Desktop via `docker`, curl, httpx). A bridge for Open WebUI's OpenAI-compatible tool protocol is slice 12.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
| # | Criterion | Test |
|
||||
|---|---|---|
|
||||
| 11.1 | `POST /mcp initialize` returns `protocolVersion == "2024-11-05"` + `serverInfo.name == "lore-engine-poc"` | `test_mcp_http_module.py:1`, `test_scripts_06.py:1` |
|
||||
| 11.2 | `POST /mcp tools/list` returns 36 tools | `test_mcp_http_module.py:2`, `test_scripts_06.py:2` |
|
||||
| 11.3 | `POST /mcp tools/call was_true_at` returns `was_true: true` for the Roland Raventhorne / House Raventhorne / `3rd_age.year_345` query | `test_mcp_http_module.py:4`, `test_scripts_06.py:3` |
|
||||
| 11.4 | Malformed JSON → HTTP 400 + JSON-RPC `-32700` | `test_mcp_http_module.py:5` |
|
||||
| 11.5 | `notifications/initialized` → HTTP 202 with empty body | `test_mcp_http_module.py:7`, `test_scripts_06.py:5` |
|
||||
| 11.6 | Unknown tool name → JSON-RPC `-32602` | `test_mcp_http_module.py:8` |
|
||||
| 11.7 | Tool-body exception → `isError: True` in result envelope (HTTP 200) | `test_mcp_http_module.py:9` |
|
||||
| 11.8 | `Accept: text/event-stream` → `Content-Type: text/event-stream` + body matches `^event: message\ndata: \{.*\}\n\n$` | `test_mcp_http_module.py:11`, `test_scripts_06.py:4` |
|
||||
| 11.9 | `Accept: application/json` (or absent) → JSON body | `test_mcp_http_module.py:10` |
|
||||
| 11.10 | `docker build -t lore-engine-mcp:test .` exits 0 | `test_dockerfile.py:1` |
|
||||
| 11.11 | `docker run` exposes a healthy `/mcp` endpoint | `test_dockerfile.py:2`, `:4` |
|
||||
| 11.12 | `docker compose up` brings the service up; round-trip works; `LORE_HTTP_PORT` env var overrides the host port | `test_dockerfile.py:3` |
|
||||
| 11.13 | The 529 existing tests still pass; stdio path is unchanged | `python3 -m pytest tests/ -q` |
|
||||
|
||||
## Test plan
|
||||
|
||||
- **In-process** (`tests/test_mcp/test_mcp_http_module.py`, 14 tests) — `httpx.AsyncClient` + `ASGITransport`, no real socket. Reuses the trivial test-double registry from `test_server.py` for dispatcher-shape tests and the real `TOOL_REGISTRY` + cached graph for the end-to-end `was_true_at` test. Each async test uses `asyncio.run` (no `pytest-asyncio` dep).
|
||||
- **Subprocess** (`tests/test_mcp/test_scripts_06.py`, 7 tests) — boots `scripts/06_mcp_http_server.py` on `LORE_HTTP_PORT=0` (OS-assigned), parses the bound port from the "Uvicorn running on" line, then exercises the server over a real socket via `httpx`. Mirrors `test_scripts_05.py`.
|
||||
- **Docker** (`tests/test_mcp/test_dockerfile.py`, 4 tests) — gated on `shutil.which("docker")`. Build, run + round-trip, compose up + round-trip, healthcheck reaches `healthy`. The compose test uses `COMPOSE_PROJECT_NAME` + `LORE_HTTP_PORT` to allow parallel CI runs.
|
||||
|
||||
## Architectural decisions
|
||||
|
||||
- **Streamable HTTP over legacy HTTP+SSE.** See ADR 0010. The 2024-11-05 transport forced every request to either SSE or a hanging GET, which doesn't match a request/response tool API. Streamable HTTP is the 2025-06-18 spec, matches `httpx`/`curl` ergonomics, and is the path the spec authors themselves are steering clients toward.
|
||||
- **Starlette + uvicorn over FastAPI or the `mcp` PyPI lib.** Mirrors the stdlib-only discipline of `scripts/05_mcp_server.py`. The HTTP layer is ~50 lines of glue around the existing dispatcher; FastAPI's pydantic models and OpenAPI docs would be overhead we don't need.
|
||||
- **Stateless in v1.** No `Mcp-Session-Id` round-trip. The dispatcher is already stateless across requests; the graph is loaded once at startup; no tool mutates per-session state.
|
||||
- **One-shot JSON as the common case.** Every lore tool is synchronous and finishes in <1s. SSE is for protocol compliance and future server-push, not for the current tool set.
|
||||
- **Bake `.graph.pkl` (165 KB) in the image, mount as documented override.** Reassess at ~10 MB.
|
||||
- **Single-process uvicorn.** Documented limitation. Multi-worker would diverge silently because the graph is in-memory per-process.
|
||||
|
||||
## Files
|
||||
|
||||
- **new:** `lore_engine_poc/mcp_http.py`
|
||||
- **new:** `scripts/06_mcp_http_server.py`
|
||||
- **new:** `tests/test_mcp/test_mcp_http_module.py`
|
||||
- **new:** `tests/test_mcp/test_scripts_06.py`
|
||||
- **new:** `tests/test_mcp/test_dockerfile.py`
|
||||
- **new:** `Dockerfile`, `.dockerignore`, `docker-compose.yml`
|
||||
- **modified:** `requirements.txt` (added `starlette`, `uvicorn`, `httpx`)
|
||||
- **modified:** `lore_engine-poc/README.md` (new "MCP transports" + "Docker" sections; pull from the impl repo)
|
||||
- **modified:** `docs/CONTEXT.md` (new transport + container section; design repo)
|
||||
- **new:** `docs/adr/0010-streamable-http-transport.md` (design repo)
|
||||
|
||||
## How to use
|
||||
|
||||
```bash
|
||||
# Local (stdio, unchanged)
|
||||
python3 scripts/05_mcp_server.py
|
||||
|
||||
# Local (HTTP)
|
||||
python3 scripts/06_mcp_http_server.py --host 127.0.0.1 --port 8765
|
||||
|
||||
# Container
|
||||
docker build -t lore-engine-mcp .
|
||||
docker run --rm -p 8765:8765 lore-engine-mcp
|
||||
|
||||
# Compose
|
||||
docker compose up
|
||||
|
||||
# Round-trip
|
||||
curl -s http://localhost:8765/mcp \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Accept: application/json' \
|
||||
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}'
|
||||
```
|
||||
Reference in New Issue
Block a user