docs(plan): slice 5 — lock Cypher-with-allowlist query language

Templates declare named queries in a Cypher subset. The watcher
validates against an allowlist (MATCH/OPTIONAL MATCH/WHERE/
RETURN/ORDER BY/LIMIT/SKIP/count/coalesce); rejects with line
numbers on disallowed constructs (CREATE/MERGE/DELETE/SET/
CALL/UNION/WITH/arbitrary functions).

User input only reaches the Cypher body via $parameter
substitution, never via string interpolation. This is the
read-only Cypher dialect that template authors learn — powerful
enough for multi-hop traversal (the ci -> b OPTIONAL MATCH
example), constrained enough that no template can mutate state.

Added criteria 5.8-5.12 covering the allowlist rules.

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2026-06-17 13:12:18 -04:00
parent 8cce0204a2
commit f24b638a33

View File

@@ -18,8 +18,12 @@ NPC secret knowledge) without touching Python.
YAML, registers new templates at runtime (hot-reload).
3. `services/template-registry/` — persists template specs
alongside Cognee storage.
4. Dynamic tool generator: generic handler that runs queries
generated from `TypeTemplate` specs.
4. **Dynamic tool generator** with a Cypher-with-allowlist query
language. Every template declares one or more named queries
in a tiny Cypher subset. The watcher validates each query
against an **allowlist**; the tool generator parameterizes
user input and dispatches. See the "Query language" section
below.
5. `list_template_tools` MCP tool.
6. Four example templates from `docs/14-examples.md`:
- Thieves-guild mission (agent, target, payout, complication)
@@ -29,6 +33,97 @@ NPC secret knowledge) without touching Python.
danger_if_revealed)
7. Update the reasoning harness to mention template tools.
## Query language
Templates declare one or more **named queries**. Each query has a
Cypher body and a parameter schema. The watcher validates the body
against an allowlist; the tool generator registers one MCP tool
per query, with the parameters as the tool's arguments.
```yaml
template:
id: cursed_item
domain: Item
fields:
- {name: curse, type: string, required: true}
- {name: bearer, type: Person, required: false}
- {name: removal_condition, type: string, required: true}
relations:
- {name: CURSES, from: cursed_item, to: bearer}
queries:
- id: list_cursed_items
description: "List cursed items, optionally filtered by bearer."
cypher: |
MATCH (ci:DomainEntity {type: "cursed_item"})
OPTIONAL MATCH (ci)-[:CURSES]->(b:Person)
WHERE $bearer IS NULL OR b.name = $bearer
RETURN ci, b
parameters:
- {name: bearer, type: string, optional: true}
- id: get_cursed_item
description: "Look up one cursed item by name."
cypher: |
MATCH (ci:DomainEntity {type: "cursed_item", name: $name})
RETURN ci
parameters:
- {name: name, type: string}
```
### Allowlist
The watcher accepts only these Cypher constructs:
| Allowed | Disallowed |
|---|---|
| `MATCH`, `OPTIONAL MATCH` | `CREATE`, `MERGE`, `DELETE`, `SET`, `REMOVE` |
| `WHERE`, `AND`, `OR`, `NOT`, `IS NULL`, `IS NOT NULL` | `CALL` (subqueries), `UNION`, `WITH` (except in specific patterns) |
| `RETURN` (one or more variables or property accessors) | Side-effect clauses (no `CREATE` after `MATCH`) |
| `ORDER BY`, `LIMIT`, `SKIP` | `FOREACH`, `LOAD CSV`, raw strings |
| `count()`, `coalesce()`, basic predicates | Arbitrary function calls |
The watcher **rejects the template** with a line number if any
disallowed construct appears. The result is a query language
powerful enough to express multi-hop graph traversal (the
`OPTIONAL MATCH` for `ci → b` in the example), but constrained
enough that no template author can write a query that mutates
state or runs arbitrary code.
### Parameterization
User input never reaches the Cypher string directly. Every
parameter declared in `parameters:` becomes a `$name` reference
inside the Cypher body. The tool generator passes the user's
tool-call arguments as Cypher parameters (`$bearer = "Elysia"`),
which Cognee (via the Neo4j or Kuzu driver) treats as a bound
parameter — not as a string-concatenated query fragment. This
is the only safe path; raw string interpolation would be a
SQL-injection-class vulnerability.
### Tool registration
For the example template, the watcher registers two new tools:
| Tool name | Parameters |
|---|---|
| `list_cursed_items` | `bearer?` |
| `get_cursed_item` | `name` |
These appear in `tools/list` after the watcher reloads. Each tool
returns the `RETURN` clause's value(s) as a list of dicts, plus
the standard `{sources, confidence}` wrapper.
### Authoring experience
A world-builder who wants to add "thieves-guild mission"
templates writes a `mission.yaml` with a few `MATCH` patterns and
some `RETURN` clauses. The watcher validates; if the query
violates the allowlist, the world-builder gets a clear error
("`DELETE` is not allowed in template queries — slice 5 is
read-only"). No Go code change between "template added" and
"tool available."
## Acceptance criteria
| # | Criterion |
@@ -40,6 +135,11 @@ NPC secret knowledge) without touching Python.
| 5.5 | `list_template_tools` returns the available template tools |
| 5.6 | Template-driven queries return the documented response shape |
| 5.7 | Ingesting a `mission.yaml` produces a queryable `ThievesGuildMission` instance |
| 5.8 | Cypher-with-allowlist: `MATCH`/`OPTIONAL MATCH`/`WHERE`/`RETURN`/`ORDER BY`/`LIMIT` accepted; `CREATE`/`MERGE`/`DELETE`/`SET`/`CALL`/`UNION`/`WITH` rejected with line numbers |
| 5.9 | User input reaches the Cypher body only via `$parameter` substitution, never via string interpolation |
| 5.10 | Two templates with the same `id` collide on the second registration; rejected |
| 5.11 | Parameter declared in `parameters:` but unused in Cypher → rejected (typo guard) |
| 5.12 | Parameter used in Cypher but not declared in `parameters:` → rejected (typo guard) |
## Test plan