#164 Stage B requires exposing whether cancellation was observed at the
turn-result level. This commit adds the infrastructure field:
Changes:
- TurnResult.cancel_observed: bool = False (query_engine.py)
- _build_timeout_result() accepts cancel_observed parameter (runtime.py)
- Two timeout paths now pass cancel_event.is_set() to signal observation (runtime.py)
- bootstrap command includes cancel_observed in turn JSON (main.py)
- SCHEMAS.md documents Turn Result Fields with cancel_observed contract
Usage:
When a turn timeout occurs, cancel_observed=true indicates that the
engine observed the cancellation event being set. This allows callers
to distinguish:
- timeout with no cancel → infrastructure/network stall
- timeout with cancel observed → cooperative cancellation was triggered
Backward compat:
- Existing TurnResult construction without cancel_observed defaults to False
- bootstrap JSON output still validates per SCHEMAS.md (new field is always present)
Test results: 182 passing, 3 skipped, zero regression.
Related: #161 (wall-clock timeout), #164 (cancellation observability protocol)
ROADMAP continues #164 with Stage C (test coverage for cancellation + turn envelope).
Closes the #161 follow-up gap identified in review: wall-clock timeout
bounded caller-facing wait but did not cancel the underlying provider
thread, which could silently mutate mutable_messages / transcript_store /
permission_denials / total_usage after the caller had already observed
stop_reason='timeout'. A ghost turn committed post-deadline would poison
any session that got persisted afterwards.
Stage A scope (this commit): runtime + engine layer cooperative cancel.
Engine layer (src/query_engine.py):
- submit_message now accepts cancel_event: threading.Event | None = None
- Two safe checkpoints:
1. Entry (before max_turns / budget projection) — earliest possible return
2. Post-budget (after output synthesis, before mutation) — catches cancel
that arrives while output was being computed
- Both checkpoints return stop_reason='cancelled' with state UNCHANGED
(mutable_messages, transcript_store, permission_denials, total_usage
all preserved exactly as on entry)
- cancel_event=None preserves legacy behaviour with zero overhead (no
checkpoint checks at all)
Runtime layer (src/runtime.py):
- run_turn_loop creates one cancel_event per invocation when a deadline
is in play (and None otherwise, preserving legacy fast path)
- Passes the same event to every submit_message call across turns, so a
late cancel on turn N-1 affects turn N
- On timeout (either pre-call or mid-call), runtime explicitly calls
cancel_event.set() before future.cancel() + synthesizing the timeout
TurnResult. This upgrades #161's best-effort future.cancel() (which
only cancels not-yet-started futures) to cooperative mid-flight cancel.
Stop reason taxonomy after Stage A:
'completed' — turn committed, state mutated exactly once
'max_budget_reached' — overflow, state unchanged (#162)
'max_turns_reached' — capacity exceeded, state unchanged
'cancelled' — cancel_event observed, state unchanged (#164 Stage A)
'timeout' — synthesised by runtime, not engine (#161)
The 'cancelled' vs 'timeout' split matters:
- 'timeout' is the runtime's best-effort signal to the caller: deadline hit
- 'cancelled' is the engine's confirmation: cancel was observed + honoured
If the provider call wedges entirely (never reaches a checkpoint), the
caller still sees 'timeout' and the thread is leaked — but any NEXT
submit_message call on the same engine observes the event at entry and
returns 'cancelled' immediately, preventing ghost-turn accumulation.
This is the honest cooperative limit in Python threading land; true
preemption requires async-native provider IO (future work, not Stage A).
Tests (29 new tests, tests/test_submit_message_cancellation.py + tests/
test_run_turn_loop_cancellation.py):
Engine-layer (12 tests):
- TestCancellationBeforeCall (5): pre-set event returns 'cancelled' immediately;
mutable_messages, transcript_store, usage, permission_denials all preserved
- TestCancellationAfterBudgetCheck (1): cancel set mid-call (after projection,
before commit) still honoured; output synthesised but state untouched
- TestCancellationAfterCommit (2): post-commit cancel not observable (honest
limit) BUT next call on same engine observes it + returns 'cancelled'
- TestLegacyCallersUnchanged (3): cancel_event=None preserves #162 atomicity
+ max_turns contract with zero behaviour change
- TestCancellationVsOtherStopReasons (2): cancel precedes max_turns check;
cancel does not retroactively override a completed turn
Runtime-layer (5 tests):
- TestTimeoutPropagatesCancelEvent (3): submit_message receives a real Event
object when deadline is set; None in legacy mode; timeout actually calls
event.set() so in-flight threads observe at their next checkpoint
- TestCancelEventSharedAcrossTurns (1): same event object passed to every
turn (object identity check) — late cancel on turn N-1 must affect turn N
Regression: 3 existing timeout test mocks updated to accept cancel_event
kwarg (mocks that previously had signature (prompt, commands, tools, denials)
now have (prompt, commands, tools, denials, cancel_event=None) since runtime
passes cancel_event positionally on the timeout path).
Full suite: 97 → 114 passing, zero regression.
Closes ROADMAP #164 Stage A.
What's explicitly NOT in Stage A:
- Preemptive cancellation of wedged provider IO (requires asyncio-native
provider path; larger refactor)
- Timeout on the legacy unbounded run_turn_loop path (by design: legacy
callers opt out of cancellation entirely)
- CLI exposure of 'cancelled' as a distinct exit code (currently 'cancelled'
maps to the same stop_reason != 'completed' break condition as others;
CLI surface for cancel is a separate pinpoint if warranted)
Previously, QueryEnginePort.submit_message() checked the token budget AFTER
appending the prompt to mutable_messages, transcript_store, and permission_denials,
and AFTER calling compact_messages_if_needed(). On overflow it set
stop_reason='max_budget_reached' but the overflow turn was already committed.
Any caller that persisted the session afterwards wrote the rejected prompt to
disk — the session was silently poisoned even though the TurnResult said the
turn never completed.
Fix:
- Restructure submit_message so the budget check early-returns BEFORE any
mutation of mutable_messages, transcript_store, permission_denials, or
total_usage.
- The returned TurnResult.usage reflects pre-call state (overflow never
advanced the usage counter).
- Normal (in-budget) path unchanged: mutation happens exactly once, at the
end, only on 'completed' results.
This closes the atomicity gap: submit_message is now either 'turn committed'
(stop_reason='completed') or 'turn rejected, state untouched'
(stop_reason in {'max_budget_reached', 'max_turns_reached'}). Callers can
safely retry with a fresh budget or a smaller prompt without worrying about
phantom committed turns from prior rejections.
Tests (tests/test_submit_message_budget.py, 10 tests):
- TestBudgetOverflowDoesNotMutate (5): mutable_messages / transcript /
permission_denials / total_usage / TurnResult.usage all pre-mutation after overflow
- TestOverflowPersistence (2): first-turn overflow persists empty session;
successful-turn-then-overflow persists only the successful turn
- TestEngineUsableAfterOverflow (2): subsequent in-budget call still works
with no residue; repeated overflows don't accumulate hidden state
- TestNormalPathStillCommits (1): regression guard — non-overflow path still
commits mutable_messages/transcript/usage as expected
Full suite: 59/59 passing, zero regression.
Blocker: none. Closes ROADMAP #162.
The old tracked TypeScript snapshot has been removed from the repository history and the root directory is now a Python porting workspace. README and tests now describe and verify the Python-first layout instead of treating the exposed snapshot as the active source tree.
A local archive can still exist outside Git, but the tracked repository now presents only the Python porting surface, related essay context, and OmX workflow artifacts.
Constraint: Tracked history should collapse to a single commit while excluding the archived snapshot from Git
Rejected: Keep the exposed TypeScript tree in tracked history under an archive path | user explicitly wanted only the Python porting repo state in Git
Confidence: medium
Scope-risk: broad
Reversibility: messy
Directive: Keep future tracked additions focused on the Python port itself; do not reintroduce the exposed snapshot into Git history
Tested: python3 -m unittest discover -s tests -v; python3 -m src.main summary; git diff --check
Not-tested: Behavioral parity with the original TypeScript system beyond the current Python workspace surface