913 Commits

Author SHA1 Message Date
YeonGyu-Kim
2eb6e0c1ee ROADMAP #84: dump-manifests bakes build machine's absolute path into binary
Dogfooded 2026-04-17 on main HEAD 70a0f0c from /tmp/cd4.
'claw dump-manifests' with no arguments emits:
  error: Manifest source files are missing.
    repo root: /Users/yeongyu/clawd/claw-code
    missing: src/commands.ts, src/tools.ts, src/entrypoints/cli.tsx

That path is the *build machine*'s absolute filesystem layout, baked
in via env!('CARGO_MANIFEST_DIR') at rusty-claude-cli/src/main.rs:2016.
strings on the binary reveals the raw path verbatim. JSON surface
(--output-format json) leaks the same path identically.

Three problems: (1) broken default for any user running a distributed
binary because the path won't exist on their machine; (2) privacy
leak -- build user's $HOME segment embedded in the binary and
surfaced to every recipient; (3) reproducibility violation -- two
binaries built from the same commit on different machines produce
different runtime behavior. Same compile-time-vs-runtime family as
ROADMAP #83 (build date injected as 'today').

Fix shape (<=20 lines): drop env!('CARGO_MANIFEST_DIR') from the
runtime default, require CLAUDE_CODE_UPSTREAM / --manifests-dir /
settings entry, reword error to name the required config instead of
leaking a path the user never asked for. Optional polish: add a
settings.json [upstream] entry.

Acceptance: strings <binary> | grep '^/Users/' returns empty for the
shipped binary. Default error surface contains zero absolute paths
from the build machine.

Filed in response to Clawhip pinpoint nudge 1494661235336282248 in
#clawcode-building-in-public.
2026-04-17 20:36:51 +09:00
YeonGyu-Kim
70a0f0cf44 ROADMAP #83: DEFAULT_DATE injects build date as 'today' in live system prompt
Dogfooded 2026-04-17 on main HEAD e58c194 against /tmp/cd3. Binary
built 2026-04-10; today is 2026-04-17. 'claw system-prompt' emits
'Today's date is 2026-04-10.' The same DEFAULT_DATE constant
(rusty-claude-cli/src/main.rs:69-72) is threaded into
build_system_prompt() at :6173-6180 and every ClaudeCliSession /
StreamingCliSession / non-interactive runner (lines 3649, 3746,
4165, 4211, ...), so the stale date lives in the LIVE agent prompt,
not just the system-prompt subcommand.

Agents reason from 'today = compile day,' which silently breaks any
task that depends on real time (freshness, deadlines, staleness,
expiry). Violates ROADMAP principle #4 (branch freshness before
blame) and mixes compile-time context into runtime behavior,
producing different prompts for two agents on the same main HEAD
built a week apart.

Fix shape (~30 lines): compute current_date at runtime via
chrono::Utc::now().date_naive(), sweep DEFAULT_DATE call sites in
main.rs, keep --date override and --version's build-date meaning,
add CLAWD_OVERRIDE_DATE env escape for reproducible tests.

Filed in response to Clawhip pinpoint nudge 1494653681222811751 in
#clawcode-building-in-public.
2026-04-17 20:02:37 +09:00
YeonGyu-Kim
e58c1947c1 ROADMAP #82: macOS sandbox filesystem_active=true is a lie
Dogfooded 2026-04-17 on main HEAD 1743e60 against /tmp/claw-dogfood-2.
claw --output-format json sandbox on macOS reports filesystem_active=
true, filesystem_mode=workspace-only but the actual enforcement is
only HOME/TMPDIR env-var rebasing at bash.rs:205-209 / :228-232.
build_linux_sandbox_command is cfg(target_os=linux)-gated and returns
None on macOS, so the fallback path is sh -lc <command> with env
tweaks and nothing else. Direct escape proof: a child with
HOME=/ws/.sandbox-home TMPDIR=/ws/.sandbox-tmp writes
/tmp/claw-escape-proof.txt and mkdir /tmp/claw-probe-target without
error.

Clawability problem: claws/orchestrators read SandboxStatus JSON and
branch on filesystem_active && filesystem_mode=='workspace-only' to
decide whether a worker can safely touch /tmp or $HOME. Today that
branch lies on macOS.

Fix shape option A (low-risk, ~15 lines): compute filesystem_active
only where an enforcement path exists, so macOS reports false by
default and fallback_reason surfaces the real story. Option B:
wire a Seatbelt (sandbox-exec) profile for actual macOS enforcement.

Filed in response to Clawhip pinpoint nudge 1494646135317598239 in
#clawcode-building-in-public.
2026-04-17 19:33:06 +09:00
YeonGyu-Kim
1743e600e1 ROADMAP #81: claw status Project root lies about session scope
Dogfooded 2026-04-17 on main HEAD a48575f inside claw-code itself
and reproduced on /tmp/claw-split-17. SessionStore::from_cwd at
session_control.rs:32-40 uses the raw CWD as input to
workspace_fingerprint() (line 295-303), not the project root
surfaced in claw status. Result: two CWDs in the same git repo
(e.g. ~/clawd/claw-code vs ~/clawd/claw-code/rust) report the same
Project root in status but land in two disjoint .claw/sessions/
<fp>/ partitions. claw --resume latest from one CWD returns
'no managed sessions found' even though the adjacent CWD has a
live session visible via /session list.

Status-layer truth (Project root) and session-layer truth
(fingerprint-of-CWD) disagree and neither surface exposes the
disagreement -- classic split-truth per ROADMAP pain point #2.

Fix shape (<=40 lines): (a) fingerprint the project root instead
of raw CWD, or (b) surface partition key explicitly in status.

Filed in response to Clawhip pinpoint nudge 1494638583481372833
in #clawcode-building-in-public.
2026-04-17 19:05:12 +09:00
Jobdori
a48575fd83 ROADMAP #80: session-lookup error copy lies about on-disk layout
Dogfooded 2026-04-17 on main HEAD 688295e against /tmp/claw-d4.
SessionStore::from_cwd at session_control.rs:32-40 places sessions
under .claw/sessions/<workspace_fingerprint>/ (16-char FNV-1a hex
at line 295-303), but format_no_managed_sessions and
format_missing_session_reference at line 516-526 advertise plain
.claw/sessions/ with no fingerprint context.

Concrete repro: fresh workspace, no sessions yet, .claw/sessions/
contains foo/ (hash dir, empty) + ffffffffffffffff/foreign.jsonl
(foreign workspace session). 'claw --resume latest' still says
'no managed sessions found in .claw/sessions/' even though that
directory is not empty -- the sessions just belong to other
workspace partitions.

Fix shape is ~30 lines: plumb the resolved sessions_root/workspace
into the two format helpers, optionally enumerate sibling partitions
so error copy tells the operator where sessions from other workspaces
are and why they're invisible.

Filed in response to Clawhip pinpoint nudge 1494615932222439456 in
#clawcode-building-in-public.
2026-04-17 17:33:05 +09:00
Jobdori
688295ea6c ROADMAP #79: claw --output-format json init discards structured InitReport
Dogfooded 2026-04-17 on main HEAD 9deaa29. init.rs:38-113 already
builds a fully-typed InitReport { project_root, artifacts: Vec<
InitArtifact { name, status: InitStatus }> } but main.rs:5436-5454
calls .render() on it and throws the structure away, emitting only
{kind, message: '<prose>'} via init_json_value(). Downstream claws
have to regex 'created|updated|skipped' out of the message string
to know per-artifact state.

version/system-prompt/acp/bootstrap-plan all emit structured payloads
on the same binary -- init is the sole odd-one-out. Fix shape is ~20
lines: add InitReport::to_json_value + InitStatus::as_str, switch
run_init to hold the report instead of .render()-ing it eagerly,
preserve message for backward compat, add output_format_contract
regression.

Filed in response to Clawhip pinpoint nudge 1494608389068558386 in
#clawcode-building-in-public.
2026-04-17 17:02:58 +09:00
Jobdori
9deaa29710 ROADMAP #78: claw plugins CLI route is a dead constructor
Dogfooded 2026-04-17 on main HEAD d05c868. CliAction::Plugins variant
is declared at main.rs:303-307 and wired to LiveCli::print_plugins at
main.rs:202-206, but parse_args has no "plugins" arm, so
claw plugins / claw plugins list / claw --output-format json plugins
all fall through to the LLM-prompt catch-all and emit a missing
Anthropic credentials error. This is the sole documented-shaped
subcommand that does NOT resolve to a local CLI route:
agents, mcp, skills, acp, init, dump-manifests, bootstrap-plan,
system-prompt, export all work. grep confirms CliAction::Plugins has
exactly one hit in crates/ (the handler), not a constructor anywhere.

Filed with a ~15 line parser fix shape plus help/test wiring, matching
the pattern already used by agents/mcp/skills.

Filed in response to Clawhip pinpoint nudge 1494600832652546151 in
#clawcode-building-in-public.
2026-04-17 16:33:09 +09:00
Jobdori
d05c8686b8 ROADMAP #77: typed error-kind contract for --output-format json errors
Dogfooded 2026-04-17 against main HEAD 00d0eb6. Five distinct failure
classes (missing credentials, missing manifests, missing worker state,
session not found, CLI parse) all emit the same {type,error} envelope
with no machine-readable kind/code, so downstream claws have to regex
the prose to route failures. Success payloads already carry a stable
'kind' discriminator; error payloads do not. Fix shape proposes an
ErrorKind discriminant plus hint/context fields to match the success
side contract.

Filed in response to Clawhip pinpoint nudge 1494593284180414484 in
#clawcode-building-in-public.
2026-04-17 16:08:41 +09:00
Yeachan-Heo
00d0eb61d4 US-024: Add token limit metadata for kimi models
Add ModelTokenLimit entries for kimi-k2.5 and kimi-k1.5 to enable
preflight context window validation. Per Moonshot AI documentation:
- Context window: 256,000 tokens
- Max output: 16,384 tokens

Includes 3 unit tests:
- returns_context_window_metadata_for_kimi_models
- kimi_alias_resolves_to_kimi_k25_token_limits
- preflight_blocks_oversized_requests_for_kimi_models

All tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 04:15:38 +00:00
Yeachan-Heo
8d8e2c3afd Mark prd.json status as completed
All 23 stories (US-001 through US-023) are now complete.
Updated status from "in_progress" to "completed".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 20:05:13 +00:00
Yeachan-Heo
d037f9faa8 Fix strip_routing_prefix to handle kimi provider prefix (US-023)
Add "kimi" to the strip_routing_prefix matches so that models like
"kimi/kimi-k2.5" have their prefix stripped before sending to the
DashScope API (consistent with qwen/openai/xai/grok handling).

Also add unit test strip_routing_prefix_strips_kimi_provider_prefix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:50:15 +00:00
Yeachan-Heo
330dc28fc2 Mark US-023 as complete in prd.json
- Move US-023 from inProgressStories to completedStories
- All acceptance criteria met and verified

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:45:56 +00:00
Yeachan-Heo
cec8d17ca8 Implement US-023: Add automatic routing for kimi models to DashScope
Changes in rust/crates/api/src/providers/mod.rs:
- Add 'kimi' alias to MODEL_REGISTRY resolving to 'kimi-k2.5' with DashScope config
- Add kimi/kimi- prefix routing to DashScope endpoint in metadata_for_model()
- Add resolve_model_alias() handling for kimi -> kimi-k2.5
- Add unit tests: kimi_prefix_routes_to_dashscope, kimi_alias_resolves_to_kimi_k2_5

Users can now use:
- --model kimi (resolves to kimi-k2.5)
- --model kimi-k2.5 (auto-routes to DashScope)
- --model kimi/kimi-k2.5 (explicit provider prefix)

All 127 tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:44:21 +00:00
Yeachan-Heo
4cb1db9faa Implement US-022: Enhanced error context for API failures
Add structured error context to API failures:
- Request ID tracking across retries with full context in error messages
- Provider-specific error code mapping with actionable suggestions
- Suggested user actions for common error types (401, 403, 413, 429, 500, 502-504)
- Added suggested_action field to ApiError::Api variant
- Updated enrich_bearer_auth_error to preserve suggested_action

Files changed:
- rust/crates/api/src/error.rs: Add suggested_action field, update Display
- rust/crates/api/src/providers/openai_compat.rs: Add suggested_action_for_status()
- rust/crates/api/src/providers/anthropic.rs: Update error handling

All tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:15:00 +00:00
Yeachan-Heo
5e65b33042 US-021: Add request body size pre-flight check for OpenAI-compatible provider 2026-04-16 17:41:57 +00:00
Yeachan-Heo
87b982ece5 US-011: Performance optimization for API request serialization
Added criterion benchmarks and optimized flatten_tool_result_content:
- Added criterion dev-dependency and request_building benchmark suite
- Optimized flatten_tool_result_content to pre-allocate capacity and avoid
  intermediate Vec construction (was collecting to Vec then joining)
- Made key functions public for benchmarking: translate_message,
  build_chat_completion_request, flatten_tool_result_content,
  is_reasoning_model, model_rejects_is_error_field

Benchmark results:
- flatten_tool_result_content/single_text: ~17ns
- translate_message/text_only: ~200ns
- build_chat_completion_request/10 messages: ~16.4µs
- is_reasoning_model detection: ~26-42ns

All 119 unit tests and 29 integration tests pass.
cargo clippy passes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:11:45 +00:00
Yeachan-Heo
f65d15fb2f US-010: Add model compatibility documentation
Created comprehensive MODEL_COMPATIBILITY.md documenting:
- Kimi models is_error exclusion (prevents 400 Bad Request)
- Reasoning models tuning parameter stripping (o1, o3, o4, grok-3-mini, qwen-qwq)
- GPT-5 max_completion_tokens requirement
- Qwen model routing through DashScope

Includes implementation details, key functions table, guide for adding new
models, and testing commands. Cross-referenced with existing code comments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:55:58 +00:00
Yeachan-Heo
3e4e1585b5 US-009: Add comprehensive unit tests for kimi model compatibility fix
Added 4 unit tests to verify is_error field handling for kimi models:
- model_rejects_is_error_field_detects_kimi_models: Detects kimi-k2.5, kimi-k1.5, dashscope/kimi-k2.5 (case insensitive)
- translate_message_includes_is_error_for_non_kimi_models: Verifies gpt-4o, grok-3, claude include is_error
- translate_message_excludes_is_error_for_kimi_models: Verifies kimi models exclude is_error (prevents 400 Bad Request)
- build_chat_completion_request_kimi_vs_non_kimi_tool_results: Full integration test for request building

All 119 unit tests and 29 integration tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:54:48 +00:00
Yeachan-Heo
110d568bcf Mark US-008 complete: kimi-k2.5 API compatibility fix
- translate_message now conditionally includes is_error field
- kimi models (kimi-k2.5, kimi-k1.5, etc.) exclude is_error
- Other models (openai, grok, xai) keep is_error support
2026-04-16 10:12:36 +00:00
Yeachan-Heo
866ae7562c Fix formatting in task_packet.rs for CI 2026-04-16 09:35:18 +00:00
Yeachan-Heo
6376694669 Mark all 7 roadmap stories complete with PRD and progress record
- Update prd.json: mark US-001 through US-007 as passes: true
- Add progress.txt: detailed implementation summary for all stories

All acceptance criteria verified:
- US-001: Startup failure evidence bundle + classifier
- US-002: Lane event schema with provenance and deduplication
- US-003: Stale branch detection with policy integration
- US-004: Recovery recipes with ledger
- US-005: Typed task packet format with TaskScope
- US-006: Policy engine for autonomous coding
- US-007: Plugin/MCP lifecycle maturity
2026-04-16 09:31:47 +00:00
Yeachan-Heo
1d5748f71f US-005: Typed task packet format with TaskScope enum
- Add TaskScope enum with Workspace, Module, SingleFile, Custom variants
- Update TaskPacket struct with scope_path and worktree fields
- Add validation for scope-specific requirements
- Fix tests in task_packet.rs, task_registry.rs, and tools/src/lib.rs
- Export TaskScope from runtime crate

Closes US-005 (Phase 4)
2026-04-16 09:28:42 +00:00
Yeachan-Heo
77fb62a9f1 Implement LaneEvent schema extensions for event ordering, provenance, and dedupe (US-002)
Adds comprehensive metadata support to LaneEvent for the canonical lane event schema:

- EventProvenance enum: live_lane, test, healthcheck, replay, transport
- SessionIdentity: title, workspace, purpose, with placeholder support
- LaneOwnership: owner, workflow_scope, watcher_action (Act/Observe/Ignore)
- LaneEventMetadata: seq, provenance, session_identity, ownership, nudge_id,
  event_fingerprint, timestamp_ms
- LaneEventBuilder: fluent API for constructing events with full metadata
- is_terminal_event(): detects Finished, Failed, Superseded, Closed, Merged
- compute_event_fingerprint(): deterministic fingerprint for terminal events
- dedupe_terminal_events(): suppresses duplicate terminal events by fingerprint

Provides machine-readable event provenance, session identity at creation,
monotonic sequence ordering, nudge deduplication, and terminal event suppression.

Adds 10 regression tests covering:
- Monotonic sequence ordering
- Provenance serialization round-trip
- Session identity completeness
- Ownership and workflow scope binding
- Watcher action variants
- Terminal event detection
- Fingerprint determinism and uniqueness
- Terminal event deduplication
- Builder construction with metadata
- Metadata serialization round-trip

Closes Phase 2 (partial) from ROADMAP.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 09:12:31 +00:00
Yeachan-Heo
21909da0b5 Implement startup-no-evidence evidence bundle + classifier (US-001)
Adds typed worker.startup_no_evidence event with evidence bundle when worker
startup times out. The classifier attempts to down-rank the vague bucket into
specific failure classifications:
- trust_required
- prompt_misdelivery
- prompt_acceptance_timeout
- transport_dead
- worker_crashed
- unknown

Evidence bundle includes:
- Last known worker lifecycle state
- Pane/command being executed
- Prompt-send timestamp
- Prompt-acceptance state
- Trust-prompt detection result
- Transport health summary
- MCP health summary
- Elapsed seconds since worker creation

Includes 6 regression tests covering:
- Evidence bundle serialization
- Transport dead classification
- Trust required classification
- Prompt acceptance timeout
- Worker crashed detection
- Unknown fallback

Closes Phase 1.6 from ROADMAP.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 09:05:33 +00:00
Yeachan-Heo
ac45bbec15 Make ACP/Zed status obvious before users go source-diving
ROADMAP #21, #22, and #23 were already closed on current main, so the next real repo-local backlog item was the ACP/Zed discoverability gap. This adds a local `claw acp` status surface plus aliases, updates help/docs, and separates the shipped discoverability fix from the still-open daemon/protocol follow-up so editor-first users get a crisp answer immediately.

Constraint: No ACP/Zed daemon or protocol server exists in claw-code yet, so the new surface must be explicit status guidance rather than a fake implementation
Rejected: Add a pretend `acp serve` daemon path | would imply supported protocol behavior that does not exist
Rejected: Docs-only clarification | still leaves `claw --help` unable to answer the editor-launch question directly
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep ROADMAP discoverability fixes separate from future ACP daemon/protocol work so help text and backlog IDs stay unambiguous
Tested: cargo fmt --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; cargo run -q -p rusty-claude-cli -- acp; cargo run -q -p rusty-claude-cli -- --output-format json acp; architect review APPROVED
Not-tested: Real ACP/Zed daemon launch because no protocol-serving surface exists yet
2026-04-16 03:13:50 +00:00
Yeachan-Heo
64e058f720 refresh 2026-04-16 02:50:54 +00:00
Yeachan-Heo
e874bc6a44 Improve malformed hook failures so operators can diagnose broken JSON
Malformed hook stdout that looks like JSON was collapsing into low-signal failure text during hook execution. This change preserves plain-text hook feedback for normal text hooks, but upgrades malformed JSON-like output into an explicit hook_invalid_json diagnostic that includes phase, tool, command, and bounded stdout/stderr previews. It also adds a regression test for malformed-but-nonempty output.

Constraint: User scoped the implementation to rust/crates/runtime/src/hooks.rs and tests only
Constraint: Existing plain-text hook feedback must remain intact for non-JSON hook output
Rejected: Treat every non-JSON stdout payload as invalid JSON | would break legitimate plain-text hook feedback
Confidence: high
Scope-risk: narrow
Directive: Keep malformed-hook diagnostics bounded and preserve the plain-text fallback for hooks that intentionally emit text
Tested: cargo test --manifest-path rust/Cargo.toml -p runtime hooks::tests:: -- --nocapture
Tested: cargo test --manifest-path rust/Cargo.toml -p runtime -- --nocapture
Tested: cargo clippy --manifest-path rust/Cargo.toml -p runtime --all-targets -- -D warnings
Not-tested: Full workspace clippy/test sweep outside runtime crate
2026-04-13 12:44:52 +00:00
Yeachan-Heo
6a957560bd Make recovery handoffs explain why a lane resumed instead of leaking control prose
Recent OMX dogfooding kept surfacing raw `[OMX_TMUX_INJECT]`
messages as lane results, which told operators that tmux reinjection
happened but not why or what lane/state it applied to. The lane-finished
persistence path now recognizes that control prose, stores structured
recovery metadata, and emits a human-meaningful fallback summary instead
of preserving the raw marker as the primary result.

Constraint: Keep the fix in the existing lane-finished metadata surface rather than inventing a new runtime channel
Rejected: Treat all reinjection prose as ordinary quality-floor mush | loses the recovery cause and target lane operators actually need
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Recovery classification is heuristic; extend the parser only when new operator phrasing shows up in real dogfood evidence
Tested: cargo fmt --all --check
Tested: cargo clippy --workspace --all-targets -- -D warnings
Tested: cargo test --workspace
Tested: LSP diagnostics on rust/crates/tools/src/lib.rs (0 errors)
Tested: Architect review (APPROVE)
Not-tested: Additional reinjection phrasings beyond the currently observed `[OMX_TMUX_INJECT]` / current-mode-state variants
Related: ROADMAP #68
2026-04-12 15:50:39 +00:00
Yeachan-Heo
42bb6cdba6 Keep local clawhip artifacts from tripping routine repo work
Dogfooding kept reproducing OMX team merge conflicts on
`.clawhip/state/prompt-submit.json`, so the init bootstrap now
teaches repos to ignore `.clawhip/` alongside the existing local
`.claw/` artifacts. This also updates the current repo ignore list
so the fix helps immediately instead of only on future `claw init`
runs.

Constraint: Keep the fix narrow and centered on repo-local ignore hygiene
Rejected: Broader team merge-hygiene changes | unnecessary for the proven local root cause
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If more runtime-local artifact directories appear, extend the shared init gitignore list instead of patching repos ad hoc
Tested: cargo fmt --all --check
Tested: cargo clippy --workspace --all-targets -- -D warnings
Tested: cargo test --workspace
Tested: Architect review (APPROVE)
Not-tested: Existing clones with already-tracked `.clawhip` files still need manual cleanup
Related: ROADMAP #75
2026-04-12 14:47:40 +00:00
Yeachan-Heo
f91d156f85 Keep poisoned test locks from cascading across unrelated regressions
The repo-local backlog was effectively exhausted, so this sweep promoted the
newly observed test-lock poisoning pain point into ROADMAP #74 and fixed it in
place. Test-only env/cwd lock acquisition now recovers poisoned mutexes in the
remaining strict call sites, and each affected surface has a regression that
proves a panic no longer permanently poisons later tests.

Constraint: Keep the fix test-only and avoid widening runtime behavior changes
Rejected: Refactor shared helper signatures across broader call paths | unnecessary churn beyond the remaining strict test sites
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: These guards only recover the mutex; tests that mutate env or cwd still must restore process-global state explicitly
Tested: cargo fmt --all --check
Tested: cargo clippy --workspace --all-targets -- -D warnings
Tested: cargo test --workspace
Tested: Architect review (APPROVE)
Not-tested: Additional fault-injection around partially restored env/cwd state after panic
Related: ROADMAP #74
2026-04-12 13:52:41 +00:00
Yeachan-Heo
6b4bb4ac26 Keep finished lanes from leaving stale reminders armed
The next repo-local sweep target was ROADMAP #66: reminder/cron
state could stay enabled after the associated lane had already
finished, which left stale nudges firing into completed work. The
fix teaches successful lane persistence to disable matching enabled
cron entries and record which reminder ids were shut down on the
finished event.

Constraint: Preserve existing cron/task registries and add the shutdown behavior only on the successful lane-finished path
Rejected: Add a separate reminder-cleanup command that operators must remember to run | leaves the completion leak unfixed at the source
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If cron-matching heuristics change later, update `disable_matching_crons`, its regression, and the ROADMAP closeout together
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: Cross-process cron/reminder persistence beyond the in-memory registry used in this repo
2026-04-12 12:52:27 +00:00
Yeachan-Heo
e75d67dfd3 Make successful lanes explain what artifacts they actually produced
The next repo-local sweep target was ROADMAP #64: downstream consumers
still had to infer artifact provenance from prose even though the repo
already emitted structured lane events. The fix extends `lane.finished`
metadata with structured artifact provenance so successful completions
can report roadmap ids, files, diff stat, verification state, and commit
sha without relying on narration alone.

Constraint: Preserve the existing commit-created event and lane-finished metadata paths while adding structured provenance to successful completions
Rejected: Introduce a separate artifact event type first | unnecessary for this focused closeout because `lane.finished` already carries structured data and existing consumers can read it there
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If artifact provenance extraction rules change later, update `extract_artifact_provenance`, its regression payload, and the ROADMAP closeout together
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: Downstream consumers that ignore `lane.finished.data.artifactProvenance` and still parse only prose output
2026-04-12 11:56:00 +00:00
Yeachan-Heo
2e34949507 Keep latest-session timestamps increasing under tight loops
The next repo-local sweep target was ROADMAP #73: repeated backlog
sweeps exposed that session writes could share the same wall-clock
millisecond, which made semantic recency fragile and forced the
resume-latest regression to sleep between saves. The fix makes session
timestamps monotonic within the process and removes the timing hack
from the test so latest-session selection stays stable under tight
loops.

Constraint: Preserve the existing session file format while changing only the timestamp source semantics
Rejected: Keep the sleep-based test workaround | hides the real ordering hazard instead of fixing timestamp generation
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Any future session-recency logic must keep `current_time_millis`, ordering tests, and latest-session expectations aligned
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: Cross-process monotonicity when multiple binaries write sessions concurrently
2026-04-12 10:51:19 +00:00
Yeachan-Heo
8f53524bd3 Make backlog-scan lanes say what they actually selected
The next repo-local sweep target was ROADMAP #65: backlog-scanning
lanes could stop with prose-only summaries naming roadmap items, but
there was no machine-readable record of which items were chosen,
which were skipped, or whether the lane intended to execute, review,
or no-op. The fix teaches completed lane persistence to extract a
structured selection outcome while preserving the existing quality-
floor and review-verdict behavior for other lanes.

Constraint: Keep selection-outcome extraction on the existing `lane.finished` metadata path instead of inventing a separate event stream
Rejected: Add a dedicated selection event type first | unnecessary for this focused closeout because `lane.finished` already persists structured data downstream can read
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If backlog-scan summary conventions change later, update `extract_selection_outcome`, its regression test, and the ROADMAP closeout wording together
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE after roadmap closeout update
Not-tested: Downstream consumers that may still ignore `lane.finished.data.selectionOutcome`
2026-04-12 09:54:37 +00:00
Yeachan-Heo
b5e30e2975 Make completed review lanes emit machine-readable verdicts
The next repo-local sweep target was ROADMAP #67: scoped review lanes
could stop with prose-only output, leaving downstream consumers to infer
approval or rejection from later chatter. The fix teaches completed lane
persistence to recognize review-style `APPROVE`/`REJECT`/`BLOCKED`
results, attach structured verdict metadata to `lane.finished`, and keep
ordinary non-review lanes on the existing quality-floor path.

Constraint: Preserve the existing non-review lane summary path while enriching only review-style completions
Rejected: Add a brand-new lane event type just for review results | unnecessary when `lane.finished` already carries structured metadata and downstream consumers can read it there
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If review verdict parsing changes later, update `extract_review_outcome`, the finished-event payload fields, and the review-lane regression together
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: External consumers that may still ignore `lane.finished.data.reviewVerdict`
2026-04-12 08:49:40 +00:00
Yeachan-Heo
dbc2824a3e Keep latest session selection tied to real session recency
The next repo-local sweep target was ROADMAP #72: the `latest`
managed-session alias could depend on filesystem mtime before the
session's own persisted recency markers, which made the selection
path vulnerable to coarse or misleading file timestamps. The fix
promotes `updated_at_ms` into the summary/order path, keeps CLI
wrappers in sync, and locks the mtime-vs-session-recency case with
regression coverage.

Constraint: Preserve existing managed-session storage layout while changing only the ordering signal
Rejected: Keep sorting by filesystem mtime and just sleep longer in tests | hides the semantic ordering bug instead of fixing it
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Any future managed-session ordering change must keep runtime and CLI summary structs aligned on the same recency fields
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: Cross-filesystem behavior where persisted session JSON cannot be read and fallback ordering uses mtime only
2026-04-12 07:49:32 +00:00
Yeachan-Heo
f309ff8642 Stop repo lanes from executing the wrong task payload
The next repo-local sweep target was ROADMAP #71: a claw-code lane
accepted an unrelated KakaoTalk/image-analysis prompt even though the
lane itself was supposed to be repo-scoped work. This extends the
existing prompt-misdelivery guardrail with an optional structured task
receipt so worker boot can reject visible wrong-task context before the
lane continues executing.

Constraint: Keep the fix inside the existing worker_boot / WorkerSendPrompt control surface instead of inventing a new external OMX-only protocol
Rejected: Treat wrong-task receipts as generic shell misdelivery | loses the expected-vs-observed task context needed to debug contaminated lanes
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If task-receipt fields change later, update the WorkerSendPrompt schema, worker payload serialization, and wrong-task regression together
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: External orchestrators that have not yet started populating the optional task_receipt field
2026-04-12 07:00:07 +00:00
Yeachan-Heo
3b806702e7 Make the CLI point users at the real install source
The next repo-local backlog item was ROADMAP #70: users could
mistake third-party pages or the deprecated `cargo install
claw-code` path for the official install route. The CLI now
surfaces the source of truth directly in `claw doctor` and
`claw --help`, and the roadmap closeout records the change.

Constraint: Keep the fix inside repo-local Rust CLI surfaces instead of relying on docs alone
Rejected: Close #70 with README-only wording | the bug was user-facing CLI ambiguity, so the warning needed to appear in runtime help/doctor output
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If install guidance changes later, update both the doctor check payload and the help-text warning together
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: Third-party websites outside this repo that may still present stale install instructions
2026-04-12 04:50:03 +00:00
Yeachan-Heo
26b89e583f Keep completed lanes from ending on mushy stop summaries
The next repo-local sweep target was ROADMAP #69: completed lane
runs could persist vague control text like “commit push everyting,
keep sweeping $ralph”, which made downstream stop summaries
operationally useless. The fix adds a lane-finished quality floor
that preserves strong summaries, rewrites empty/control-only/too-
short-without-context summaries into a contextual fallback, and
records structured metadata explaining when the fallback fired.

Constraint: Keep legitimate concise lane summaries intact while improving only low-signal completions
Rejected: Blanket-rewrite every completed summary into a templated sentence | would erase useful model-authored detail from good lane outputs
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If lane-finished summary heuristics change later, update the structured `qualityFloorApplied/rawSummary/reasons/wordCount` contract and its regression tests together
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: External OMX consumers that may still ignore the new lane.finished data payload
2026-04-12 03:23:39 +00:00
YeonGyu-Kim
17e21bc4ad docs(roadmap): add #70 — install-source ambiguity misleads users
User treated claw-code.io as official, hit clawcode vs deprecated
claw-code naming collision. Adding requirement for canonical
docs to explicitly state official source and warn against
deprecated crate.

Source: gaebal-gajae community watch 2026-04-12
2026-04-12 12:08:52 +09:00
Yeachan-Heo
4f83a81cf6 Make dump-manifests recoverable outside the inferred build tree
The backlog sweep found that the user-cited #21-#23 items were already
closed, and the next real pain point was `claw dump-manifests` failing
without a direct way to point at the upstream manifest source. This adds
an explicit `--manifests-dir` path, upgrades the failure messages to say
whether the source root or required files are missing, and updates the
ROADMAP closeout to reflect that #45 is now fixed.

Constraint: Preserve existing dump-manifests behavior when no explicit override is supplied
Rejected: Require CLAUDE_CODE_UPSTREAM for every invocation | breaks existing build-tree workflows and is unnecessarily rigid
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep manifest-source override guidance centralized so future error-path edits do not drift
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: Manual invocation against every legacy env-based manifest lookup layout
2026-04-12 02:57:11 +00:00
Yeachan-Heo
1d83e67802 Keep the backlog sweep from chasing external executor notes
ROADMAP #31 described acpx/droid executor quirks, but a fresh repo-local
search showed no implementation surface outside ROADMAP.md. This rewrites
the local unpushed team checkpoint commits into one docs-only closeout so the
branch reflects the real claw-code backlog instead of runtime-generated state.

Constraint: Current evidence is limited to repo-local search plus existing prior closeouts
Rejected: Leave team auto-checkpoint commits intact | they pollute the branch with runtime state and obscure the actual closeout
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep generated .clawhip prompt-submit artifacts out of backlog closeout commits
Tested: Repo-local grep evidence for #31/#63-#68 terms; ROADMAP.md line review; architect approval x2
Not-tested: Fresh remote/backlog audit beyond the current repo-local evidence set
2026-04-12 02:57:11 +00:00
YeonGyu-Kim
763437a0b3 docs(roadmap): add #69 — lane stop summary quality floor
clawcode-human session stopped with sloppy summary
('commit push everyting, keep sweeping '). Adding
requirement for minimum stop/result summary standards.

Source: gaebal-gajae dogfood analysis 2026-04-12
2026-04-12 11:18:18 +09:00
Yeachan-Heo
491386f0a5 Keep external orchestration gaps out of the claw-code sweep path
ROADMAP #63-#68 describe OMX/Ultraclaw orchestration behavior, but a repo-local
search shows those implementation markers do not exist in claw-code source.
Marking that scope boundary directly in the roadmap keeps future backlog sweeps
from repeatedly targeting the wrong repository.

Constraint: Stay within claw-code repo scope while continuing the user-requested backlog sweep
Rejected: Attempt repo-local fixes for #63-#68 | implementation surface is absent from this repository
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Treat #63-#68 as external tracking notes unless claw-code later grows the corresponding orchestration/runtime surface
Tested: Repo-local search for acpx/ultraclaw/roadmap-nudge-10min/OMX_TMUX_INJECT outside ROADMAP.md
Not-tested: No code/test/static-analysis rerun because the change is docs-only
2026-04-12 02:14:43 +00:00
Yeachan-Heo
5c85e5ad12 Keep the worker-state backlog honest with current main behavior
ROADMAP #62 was stale. Current main already emits `.claw/worker-state.json`
on worker status transitions and exposes the documented `claw state`
reader surface, so leaving the item open would keep sending future
backlog passes after already-landed work.

Fresh verification on the exact branch confirmed the implementation and
left the workspace green, so this commit closes the item with current
proof instead of duplicating the feature.

Constraint: User required fresh cargo fmt, cargo clippy --workspace --all-targets -- -D warnings, and cargo test --workspace before push
Constraint: OMX team runtime was explicitly requested, but the verification lane stalled before producing any diff
Rejected: Re-implement the worker-state feature from scratch | current main already contains the runtime hook, CLI surface, and regression coverage
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Reopen #62 only with a fresh repro showing missing `.claw/worker-state.json` writes or a broken `claw state` surface on current main
Tested: cargo test -p runtime emit_state_file_writes_worker_status_on_transition -- --nocapture; cargo test -p tools recovery_loop_state_file_reflects_transitions -- --nocapture; cargo test -p rusty-claude-cli removed_login_and_logout_subcommands_error_helpfully -- --nocapture; cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace; architect review APPROVE
Not-tested: No dedicated automated end-to-end CLI regression for reading `.claw/worker-state.json` beyond parser coverage and focused smoke validation
2026-04-12 01:51:15 +00:00
Yeachan-Heo
b825713db3 Retire the stale slash-command backlog item without breaking verification
ROADMAP #39 was stale: current main already hides the unimplemented slash
commands from the help/completion surfaces that triggered the original report,
so the backlog entry should be marked done with current evidence instead of
staying open forever.

While rerunning the user's required Rust verification gates on the exact commit
we planned to push, clippy exposed duplicate and unused imports in the plugin
state-isolation files. Folding those cleanup fixes into the same closeout keeps
the proof honest and restores a green workspace before the backlog retirement
lands.

Constraint: User required fresh cargo fmt, cargo clippy --workspace --all-targets -- -D warnings, and cargo test --workspace before push
Rejected: Push the roadmap-only closeout without fixing the workspace | would violate the required verification gate and leave main red
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Re-run the full Rust workspace gates on the exact commit you intend to push when retiring stale roadmap items
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: No manual interactive REPL completion/help smoke test beyond the existing automated coverage
2026-04-12 00:59:29 +00:00
YeonGyu-Kim
06d1b8ac87 docs(roadmap): add #68 — internal reinjection/resume path opacity
OMX lanes leaking internal control prose like [OMX_TMUX_INJECT]
instead of operator-meaningful state. Adding requirement for
structured recovery/reinject events with clear cause, preserved
state, and target lane info.

Also fixes merge conflict in test_isolation.rs.

Source: gaebal-gajae dogfood analysis 2026-04-12
2026-04-12 08:53:10 +09:00
Yeachan-Heo
4f84607ad6 Align the plugin-state isolation roadmap note with current green verification
The roadmap still implied that the ambient-plugin-state isolation work sat
outside a green full-workspace verification story. Current main already has
both the test-isolation helpers and the host-plugin-leakage regression, and
the required workspace fmt/clippy/test sequence is green. This updates the
remaining stale roadmap wording to match reality.

Constraint: User required fresh cargo fmt, cargo clippy --workspace --all-targets -- -D warnings, and cargo test --workspace before closeout
Rejected: Leave the stale note in place | contradicts the current verified workspace state
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When backlog items are retired as stale, update any nearby stale verification caveats in the same pass
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: No additional runtime behavior beyond already-covered regression paths
2026-04-11 23:51:00 +00:00
Yeachan-Heo
8eb93e906c Retire the stale bare-word skill discovery backlog item
ROADMAP #36 remained open even though current main already resolves bare
project skill names in the REPL through `resolve_skill_invocation()` instead
of forwarding them to the model. This change adds direct regression coverage
for the known-skill dispatch path and the unknown-skill/non-skill bypass, then
marks the roadmap item done with fresh proof.

Constraint: User required fresh cargo fmt, cargo clippy --workspace --all-targets -- -D warnings, and cargo test --workspace before closeout
Rejected: Leave #36 open because the implementation already existed | keeps the immediate backlog inaccurate and invites duplicate work
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Reopen #36 only with a fresh repro showing a listed project skill still falls through to plain prompt handling on current main
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: No interactive manual REPL session beyond the new bare-skill unit coverage
2026-04-11 23:45:46 +00:00
Yeachan-Heo
264fdc214e Retire the stale bare-skill dispatch backlog item
ROADMAP #36 remained open even though current main already dispatches bare
skill names in the REPL through skill resolution instead of forwarding them
to the model. This change adds a direct regression test for that behavior
and marks the backlog item done with fresh verification evidence.

Constraint: User required fresh cargo fmt, cargo clippy --workspace --all-targets -- -D warnings, and cargo test --workspace before closeout
Rejected: Leave #36 open because the implementation already existed | keeps the immediate backlog inaccurate and invites duplicate work
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Reopen #36 only with a fresh repro showing a listed project skill still falls through to plain prompt handling on current main
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: No interactive manual REPL session beyond the new bare-skill unit coverage
2026-04-11 22:50:28 +00:00