diff --git a/ROADMAP.md b/ROADMAP.md index 7d6d5ab..2e2e990 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -496,7 +496,7 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes 62. **Worker state file surface not implemented** — **done (verified 2026-04-12):** current `main` already wires `emit_state_file(worker)` into the worker transition path in `rust/crates/runtime/src/worker_boot.rs`, atomically writes `.claw/worker-state.json`, and exposes the documented reader surface through `claw state` / `claw state --output-format json` in `rust/crates/rusty-claude-cli/src/main.rs`. Fresh proof exists in `runtime` regression `emit_state_file_writes_worker_status_on_transition`, the end-to-end `tools` regression `recovery_loop_state_file_reflects_transitions`, and direct CLI parsing coverage for `state` / `state --output-format json`. Source: Jobdori dogfood. -**Scope note (verified 2026-04-12):** ROADMAP #31, #43, and #63-#68 currently appear to describe acpx/droid or upstream OMX/server orchestration behavior, not claw-code source already present in this repository. Repo-local searches for `acpx`, `use-droid`, `run-acpx`, `commit-wrapper`, `ultraclaw`, `roadmap-nudge-10min`, `OMX_TMUX_INJECT`, `/hooks/health`, and `/hooks/status` found no implementation hits outside `ROADMAP.md`, and the earlier state-surface note already records that the HTTP server is not owned by claw-code. With #45 now fixed, the remaining unresolved items in this section look like external tracking notes rather than confirmed repo-local backlog; re-check if new repo-local evidence appears. +**Scope note (verified 2026-04-12):** ROADMAP #31, #43, and #63-#68 currently appear to describe acpx/droid or upstream OMX/server orchestration behavior, not claw-code source already present in this repository. Repo-local searches for `acpx`, `use-droid`, `run-acpx`, `commit-wrapper`, `ultraclaw`, `roadmap-nudge-10min`, `OMX_TMUX_INJECT`, `/hooks/health`, and `/hooks/status` found no implementation hits outside `ROADMAP.md`, and the earlier state-surface note already records that the HTTP server is not owned by claw-code. With #45 and #69 now fixed, the remaining unresolved items in this section look like external tracking notes rather than confirmed repo-local backlog; re-check if new repo-local evidence appears. 63. **Droid session completion semantics broken: code arrives after "status: completed"** — dogfooded 2026-04-12. Ultraclaw droid sessions (use-droid via acpx) report `session.status: completed` before file writes are fully flushed/synced to the working tree. Discovered +410 lines of "late-arriving" droid output that appeared after I had already assessed 8 sessions as "no code produced." This creates false-negative assessments and duplicate work. **Fix shape:** (a) droid agent should only report completion after explicit file-write confirmation (fsync or existence check); (b) or, claw-code should expose a `pending_writes` status that indicates "agent responded, disk flush pending"; (c) lane orchestrators should poll for file changes for N seconds after completion before final assessment. **Blocker:** none. Source: Jobdori ultraclaw dogfood 2026-04-12. @@ -510,6 +510,6 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes 68. **Internal reinjection/resume paths leak opaque control prose** — dogfooded 2026-04-12. OMX lanes stopping with `Continue from current mode state. [OMX_TMUX_INJECT]` expose internal implementation details instead of operator-meaningful state. The event tells us *that* tmux reinjection happened, but not *why* (retry after failure? resume after idle? manual recovery?), *what state was preserved*, or *what the lane was trying to do*. **Fix shape:** recovery/reinject events should emit structured cause like: `resume_after_stop`, `retry_after_tool_failure`, `tmux_reinject_after_idle`, `manual_recovery` plus preserved state / target lane info. Never leak bare internal markers like `[OMX_TMUX_INJECT]` as the primary summary. Blocker: none. Source: gaebal-gajae dogfood analysis 2026-04-12. -69. **Lane stop summaries have no minimum quality floor** — dogfooded 2026-04-12. `clawcode-human` session stopped with summary `commit push everyting, keep sweeping $ralph` — vague, typo-ridden, operationally useless. Unlike well-scoped review lanes, this summary regressed to mushy command prose with no outcome clarity. **Fix shape:** (a) enforce minimum stop/result summary standards: what was done (outcome), what was scoped (target), what's next (state); (b) typo/grammar validation; (c) reject summaries that are shorter than N words or contain only control verbs without context. Blocker: none. Source: gaebal-gajae dogfood analysis 2026-04-12. +69. **Lane stop summaries have no minimum quality floor** — **done (verified 2026-04-12):** completed lane persistence in `rust/crates/tools/src/lib.rs` now normalizes vague/control-only stop summaries into a contextual fallback that includes the lane target and status, while preserving structured metadata about whether the quality floor fired (`qualityFloorApplied`, `rawSummary`, `reasons`, `wordCount`). Regression coverage locks both the pass-through path for good summaries and the fallback path for mushy summaries like `commit push everyting, keep sweeping $ralph`. **Original filing below.** -70. **Install-source ambiguity misleads real users** — community observation 2026-04-12. User treated `claw-code.io` as official, then hit `clawcode` vs deprecated `claw-code` naming collision and concluded install story was inconsistent. Source-of-truth is not obvious when website/repo/crates naming diverges. **Fix shape:** canonical repo docs should explicitly state which site is official; installation guidance should visibly warn against deprecated `claw-code` crate and ambiguous third-party pages. Blocker: none. Source: gaebal-gajae community watch 2026-04-12. \ No newline at end of file +70. **Install-source ambiguity misleads real users** — community observation 2026-04-12. User treated `claw-code.io` as official, then hit `clawcode` vs deprecated `claw-code` naming collision and concluded install story was inconsistent. Source-of-truth is not obvious when website/repo/crates naming diverges. **Fix shape:** canonical repo docs should explicitly state which site is official; installation guidance should visibly warn against deprecated `claw-code` crate and ambiguous third-party pages. Blocker: none. Source: gaebal-gajae community watch 2026-04-12. diff --git a/rust/crates/tools/src/lib.rs b/rust/crates/tools/src/lib.rs index 04e0039..d843dcc 100644 --- a/rust/crates/tools/src/lib.rs +++ b/rust/crates/tools/src/lib.rs @@ -3743,12 +3743,13 @@ fn persist_agent_terminal_state( .push(LaneEvent::failed(iso8601_now(), &blocker)); } else { next_manifest.current_blocker = None; - let compressed_detail = result - .filter(|value| !value.trim().is_empty()) - .map(|value| compress_summary_text(value.trim())); - next_manifest - .lane_events - .push(LaneEvent::finished(iso8601_now(), compressed_detail)); + let finished_summary = build_lane_finished_summary(&next_manifest, result); + next_manifest.lane_events.push( + LaneEvent::finished(iso8601_now(), finished_summary.detail).with_data( + serde_json::to_value(&finished_summary.data) + .expect("lane summary metadata should serialize"), + ), + ); if let Some(provenance) = maybe_commit_provenance(result) { next_manifest.lane_events.push(LaneEvent::commit_created( iso8601_now(), @@ -3760,6 +3761,152 @@ fn persist_agent_terminal_state( write_agent_manifest(&next_manifest) } +const MIN_LANE_SUMMARY_WORDS: usize = 7; +const CONTROL_ONLY_SUMMARY_WORDS: &[&str] = &[ + "ack", + "commit", + "continue", + "everyting", + "everything", + "keep", + "next", + "push", + "ralph", + "resume", + "retry", + "run", + "stop", + "sweep", + "sweeping", + "team", +]; +const CONTEXTUAL_SUMMARY_WORDS: &[&str] = &[ + "added", + "audited", + "blocked", + "completed", + "documented", + "failed", + "finished", + "fixed", + "implemented", + "investigated", + "merged", + "pushed", + "refactored", + "removed", + "reviewed", + "tested", + "updated", + "verified", +]; + +#[derive(Debug, Clone, Serialize)] +struct LaneFinishedSummaryData { + #[serde(rename = "qualityFloorApplied")] + quality_floor_applied: bool, + reasons: Vec, + #[serde(rename = "rawSummary", skip_serializing_if = "Option::is_none")] + raw_summary: Option, + #[serde(rename = "wordCount")] + word_count: usize, +} + +#[derive(Debug, Clone)] +struct LaneFinishedSummary { + detail: Option, + data: LaneFinishedSummaryData, +} + +#[derive(Debug)] +struct LaneSummaryAssessment { + apply_quality_floor: bool, + reasons: Vec, + word_count: usize, +} + +fn build_lane_finished_summary( + manifest: &AgentOutput, + result: Option<&str>, +) -> LaneFinishedSummary { + let raw_summary = result.map(str::trim).filter(|value| !value.is_empty()); + let assessment = assess_lane_summary_quality(raw_summary.unwrap_or_default()); + let detail = match raw_summary { + Some(summary) if !assessment.apply_quality_floor => Some(compress_summary_text(summary)), + Some(summary) => Some(compose_lane_summary_fallback(manifest, Some(summary))), + None => Some(compose_lane_summary_fallback(manifest, None)), + }; + + LaneFinishedSummary { + detail, + data: LaneFinishedSummaryData { + quality_floor_applied: raw_summary.is_none() || assessment.apply_quality_floor, + reasons: assessment.reasons, + raw_summary: raw_summary.map(str::to_string), + word_count: assessment.word_count, + }, + } +} + +fn assess_lane_summary_quality(summary: &str) -> LaneSummaryAssessment { + let words = summary + .split(|ch: char| !(ch.is_ascii_alphanumeric() || ch == '-' || ch == '#')) + .filter(|token| !token.is_empty()) + .map(str::to_ascii_lowercase) + .collect::>(); + + let word_count = words.len(); + let mut reasons = Vec::new(); + if summary.trim().is_empty() { + reasons.push(String::from("empty")); + } + + let control_only = !words.is_empty() + && words + .iter() + .all(|word| CONTROL_ONLY_SUMMARY_WORDS.contains(&word.as_str())); + if control_only { + reasons.push(String::from("control_only")); + } + + let has_context_signal = summary.contains('`') + || summary.contains('/') + || summary.contains(':') + || summary.contains('#') + || words + .iter() + .any(|word| CONTEXTUAL_SUMMARY_WORDS.contains(&word.as_str())); + if word_count < MIN_LANE_SUMMARY_WORDS && !has_context_signal { + reasons.push(String::from("too_short_without_context")); + } + + LaneSummaryAssessment { + apply_quality_floor: !reasons.is_empty(), + reasons, + word_count, + } +} + +fn compose_lane_summary_fallback(manifest: &AgentOutput, raw_summary: Option<&str>) -> String { + let target = manifest.description.trim(); + let base = format!( + "Completed lane `{}` for target: {}. Status: completed.", + manifest.name, + if target.is_empty() { + "unspecified task" + } else { + target + } + ); + match raw_summary { + Some(summary) => format!( + "{base} Original stop summary was too vague to keep as the lane result: \"{}\".", + summary.trim() + ), + None => format!("{base} No usable stop summary was produced by the lane."), + } +} + fn derive_agent_state( status: &str, result: Option<&str>, @@ -7240,6 +7387,14 @@ mod tests { completed_manifest_json["laneEvents"][1]["event"], "lane.finished" ); + assert_eq!( + completed_manifest_json["laneEvents"][1]["data"]["qualityFloorApplied"], + false + ); + assert_eq!( + completed_manifest_json["laneEvents"][1]["detail"], + "Finished successfully in commit abc1234" + ); assert_eq!( completed_manifest_json["laneEvents"][2]["event"], "lane.commit.created" @@ -7301,6 +7456,51 @@ mod tests { ); assert_eq!(failed_manifest_json["derivedState"], "truly_idle"); + let normalized = execute_agent_with_spawn( + AgentInput { + description: "Sweep the next backlog item".to_string(), + prompt: "Produce a low-signal stop summary".to_string(), + subagent_type: Some("Explore".to_string()), + name: Some("summary-floor".to_string()), + model: None, + }, + |job| { + persist_agent_terminal_state( + &job.manifest, + "completed", + Some("commit push everyting, keep sweeping $ralph"), + None, + ) + }, + ) + .expect("normalized agent should succeed"); + + let normalized_manifest = std::fs::read_to_string(&normalized.manifest_file) + .expect("normalized manifest should exist"); + let normalized_manifest_json: serde_json::Value = + serde_json::from_str(&normalized_manifest).expect("normalized manifest json"); + assert_eq!( + normalized_manifest_json["laneEvents"][1]["event"], + "lane.finished" + ); + let normalized_detail = normalized_manifest_json["laneEvents"][1]["detail"] + .as_str() + .expect("normalized detail"); + assert!(normalized_detail.contains("Completed lane `summary-floor`")); + assert!(normalized_detail.contains("Sweep the next backlog item")); + assert_eq!( + normalized_manifest_json["laneEvents"][1]["data"]["qualityFloorApplied"], + true + ); + assert_eq!( + normalized_manifest_json["laneEvents"][1]["data"]["rawSummary"], + "commit push everyting, keep sweeping $ralph" + ); + assert_eq!( + normalized_manifest_json["laneEvents"][1]["data"]["reasons"][0], + "control_only" + ); + let spawn_error = execute_agent_with_spawn( AgentInput { description: "Spawn error task".to_string(),