everything-claude-code/docs/ECC-2.0-GA-ROADMAP.md
2026-05-12 09:02:55 -04:00

265 lines
12 KiB
Markdown

# ECC 2.0 GA Roadmap
This roadmap is the durable repo mirror for the Linear project:
<https://linear.app/ecctools/project/ecc-20-ga-harness-os-security-platform-de2a0ecace6f>
Linear issue creation is currently blocked by the workspace active issue limit,
so the live execution truth is split across:
- the Linear project description, status updates, and milestones;
- this repo document;
- merged PR evidence;
- handoffs under `~/.cluster-swarm/handoffs/`.
## Current Evidence
As of 2026-05-12:
- Public GitHub queues are clean across `everything-claude-code`,
`agentshield`, `JARVIS`, `ECC-Tools`, and `ECC-website`.
- `npm run harness:audit -- --format json` reports 70/70 on current `main`.
- `npm run observability:ready` reports 14/14 readiness on current `main`.
- `docs/architecture/harness-adapter-compliance.md` maps Claude Code, Codex,
OpenCode, Cursor, Gemini, Zed-adjacent, dmux, Orca, Superset, Ghast, and
terminal-only support to install paths, verification commands, and risk
notes.
- `npm run harness:adapters -- --check` validates that the public adapter
matrix still matches the source data in
`scripts/lib/harness-adapter-compliance.js`.
- `docs/releases/2.0.0-rc.1/publication-readiness.md` gates GitHub release,
npm dist-tag, Claude plugin, Codex plugin, OpenCode package, billing, and
announcement publication on fresh evidence fields.
- `docs/legacy-artifact-inventory.md` records that no `_legacy-documents-*`
directories exist in the current checkout, inventories the two sibling
workspace-level `_legacy-documents-*` repos as sanitized extraction sources,
and classifies `legacy-command-shims/` as an opt-in archive/no-action
surface.
- `docs/stale-pr-salvage-ledger.md` records stale PR salvage outcomes,
skipped PRs, superseded work, and the remaining #1687 translator/manual
review tail.
- AgentShield PR #53 reduced two context-rule false positives and closed the
remaining AgentShield issues.
- AgentShield PR #55 added GitHub Action organization-policy enforcement with
`policy` / `fail-on-policy` inputs, `policy-status` /
`policy-violations` outputs, job-summary evidence, and policy violation
annotations.
- AgentShield PR #56 added SARIF/code-scanning output for organization-policy
violations as `agentshield-policy/*` results.
- AgentShield PR #57 added OSS, team, enterprise, regulated,
high-risk-hooks/MCP, and CI-enforcement policy-pack presets plus
`agentshield policy init --pack`.
- AgentShield PR #58 added MCP package provenance fields and report-level
counts for npm vs git, pinned vs unpinned, known-good, and registry-backed
supply-chain evidence.
- AgentShield PR #59 added self-contained HTML executive summaries with risk
posture, critical/high priority findings, category exposure, README/API
docs, built-CLI smoke validation, and 1,704-test coverage.
- AgentShield PR #60 added category-level built-in corpus benchmark output,
a `readyForRegressionGate` signal, terminal `--corpus` category coverage,
README/API docs, built-CLI smoke validation, and 1,705-test coverage.
- ECC PR #1778 recovered the useful stale #1413 network/homelab architect-agent
concepts.
- ECC-Tools PR #26 added cost/token-risk predictive follow-ups for AI routing,
Claude/model calls, usage limits, quota, and analysis-budget changes that lack
budget, quota, rate-limit, or cost validation evidence.
- ECC-Tools PR #27 added the non-blocking `ECC Tools / PR Risk Taxonomy`
check-run for Security Evidence, Harness Drift, Install Manifest Integrity,
CI/CD Recommendation, Cost/Token Risk, and Agent Config Review buckets.
- ECC-Tools PR #28 added billing readiness audit checks for plan limits,
entitlements, Marketplace plan shape, subscription source, seats, and
overage metering.
- ECC-Tools PR #29 added deterministic Reference Set Validation signals for
analyzer, skill, agent, command, and harness-guidance changes that lack eval,
golden trace, benchmark, or reference-set evidence.
- ECC-Tools PR #30 capped follow-up generation to three new GitHub issues and
one draft PR per run, then emits the remaining deterministic findings as a
project sync backlog for Linear/status tracking without flooding trackers.
- ECC-Tools PR #31 added review follow-up signals to analysis completion
comments for outstanding change requests, unresolved or outdated review
threads, and review activity without an explicit approval.
- ECC-Tools PR #32 added CI failure-mode predictive follow-ups for workflow
and test-runner changes that lack failure fixtures, captured logs,
troubleshooting notes, dry-run evidence, or regression coverage.
- ECC-Tools PR #33 added harness-config quality predictive follow-ups for MCP,
plugin, agent, hook, command, and harness config changes that lack harness
audit, adapter matrix, cross-harness docs, or compatibility regression
evidence.
## Operating Rules
- Keep public PRs and issues below 20, with zero as the preferred release-lane
target.
- Maintain 70/70 harness audit and 14/14 observability readiness after every
GA-readiness batch.
- Do not publish release or social announcements until the GitHub release,
npm/package state, billing state, and plugin submission surfaces are verified
with fresh evidence.
- Do not treat closed stale PRs as discarded. Pair each cleanup batch with a
salvage pass: inspect the closed diffs, port useful compatible work on
maintainer-owned branches, and credit the source PR.
- Do not create new Linear issues until the active issue limit is cleared.
## Reference Pressure
The GA roadmap is informed by these reference surfaces:
- `stablyai/orca` and `superset-sh/superset` for worktree-native parallel agent
UX, review loops, and workspace presets.
- `standardagents/dmux` and `aidenybai/ghast` for terminal/worktree
multiplexing, session grouping, and lifecycle hooks.
- `jarrodwatts/claude-hud` for always-visible status, tool, agent, todo, and
context telemetry.
- `stanford-iris-lab/meta-harness` and `greyhaven-ai/autocontext` for
evaluation-driven harness improvement, traces, playbooks, and promotion
loops.
- `NousResearch/hermes-agent` for operator shell, gateway, memory, skills, and
multi-platform command patterns.
- `anthropics/claude-code`, active `sst/opencode` / `anomalyco/opencode`, Zed,
Codex, Cursor, Gemini, and terminal-only workflows for adapter expectations.
The output of this reference work should be concrete ECC deltas, not a second
strategy memo.
## Milestones
### 1. GA Release, Naming, And Plugin Publication Readiness
Target: 2026-05-24
Acceptance:
- Naming matrix covers product name, npm package, Claude plugin, Codex plugin,
OpenCode package, marketplace metadata, docs, and migration copy.
- GitHub release, npm dist-tag, plugin publication, and announcement gates are
mapped to fresh command evidence.
- Release notes, migration guide, known issues, quickstart, X thread, LinkedIn
post, and GitHub release copy are ready but not posted before release URLs
exist.
- Plugin publication/contact paths for Claude and Codex are documented with
owner, required artifacts, and submission status.
### 2. Harness Adapter Compliance Matrix And Scorecard Onramp
Target: 2026-05-31
Acceptance:
- Adapter matrix covers Claude Code, Codex, OpenCode, Cursor, Gemini,
Zed-adjacent surfaces, dmux, Orca, Superset, Ghast, and terminal-only use.
- Each adapter has supported assets, unsupported surfaces, install path,
verification command, and risk notes.
- Harness audit remains 70/70 and gains a public onramp that explains how teams
use the scorecard.
- Reference findings are converted into concrete adapter, observability, or
operator-surface deltas.
### 3. Local Observability, HUD/Status, And Session Control Plane
Target: 2026-06-07
Acceptance:
- Observability readiness remains 14/14 and is backed by JSONL traces, status
snapshots, risk ledger, and exportable handoff contracts.
- HUD/status model covers context, tool calls, active agents, todos, checks,
cost, risk, and queue state.
- Worktree/session controls cover create, resume, status, stop, diff, PR,
merge queue, and conflict queue.
- Linear/GitHub/handoff sync model is explicit enough for real-time progress
tracking.
### 4. Self-Improving Harness Evaluation Loop
Target: 2026-06-10
Acceptance:
- Scenario specs, verifier contracts, traces, playbooks, and regression gates
are documented and at least one read-only prototype exists.
- The loop separates observation, proposal, verification, and promotion.
- Team and individual setups can be scored and improved without blindly
mutating configs.
- RAG/reference-set design covers vetted ECC patterns, team history, CI
failures, diffs, review outcomes, and harness config quality.
### 5. AgentShield Enterprise Security Platform
Target: 2026-06-14
Acceptance:
- Formal policy schema exists for org baselines, exceptions, owners,
expiration, severity, and audit trails.
- SARIF/code-scanning output is implemented and tested.
- GitHub Action policy gates expose organization policy status and violation
counts for branch-protection and CI evidence.
- Policy packs are defined for OSS, team, enterprise, regulated, high-risk
hooks/MCP, and CI enforcement.
- Supply-chain intelligence covers MCP package provenance and has an extension
path for npm/pip reputation, CVEs, typosquats, and dependency risk.
- Prompt-injection corpus and regression benchmark are ready for continuous
rule hardening with category-level coverage and regression-gate output.
- Enterprise reports include JSON plus self-contained HTML executive output
with risk posture, priority findings, and category exposure.
### 6. ECC Tools Billing, Deep Analysis, PR Checks, And Linear Sync
Target: 2026-06-21
Acceptance:
- Native GitHub Marketplace billing announcement is backed by verified
implementation and docs.
- Internal billing readiness audit covers plan limits, seats, entitlement
mapping, Marketplace plan shape, subscription state, overage hooks, and
failure modes.
- Deep analyzer covers diff patterns, CI/CD workflows, dependency/security
surface, PR review behavior, failure history, harness config, skill quality,
and reference-set/RAG comparison.
- PR check suite taxonomy includes Security Evidence, Harness Drift, Install
Manifest Integrity, CI/CD Recommendation, Cost/Token Risk, and Agent Config
Review.
- Cost/token-risk predictive follow-ups flag AI routing, model-call, usage,
quota, and budget changes when budget evidence is missing.
- Reference-set validation follow-ups flag analyzer, skill, agent, command, and
harness-guidance changes that lack eval, golden trace, benchmark, or
maintained reference-set evidence.
- PR analysis comments summarize review follow-up signals for requested
changes, unresolved or outdated review threads, and missing approvals.
- CI failure-mode predictive follow-ups flag workflow and test-runner changes
that lack failure fixtures, captured logs, troubleshooting notes, dry-run
evidence, or regression coverage.
- Harness-config quality predictive follow-ups flag MCP, plugin, agent, hook,
command, and harness config changes that lack audit, adapter matrix,
cross-harness doc, or compatibility regression evidence.
- Linear sync design maps findings to issues/status without flooding the
workspace.
- Follow-up generation caps automatic GitHub object creation and keeps overflow
findings in a copy-ready project sync backlog.
### 7. Legacy Audit And Stale-Work Salvage Closure
Target: 2026-06-15
Acceptance:
- Legacy directories and orphaned handoffs are inventoried.
- Each useful artifact is marked landed, Linear/project-tracked, salvage
branch, or archive/no-action.
- Workspace-level legacy repos are mined only through sanitized maintainer
branches; raw context, secrets, personal paths, local settings, and private
drafts are never imported wholesale.
- Stale PR salvage policy stays in force: close stale/conflicted PRs first,
record a salvage ledger item, then port useful compatible content on
maintainer branches with attribution.
- #1687 localization leftovers are handled only by translator/manual review,
not blind cherry-pick.
## Next Engineering Slices
1. Decide whether AgentShield PDF export adds value beyond the merged HTML
executive report and corpus benchmark output.
2. Extend ECC Tools deep analysis and Linear/project sync without flooding the
workspace.