13 KiB
ECC 2.0 Reference Architecture
Current execution mirror:
ECC-2.0-GA-ROADMAP.md.
This document turns the May 2026 reference sweep into concrete ECC backlog shape. It is not a second strategy memo: every reference pressure below should land as an adapter, check, observable signal, security policy, PR review surface, or release-readiness gate.
Reference Baseline
Snapshot date: 2026-05-12.
| Reference | Primary pressure on ECC 2.0 | Concrete ECC delta |
|---|---|---|
stablyai/orca |
Worktree-native multi-agent IDE with terminals, source control, GitHub integration, SSH, notifications, design/browser mode, account switching, and per-worktree context. | Treat worktree lifecycle, review state, notification state, and account/provider identity as first-class adapter signals. |
superset-sh/superset |
Desktop AI-agent workspace with parallel execution, worktree isolation, diff review, workspace presets, and broad CLI-agent compatibility. | Add workspace preset taxonomy and make ECC2 session/worktree state exportable enough for external editors to consume. |
standardagents/dmux |
Tmux/worktree orchestration, lifecycle hooks, multi-select agent control, smart merging, file browser, notifications, and cleanup. | Add lifecycle-hook coverage to the harness matrix and define merge/conflict queue events. |
aidenybai/ghast |
Native macOS terminal multiplexer with cwd-grouped workspaces, panes, tabs, drag/drop, search, and notifications. | Preserve terminal-native ergonomics while adding cwd/session grouping and searchable handoff/session records. |
jarrodwatts/claude-hud |
Always-visible Claude Code statusline for context, tools, agents, todos, and transcript-backed activity. | Formalize the ECC HUD/status payload for context, cost, tool calls, active agents, todos, queue state, checks, and risk. |
stanford-iris-lab/meta-harness |
Automated search over task-specific harness design: what to store, retrieve, and show. | Split ECC improvement loops into scenario spec, proposer trace, verifier result, and promoted playbook. |
greyhaven-ai/autocontext |
Recursive harness improvement using traces, reports, artifacts, datasets, playbooks, and role-separated evaluators. | Store reusable traces and playbooks before mutating installed harness assets. |
NousResearch/hermes-agent |
Self-improving operator shell with memories, skills, scheduler, gateways, subagents, terminal backends, and migration tooling. | Keep ECC portable across local, SSH, container, and hosted terminal backends without hiding the underlying commands. |
anthropics/claude-code, sst/opencode, Zed, Codex, Cursor, Gemini |
Different agent harnesses expose different hooks, plugin surfaces, session stores, config files, and review loops. | Maintain a public adapter compliance matrix instead of treating one harness as the canonical UX. |
| Local Claude Code source review | Session, tool, permission, hook, remote, analytics, task, and context-suggestion surfaces are more structured than the public CLI UX suggests. | Model status and risk events around session messages, permission requests, tool progress, context pressure, and summary state. |
Architecture Shape
ECC 2.0 should be a harness operating system, not only a catalog of commands, agents, and skills.
┌──────────────────────────────────────────────────────────────┐
│ Operator Surface │
│ CLI, plugin, TUI, HUD/statusline, release gates, PR checks │
├──────────────────────────────────────────────────────────────┤
│ Harness Adapter Layer │
│ Claude Code, Codex, OpenCode, Cursor, Gemini, Zed, dmux, │
│ Orca, Superset, Ghast, terminal-only │
├──────────────────────────────────────────────────────────────┤
│ Worktree, Session, And Queue Runtime │
│ worktrees, panes, sessions, todos, checks, merge/conflict │
│ queues, notification state, ownership, handoff exports │
├──────────────────────────────────────────────────────────────┤
│ Observability And Evaluation Loop │
│ JSONL traces, status snapshots, risk ledger, harness audit, │
│ scenario specs, verifiers, promoted playbooks, RAG sets │
├──────────────────────────────────────────────────────────────┤
│ Security And Commercial Platform │
│ AgentShield policies/SARIF, ECC Tools checks, billing, │
│ Linear/GitHub sync, enterprise reports │
└──────────────────────────────────────────────────────────────┘
Reference-To-Backlog Map
Worktree And Session Orchestration
Adopt from Orca, Superset, dmux, and Ghast:
- Worktree lifecycle events: create, resume, pause, stop, diff, review, PR, merge-ready, conflict, stale, close, salvage.
- Session grouping by repo, branch, cwd, task, owner, and harness.
- Workspace presets for release lane, PR triage lane, docs lane, security lane, and test-writer lane.
- Notifications for blocked CI, dirty worktrees, merge conflicts, stale review, and finished autonomous runs.
- Review loops that can annotate diffs and PRs without taking ownership away from maintainers.
Repo work:
everything-claude-code: extend the adapter compliance matrix and public scorecard onramp.ecc2: surface session/worktree state through a stable local payload before adding hosted telemetry.ECC-Tools: consume the same lifecycle events for PR checks, issue routing, and Linear sync.
Verification:
npm run harness:audit -- --format jsonnpm run observability:ready- targeted adapter matrix tests once the matrix moves from docs to data
HUD, Status, And Observability
Adopt from Claude HUD and the Claude Code source review:
- Context pressure: usage, compaction risk, large-result warnings, and summary state.
- Tool activity: active tool, recent tools, duration, risky operations, and permission requests.
- Agent activity: active subagents, delegated task, branch/worktree, and wait state.
- Queue activity: open PRs/issues, CI state, stale/conflict batches, review state, and closed-stale salvage backlog.
- Cost/risk: token cost estimate, destructive-operation risk, hook/MCP risk, and security scan state.
Repo work:
- Keep
docs/architecture/observability-readiness.mdas the operator-facing readiness gate. - Define a versioned HUD/status JSON contract that both ECC2 and ECC Tools can consume.
- Add sample exports from
loop-status,session-inspect, harness audit, and risk ledger into a fixture directory before building visual UI.
Verification:
npm run observability:ready- fixture validation for every status payload
- cross-platform smoke test for commands that read session history
Self-Improving Harness Loop
Adopt from Meta-Harness, Autocontext, and Hermes Agent:
- Separate the loop into observation, proposal, verification, promotion, and rollback.
- Store every proposed improvement as trace plus artifact, not only as a final changed file.
- Promote playbooks only after a verifier proves that they improve a scenario without widening blast radius.
- Use RAG/reference sets for vetted ECC patterns, team history, CI failures, review outcomes, harness config quality, and security decisions.
Repo work:
everything-claude-code: document scenario specs, verifier contracts, and playbook promotion rules.ECC-Tools: map analyzer findings to PR comments, check runs, and Linear tasks without flooding the workspace.agentshield: feed prompt-injection and config-risk findings into regression suites.
Verification:
- read-only prototype that emits a trace, report, candidate playbook, and verifier result
- regression fixture proving a bad proposal is rejected
AgentShield Enterprise Security Platform
AgentShield should move from useful scanner to enterprise security platform.
Backlog shape:
- Policy schema for org baseline, rule severity, owner, exception, expiration, evidence, and audit trail.
- SARIF output for GitHub code scanning.
- Policy packs for OSS, team, enterprise, regulated, high-risk hooks/MCP, and CI enforcement.
- Supply-chain intelligence for MCP packages, npm/pip provenance, CVEs, typosquats, and dependency reputation.
- Prompt-injection corpus and regression benchmark.
- JSON plus executive HTML/PDF report output.
Verification:
- schema unit tests
- SARIF fixture tests
- policy-pack golden tests
- false-positive regression tests from the public issue history
ECC Tools Commercial And Review Platform
ECC Tools should become the GitHub-native layer for billing, deep analysis, PR checks, and Linear progress tracking.
Backlog shape:
- Native GitHub Marketplace billing audit before any payments announcement: plans, seats, org/account mapping, subscription state, overage behavior, downgrade/cancel behavior, and failure modes.
- Deep analyzer comparable in scope to the useful parts of GitGuardian, Dependabot, CodeRabbit, and Greptile: security evidence, dependency risk, CI/CD recommendations, PR review behavior, config quality, token/cost risk, and harness drift.
- RAG/reference set over vetted ECC patterns, historical PR outcomes, dependency advisories, CI failures, review decisions, and team-specific conventions.
- Linear sync that maps findings to project status, milestone evidence, and owner-ready issues without exhausting issue limits.
Verification:
- check-run fixture tests
- billing webhook replay tests
- analyzer golden PR fixtures
- Linear sync dry-run fixture
Closed-Stale Salvage Lane
Closing stale PRs keeps the public queue usable, but useful work should not be lost because a contributor no longer has time to rebase.
Execution rule:
- Close stale, conflicted, or obsolete PRs with a clear courtesy comment.
- Record them in a salvage ledger with source PR, author, reason closed, useful files/concepts, risk, and recommended maintainer action.
- After the cleanup batch, inspect each closed PR diff manually.
- Cherry-pick only when the patch still applies cleanly and preserves current architecture. Otherwise reimplement the useful idea in a fresh maintainer branch.
- Preserve attribution in the commit body or PR body.
- Comment back on the source PR when useful work lands, linking the maintainer PR or merged commit.
- Mark the ledger item as landed, superseded, Linear-tracked, or no-action.
Required safeguards:
- Never blind cherry-pick generated churn, bulk localization, or dependency major-version changes.
- Prefer small maintainer PRs over one salvage megabranch.
- Run the same validation gates as normal code, docs, or catalog changes.
- Keep contributor credit even when the final implementation is rewritten.
Near-Term Implementation Order
- Extend the harness adapter matrix and public scorecard onramp.
- Add the release/name/plugin publication checklist with evidence fields.
- Define the HUD/status JSON contract and fixture directory.
- Start AgentShield policy schema plus SARIF fixtures.
- Audit ECC Tools billing and check-run surfaces.
- Inventory legacy folders and closed-stale PRs into the salvage ledger.
- Port useful stale work in small attributed maintainer PRs.
Non-Goals
- Hosted telemetry before the local event model is useful and testable.
- Automatic mutation of user harness configs without verifier evidence.
- Treating any one agent harness as the canonical interface.
- Release or payments announcements before command, package, marketplace, and billing evidence is fresh.