elias/everything-claude-code

Fork 0

mirror of https://github.com/affaan-m/everything-claude-code.git synced 2026-05-13 18:00:35 +08:00

Affaan Mustafa 60bd26fadf

docs: refresh ECC 2.0 reference architecture (#1783 )

2026-05-12 02:03:07 -04:00

13 KiB

Raw Blame History

ECC 2.0 Reference Architecture

Current execution mirror: ECC-2.0-GA-ROADMAP.md.

This document turns the May 2026 reference sweep into concrete ECC backlog shape. It is not a second strategy memo: every reference pressure below should land as an adapter, check, observable signal, security policy, PR review surface, or release-readiness gate.

Reference Baseline

Snapshot date: 2026-05-12.

Reference	Primary pressure on ECC 2.0	Concrete ECC delta
`stablyai/orca`	Worktree-native multi-agent IDE with terminals, source control, GitHub integration, SSH, notifications, design/browser mode, account switching, and per-worktree context.	Treat worktree lifecycle, review state, notification state, and account/provider identity as first-class adapter signals.
`superset-sh/superset`	Desktop AI-agent workspace with parallel execution, worktree isolation, diff review, workspace presets, and broad CLI-agent compatibility.	Add workspace preset taxonomy and make ECC2 session/worktree state exportable enough for external editors to consume.
`standardagents/dmux`	Tmux/worktree orchestration, lifecycle hooks, multi-select agent control, smart merging, file browser, notifications, and cleanup.	Add lifecycle-hook coverage to the harness matrix and define merge/conflict queue events.
`aidenybai/ghast`	Native macOS terminal multiplexer with cwd-grouped workspaces, panes, tabs, drag/drop, search, and notifications.	Preserve terminal-native ergonomics while adding cwd/session grouping and searchable handoff/session records.
`jarrodwatts/claude-hud`	Always-visible Claude Code statusline for context, tools, agents, todos, and transcript-backed activity.	Formalize the ECC HUD/status payload for context, cost, tool calls, active agents, todos, queue state, checks, and risk.
`stanford-iris-lab/meta-harness`	Automated search over task-specific harness design: what to store, retrieve, and show.	Split ECC improvement loops into scenario spec, proposer trace, verifier result, and promoted playbook.
`greyhaven-ai/autocontext`	Recursive harness improvement using traces, reports, artifacts, datasets, playbooks, and role-separated evaluators.	Store reusable traces and playbooks before mutating installed harness assets.
`NousResearch/hermes-agent`	Self-improving operator shell with memories, skills, scheduler, gateways, subagents, terminal backends, and migration tooling.	Keep ECC portable across local, SSH, container, and hosted terminal backends without hiding the underlying commands.
`anthropics/claude-code`, `sst/opencode`, Zed, Codex, Cursor, Gemini	Different agent harnesses expose different hooks, plugin surfaces, session stores, config files, and review loops.	Maintain a public adapter compliance matrix instead of treating one harness as the canonical UX.
Local Claude Code source review	Session, tool, permission, hook, remote, analytics, task, and context-suggestion surfaces are more structured than the public CLI UX suggests.	Model status and risk events around session messages, permission requests, tool progress, context pressure, and summary state.

Architecture Shape

ECC 2.0 should be a harness operating system, not only a catalog of commands, agents, and skills.

┌──────────────────────────────────────────────────────────────┐
│ Operator Surface                                             │
│ CLI, plugin, TUI, HUD/statusline, release gates, PR checks   │
├──────────────────────────────────────────────────────────────┤
│ Harness Adapter Layer                                        │
│ Claude Code, Codex, OpenCode, Cursor, Gemini, Zed, dmux,     │
│ Orca, Superset, Ghast, terminal-only                         │
├──────────────────────────────────────────────────────────────┤
│ Worktree, Session, And Queue Runtime                         │
│ worktrees, panes, sessions, todos, checks, merge/conflict    │
│ queues, notification state, ownership, handoff exports       │
├──────────────────────────────────────────────────────────────┤
│ Observability And Evaluation Loop                            │
│ JSONL traces, status snapshots, risk ledger, harness audit,  │
│ scenario specs, verifiers, promoted playbooks, RAG sets      │
├──────────────────────────────────────────────────────────────┤
│ Security And Commercial Platform                             │
│ AgentShield policies/SARIF, ECC Tools checks, billing,       │
│ Linear/GitHub sync, enterprise reports                       │
└──────────────────────────────────────────────────────────────┘

Reference-To-Backlog Map

Worktree And Session Orchestration

Adopt from Orca, Superset, dmux, and Ghast:

Worktree lifecycle events: create, resume, pause, stop, diff, review, PR, merge-ready, conflict, stale, close, salvage.
Session grouping by repo, branch, cwd, task, owner, and harness.
Workspace presets for release lane, PR triage lane, docs lane, security lane, and test-writer lane.
Notifications for blocked CI, dirty worktrees, merge conflicts, stale review, and finished autonomous runs.
Review loops that can annotate diffs and PRs without taking ownership away from maintainers.

Repo work:

everything-claude-code: extend the adapter compliance matrix and public scorecard onramp.
ecc2: surface session/worktree state through a stable local payload before adding hosted telemetry.
ECC-Tools: consume the same lifecycle events for PR checks, issue routing, and Linear sync.

Verification:

npm run harness:audit -- --format json
npm run observability:ready
targeted adapter matrix tests once the matrix moves from docs to data

HUD, Status, And Observability

Adopt from Claude HUD and the Claude Code source review:

Context pressure: usage, compaction risk, large-result warnings, and summary state.
Tool activity: active tool, recent tools, duration, risky operations, and permission requests.
Agent activity: active subagents, delegated task, branch/worktree, and wait state.
Queue activity: open PRs/issues, CI state, stale/conflict batches, review state, and closed-stale salvage backlog.
Cost/risk: token cost estimate, destructive-operation risk, hook/MCP risk, and security scan state.

Repo work:

Keep docs/architecture/observability-readiness.md as the operator-facing readiness gate.
Define a versioned HUD/status JSON contract that both ECC2 and ECC Tools can consume.
Add sample exports from loop-status, session-inspect, harness audit, and risk ledger into a fixture directory before building visual UI.

Verification:

npm run observability:ready
fixture validation for every status payload
cross-platform smoke test for commands that read session history

Self-Improving Harness Loop

Adopt from Meta-Harness, Autocontext, and Hermes Agent:

Separate the loop into observation, proposal, verification, promotion, and rollback.
Store every proposed improvement as trace plus artifact, not only as a final changed file.
Promote playbooks only after a verifier proves that they improve a scenario without widening blast radius.
Use RAG/reference sets for vetted ECC patterns, team history, CI failures, review outcomes, harness config quality, and security decisions.

Repo work:

everything-claude-code: document scenario specs, verifier contracts, and playbook promotion rules.
ECC-Tools: map analyzer findings to PR comments, check runs, and Linear tasks without flooding the workspace.
agentshield: feed prompt-injection and config-risk findings into regression suites.

Verification:

read-only prototype that emits a trace, report, candidate playbook, and verifier result
regression fixture proving a bad proposal is rejected

AgentShield Enterprise Security Platform

AgentShield should move from useful scanner to enterprise security platform.

Backlog shape:

Policy schema for org baseline, rule severity, owner, exception, expiration, evidence, and audit trail.
SARIF output for GitHub code scanning.
Policy packs for OSS, team, enterprise, regulated, high-risk hooks/MCP, and CI enforcement.
Supply-chain intelligence for MCP packages, npm/pip provenance, CVEs, typosquats, and dependency reputation.
Prompt-injection corpus and regression benchmark.
JSON plus executive HTML/PDF report output.

Verification:

schema unit tests
SARIF fixture tests
policy-pack golden tests
false-positive regression tests from the public issue history

ECC Tools Commercial And Review Platform

ECC Tools should become the GitHub-native layer for billing, deep analysis, PR checks, and Linear progress tracking.

Backlog shape:

Native GitHub Marketplace billing audit before any payments announcement: plans, seats, org/account mapping, subscription state, overage behavior, downgrade/cancel behavior, and failure modes.
Deep analyzer comparable in scope to the useful parts of GitGuardian, Dependabot, CodeRabbit, and Greptile: security evidence, dependency risk, CI/CD recommendations, PR review behavior, config quality, token/cost risk, and harness drift.
RAG/reference set over vetted ECC patterns, historical PR outcomes, dependency advisories, CI failures, review decisions, and team-specific conventions.
Linear sync that maps findings to project status, milestone evidence, and owner-ready issues without exhausting issue limits.

Verification:

check-run fixture tests
billing webhook replay tests
analyzer golden PR fixtures
Linear sync dry-run fixture

Closed-Stale Salvage Lane

Closing stale PRs keeps the public queue usable, but useful work should not be lost because a contributor no longer has time to rebase.

Execution rule:

Close stale, conflicted, or obsolete PRs with a clear courtesy comment.
Record them in a salvage ledger with source PR, author, reason closed, useful files/concepts, risk, and recommended maintainer action.
After the cleanup batch, inspect each closed PR diff manually.
Cherry-pick only when the patch still applies cleanly and preserves current architecture. Otherwise reimplement the useful idea in a fresh maintainer branch.
Preserve attribution in the commit body or PR body.
Comment back on the source PR when useful work lands, linking the maintainer PR or merged commit.
Mark the ledger item as landed, superseded, Linear-tracked, or no-action.

Required safeguards:

Never blind cherry-pick generated churn, bulk localization, or dependency major-version changes.
Prefer small maintainer PRs over one salvage megabranch.
Run the same validation gates as normal code, docs, or catalog changes.
Keep contributor credit even when the final implementation is rewritten.

Near-Term Implementation Order

Extend the harness adapter matrix and public scorecard onramp.
Add the release/name/plugin publication checklist with evidence fields.
Define the HUD/status JSON contract and fixture directory.
Start AgentShield policy schema plus SARIF fixtures.
Audit ECC Tools billing and check-run surfaces.
Inventory legacy folders and closed-stale PRs into the salvage ledger.
Port useful stale work in small attributed maintainer PRs.

Non-Goals

Hosted telemetry before the local event model is useful and testable.
Automatic mutation of user harness configs without verifier evidence.
Treating any one agent harness as the canonical interface.
Release or payments announcements before command, package, marketplace, and billing evidence is fresh.

13 KiB Raw Blame History

ECC 2.0 Reference Architecture

Reference Baseline

Architecture Shape

Reference-To-Backlog Map

Worktree And Session Orchestration

HUD, Status, And Observability

Self-Improving Harness Loop

AgentShield Enterprise Security Platform

ECC Tools Commercial And Review Platform

Closed-Stale Salvage Lane

Near-Term Implementation Order

Non-Goals

13 KiB

Raw Blame History