- agent.yaml: register epic-* commands (#2236) and vue-review (#2241)
- package.json files: drop stray skills/ml-adoption-playbook entry (follows orphan-skill publish pattern; not in install-modules.json)
- unicode-safety: strip decorative emoji from dashboard-web.js (#2100) and brand-discovery refs (#2221) to pass the CI gate
- agent-compress: raise catalog token canary 5000 -> 6000 for the 67-agent catalog
Full suite green (2836/2836).
* docs(skills): document tdd plan handoff evidence
Address issue #2138 by clarifying how tdd-workflow should continue from a plan file, preserve human-readable test guarantees, and retain RED/GREEN evidence across squash merges.
* docs(skills): harden tdd plan handoff guidance
Address review feedback on #2235: use angle-bracket argument hint, treat plan files as untrusted input, and prefer project-local documentation paths for TDD evidence reports.
* docs(skills): clarify plan handoff injection guard
Address review feedback by explicitly stating that plan file content is data, not AI instructions, and that validation commands from untrusted plans require sanitization and approval before execution.
* Update skills/tdd-workflow/SKILL.md
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* docs(skills): address tdd workflow review nits
Clarify plan handoff safety decisions, remove redundant untrusted-input wording, and show consistent TDD evidence path examples.
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* fix: V-001 security vulnerability
Automated security fix generated by OrbisAI Security
* fix: sanitize subprocess call in runner.py
The runner
* fix: address PR review comments on V-001 allowlist and test coverage
Remove dangerous interpreters (python, python3, node, curl, wget) from
ALLOWED_SETUP_EXECUTABLES — they can execute arbitrary code via argument
flags and are not needed for sandbox setup. Rewrite test_invariant_runner
to call _setup_sandbox directly instead of spawning runner.py as a
subprocess (which had no __main__ entrypoint and never exercised the fix).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
- suggest-compact hook now reads the latest usage record from the session
transcript and suggests /compact at a window-scaled token threshold
(160k/200k window, 250k/1M window; COMPACT_CONTEXT_THRESHOLD and
COMPACT_CONTEXT_INTERVAL overridable), re-firing per 60k-token growth
bucket; tool-call count stays as the secondary signal (#2155)
- Codex repo marketplace now points at ./plugins/ecc instead of ./ — Codex
never discovers plugins whose local marketplace source.path is the
marketplace root (verified on Codex CLI 0.137.0); plugins/ecc is a thin
folder referencing root skills/.mcp.json per maintainer direction on
#2097; docs flag plugin mode as experimental with the upstream blocker
openai/codex#26037 linked (#2128)
- README badges for installs/stars/forks now use shields endpoint badges
backed by api.ecc.tools (live install count 3,712 vs the stale static
150), which also eliminates shields' 'Unable to select next GitHub token
from pool' render in the stars badge
Closes#2155Closes#2128
- competitive-platform-analysis: add ## Examples section per ECC
guidelines (8-axis taxonomy walkthrough + pre-filter scoring matrix)
- competitive-report-structure: clarify dimension 9 poles are client-
specific (e.g., Memorability/Hireability) not hard-coded names
- brand-discovery: fix terminal state — set inProgressModule to null
after 90_SYNTHESIS.md is complete to prevent misleading resumption
All fixes mirrored to .agents/ copies.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds four community skills covering brand identity discovery and a
three-skill competitive benchmarking pipeline.
**brand-discovery** — Adaptive multi-session brand identity interview
spanning 8 modules (purpose, positioning, audience, personality, voice,
narrative, founder-brand tension, synthesis). Uses laddering, 5 Whys,
and projective techniques. State persisted to disk via state.json so
sessions resume across conversations without losing elicited knowledge.
Frameworks: Sinek, Dunford, Baker, Enns, Kapferer, Aaker, Neumeier,
Mark & Pearson, Lencioni. Includes 8 module output templates in
references/.
**competitive-platform-analysis** — Scopes and tiers a competitor set
before benchmarking begins. Categorizes candidates along 8 generic
creative-industry axes (positioning stance, specialization, size/model,
engagement format, distinctiveness posture, evidence model, brand
strength, market/reach) into Direct / Adjacent / Aspirational tiers.
Includes a pre-filter scoring matrix. First step in the pipeline.
**benchmark-methodology** — Scores each competitor across 9 weighted
dimensions (positioning 18%, brand voice 15%, visual craft 15%, offer
packaging 12%, evidence 12%, enterprise-readiness 10%, thought
leadership 8%, pricing 5%, client's strategic tension 5%) with explicit
1–5 rubrics and bias controls. Produces one profile card per competitor.
**competitive-report-structure** — Assembles scored cards into a
decision-grade report: executive summary, landscape map, competitor
tiers, heatmap matrix, deep dives, white-space and threats, strategic
recommendations, sources appendix.
brand-discovery complements brand-voice (ECC): brand-voice extracts a
style profile from existing source material; brand-discovery elicits
identity from scratch through structured interviews when no prior
material exists.
A competitive set scoped without the client's positioning brief is
noise, not intelligence — each skill enforces this by requiring the
brief before proceeding. The 9-dimension scoring framework deliberately
reports the client's strategic tension as two separate poles (never
averaged) because the gap between them is the strategic finding.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222)
run-with-flags.js capped stdin at 1MB but every fallthrough path still
echoed the truncated string to stdout. The harness parses hook stdout as
JSON, got a document cut mid-stream, and blocked the tool call — so any
Edit/Write with a >1MB hook payload was permanently blocked by every
registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS
gating could run.
- Exit 0 with empty stdout (no opinion) when the stdin cap trips, before
any echo or gating logic.
- Flush stdout via write callback before process.exit: exiting right
after stdout.write() dropped everything past the ~64KB pipe buffer,
cutting even sub-cap pass-through payloads mid-JSON.
Regression tests cover the enabled, disabled, and missing-arg paths for
oversized payloads plus full echo of sub-cap >64KB payloads.
* fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224)
The Codex MCP merge declared exa with a url key, but Codex's
[mcp_servers.*] TOML schema is stdio-only — the url key makes the
entire config.toml fail to load, bricking both the codex CLI and the
desktop app. Every install/update re-injected the line because the
urlEntry branch treated the broken entry as present.
- ECC_SERVERS now emits only the current default set per
docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args).
Retired servers (supabase, playwright, context7, exa, github, memory,
sequential-thinking) are never re-emitted; existing user-managed
entries are untouched.
- The merge now repairs the exact ECC-emitted broken form (url-only
exa entry) on every run so re-running the installer fixes broken
configs instead of preserving them. User stdio exa entries
(command + mcp-remote) are left alone.
- check-codex-global-state.sh requires chrome-devtools instead of the
retired set, and flags url-only exa entries with a repair hint.
Tests cover repair, re-run idempotence, stdio-entry preservation, and
no-retired-server emission in add, update, dry-run, and disabled modes.
* fix(hooks): never echo truncated stdin from Stop hooks (#2090)
Stop hooks follow the ECC pass-through convention (echo stdin on
stdout), but every echoing Stop hook capped stdin and echoed the capped
string. The Stop payload carries last_assistant_message, so a long
final assistant message produced a JSON document cut mid-stream on
stdout, which the harness reports as 'Stop hook error: JSON validation
failed' across the whole Stop chain.
Reproduced: a Stop payload with a >64KB last_assistant_message run
through run-with-flags + cost-tracker emitted exactly 65536 bytes of
invalid JSON (cost-tracker capped stdin at 64KB — far below realistic
Stop payloads).
- cost-tracker: raise the cap to 1MB (matching all other hooks) and
suppress the pass-through echo when stdin was truncated.
- check-console-log, stop-format-typecheck, desktop-notify: suppress
the echo when stdin was truncated; flush stdout before process.exit
so sub-cap payloads are not cut at the ~64KB pipe buffer.
- All hooks keep exiting 0 (fail-open); diagnostics go to stderr.
New stop-hooks-stdout test asserts the contract for every registered
Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic
100KB payloads and oversized >1MB payloads, via the production runner
and via direct invocation. Updated the old hooks.test.js case that
codified the truncated-echo behavior.
* fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142)
In long autonomous sessions the fact-force gate produced 10+
near-identical 'state facts -> blocked -> restate -> retry' blocks in
one context window, which measurably raises the odds of the model
collapsing into a degenerate single-token repetition loop.
- Track a per-session fact_force_denials counter in GateGuard state
(merged max across concurrent writers, reset with the session, robust
to malformed on-disk values).
- The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep
the full four-fact block; later denials emit a condensed single-line
message that carries the denial ordinal, so consecutive denials are
structurally different and never textually identical.
- True retries of the same target remain allowed without re-prompting
(unchanged). Destructive-Bash and routine-Bash gates are unchanged,
as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches.
Eight new tests cover budget counting, condensed format, ordinal
advancement, retry pass-through, env tuning, malformed state, MultiEdit
dampening, and destructive-gate exemption.
* fix(hooks): keep security hooks able to block on oversized stdin (#2222)
Refine the truncation fail-open: instead of skipping the hook entirely,
the runner now suppresses only its own raw-echo when stdin was
truncated. The hook still executes and receives the truncated flag
(run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps
blocking truncated protected-config payloads (its test requires exit 2)
while pass-through hooks fail open with empty stdout as before.
* style: apply repo formatter to touched hook files
- Add top-level hooks wrapper to second JSON example (consistent with hooks.json format)
- Extract hardcoded thresholds as module-level constants (WALL_OF_TEXT_WORDS,
SUMMARY_CHECK_WORDS, SUMMARY_CHECK_FIRST_N, TASK_OUTPUT_RATIO_HIGH/MEDIUM)
Skipped (not applicable):
- 'Scoring defaults to 5/5' — by design for heuristic fallback; SKILL.md already
documents pairing with LLM judge for production use
- '--output silently ignored' — already fixed by _read_input refactor (checks
args.output directly, not elif args.task and args.output)
Validator (scripts/ci/validate-hooks.js line 182-184) only errors when
matcher is missing for non-EVENTS_WITHOUT_MATCHER events. For Stop (in
EVENTS_WITHOUT_MATCHER), matcher is optional — presence is allowed and
validated for type correctness, absence is also accepted.
- Replace httpx.Retry references with correct httpx API usage across all files
(httpx has no built-in Retry class; use HTTPTransport/Limits instead)
- Fix _check_summary to check first 100 words (not 100 characters)
- Fix template to only show → improvement arrow for non-5 scores
- Clarify hook documentation: hook echoes reminder, does not run evaluator
- Add return type annotation to main()
- Make required parameter keyword-only in _read_file_or_text
- evaluate.py: add CRITICAL ISSUES (axes ≤ 2) section, VERDICT line
- agent-evaluator.md: match format_report output exactly (title, evidence markers, bar graphs)
- templates/evaluation-report.md: match evaluate.py output format
- All now produce identical AGENT SELF-EVALUATION REPORT structure
Single authoritative format: evaluate.py's format_report() output.
- Added ml-adoption-playbook to structure the agentic workflow for adopting ML into non-ML projects.
- Registered the ML playbook in package.json.
- Synchronized catalog counts across documentation and plugin manifests.
Distills a named-genre aesthetic vocabulary (angelcore / cloud-trance /
hyperpop family), a mood + color + light system, and a beat-synced editing
grammar into a creative-direction layer that sits on top of the existing ECC
video skills and chains them (video-editing -> fal-ai-media ->
remotion-video-creation -> motion-* -> content-engine) into one pipeline.
Includes beat math (138 BPM), a section-by-section shot plan, fal.ai prompt
presets per mood, FFmpeg reframe/beat-cut recipes, a Remotion beat-synced
composition skeleton, and a companion genre-taxonomy reference.
* feat: add orch-* orchestrator skill family
Lightweight wrappers that orchestrate existing ECC agents through a gated Research -> Plan -> TDD -> Review -> Commit pipeline, right-sized per task.
- orch-pipeline: shared engine (phases, size classifier, two gates, agent map)
- orch-add-feature/change-feature/fix-defect/refine-code/build-mvp: thin wrappers delegating to the engine
* chore: register orch-* family in catalog, command registry, and agent.yaml (post-rebase onto green main)
---------
Co-authored-by: ECC Test <ecc@example.test>
ROOT CAUSE: hooks load plugin-hook-bootstrap.js via
`node -e "...; process.argv.splice(1,0,s); require(s)"`. On Node 21+,
require.main is `undefined` under --eval, so the `if (require.main === module)`
guard was false and main() never ran — every plugin hook silently no-op'd
(e.g. the MCP-health PreToolUse hook stopped blocking). CI (Node 18/20) hid
this; it only surfaces on Node 21+. Fix: also run main() when require.main is
undefined (the eval-bootstrap case), while staying dormant on real imports.
Also clears pre-existing main debt the full local suite enforces:
- catalog:sync — README/docs agent+skill counts drifted after recent merges
- tests/ci/supply-chain-watch-workflow: update checkout SHA to the merged v6.0.3 (#2183)
- markdownlint + check-unicode-safety --write across docs/skills
Suite: 2683/2683 green under Node v25; lint + unicode clean.
Co-authored-by: ECC Test <ecc@example.test>
* feat(skills): add codehealth-mcp skill and CodeScene MCP config
* docs(skills): add When to Use, How It Works, and Examples sections
* docs(skills): clarify MCP opt-in, data boundaries, and offline behavior
Address security review on PR #2077: no bundled credentials, document what
tools read locally, failure behavior when MCP is unavailable, and README
wording that Code Health MCP is optional and not enabled by default.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: adnasalk-notus <adna.salkovic@notus.hr>
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat: add intent-driven-development skill
Converts ambiguous feature or engineering requests into scoped,
verifiable acceptance criteria before implementation starts.
- Chooses between Quick Capture (low/moderate risk) and Full
Acceptance Brief (security, data, migration, cross-system changes)
- Reads repo context before asking questions; only asks what cannot
be inferred
- Non-blocking by default: records criteria and proceeds unless a
real risk requires confirmation
- Rule 9: when an AC fails mid-implementation due to architectural
constraints, marks it [revised], updates scope/verification method,
and re-presents only changed criteria rather than silently dropping
- Output template includes Revision Log for traceability across
multiple implementation cycles
* fix: add canonical When to Activate, How It Works, and Examples sections
Required for auto-activation mechanism detection per CONTRIBUTING.md
and existing skill conventions. Sections inserted after the intro
and before Operating Rules.
* fix: strengthen intent-driven-development skill per review
Address skill-quality review feedback on the intent-driven-development PR:
- Business/product constraints: add Operating Rule 2 forbidding inference
of business rules, compliance/SLAs, pricing, retention, prioritization,
and target users from code; surface the technical-vs-business split in
How It Works, Discover Context, and a dedicated 'supplied, not inferred'
section in the brief template.
- Eval-style pass/fail: add a Pass/Fail Examples section (failing vs
passing AC, plus a misplaced business-rule context entry) and a 5-point
Pass/Fail Rubric users can apply to the output.
- Renumber Operating Rules 1-10 accordingly; markdownlint clean.
* fix: surface legacy data warning in instinct-cli status (#2036)
When the data directory moved from ~/.claude/homunculus/ to the
XDG-compliant ~/.local/share/ecc-homunculus/, legacy installs with data
still in the old path saw "No instincts found" with no explanation.
Add _warn_legacy_data() to cmd_status so users get a clear, actionable
warning pointing them to the migration script or the CLV2_HOMUNCULUS_DIR
override. Wrap the directory scan in try/except to handle permission
errors gracefully.
Closes#2036
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix: address review feedback — drop unused f-strings, resolve absolute migrate path
Remove extraneous f-prefix from strings without interpolation (ruff F541).
Resolve migrate-homunculus.sh path relative to instinct-cli.py instead of
hard-coding a repo-relative path that only works from the repo root.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix: quote migrate script path to handle spaces
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: kky <lingmu141592@gmail.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Several published examples contained APIs that no longer exist, code that
does not run, or model versions that drifted from reality:
- agents/performance-optimizer.md used the web-vitals v3 API
(getCLS/getFID/getLCP/getFCP/getTTFB) and reported FID. web-vitals v4
renamed the imports to onCLS/onINP/onLCP/onFCP/onTTFB and FID was
replaced by INP (target < 200ms)
- rules/common/performance.md pinned stale model versions in the
model-selection guidance; refresh to the versions the repo itself uses
(agent.yaml pins claude-opus-4-6) and add the PowerShell variant for
MAX_THINKING_TOKENS next to the bash export
- skills/python-patterns/SKILL.md: both get_value examples referenced
default_value without declaring the parameter (NameError); add
default_value: Any = None to the EAFP and LBYL signatures
- skills/frontend-patterns/SKILL.md: the custom useQuery example rebuilt
refetch whenever callers passed inline fetchers/options, re-triggering
the effect after every state update (infinite fetch loop). Keep the
latest fetcher/options in refs so refetch stays referentially stable.
The PASS-labelled useMemo example mutated its input with in-place sort;
copy before sorting
- skills/coding-standards/SKILL.md repeated the same PASS-labelled
in-place-sort-in-useMemo example; same fix
- rules/typescript/security.md used a vendor-specific OPENAI_API_KEY in
generic guidance; switch to a neutral API_KEY
Every hand-maintained copy of the affected content is synced in the same
change: locale mirrors (ja-JP, ko-KR, pt-BR, tr, zh-CN, zh-TW - each only
where it carries the affected file) and the .agents/.kiro/.cursor harness
mirrors. Two structural divergences are left alone and noted here:
.kiro/steering/performance.md has no extended-thinking control list to
carry the PowerShell variant, and docs/zh-TW/rules/performance.md keeps an
older condensed thinking section without the budget-cap line.
rules/zh/performance.md is intentionally untouched - the rules/zh tree is
being retired in a separate change
- multi-{plan,execute,backend,frontend,workflow}.md: add an in-file
prerequisite note for the external ccg-workflow runtime. README.md already
warns these commands need codeagent-wrapper and the .ccg prompt tree, but
users meeting them via the installed slash commands never see the README;
the commands-core module still installs all five by default
- quality-gate.md: describe what scripts/hooks/quality-gate.js actually does.
The doc advertised '/quality-gate [path] [--fix] [--strict]' with lint/type
checks, but the script reads the file path from hook stdin JSON, toggles
behavior via ECC_QUALITY_GATE_FIX / ECC_QUALITY_GATE_STRICT env vars, and
runs formatters only (Biome/Prettier, gofmt, ruff format)
- claude-devfleet SKILL.md: add a Setup section pointing at the DevFleet
server repository (github.com/LEC-AI/claude-devfleet, already disclosed in
mcp-configs/mcp-servers.json) plus the SECURITY.md port-verification note;
the skill previously assumed a running instance with no way to obtain one
- regenerate docs/COMMAND-REGISTRY.json for the quality-gate description
* feat(skills): add kubernetes-patterns skill
* fix(skills): address CodeRabbit review on kubernetes-patterns
- Add When to Use alias section (repo skill-format requirement)
- Add How It Works overview section (required schema)
- Add Examples quick-reference table (required schema)
- Fix RBAC: split into Pattern A (no API, token disabled) and
Pattern B (needs API, token enabled) to resolve contradiction
between automountServiceAccountToken: false and Role/RoleBinding
- Fix missing -n my-namespace flag on OOMKilled kubectl describe command
* fix(observer): auto-scale max_turns by analysis batch size (#2035)
The hardcoded default of MAX_TURNS=20 is insufficient when
MAX_ANALYSIS_LINES=500 (also the default). Claude exhausts its turn
budget before it can write all discovered instinct files, producing:
Error: Reached max turns (20)
Fix: when ECC_OBSERVER_MAX_TURNS is not explicitly set, compute
max_turns proportionally to the actual analysis batch size:
max_turns = clamp(analysis_count / 10, 20, 100)
This gives:
- 20–199 lines → 20 turns (existing floor, unchanged)
- 500 lines → 50 turns (resolves the reported failure)
- 1000 lines → 100 turns (cap)
Explicitly setting ECC_OBSERVER_MAX_TURNS still overrides the
auto-scaled value, preserving the existing escape hatch.
* test(observer): update max_turns test for auto-scaling; document validation
The max-turns budget test in tests/hooks/hooks.test.js still asserted the removed literal max_turns="${ECC_OBSERVER_MAX_TURNS:-20}", which would fail against the new auto-scaling logic. Assert the auto-scale formula and the 20/100 clamp bounds instead.
Also add the explanatory comment CodeRabbit requested above the max_turns sanitization block, clarifying it guards the explicit ECC_OBSERVER_MAX_TURNS override path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
The observe.sh Layer 1 entrypoint guard short-circuits with exit 0 when
CLAUDE_CODE_ENTRYPOINT is not in {cli, sdk-ts, claude-desktop}. Claude
Code's VS Code extension sets CLAUDE_CODE_ENTRYPOINT=claude-vscode, so
VS Code users see no observations recorded — observations.jsonl never
gets created and the instinct pipeline stays empty.
Add claude-vscode to the allowlist, mirroring the precedent in #1522
which added claude-desktop the same way.
Add a regression test that spawns observe.sh under bash -x for each
allowed entrypoint (cli, sdk-ts, claude-desktop, claude-vscode) and
each denied entrypoint (unknown-host, claude-cody, mcp), asserting
that allowed entrypoints reach Layer 2's ECC_HOOK_PROFILE check while
denied entrypoints stop at Layer 1.
Fixes#2102