mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-06-14 07:13:35 +08:00
* fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222) run-with-flags.js capped stdin at 1MB but every fallthrough path still echoed the truncated string to stdout. The harness parses hook stdout as JSON, got a document cut mid-stream, and blocked the tool call — so any Edit/Write with a >1MB hook payload was permanently blocked by every registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS gating could run. - Exit 0 with empty stdout (no opinion) when the stdin cap trips, before any echo or gating logic. - Flush stdout via write callback before process.exit: exiting right after stdout.write() dropped everything past the ~64KB pipe buffer, cutting even sub-cap pass-through payloads mid-JSON. Regression tests cover the enabled, disabled, and missing-arg paths for oversized payloads plus full echo of sub-cap >64KB payloads. * fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224) The Codex MCP merge declared exa with a url key, but Codex's [mcp_servers.*] TOML schema is stdio-only — the url key makes the entire config.toml fail to load, bricking both the codex CLI and the desktop app. Every install/update re-injected the line because the urlEntry branch treated the broken entry as present. - ECC_SERVERS now emits only the current default set per docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args). Retired servers (supabase, playwright, context7, exa, github, memory, sequential-thinking) are never re-emitted; existing user-managed entries are untouched. - The merge now repairs the exact ECC-emitted broken form (url-only exa entry) on every run so re-running the installer fixes broken configs instead of preserving them. User stdio exa entries (command + mcp-remote) are left alone. - check-codex-global-state.sh requires chrome-devtools instead of the retired set, and flags url-only exa entries with a repair hint. Tests cover repair, re-run idempotence, stdio-entry preservation, and no-retired-server emission in add, update, dry-run, and disabled modes. * fix(hooks): never echo truncated stdin from Stop hooks (#2090) Stop hooks follow the ECC pass-through convention (echo stdin on stdout), but every echoing Stop hook capped stdin and echoed the capped string. The Stop payload carries last_assistant_message, so a long final assistant message produced a JSON document cut mid-stream on stdout, which the harness reports as 'Stop hook error: JSON validation failed' across the whole Stop chain. Reproduced: a Stop payload with a >64KB last_assistant_message run through run-with-flags + cost-tracker emitted exactly 65536 bytes of invalid JSON (cost-tracker capped stdin at 64KB — far below realistic Stop payloads). - cost-tracker: raise the cap to 1MB (matching all other hooks) and suppress the pass-through echo when stdin was truncated. - check-console-log, stop-format-typecheck, desktop-notify: suppress the echo when stdin was truncated; flush stdout before process.exit so sub-cap payloads are not cut at the ~64KB pipe buffer. - All hooks keep exiting 0 (fail-open); diagnostics go to stderr. New stop-hooks-stdout test asserts the contract for every registered Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic 100KB payloads and oversized >1MB payloads, via the production runner and via direct invocation. Updated the old hooks.test.js case that codified the truncated-echo behavior. * fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142) In long autonomous sessions the fact-force gate produced 10+ near-identical 'state facts -> blocked -> restate -> retry' blocks in one context window, which measurably raises the odds of the model collapsing into a degenerate single-token repetition loop. - Track a per-session fact_force_denials counter in GateGuard state (merged max across concurrent writers, reset with the session, robust to malformed on-disk values). - The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep the full four-fact block; later denials emit a condensed single-line message that carries the denial ordinal, so consecutive denials are structurally different and never textually identical. - True retries of the same target remain allowed without re-prompting (unchanged). Destructive-Bash and routine-Bash gates are unchanged, as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches. Eight new tests cover budget counting, condensed format, ordinal advancement, retry pass-through, env tuning, malformed state, MultiEdit dampening, and destructive-gate exemption. * fix(hooks): keep security hooks able to block on oversized stdin (#2222) Refine the truncation fail-open: instead of skipping the hook entirely, the runner now suppresses only its own raw-echo when stdin was truncated. The hook still executes and receives the truncated flag (run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps blocking truncated protected-config payloads (its test requires exit 2) while pass-through hooks fail open with empty stdout as before. * style: apply repo formatter to touched hook files
133 lines
5.0 KiB
Markdown
133 lines
5.0 KiB
Markdown
---
|
|
name: gateguard
|
|
description: Fact-forcing gate that blocks Edit/Write/Bash (including MultiEdit) and demands concrete investigation (importers, data schemas, user instruction) before allowing the action. Measurably improves output quality by +2.25 points vs ungated agents.
|
|
origin: community
|
|
---
|
|
|
|
# GateGuard — Fact-Forcing Pre-Action Gate
|
|
|
|
A PreToolUse hook that forces Claude to investigate before editing. Instead of self-evaluation ("are you sure?"), it demands concrete facts. The act of investigation creates awareness that self-evaluation never did.
|
|
|
|
## When to Activate
|
|
|
|
- Working on any codebase where file edits affect multiple modules
|
|
- Projects with data files that have specific schemas or date formats
|
|
- Teams where AI-generated code must match existing patterns
|
|
- Any workflow where Claude tends to guess instead of investigating
|
|
|
|
## Core Concept
|
|
|
|
LLM self-evaluation doesn't work. Ask "did you violate any policies?" and the answer is always "no." This is verified experimentally.
|
|
|
|
But asking "list every file that imports this module" forces the LLM to run Grep and Read. The investigation itself creates context that changes the output.
|
|
|
|
**Three-stage gate:**
|
|
|
|
```
|
|
1. DENY — block the first Edit/Write/Bash attempt
|
|
2. FORCE — tell the model exactly which facts to gather
|
|
3. ALLOW — permit retry after facts are presented
|
|
```
|
|
|
|
No competitor does all three. Most stop at deny.
|
|
|
|
## Evidence
|
|
|
|
Two independent A/B tests, identical agents, same task:
|
|
|
|
| Task | Gated | Ungated | Gap |
|
|
| --- | --- | --- | --- |
|
|
| Analytics module | 8.0/10 | 6.5/10 | +1.5 |
|
|
| Webhook validator | 10.0/10 | 7.0/10 | +3.0 |
|
|
| **Average** | **9.0** | **6.75** | **+2.25** |
|
|
|
|
Both agents produce code that runs and passes tests. The difference is design depth.
|
|
|
|
## Gate Types
|
|
|
|
### Edit / MultiEdit Gate (first edit per file)
|
|
|
|
MultiEdit is handled identically — each file in the batch is gated individually.
|
|
|
|
```
|
|
Before editing {file_path}, present these facts:
|
|
|
|
1. List ALL files that import/require this file (use Grep)
|
|
2. List the public functions/classes affected by this change
|
|
3. If this file reads/writes data files, show field names, structure,
|
|
and date format (use redacted or synthetic values, not raw production data)
|
|
4. Quote the user's current instruction verbatim
|
|
```
|
|
|
|
### Write Gate (first new file creation)
|
|
|
|
```
|
|
Before creating {file_path}, present these facts:
|
|
|
|
1. Name the file(s) and line(s) that will call this new file
|
|
2. Confirm no existing file serves the same purpose (use Glob)
|
|
3. If this file reads/writes data files, show field names, structure,
|
|
and date format (use redacted or synthetic values, not raw production data)
|
|
4. Quote the user's current instruction verbatim
|
|
```
|
|
|
|
### Destructive Bash Gate (every destructive command)
|
|
|
|
Triggers on: `rm -rf`, `git reset --hard`, `git push --force`, `drop table`, etc.
|
|
|
|
```
|
|
1. List all files/data this command will modify or delete
|
|
2. Write a one-line rollback procedure
|
|
3. Quote the user's current instruction verbatim
|
|
```
|
|
|
|
### Routine Bash Gate (once per session)
|
|
|
|
```
|
|
1. The current user request in one sentence
|
|
2. What this specific command verifies or produces
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
### Option A: Use the ECC hook (zero install)
|
|
|
|
The hook at `scripts/hooks/gateguard-fact-force.js` is included in this plugin. Enable it via hooks.json.
|
|
|
|
If GateGuard blocks setup or repair work, start the session with
|
|
`ECC_GATEGUARD=off`. For hook-level control, keep using
|
|
`ECC_DISABLED_HOOKS` with the GateGuard hook ID.
|
|
|
|
In long sessions, only the first `GATEGUARD_FACT_FORCE_FULL_DENIALS`
|
|
fact-force denials (default 3) emit the full four-fact block; later
|
|
denials are condensed to a single line carrying the denial ordinal, so
|
|
near-identical blocks cannot accumulate in the context window and
|
|
amplify model repetition loops (#2142). Retrying the same file or
|
|
command after presenting facts never re-triggers the gate.
|
|
|
|
### Option B: Full package with config
|
|
|
|
```bash
|
|
pip install gateguard-ai
|
|
gateguard init
|
|
```
|
|
|
|
This adds `.gateguard.yml` for per-project configuration (custom messages, ignore paths, gate toggles).
|
|
|
|
## Anti-Patterns
|
|
|
|
- **Don't use self-evaluation instead.** "Are you sure?" always gets "yes." This is experimentally verified.
|
|
- **Don't skip the data schema check.** Both A/B test agents assumed ISO-8601 dates when real data used `%Y/%m/%d %H:%M`. Checking data structure (with redacted values) prevents this entire class of bugs.
|
|
- **Don't gate every single Bash command.** Routine bash gates once per session. Destructive bash gates every time. This balance avoids slowdown while catching real risks.
|
|
|
|
## Best Practices
|
|
|
|
- Let the gate fire naturally. Don't try to pre-answer the gate questions — the investigation itself is what improves quality.
|
|
- Customize gate messages for your domain. If your project has specific conventions, add them to the gate prompts.
|
|
- Use `.gateguard.yml` to ignore paths like `.venv/`, `node_modules/`, `.git/`.
|
|
|
|
## Related Skills
|
|
|
|
- `safety-guard` — Runtime safety checks (complementary, not overlapping)
|
|
- `code-reviewer` — Post-edit review (GateGuard is pre-edit investigation)
|