* Restore delivery-gate: Stop hook with learning capture enforcement (auto-closed by fork sync, now on clean branch)
* Fix bot findings: log level→INFO (DISK_REMIND dead code), count_edits full transcript (not truncated), memory-dir-absent warning (not silent pass), SKILL.md description accuracy
* Fix CodeRabbit feedback: treat missing memory-dir as all-stale on complex tasks (fail-close instead of fail-open)
* Trigger bot re-review (no logic changes)
* Fix: handle both stdin formats — raw transcript AND JSON with transcript_path (Greptile feedback)
* Add debug log for memory-dir lookup path
* Fix path encoding: replace colon with dash (not strip), matching Claude Code actual encoding on Windows
* Fix SKILL.md: update How It Works for JSON+transcript_path, add English translation to CLAUDE.md block (Greptile feedback)
* Fix: memory-dir absent → warn but don't block (prevents deadlock for new users per Greptile feedback)
* fix: restore daltino-approved voice (thinking quality/收尾铁律) with technical patches
Reverts 'session hygiene' rebranding. Preserves original approved framing
while keeping technical improvements:
- JSON transcript_path parsing documentation
- filesystem mtime staleness check
- 'skip tests for now' rationalization pattern
- disk critically low explicit block condition
* fix: remove stdout JSON echo — Stop hooks write feedback to stderr, not stdout
Previously sys.stdout.write(raw) echoed the raw hook JSON payload to stdout,
which Claude Code displays as the hook's response message. When the hook
blocked (exit 2), Claude saw {"transcript_path":"...","session_id":"..."}
instead of the actual blocking reason from stderr.
This made the gate functionally silent from Claude's perspective — it could
not guide Claude to the corrective action (update growth-log / free disk).
Fix per Greptile feedback: stop echo, let stderr messages reach Claude.
* fix: remove duplicate disk-critical log line
* docs(delivery-gate): v1.1.0 — accurate scope (deterministic checks, not reasoning), warning vs block table, CI/CD analogy, limitations section, self-audit pairing
* fix(delivery-gate): expand rationalization regex coverage (R3/R4) — match "we can fix" and "integration tests" variants
* chore: bump version to 1.1.1 to re-trigger CI checks
4.9 KiB
name, description, version, metadata
| name | description | version | metadata | ||
|---|---|---|---|---|---|
| delivery-gate | Stop hook that blocks Claude from finishing until quality checks pass. Detects rationalization patterns (surface text heuristics), stale learning logs (filesystem mtime), and low disk space. Complements self-audit by mechanically enforcing learning capture habits. | 1.1.1 |
|
Delivery Gate — Mechanical Quality Gate for Claude Code
A Stop hook that checks three things before Claude can finish a session, using only deterministic checks — file modification timestamps, disk usage, and regex patterns on the transcript text. No AI inference.
This is distinct from reasoning gates (like self-audit): delivery-gate checks machine-verifiable facts; self-audit checks output quality across four reasoning dimensions. Together they form defense in depth:
- delivery-gate: "Was the learning library touched today? Is disk space safe?"
- self-audit: "Is the file content correct, complete, and honest?"
This is the same pattern as CI pipeline gates — automated, deterministic checks that verify machine-readable facts rather than trusting self-reported status.
What It Checks
| Check | Mechanism | On Hit |
|---|---|---|
| Rationalization patterns | Regex on transcript tail | Warning only (never blocks) |
| Stale learning libraries | mtime on 5 configurable paths | Warning if some stale; Block if >=3 stale OR growth-log stale + complex task |
| Disk space < 50GB | shutil.disk_usage |
Warning |
| Disk space < 15GB | shutil.disk_usage |
Block (exit 2) |
Rationalization detection warns about patterns like "skip tests for now" and "pre-existing bug" — surface signals that thinking may have been cut short. It never blocks on its own, because regex heuristics can false-positive. The blocking conditions are: disk critical, >=3 learning libs stale, OR growth-log specifically stale (all require complex task >=3 edits).
Why
Claude Code's built-in checks cover code quality (build → type → lint → test). But there's a different failure mode: the agent produces working code while the session hygiene was neglected — learning not captured, rationalized shortcuts, disk running out silently.
Over many sessions of "ship and forget," the human hasn't grown. This hook enforces the habit: complex task → must touch learning libraries.
Install
cp quality-gate.py ~/.claude/scripts/
Add to ~/.claude/settings.json:
{
"hooks": {
"Stop": [{
"hooks": [{
"type": "command",
"command": "python3 ~/.claude/scripts/quality-gate.py",
"timeout": 5000
}]
}]
}
}
Learning Libraries
Create these files in your project's memory directory. The hook checks if at least one was updated today:
memory/
├── growth-log/ # Daily learning entries (directory)
├── decisions/log.md # Decision log
├── output-index.md # Index of session outputs
├── ratings-tracker.md # Skill ratings over time
└── tooling_capabilities.md # Known tools inventory
Customize the LIBS dict to match your own file structure.
Configuration
Edit quality-gate.py:
| Variable | Default | Purpose |
|---|---|---|
RATIONALIZE |
4 patterns | Regex patterns for rationalization detection |
LIBS |
5 libraries | Files/dirs to check for today's updates |
COMPLEX_THRESHOLD |
3 | Edit/Write calls to classify as complex |
DISK_WARN_GB |
50 | Warn below this |
DISK_CRIT_GB |
15 | Block below this |
Examples
Simple session — allowed:
edit_count=1 (< 3, not complex) → exit 0
Complex task, learning captured — allowed:
edit_count=5 (complex) → checks LIBS → growth-log updated today → exit 0
Complex task, no learning — BLOCKED:
edit_count=4 (complex) → checks LIBS → all 5 stale → exit 2
stderr: "Blocked: complex task completed but no learning captured today."
Low disk space — BLOCKED:
disk_free=12GB < 15GB critical → exit 2
stderr: "Blocked: disk space at 12GB (threshold: 15GB)."
Limitations
The hook enforces the habit of touching learning libraries, not the quality of what was recorded. If output-index.md is updated but growth-log is skipped, the hook passes (1 of 5 libraries touched). This is by design: mechanical gates check machine-verifiable facts. For content quality verification, pair with self-audit.
Compatibility
- Python 3.8+ (uses
from __future__ import annotations) - Cross-platform: Windows, macOS, Linux
- Zero dependencies beyond stdlib
Quality
This code went through 4 rounds of automated code review (CodeRabbit + Greptile) with 9 real bugs found and fixed.
See Also
self-audit— Reasoning quality gate (completeness/consistency/groundedness/honesty)verification-loop— Code quality checks (build/type/lint/test)gateguard— PreToolUse safety gate