everything-claude-code/skills/continuous-learning-v2
Gaurav Dubey a6d12ec21e
fix(clv2): surface SIGALRM timeout drops in observe.sh (#2373)
* fix(clv2): surface SIGALRM timeout drops in observe.sh

The inline-Python observation writers in observe.sh arm a signal.SIGALRM
alarm (8s) so they self-terminate before the async hook's 10s timeout can
orphan them (#2278). The handler _ecc_bail called sys.exit(0) with no
logging, so when the alarm fired the in-flight observation was silently
dropped: nothing was logged, no partial write occurred, and the shell saw
a clean exit. There was no way to detect or count how many observations
were being lost.

Add a single stderr visibility line to both _ecc_bail handlers (the
parse-error fallback path and the main observation-writing path) before
sys.exit(0), using the repo's "[observe]" log prefix. Exit code stays 0:
in a Claude Code hook a non-zero exit signals a block, so changing it
would turn an internal timeout into a user-facing tool block. The warning
goes to stderr (not stdout) because both blocks redirect stdout into the
observations file.

Add tests/hooks/observe-signal-timeout.test.js: a static regression guard
that every _ecc_bail handler logs to stderr before exiting and keeps exit
0, plus a behavioral check that runs the real handler text extracted from
observe.sh and confirms a fired alarm exits 0 and emits the [observe]
warning on stderr only.

Fixes #2300

* test(clv2): exercise both _ecc_bail handlers end-to-end

The behavioral SIGALRM-fire test ran only handlers[0] (the parse-error
fallback path); the main observation-write path (handlers[1]) was covered
only by the static regex guard. The write path is the higher-value one to
verify end-to-end since it carries valid, parseable data that would succeed
given more time, so a silent drop there is the worst case.

Loop the behavioral check over every extracted handler so a regression that
silenced the second handler's stderr write is caught at runtime, not just by
the static guard.

* test(clv2): select timeout handlers by marker, not array index

The behavioral check looped over all extracted _ecc_bail handlers by index.
If an unrelated _ecc_bail were ever added to observe.sh, the loop would
either test the wrong block or be diluted. Filter the handlers to those
carrying the "[observe] SIGALRM timeout" marker so the live SIGALRM check
stays pinned to the two #2300 timeout handlers regardless of array order or
future additions.

* test(clv2): fail fast when python is missing in SIGALRM check

The behavioral test returned early when no python interpreter was found,
which the test harness records as a PASS — so the SIGALRM contract could go
entirely unverified yet still look green. Throw instead, matching the
existing insaits-security-monitor convention of failing when a required
Python runtime is absent, and drop the in-test console.log.
2026-06-29 18:43:28 -07:00
..