Gaurav Dubey a89b32c2b5
fix(clv2): serialize observer signal-counter to stop dropped increments (#2372)
observe.sh bumps the SIGUSR1 throttle counter in
${PROJECT_DIR}/.observer-signal-counter with an unlocked read-modify-write.
The hook runs on every tool call, so concurrent invocations read the same
value, both increment, and lose a write, signaling the observer at
unpredictable intervals and defeating the #521 throttle.

Serialize the read-modify-write under a lock, and only ever bump the counter
while that lock is held:

- Prefer flock with a bounded -w wait (the OS auto-releases it when the fd
  closes or the process dies, so there is no stale lock and no lost increment);
  on a timeout the tick is skipped rather than bumped unlocked.
- Fall back to an atomic mkdir lock on platforms without flock, with a bounded
  spin. An EXIT trap cleans up on normal completion; INT/TERM traps release the
  lock and exit, so a signal cannot drop the lock and then continue the
  read-modify-write without ownership. If the lock cannot be acquired in the
  budget the tick is skipped rather than raced. No hand-rolled PID stale-reclaim
  (which is racy and can delete a live re-acquirer's lock).
- Guard the counter read against a corrupt (non-integer) file that would abort
  the hook under set -e.

Add tests/hooks/observe-signal-counter-race.test.js: 20 concurrent observe.sh
invocations must not lose increments (exact under flock; at most one dropped on
the best-effort mkdir fallback), the runner rejects on any hook execution
failure or hang, plus content guards for the lock and the corrupt-counter
handling.

Fixes #2296
2026-06-29 18:43:23 -07:00
..