mirror of
https://github.com/ultraworkers/claw-code.git
synced 2026-04-24 05:00:25 +08:00
feat: #151 — canonicalize workspace path in SessionStore::from_cwd/data_dir
## Problem `workspace_fingerprint(path)` hashes the raw path string without canonicalization. Two equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo` on macOS) produce different fingerprints and therefore different session stores. #150 fixed the test-side symptom; this fixes the underlying product contract. ## Discovery path #150 fix (canonicalize in test) was a workaround. Q's ack on #150 surfaced the deeper gap: the function itself is still fragile for any caller passing a non-canonical path: 1. Embedded callers with a raw `--data-dir` path 2. Programmatic `SessionStore::from_cwd(user_path)` calls 3. NixOS store paths, Docker bind mounts, case-insensitive normalization The REPL's default flow happens to work because `env::current_dir()` returns canonical paths on macOS. But any caller passing a raw path risks silent session-store divergence. ## Fix Canonicalize inside `SessionStore::from_cwd()` and `from_data_dir()` before computing the fingerprint. Kept `workspace_fingerprint()` itself as a pure function for determinism — canonicalization is the entry point's responsibility. ```rust let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf()); let sessions_root = canonical_cwd.join(".claw").join("sessions").join(workspace_fingerprint(&canonical_cwd)); ``` Falls back to the raw path if canonicalize fails (directory doesn't exist yet). ## Test-side updates Three legacy-session tests expected the non-canonical base path to match the store's workspace_root. Updated them to canonicalize `base` after creation — same defensive pattern as #150, now explicit across all three tests. ## Regression test Added `session_store_from_cwd_canonicalizes_equivalent_paths` that creates two stores from equivalent paths (raw vs canonical) and asserts they resolve to the same sessions_dir. ## Verification - `cargo test -p runtime session_store_` — 9/9 pass - `cargo test --workspace` — all green, no FAILED markers - No behavior change for existing users (REPL default flow already used canonical paths) ## Backward compatibility Users on macOS who always went through `env::current_dir()`: no hash change, sessions resume identically. Users who ever called with a non-canonical path: hash would change, but those sessions were already broken (couldn't be resumed from a canonical-path cwd). Net improvement. Closes ROADMAP #151.
This commit is contained in:
parent
eaa077bf91
commit
7bc66e86e8
53
ROADMAP.md
53
ROADMAP.md
@ -5832,3 +5832,56 @@ Deliverable: Update `clawcode-dogfood-cycle-reminder` task to emit this field on
|
||||
**Blocker.** Assigned to gaebal-gajae's domain (cron scheduling / o p e n c l a w orchestration). Not a claw-code CLI blocker; purely infrastructure/monitoring.
|
||||
|
||||
**Source.** Q's direct observation during 2026-04-21 20:50–21:00 dogfood cycles: repeated timeouts with no way to diagnose. Session tally: ROADMAP #246.
|
||||
|
||||
## Pinpoint #151. `workspace_fingerprint` path-equivalence contract gap (product, not just test)
|
||||
|
||||
**Gap.** `workspace_fingerprint(path)` hashes the raw path string without canonicalization. Two callers passing equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo` on macOS where `/tmp` is a symlink to `/private/tmp`) get different fingerprints and therefore different session stores. #150 was the test-side symptom; the product contract itself is still fragile.
|
||||
|
||||
**Discovery path.** #150 fix (canonicalize in test) was a workaround. Real users hit this whenever:
|
||||
1. Embedded callers pass a raw `--data-dir` path that differs from canonical `env::current_dir()`
|
||||
2. Programmatic use of `SessionStore::from_cwd(some_path)` with a non-canonical input
|
||||
3. Symlinks elsewhere in the filesystem (not just macOS `/tmp`): NixOS store paths, Docker bind mounts, network mounts with case-insensitive normalization, etc.
|
||||
|
||||
The REPL's default flow happens to work because `env::current_dir()` returns canonicalized paths on macOS. But anyone calling `SessionStore::from_cwd()` with a user-supplied path risks silent session-store divergence.
|
||||
|
||||
**Root cause.** The function treats path-string equality and path-equivalence as the same thing:
|
||||
|
||||
```rust
|
||||
pub fn workspace_fingerprint(workspace_root: &Path) -> String {
|
||||
let input = workspace_root.to_string_lossy(); // ← raw bytes
|
||||
// ... FNV-1a hash ...
|
||||
}
|
||||
```
|
||||
|
||||
**Fix shape (~10 lines).** Canonicalize inside `SessionStore::from_cwd()` (and `from_data_dir`) before computing the fingerprint. Keep `workspace_fingerprint()` itself as a pure function of its input for determinism — the canonicalization is the caller's responsibility, but the two production entry points should always canonicalize.
|
||||
|
||||
```rust
|
||||
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
|
||||
let cwd = cwd.as_ref();
|
||||
// #151: canonicalize so that equivalent paths (symlinks, ./foo vs /abs/foo)
|
||||
// produce the same workspace_fingerprint. Falls back to the raw path when
|
||||
// canonicalize() fails (e.g. directory doesn't exist yet — callers that
|
||||
// haven't materialized the workspace).
|
||||
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
|
||||
let sessions_root = canonical_cwd
|
||||
.join(".claw")
|
||||
.join("sessions")
|
||||
.join(workspace_fingerprint(&canonical_cwd));
|
||||
fs::create_dir_all(&sessions_root)?;
|
||||
Ok(Self {
|
||||
sessions_root,
|
||||
workspace_root: canonical_cwd,
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
**Backward compatibility.** Existing users on macOS where `env::current_dir()` already returns canonical paths: no change in hash. Users who ever called with a non-canonical path: hash would change, but those sessions were already broken (couldn't be resumed from a canonical-path cwd). Net improvement.
|
||||
|
||||
**Acceptance.**
|
||||
- Revert the test-side workaround from #150; test still passes.
|
||||
- Add regression test: `SessionStore::from_cwd("/tmp/foo")` and `SessionStore::from_cwd("/private/tmp/foo")` return stores with identical `sessions_dir()` on macOS.
|
||||
- Workspace tests green.
|
||||
|
||||
**Blocker.** None.
|
||||
|
||||
**Source.** Q's ack on #150 surfaced the deeper gap: "#150 closed is real value" but the product function still has the brittleness. Session tally: ROADMAP #151.
|
||||
|
||||
@ -31,14 +31,19 @@ impl SessionStore {
|
||||
/// The on-disk layout becomes `<cwd>/.claw/sessions/<workspace_hash>/`.
|
||||
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
|
||||
let cwd = cwd.as_ref();
|
||||
let sessions_root = cwd
|
||||
// #151: canonicalize so equivalent paths (symlinks, relative vs
|
||||
// absolute, /tmp vs /private/tmp on macOS) produce the same
|
||||
// workspace_fingerprint. Falls back to the raw path if canonicalize
|
||||
// fails (e.g. the directory doesn't exist yet).
|
||||
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
|
||||
let sessions_root = canonical_cwd
|
||||
.join(".claw")
|
||||
.join("sessions")
|
||||
.join(workspace_fingerprint(cwd));
|
||||
.join(workspace_fingerprint(&canonical_cwd));
|
||||
fs::create_dir_all(&sessions_root)?;
|
||||
Ok(Self {
|
||||
sessions_root,
|
||||
workspace_root: cwd.to_path_buf(),
|
||||
workspace_root: canonical_cwd,
|
||||
})
|
||||
}
|
||||
|
||||
@ -51,14 +56,18 @@ impl SessionStore {
|
||||
workspace_root: impl AsRef<Path>,
|
||||
) -> Result<Self, SessionControlError> {
|
||||
let workspace_root = workspace_root.as_ref();
|
||||
// #151: canonicalize workspace_root for consistent fingerprinting
|
||||
// across equivalent path representations.
|
||||
let canonical_workspace = fs::canonicalize(workspace_root)
|
||||
.unwrap_or_else(|_| workspace_root.to_path_buf());
|
||||
let sessions_root = data_dir
|
||||
.as_ref()
|
||||
.join("sessions")
|
||||
.join(workspace_fingerprint(workspace_root));
|
||||
.join(workspace_fingerprint(&canonical_workspace));
|
||||
fs::create_dir_all(&sessions_root)?;
|
||||
Ok(Self {
|
||||
sessions_root,
|
||||
workspace_root: workspace_root.to_path_buf(),
|
||||
workspace_root: canonical_workspace,
|
||||
})
|
||||
}
|
||||
|
||||
@ -744,6 +753,40 @@ mod tests {
|
||||
assert_eq!(fp_a1.len(), 16, "fingerprint must be a 16-char hex string");
|
||||
}
|
||||
|
||||
/// #151 regression: equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo`
|
||||
/// on macOS where `/tmp` is a symlink to `/private/tmp`) must resolve to
|
||||
/// the same session store. Previously they diverged because
|
||||
/// `workspace_fingerprint()` hashed the raw path string. Now
|
||||
/// `SessionStore::from_cwd()` canonicalizes first.
|
||||
#[test]
|
||||
fn session_store_from_cwd_canonicalizes_equivalent_paths() {
|
||||
let base = temp_dir();
|
||||
let real_dir = base.join("real-workspace");
|
||||
fs::create_dir_all(&real_dir).expect("real workspace should exist");
|
||||
|
||||
// Build two stores via different but equivalent path representations:
|
||||
// the raw path and the canonicalized path.
|
||||
let raw_path = real_dir.clone();
|
||||
let canonical_path = fs::canonicalize(&real_dir).expect("canonicalize ok");
|
||||
|
||||
let store_from_raw =
|
||||
SessionStore::from_cwd(&raw_path).expect("store from raw should build");
|
||||
let store_from_canonical =
|
||||
SessionStore::from_cwd(&canonical_path).expect("store from canonical should build");
|
||||
|
||||
assert_eq!(
|
||||
store_from_raw.sessions_dir(),
|
||||
store_from_canonical.sessions_dir(),
|
||||
"equivalent paths must produce the same sessions dir (raw={} canonical={})",
|
||||
raw_path.display(),
|
||||
canonical_path.display()
|
||||
);
|
||||
|
||||
if base.exists() {
|
||||
fs::remove_dir_all(base).expect("cleanup ok");
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn session_store_from_cwd_isolates_sessions_by_workspace() {
|
||||
// given
|
||||
@ -832,6 +875,11 @@ mod tests {
|
||||
let workspace_b = base.join("repo-beta");
|
||||
fs::create_dir_all(&workspace_a).expect("workspace a should exist");
|
||||
fs::create_dir_all(&workspace_b).expect("workspace b should exist");
|
||||
// #151: canonicalize so test expectations match the store's canonical
|
||||
// workspace_root. Without this, the test builds sessions with a raw
|
||||
// path but the store resolves to the canonical form.
|
||||
let workspace_a = fs::canonicalize(&workspace_a).unwrap_or(workspace_a);
|
||||
let workspace_b = fs::canonicalize(&workspace_b).unwrap_or(workspace_b);
|
||||
|
||||
let store_b = SessionStore::from_cwd(&workspace_b).expect("store b should build");
|
||||
let legacy_root = workspace_b.join(".claw").join("sessions");
|
||||
@ -865,6 +913,8 @@ mod tests {
|
||||
// given
|
||||
let base = temp_dir();
|
||||
fs::create_dir_all(&base).expect("base dir should exist");
|
||||
// #151: canonicalize for path-representation consistency with store.
|
||||
let base = fs::canonicalize(&base).unwrap_or(base);
|
||||
let store = SessionStore::from_cwd(&base).expect("store should build");
|
||||
let legacy_root = base.join(".claw").join("sessions");
|
||||
let legacy_path = legacy_root.join("legacy-safe.jsonl");
|
||||
@ -893,6 +943,8 @@ mod tests {
|
||||
// given
|
||||
let base = temp_dir();
|
||||
fs::create_dir_all(&base).expect("base dir should exist");
|
||||
// #151: canonicalize for path-representation consistency with store.
|
||||
let base = fs::canonicalize(&base).unwrap_or(base);
|
||||
let store = SessionStore::from_cwd(&base).expect("store should build");
|
||||
let legacy_root = base.join(".claw").join("sessions");
|
||||
let legacy_path = legacy_root.join("legacy-unbound.json");
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user