mirror of
https://github.com/ultraworkers/claw-code.git
synced 2026-04-25 21:54:09 +08:00
feat: #151 — canonicalize workspace path in SessionStore::from_cwd/data_dir
## Problem `workspace_fingerprint(path)` hashes the raw path string without canonicalization. Two equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo` on macOS) produce different fingerprints and therefore different session stores. #150 fixed the test-side symptom; this fixes the underlying product contract. ## Discovery path #150 fix (canonicalize in test) was a workaround. Q's ack on #150 surfaced the deeper gap: the function itself is still fragile for any caller passing a non-canonical path: 1. Embedded callers with a raw `--data-dir` path 2. Programmatic `SessionStore::from_cwd(user_path)` calls 3. NixOS store paths, Docker bind mounts, case-insensitive normalization The REPL's default flow happens to work because `env::current_dir()` returns canonical paths on macOS. But any caller passing a raw path risks silent session-store divergence. ## Fix Canonicalize inside `SessionStore::from_cwd()` and `from_data_dir()` before computing the fingerprint. Kept `workspace_fingerprint()` itself as a pure function for determinism — canonicalization is the entry point's responsibility. ```rust let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf()); let sessions_root = canonical_cwd.join(".claw").join("sessions").join(workspace_fingerprint(&canonical_cwd)); ``` Falls back to the raw path if canonicalize fails (directory doesn't exist yet). ## Test-side updates Three legacy-session tests expected the non-canonical base path to match the store's workspace_root. Updated them to canonicalize `base` after creation — same defensive pattern as #150, now explicit across all three tests. ## Regression test Added `session_store_from_cwd_canonicalizes_equivalent_paths` that creates two stores from equivalent paths (raw vs canonical) and asserts they resolve to the same sessions_dir. ## Verification - `cargo test -p runtime session_store_` — 9/9 pass - `cargo test --workspace` — all green, no FAILED markers - No behavior change for existing users (REPL default flow already used canonical paths) ## Backward compatibility Users on macOS who always went through `env::current_dir()`: no hash change, sessions resume identically. Users who ever called with a non-canonical path: hash would change, but those sessions were already broken (couldn't be resumed from a canonical-path cwd). Net improvement. Closes ROADMAP #151.
This commit is contained in:
parent
eaa077bf91
commit
7bc66e86e8
53
ROADMAP.md
53
ROADMAP.md
@ -5832,3 +5832,56 @@ Deliverable: Update `clawcode-dogfood-cycle-reminder` task to emit this field on
|
|||||||
**Blocker.** Assigned to gaebal-gajae's domain (cron scheduling / o p e n c l a w orchestration). Not a claw-code CLI blocker; purely infrastructure/monitoring.
|
**Blocker.** Assigned to gaebal-gajae's domain (cron scheduling / o p e n c l a w orchestration). Not a claw-code CLI blocker; purely infrastructure/monitoring.
|
||||||
|
|
||||||
**Source.** Q's direct observation during 2026-04-21 20:50–21:00 dogfood cycles: repeated timeouts with no way to diagnose. Session tally: ROADMAP #246.
|
**Source.** Q's direct observation during 2026-04-21 20:50–21:00 dogfood cycles: repeated timeouts with no way to diagnose. Session tally: ROADMAP #246.
|
||||||
|
|
||||||
|
## Pinpoint #151. `workspace_fingerprint` path-equivalence contract gap (product, not just test)
|
||||||
|
|
||||||
|
**Gap.** `workspace_fingerprint(path)` hashes the raw path string without canonicalization. Two callers passing equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo` on macOS where `/tmp` is a symlink to `/private/tmp`) get different fingerprints and therefore different session stores. #150 was the test-side symptom; the product contract itself is still fragile.
|
||||||
|
|
||||||
|
**Discovery path.** #150 fix (canonicalize in test) was a workaround. Real users hit this whenever:
|
||||||
|
1. Embedded callers pass a raw `--data-dir` path that differs from canonical `env::current_dir()`
|
||||||
|
2. Programmatic use of `SessionStore::from_cwd(some_path)` with a non-canonical input
|
||||||
|
3. Symlinks elsewhere in the filesystem (not just macOS `/tmp`): NixOS store paths, Docker bind mounts, network mounts with case-insensitive normalization, etc.
|
||||||
|
|
||||||
|
The REPL's default flow happens to work because `env::current_dir()` returns canonicalized paths on macOS. But anyone calling `SessionStore::from_cwd()` with a user-supplied path risks silent session-store divergence.
|
||||||
|
|
||||||
|
**Root cause.** The function treats path-string equality and path-equivalence as the same thing:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn workspace_fingerprint(workspace_root: &Path) -> String {
|
||||||
|
let input = workspace_root.to_string_lossy(); // ← raw bytes
|
||||||
|
// ... FNV-1a hash ...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Fix shape (~10 lines).** Canonicalize inside `SessionStore::from_cwd()` (and `from_data_dir`) before computing the fingerprint. Keep `workspace_fingerprint()` itself as a pure function of its input for determinism — the canonicalization is the caller's responsibility, but the two production entry points should always canonicalize.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
|
||||||
|
let cwd = cwd.as_ref();
|
||||||
|
// #151: canonicalize so that equivalent paths (symlinks, ./foo vs /abs/foo)
|
||||||
|
// produce the same workspace_fingerprint. Falls back to the raw path when
|
||||||
|
// canonicalize() fails (e.g. directory doesn't exist yet — callers that
|
||||||
|
// haven't materialized the workspace).
|
||||||
|
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
|
||||||
|
let sessions_root = canonical_cwd
|
||||||
|
.join(".claw")
|
||||||
|
.join("sessions")
|
||||||
|
.join(workspace_fingerprint(&canonical_cwd));
|
||||||
|
fs::create_dir_all(&sessions_root)?;
|
||||||
|
Ok(Self {
|
||||||
|
sessions_root,
|
||||||
|
workspace_root: canonical_cwd,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Backward compatibility.** Existing users on macOS where `env::current_dir()` already returns canonical paths: no change in hash. Users who ever called with a non-canonical path: hash would change, but those sessions were already broken (couldn't be resumed from a canonical-path cwd). Net improvement.
|
||||||
|
|
||||||
|
**Acceptance.**
|
||||||
|
- Revert the test-side workaround from #150; test still passes.
|
||||||
|
- Add regression test: `SessionStore::from_cwd("/tmp/foo")` and `SessionStore::from_cwd("/private/tmp/foo")` return stores with identical `sessions_dir()` on macOS.
|
||||||
|
- Workspace tests green.
|
||||||
|
|
||||||
|
**Blocker.** None.
|
||||||
|
|
||||||
|
**Source.** Q's ack on #150 surfaced the deeper gap: "#150 closed is real value" but the product function still has the brittleness. Session tally: ROADMAP #151.
|
||||||
|
|||||||
@ -31,14 +31,19 @@ impl SessionStore {
|
|||||||
/// The on-disk layout becomes `<cwd>/.claw/sessions/<workspace_hash>/`.
|
/// The on-disk layout becomes `<cwd>/.claw/sessions/<workspace_hash>/`.
|
||||||
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
|
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
|
||||||
let cwd = cwd.as_ref();
|
let cwd = cwd.as_ref();
|
||||||
let sessions_root = cwd
|
// #151: canonicalize so equivalent paths (symlinks, relative vs
|
||||||
|
// absolute, /tmp vs /private/tmp on macOS) produce the same
|
||||||
|
// workspace_fingerprint. Falls back to the raw path if canonicalize
|
||||||
|
// fails (e.g. the directory doesn't exist yet).
|
||||||
|
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
|
||||||
|
let sessions_root = canonical_cwd
|
||||||
.join(".claw")
|
.join(".claw")
|
||||||
.join("sessions")
|
.join("sessions")
|
||||||
.join(workspace_fingerprint(cwd));
|
.join(workspace_fingerprint(&canonical_cwd));
|
||||||
fs::create_dir_all(&sessions_root)?;
|
fs::create_dir_all(&sessions_root)?;
|
||||||
Ok(Self {
|
Ok(Self {
|
||||||
sessions_root,
|
sessions_root,
|
||||||
workspace_root: cwd.to_path_buf(),
|
workspace_root: canonical_cwd,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -51,14 +56,18 @@ impl SessionStore {
|
|||||||
workspace_root: impl AsRef<Path>,
|
workspace_root: impl AsRef<Path>,
|
||||||
) -> Result<Self, SessionControlError> {
|
) -> Result<Self, SessionControlError> {
|
||||||
let workspace_root = workspace_root.as_ref();
|
let workspace_root = workspace_root.as_ref();
|
||||||
|
// #151: canonicalize workspace_root for consistent fingerprinting
|
||||||
|
// across equivalent path representations.
|
||||||
|
let canonical_workspace = fs::canonicalize(workspace_root)
|
||||||
|
.unwrap_or_else(|_| workspace_root.to_path_buf());
|
||||||
let sessions_root = data_dir
|
let sessions_root = data_dir
|
||||||
.as_ref()
|
.as_ref()
|
||||||
.join("sessions")
|
.join("sessions")
|
||||||
.join(workspace_fingerprint(workspace_root));
|
.join(workspace_fingerprint(&canonical_workspace));
|
||||||
fs::create_dir_all(&sessions_root)?;
|
fs::create_dir_all(&sessions_root)?;
|
||||||
Ok(Self {
|
Ok(Self {
|
||||||
sessions_root,
|
sessions_root,
|
||||||
workspace_root: workspace_root.to_path_buf(),
|
workspace_root: canonical_workspace,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -744,6 +753,40 @@ mod tests {
|
|||||||
assert_eq!(fp_a1.len(), 16, "fingerprint must be a 16-char hex string");
|
assert_eq!(fp_a1.len(), 16, "fingerprint must be a 16-char hex string");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// #151 regression: equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo`
|
||||||
|
/// on macOS where `/tmp` is a symlink to `/private/tmp`) must resolve to
|
||||||
|
/// the same session store. Previously they diverged because
|
||||||
|
/// `workspace_fingerprint()` hashed the raw path string. Now
|
||||||
|
/// `SessionStore::from_cwd()` canonicalizes first.
|
||||||
|
#[test]
|
||||||
|
fn session_store_from_cwd_canonicalizes_equivalent_paths() {
|
||||||
|
let base = temp_dir();
|
||||||
|
let real_dir = base.join("real-workspace");
|
||||||
|
fs::create_dir_all(&real_dir).expect("real workspace should exist");
|
||||||
|
|
||||||
|
// Build two stores via different but equivalent path representations:
|
||||||
|
// the raw path and the canonicalized path.
|
||||||
|
let raw_path = real_dir.clone();
|
||||||
|
let canonical_path = fs::canonicalize(&real_dir).expect("canonicalize ok");
|
||||||
|
|
||||||
|
let store_from_raw =
|
||||||
|
SessionStore::from_cwd(&raw_path).expect("store from raw should build");
|
||||||
|
let store_from_canonical =
|
||||||
|
SessionStore::from_cwd(&canonical_path).expect("store from canonical should build");
|
||||||
|
|
||||||
|
assert_eq!(
|
||||||
|
store_from_raw.sessions_dir(),
|
||||||
|
store_from_canonical.sessions_dir(),
|
||||||
|
"equivalent paths must produce the same sessions dir (raw={} canonical={})",
|
||||||
|
raw_path.display(),
|
||||||
|
canonical_path.display()
|
||||||
|
);
|
||||||
|
|
||||||
|
if base.exists() {
|
||||||
|
fs::remove_dir_all(base).expect("cleanup ok");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn session_store_from_cwd_isolates_sessions_by_workspace() {
|
fn session_store_from_cwd_isolates_sessions_by_workspace() {
|
||||||
// given
|
// given
|
||||||
@ -832,6 +875,11 @@ mod tests {
|
|||||||
let workspace_b = base.join("repo-beta");
|
let workspace_b = base.join("repo-beta");
|
||||||
fs::create_dir_all(&workspace_a).expect("workspace a should exist");
|
fs::create_dir_all(&workspace_a).expect("workspace a should exist");
|
||||||
fs::create_dir_all(&workspace_b).expect("workspace b should exist");
|
fs::create_dir_all(&workspace_b).expect("workspace b should exist");
|
||||||
|
// #151: canonicalize so test expectations match the store's canonical
|
||||||
|
// workspace_root. Without this, the test builds sessions with a raw
|
||||||
|
// path but the store resolves to the canonical form.
|
||||||
|
let workspace_a = fs::canonicalize(&workspace_a).unwrap_or(workspace_a);
|
||||||
|
let workspace_b = fs::canonicalize(&workspace_b).unwrap_or(workspace_b);
|
||||||
|
|
||||||
let store_b = SessionStore::from_cwd(&workspace_b).expect("store b should build");
|
let store_b = SessionStore::from_cwd(&workspace_b).expect("store b should build");
|
||||||
let legacy_root = workspace_b.join(".claw").join("sessions");
|
let legacy_root = workspace_b.join(".claw").join("sessions");
|
||||||
@ -865,6 +913,8 @@ mod tests {
|
|||||||
// given
|
// given
|
||||||
let base = temp_dir();
|
let base = temp_dir();
|
||||||
fs::create_dir_all(&base).expect("base dir should exist");
|
fs::create_dir_all(&base).expect("base dir should exist");
|
||||||
|
// #151: canonicalize for path-representation consistency with store.
|
||||||
|
let base = fs::canonicalize(&base).unwrap_or(base);
|
||||||
let store = SessionStore::from_cwd(&base).expect("store should build");
|
let store = SessionStore::from_cwd(&base).expect("store should build");
|
||||||
let legacy_root = base.join(".claw").join("sessions");
|
let legacy_root = base.join(".claw").join("sessions");
|
||||||
let legacy_path = legacy_root.join("legacy-safe.jsonl");
|
let legacy_path = legacy_root.join("legacy-safe.jsonl");
|
||||||
@ -893,6 +943,8 @@ mod tests {
|
|||||||
// given
|
// given
|
||||||
let base = temp_dir();
|
let base = temp_dir();
|
||||||
fs::create_dir_all(&base).expect("base dir should exist");
|
fs::create_dir_all(&base).expect("base dir should exist");
|
||||||
|
// #151: canonicalize for path-representation consistency with store.
|
||||||
|
let base = fs::canonicalize(&base).unwrap_or(base);
|
||||||
let store = SessionStore::from_cwd(&base).expect("store should build");
|
let store = SessionStore::from_cwd(&base).expect("store should build");
|
||||||
let legacy_root = base.join(".claw").join("sessions");
|
let legacy_root = base.join(".claw").join("sessions");
|
||||||
let legacy_path = legacy_root.join("legacy-unbound.json");
|
let legacy_path = legacy_root.join("legacy-unbound.json");
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user