37 Commits

Author SHA1 Message Date
YeonGyu-Kim
37d3086658 test(atlas): reset session state instead of module mocking 2026-02-14 22:06:34 +09:00
YeonGyu-Kim
fd99a29d6e feat(atlas): add notepad reading step to boulder verification reminders
Instructs the orchestrator to read subagent notepad files
(.sisyphus/notepads/{planName}/) after task completion, ensuring
learnings, issues, and problems are propagated to subsequent delegations.
2026-02-11 13:52:20 +09:00
YeonGyu-Kim
d60697bb13 fix: guard session_ids with optional chaining to prevent crash
boulderState?.session_ids.includes() only guards boulderState, not
session_ids. If boulder.json is corrupted or missing the field,
session_ids is undefined and .includes() crashes silently, losing
subagent results.

Changes:
- readBoulderState: validate parsed JSON is object, default session_ids to []
- atlas hook line 427: boulderState?.session_ids?.includes
- atlas hook line 655: boulderState?.session_ids?.includes
- prometheus-md-only line 93: boulderState?.session_ids?.includes
- appendSessionId: guard with ?. and initialize to [] if missing

Fixes #1672
2026-02-11 13:27:18 +09:00
YeonGyu-Kim
fba916db60 fix(atlas): await injectBoulderContinuation and handle errors
The async call was fire-and-forget with no error handling. Now properly
awaited with try/catch that logs failures and increments promptFailureCount.
2026-02-11 00:45:51 +09:00
YeonGyu-Kim
6694082a7e fix(atlas): correct plan path from .sisyphus/tasks/*.yaml to .sisyphus/plans/*.md
The verification reminder template was pointing at the wrong directory;
actual plan files are stored under .sisyphus/plans/ as markdown.
2026-02-11 00:45:51 +09:00
YeonGyu-Kim
257eb9277b fix(atlas): restrict boulder continuation to sessions in boulder session_ids
Main session was unconditionally allowed through the boulder session guard,
causing continuation injection into sessions not part of the active boulder.
Now only sessions explicitly in boulder's session_ids (or background tasks)
receive boulder continuation, matching todo-continuation-enforcer behavior.
2026-02-10 22:15:28 +09:00
YeonGyu-Kim
2b87719c83 docs: document intentional design decisions in atlas, todo-continuation, and delegation hooks 2026-02-10 22:00:54 +09:00
YeonGyu-Kim
45dfc4ec66 feat(atlas): enforce mandatory manual code review and direct boulder state checks
- VERIFICATION_REMINDER: add Step 2 manual code review (non-negotiable)
  - Require Read of EVERY changed file line by line
  - Cross-check subagent claims vs actual code
  - Verify logic correctness, completeness, edge cases, patterns
- Add Step 5: direct boulder state check via Read plan file
  - Count remaining tasks directly, no cached state
- BOULDER_CONTINUATION_PROMPT: add first rule to read plan file immediately
- verification-reminders.ts: restructure steps 5-8 for boulder/todo checks
- Atlas default.ts (Claude): enhance 3.4 QA with A/B/C/D sections
  - A: Automated verification
  - B: Manual code review (non-negotiable)
  - C: Hands-on QA (if applicable)
  - D: Check boulder state directly
- Atlas gpt.ts (GPT-5.2): apply same QA enhancements with GPT-optimized structure
- verification_rules: update both Claude and GPT versions with manual review requirements

Addresses issue where Atlas would skip manual code inspection after delegation,
leading to rubber-stamping of broken or incomplete work.
2026-02-10 22:00:54 +09:00
YeonGyu-Kim
44675fb57f fix(atlas): allow boulder continuation for Sisyphus sessions
When boulderState.agent is not explicitly set (defaults to 'atlas'),
allow continuation for sessions where the last agent is 'sisyphus'.
This fixes the issue where boulder continuation was skipped when
Sisyphus took over the conversation after boulder creation.
2026-02-10 11:41:44 +09:00
YeonGyu-Kim
f3f6ba47fe merge: integrate origin/dev into modular-enforcement branch
Resolves all merge conflicts, preserving our split module structure
while integrating all dev changes:
- Custom agent summaries support (parseRegisteredAgentSummaries)
- Background notification queue (enqueueNotificationForParent)
- Atlas shared git-worktree module (collectGitDiffStats, formatFileChanges)
- Ralph-loop withTimeout + DEFAULT_API_TIMEOUT=5000
- Session recovery assistant_prefill_unsupported error type
- Atlas agentOverrides forwarding
- Config handler plan model demotion (buildPlanDemoteConfig)
- Delegate-task agentOverrides, promptSyncWithModelSuggestionRetry, variant
- LSP init timeout + stale init detection
- isPlanFamily function + task-continuation-enforcer hook
- Handoff command
2026-02-08 17:34:47 +09:00
YeonGyu-Kim
c7122b4127 fix: resolve all test failures and Cubic review issues
- Fix unstable-agent-babysitter: add promptAsync to test mock
- Fix claude-code-mcp-loader: isolate tests from user home configs
- Fix npm-dist-tags: encode packageName for scoped packages
- Fix agent-builder: clone source to prevent shared object mutation
- Fix add-plugin-to-opencode-config: handle JSONC with leading comments
- Fix auth-plugins/add-provider-config: error on parse failures
- Fix bun-install: clear timeout on completion
- Fix git-diff-stats: include untracked files in diff summary
2026-02-08 15:31:32 +09:00
YeonGyu-Kim
119e18c810 refactor: wave 2 - split atlas, auto-update-checker, session-recovery, todo-enforcer, background-task hooks
- Extract atlas/ into 15 focused modules (hook, event handler, tool policies, types, etc.)
- Split auto-update-checker into checker/ and hook/ subdirectories with single-purpose files
- Decompose session-recovery into separate recovery strategy files per error type
- Extract todo-continuation-enforcer from monolith to directory with dedicated modules
- Split background-task/tools.ts into individual tool creator files
- Extract command-executor, tmux-utils into focused sub-modules
- Split config/schema.ts into domain-specific schema files
- Decompose cli/config-manager.ts into focused modules
- Rollback skill-mcp-manager, model-availability, index.ts splits that broke tests
- Fix all import path depths for moved files (../../ -> ../../../)
- Add explicit type annotations to resolve TS7006 implicit any errors

Typecheck: 0 errors
Tests: 2359 pass, 5 fail (all pre-existing)
2026-02-08 15:01:42 +09:00
YeonGyu-Kim
6ce482668b refactor: extract git worktree parser from atlas hook 2026-02-08 13:30:00 +09:00
Peïo Thibault
414cecd7df test: add promptAsync mocks to all test files for promptAsync migration 2026-02-07 14:41:46 +01:00
Peïo Thibault
46e02b9457 fix(hooks): switch session.prompt to promptAsync in all hooks 2026-02-07 13:42:24 +01:00
YeonGyu-Kim
f980e256dd fix: boulder continuation now respects /stop-continuation guard
Add isContinuationStopped check to atlas hook's session.idle handler
so boulder continuation stops when user runs /stop-continuation.

Previously, todo continuation and session recovery checked the guard,
but boulder continuation did not — causing work to resume after stop.

Fixes #1575
2026-02-07 18:50:13 +09:00
YeonGyu-Kim
a691a3ac0a refactor: migrate delegate_task to task tool with metadata fixes
- Rename delegate_task tool to task across codebase (100 files)
- Update model references: claude-opus-4-6 → 4-5, gpt-5.3-codex → 5.2-codex
- Add tool-metadata-store to restore metadata overwritten by fromPlugin()
- Add session ID polling for BackgroundManager task sessions
- Await async ctx.metadata() calls in tool executors
- Add ses_ prefix guard to getMessageDir for performance
- Harden BackgroundManager with idle deferral and error handling
- Fix duplicate task key in sisyphus-junior test object literals
- Fix unawaited showOutputToUser in ast_grep_replace
- Fix background=true → run_in_background=true in ultrawork prompt
- Fix duplicate task/task references in docs and comments
2026-02-06 21:35:30 +09:00
YeonGyu-Kim
aec5624122 fix(atlas): stop continuation retry loop on repeated prompt failures 2026-02-06 17:34:14 +09:00
YeonGyu-Kim
1f64920453 chore: update claude-opus-4-5 references to claude-opus-4-6 (excludes antigravity models) 2026-02-06 15:09:07 +09:00
Rishi Vhavle
169ccb6b05 fix: use boulder agent instead of hardcoded Atlas check for continuation
Address code review: continuation was blocked unless last agent was Atlas,
making the new agent parameter ineffective. Now the idle handler checks if
the last session agent matches boulderState.agent (defaults to 'atlas'),
allowing non-Atlas agents to resume when properly configured.

- Add getLastAgentFromSession helper for agent lookup
- Replace isCallerOrchestrator gate with boulder-agent-aware check
- Add test for non-Atlas agent continuation scenario
2026-02-04 21:21:57 +05:30
Rishi Vhavle
d8137c0c90 fix: track agent in boulder state to fix session continuation (fixes #927)
Add 'agent' field to BoulderState to track which agent (atlas) should
resume on session continuation. Previously, when user typed 'continue'
after interruption, Prometheus (planner) resumed instead of Sisyphus
(executor), causing all delegate_task calls to get READ-ONLY mode.

Changes:
- Add optional 'agent' field to BoulderState interface
- Update createBoulderState() to accept agent parameter
- Set agent='atlas' when /start-work creates boulder.json
- Use stored agent on boulder continuation (defaults to 'atlas')
- Add tests for new agent field functionality
2026-02-04 21:21:57 +05:30
YeonGyu-Kim
7226836472 atlas reminder reinforce 2026-02-03 10:31:34 +09:00
justsisyphus
7f9fcc708f fix(tests): properly stub notifyParentSession and fix timer-based tests
- Add stubNotifyParentSession implementation to stub manager's notifyParentSession method
- Add stubNotifyParentSession calls to checkAndInterruptStaleTasks tests
- Add messages mock to client mocks for completeness
- Fix timer-based tests by using real timers (fakeTimers.restore) with wait()
- Increase timeout for tests that need real time delays
2026-02-01 18:33:06 +09:00
YeonGyu-Kim
f146aeff0f
refactor: major codebase cleanup - BDD comments, file splitting, bug fixes (#1350)
* style(tests): normalize BDD comments from '// #given' to '// given'

- Replace 4,668 Python-style BDD comments across 107 test files
- Patterns changed: // #given -> // given, // #when -> // when, // #then -> // then
- Also handles no-space variants: //#given -> // given

* fix(rules-injector): prefer output.metadata.filePath over output.title

- Extract file path resolution to dedicated output-path.ts module
- Prefer metadata.filePath which contains actual file path
- Fall back to output.title only when metadata unavailable
- Fixes issue where rules weren't injected when tool output title was a label

* feat(slashcommand): add optional user_message parameter

- Add user_message optional parameter for command arguments
- Model can now call: command='publish' user_message='patch'
- Improves error messages with clearer format guidance
- Helps LLMs understand correct parameter usage

* feat(hooks): restore compaction-context-injector hook

- Restore hook deleted in cbbc7bd0 for session compaction context
- Injects 7 mandatory sections: User Requests, Final Goal, Work Completed,
  Remaining Tasks, Active Working Context, MUST NOT Do, Agent Verification State
- Re-register in hooks/index.ts and main plugin entry

* refactor(background-agent): split manager.ts into focused modules

- Extract constants.ts for TTL values and internal types (52 lines)
- Extract state.ts for TaskStateManager class (204 lines)
- Extract spawner.ts for task creation logic (244 lines)
- Extract result-handler.ts for completion handling (265 lines)
- Reduce manager.ts from 1377 to 755 lines (45% reduction)
- Maintain backward compatible exports

* refactor(agents): split prometheus-prompt.ts into subdirectory

- Move 1196-line prometheus-prompt.ts to prometheus/ subdirectory
- Organize prompt sections into separate files for maintainability
- Update agents/index.ts exports

* refactor(delegate-task): split tools.ts into focused modules

- Extract categories.ts for category definitions and routing
- Extract executor.ts for task execution logic
- Extract helpers.ts for utility functions
- Extract prompt-builder.ts for prompt construction
- Reduce tools.ts complexity with cleaner separation of concerns

* refactor(builtin-skills): split skills.ts into individual skill files

- Move each skill to dedicated file in skills/ subdirectory
- Create barrel export for backward compatibility
- Improve maintainability with focused skill modules

* chore: update import paths and lockfile

- Update prometheus import path after refactor
- Update bun.lock

* fix(tests): complete BDD comment normalization

- Fix remaining #when/#then patterns missed by initial sed
- Affected: state.test.ts, events.test.ts

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-02-01 16:47:50 +09:00
Sisyphus
8f6ed5b20f
fix(hooks): add null guard for tool.execute.after output (#1054)
/review command and some Claude Code built-in commands trigger
tool.execute.after hooks with undefined output, causing crashes
when accessing output.metadata or output.output.

Fixes #1035

Co-authored-by: sisyphus-dev-ai <sisyphus-dev-ai@users.noreply.github.com>
2026-01-28 16:26:40 +09:00
justsisyphus
9b59ef66e4 test: fix flaky tests caused by mock.module pollution across parallel test files 2026-01-28 00:54:20 +09:00
justsisyphus
5c7eb02d5b
chore(test): sync agent name casing in tests (#1128)
Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 12:10:30 +09:00
YeonGyu-Kim
0aa8f486af
feat(hooks): add sisyphus-junior-notepad hook for conditional notepad rules injection (#1092)
* refactor(shared): extract isCallerOrchestrator to session-utils

* refactor(atlas): use shared isCallerOrchestrator, change to prepend

* refactor(prometheus-md-only): change to prepend pattern

* refactor(sisyphus-junior): remove Work_Context (moved to hook)

* feat(hooks): add sisyphus-junior-notepad hook

* fix(shared): replace dynamic require with static import in session-utils

- Change from dynamic require to static import for better bundler compatibility
- Fix import path: ../../features -> ../features
- Add barrel export to src/shared/index.ts

* feat(hooks): register sisyphus-junior-notepad hook

- Add to HookNameSchema in schema.ts
- Export from hooks/index.ts
- Register with isHookEnabled in index.ts
- Auto-generated schema.json update

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-25 14:52:11 +09:00
justsisyphus
14f450bd25 refactor: sync delegate_task schema with OpenCode Task tool (resume→session_id, add command param) 2026-01-25 13:57:45 +09:00
justsisyphus
7ed7bf5c66 fix(agents): use lowercase agent names in API calls
- atlas/index.ts: agent: 'Atlas' -> 'atlas'
- start-work/index.ts: updateSessionAgent(..., 'Atlas') -> 'atlas'
- builtin-commands/commands.ts: agent: 'Atlas' -> 'atlas'
- Updated tests to match lowercase convention
2026-01-24 02:39:12 +09:00
justsisyphus
c2247aec60 refactor(agents): add prometheus agent and normalize agent key lookups
- Add 'prometheus' to BuiltinAgentNameSchema enum
- Update delegate_task parameter names in documentation (agent → subagent_type, background → run_in_background)
- Make agent name comparison case-insensitive in Atlas hook
- Implement case-insensitive agent config lookup in shared utilities
- Relax type signature for disabled agents parameter

🤖 Generated with assistance of OhMyOpenCode
2026-01-24 02:00:17 +09:00
justsisyphus
599fad0e86 fix(atlas): capture stderr from git commands to prevent help text leak
When git commands fail (e.g., not in a repo, invalid HEAD), git outputs
help text to stderr. Without explicit stdio option, execSync inherits
the parent's stdio causing help text to appear on terminal during
delegate_task execution.

Add stdio: ['pipe', 'pipe', 'pipe'] to capture stderr instead of
letting it leak to terminal.
2026-01-23 15:42:35 +09:00
justsisyphus
bdbc8d73cb fix(hooks): minor fixes across multiple hooks
Apply consistent naming and small improvements to atlas, background-compaction, claude-code-hooks, and non-interactive-env hooks.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-22 22:47:31 +09:00
justsisyphus
d419bc302c refactor: rename atlas agent to Atlas for naming consistency
- Rename agent name from 'atlas' to 'Atlas' (PascalCase like Sisyphus, Metis, Momus)
- Add migration for lowercase 'atlas' -> 'Atlas' for backward compatibility
- Keep hook name as 'atlas' (lowercase) to match other hook naming conventions
- Update all references in types, schema, hooks, commands, and tests
2026-01-21 11:19:00 +09:00
justsisyphus
c46d57f3aa refactor(hooks): rename sisyphus-orchestrator to atlas
- Rename SisyphusOrchestratorHookOptions → AtlasHookOptions
- Rename createSisyphusOrchestratorHook → createAtlasHook
- Update hook name in schema: sisyphus-orchestrator → atlas
- Update all documentation references
- Rename image: orchestrator-sisyphus.png → orchestrator-atlas.png
2026-01-20 16:52:20 +09:00
justsisyphus
52d9b30035 refactor(plugin): update atlas references
Update plugin handlers, commands, and integration points to use 'atlas' agent name. Start-work command and config handler now reference 'atlas'.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-20 15:55:09 +09:00
justsisyphus
c4b862cbc4 refactor(hooks): rename sisyphus-orchestrator to atlas
Hook name is now 'atlas' (not 'atlas-orchestrator'). Directory renamed to src/hooks/atlas/. All references updated.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-20 15:52:13 +09:00