39 Commits

Author SHA1 Message Date
YeonGyu-Kim
b954afca90
fix(model-requirements): use supported variant for gemini-3-pro (#1463)
* fix(model-requirements): use supported variant for gemini-3-pro

* fix(delegate-task): update artistry variant to high for gemini-3-pro

- Update DEFAULT_CATEGORIES artistry variant from 'max' to 'high'
- Update related test comment
- gemini-3-pro only supports low/high thinking levels, not max
- Addresses Oracle review feedback
2026-02-04 11:26:17 +09:00
YeonGyu-Kim
8441f70c2b
fix(delegate-task): honor sisyphus-junior model override precedence (#1404) 2026-02-03 10:41:26 +09:00
YeonGyu-Kim
6080bc8caf refactor(delegate-task): improve session title format and add task_metadata block
- Change session title from 'Task: {desc}' to '{desc} (@{agent} subagent)'
- Move session_id to structured <task_metadata> block for better parsing
- Add category tracking to BackgroundTask type and LaunchInput
- Add tests for new title format and metadata block
2026-02-01 19:44:22 +09:00
YeonGyu-Kim
64825158a7
feat(agents): add Hephaestus - autonomous deep worker agent (#1287)
* refactor(keyword-detector): split constants into domain-specific modules

* feat(shared): add requiresAnyModel and isAnyFallbackModelAvailable

* feat(config): add hephaestus to agent schemas

* feat(agents): add Hephaestus autonomous deep worker

* feat(cli): update model-fallback for hephaestus support

* feat(plugin): add hephaestus to config handler with ordering

* test(delegate-task): update tests for hephaestus agent

* docs: update AGENTS.md files for hephaestus

* docs: add hephaestus to READMEs

* chore: regenerate config schema

* fix(delegate-task): bypass requiresModel check when user provides explicit config

* docs(hephaestus): add 4-part context structure for explore/librarian prompts

* docs: fix review comments from cubic (non-breaking changes)

- Move Hephaestus from Primary Agents to Subagents (uses own fallback chain)
- Fix Hephaestus fallback chain documentation (claude-opus-4-5 → gemini-3-pro)
- Add settings.local.json to claude-code-hooks config sources
- Fix delegate_task parameters in ultrawork prompt (agent→subagent_type, background→run_in_background, add load_skills)
- Update line counts in AGENTS.md (index.ts: 788, manager.ts: 1440)

* docs: fix additional documentation inconsistencies from oracle review

- Fix delegate_task parameters in Background Agents example (docs/features.md)
- Fix Hephaestus fallback chain in root AGENTS.md to match model-requirements.ts

* docs: clarify Hephaestus has no fallback (requires gpt-5.2-codex only)

Hephaestus uses requiresModel constraint - it only activates when gpt-5.2-codex
is available. The fallback chain in code is unreachable, so documentation
should not mention fallbacks.

* fix(hephaestus): remove unreachable fallback chain entries

Hephaestus has requiresModel: gpt-5.2-codex which means the agent only
activates when that specific model is available. The fallback entries
(claude-opus-4-5, gemini-3-pro) were unreachable and misleading.

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-02-01 19:26:57 +09:00
YeonGyu-Kim
c73314f643 feat(skill-mcp-manager): enhance manager with improved test coverage
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-01 19:01:40 +09:00
justsisyphus
ab54e6ccdc chore: treat minimax as unstable model requiring background monitoring 2026-02-01 17:20:01 +09:00
YeonGyu-Kim
f146aeff0f
refactor: major codebase cleanup - BDD comments, file splitting, bug fixes (#1350)
* style(tests): normalize BDD comments from '// #given' to '// given'

- Replace 4,668 Python-style BDD comments across 107 test files
- Patterns changed: // #given -> // given, // #when -> // when, // #then -> // then
- Also handles no-space variants: //#given -> // given

* fix(rules-injector): prefer output.metadata.filePath over output.title

- Extract file path resolution to dedicated output-path.ts module
- Prefer metadata.filePath which contains actual file path
- Fall back to output.title only when metadata unavailable
- Fixes issue where rules weren't injected when tool output title was a label

* feat(slashcommand): add optional user_message parameter

- Add user_message optional parameter for command arguments
- Model can now call: command='publish' user_message='patch'
- Improves error messages with clearer format guidance
- Helps LLMs understand correct parameter usage

* feat(hooks): restore compaction-context-injector hook

- Restore hook deleted in cbbc7bd0 for session compaction context
- Injects 7 mandatory sections: User Requests, Final Goal, Work Completed,
  Remaining Tasks, Active Working Context, MUST NOT Do, Agent Verification State
- Re-register in hooks/index.ts and main plugin entry

* refactor(background-agent): split manager.ts into focused modules

- Extract constants.ts for TTL values and internal types (52 lines)
- Extract state.ts for TaskStateManager class (204 lines)
- Extract spawner.ts for task creation logic (244 lines)
- Extract result-handler.ts for completion handling (265 lines)
- Reduce manager.ts from 1377 to 755 lines (45% reduction)
- Maintain backward compatible exports

* refactor(agents): split prometheus-prompt.ts into subdirectory

- Move 1196-line prometheus-prompt.ts to prometheus/ subdirectory
- Organize prompt sections into separate files for maintainability
- Update agents/index.ts exports

* refactor(delegate-task): split tools.ts into focused modules

- Extract categories.ts for category definitions and routing
- Extract executor.ts for task execution logic
- Extract helpers.ts for utility functions
- Extract prompt-builder.ts for prompt construction
- Reduce tools.ts complexity with cleaner separation of concerns

* refactor(builtin-skills): split skills.ts into individual skill files

- Move each skill to dedicated file in skills/ subdirectory
- Create barrel export for backward compatibility
- Improve maintainability with focused skill modules

* chore: update import paths and lockfile

- Update prometheus import path after refactor
- Update bun.lock

* fix(tests): complete BDD comment normalization

- Fix remaining #when/#then patterns missed by initial sed
- Affected: state.test.ts, events.test.ts

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-02-01 16:47:50 +09:00
YeonGyu-Kim
8c2625cfb0
🏆 test: optimize test suite with FakeTimers and race condition fixes (#1284)
* fix: exclude prompt/permission from plan agent config

plan agent should only inherit model settings from prometheus,
not the prompt or permission. This ensures plan agent uses
OpenCode's default behavior while only overriding the model.

* test(todo-continuation-enforcer): use FakeTimers for 15x faster tests

- Add custom FakeTimers implementation (~100 lines)
- Replace all real setTimeout waits with fakeTimers.advanceBy()
- Test time: 104.6s → 7.01s

* test(callback-server): fix race conditions with Promise.all and Bun.fetch

- Use Bun.fetch.bind(Bun) to avoid globalThis.fetch mock interference
- Use Promise.all pattern for concurrent fetch/waitForCallback
- Add Bun.sleep(10) in afterEach for port release

* test(concurrency): replace placeholder assertions with getCount checks

Replace 6 meaningless expect(true).toBe(true) assertions with
actual getCount() verifications for test quality improvement

* refactor(config-handler): simplify planDemoteConfig creation

Remove unnecessary IIFE and destructuring, use direct spread instead

* test(executor): use FakeTimeouts for faster tests

- Add custom FakeTimeouts implementation
- Replace setTimeout waits with fakeTimeouts.advanceBy()
- Test time reduced from ~26s to ~6.8s

* test: fix gemini model mock for artistry unstable mode

* test: fix model list mock payload shape

* test: mock provider models for artistry category

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-30 22:10:52 +09:00
justsisyphus
80ee52fe3b fix: improve model resolution with client API fallback and explicit model passing
- fetchAvailableModels now falls back to client.model.list() when cache is empty
- provider-models cache empty → models.json → client API (3-tier fallback)
- look-at tool explicitly passes registered agent's model to session.prompt
- Ensures multimodal-looker uses correctly resolved model (e.g., gemini-3-flash-preview)
- Add comprehensive tests for fuzzy matching and fallback scenarios
2026-01-30 16:57:21 +09:00
justsisyphus
10bdb6c694 chore: update artistry category description for creative problem-solving 2026-01-30 14:53:50 +09:00
justsisyphus
c06f38693e refactor: revamp ultrabrain category with deep work mindset
- Add variant: max to ultrabrain's gemini-3-pro fallback entry
- Rename STRATEGIC_CATEGORY_PROMPT_APPEND to ULTRABRAIN_CATEGORY_PROMPT_APPEND
- Keep original strategic advisor prompt content (no micromanagement instructions)
- Update description: use only for genuinely hard tasks, give clear goals only
- Update tests to match renamed constant
2026-01-30 14:53:50 +09:00
justsisyphus
c905e1cb7a fix(delegate-task): restore resolved.model to category userModel chain (#1227)
PR #1227 incorrectly removed resolved.model from the userModel chain,
assuming it was bypassing the fallback chain. However, resolved.model
contained the category's DEFAULT_CATEGORIES model (e.g., quick ->
claude-haiku-4-5), not the main session model.

Without resolved.model, when connectedProvidersCache is null and
availableModels is empty, category model resolution falls through to
systemDefaultModel (opus) instead of using the category's default.

This fix restores the original priority:
1. User category model override
2. Category default model (from resolved.model)
3. sisyphusJuniorModel
4. Fallback chain
5. System default
2026-01-30 11:45:19 +09:00
YeonGyu-Kim
34aaef2219
fix(delegate-task): pass registered agent model explicitly for subagent_type (#1225)
When delegate_task uses subagent_type, extract the matched agent's model
object and pass it explicitly to session.prompt/manager.launch. This
ensures the model is always in the correct object format regardless of
how OpenCode handles string→object conversion for plugin-registered
agents.

Closes #1225
2026-01-29 11:27:07 +09:00
justsisyphus
dee89c1556 feat(delegate-task): add prometheus self-delegation block and delegate_task permission
- Block prometheus from delegating to itself via delegate_task
- Grant delegate_task permission to prometheus when called as subagent
- Other subagents still have delegate_task disabled
2026-01-28 18:24:15 +09:00
justsisyphus
b2d618e851 fix: mock provider cache in delegate-task tests for CI stability
- Add spyOn for readConnectedProvidersCache to return connected providers
- Tests now work consistently regardless of actual provider cache state
- Fixes CI failures for category variant and unstable agent tests
2026-01-28 14:27:34 +09:00
justsisyphus
6f348a8a5c fix: resolve CI test timeouts with configurable timing
- Add timing.ts module for test-only timing configuration
- Replace hardcoded wait times with getTimingConfig()
- Enable all previously skipped tests (ralph-loop, session-state, delegate-task)
- Tests now complete in ~2s instead of timing out
2026-01-28 14:17:56 +09:00
justsisyphus
1d27d78127 test: skip flaky sync variant test (CI timeout) 2026-01-28 01:07:14 +09:00
justsisyphus
acded4ba2a fix(delegate-task): add clear error when model not configured (#1139) 2026-01-27 10:35:20 +09:00
justsisyphus
7e065dfe12 feat(delegate-task): prepend system prompt for plan agent invocations
When plan agent (plan/prometheus/planner) is invoked via delegate_task,
automatically prepend a <system> prompt instructing the agent to:
- Launch explore/librarian agents in background to gather context
- Summarize user request and list uncertainties
- Ask clarifying questions until requirements are 100% clear
2026-01-26 18:25:06 +09:00
justsisyphus
3ee519c7b0
feat: make systemDefaultModel optional for OpenCode fallback (#1136)
- Remove mandatory model requirement from plugin initialization
- Allow OpenCode to use its built-in model fallback when user doesn't specify
- Update model-resolver to handle undefined systemDefaultModel
- Remove throw errors in config-handler, utils, atlas, delegate-task
- Add tests for optional model scenarios

Closes #1129

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-26 17:01:08 +09:00
justsisyphus
3a79b8761b
feat(shared): add connected-providers-cache for model availability (#1121)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-26 11:53:41 +09:00
YeonGyu-Kim
3af30b0a21
feat(skills): add agent-browser option for browser automation (#1090)
Add configurable browser automation allowing users to choose between
Playwright MCP (default) and Vercel's agent-browser CLI.

Changes:
- Add browser_automation_engine.provider config option
- Dynamic skill loading based on provider selection
- Comprehensive agent-browser CLI reference (inline in skills.ts)
- Propagate browserProvider to delegate_task and buildAgent
- Update documentation with provider comparison

Co-authored-by: Suyeol Jeon <devxoul@gmail.com>
Co-authored-by: YeonGyu Kim <code.yeongyu@gmail.com>
2026-01-25 15:02:41 +09:00
justsisyphus
14f450bd25 refactor: sync delegate_task schema with OpenCode Task tool (resume→session_id, add command param) 2026-01-25 13:57:45 +09:00
YeonGyu-Kim
04633ba208
fix(models): update model names to match OpenCode Zen catalog (#1048)
* fix(models): update model names to match OpenCode Zen catalog

OpenCode Zen recently updated their official model catalog, deprecating
several preview and free model variants:

DEPRECATED → NEW (Official Zen Names):
- gemini-3-pro-preview → gemini-3-pro
- gemini-3-flash-preview → gemini-3-flash
- grok-code → gpt-5-nano (FREE tier maintained)
- glm-4.7-free → big-pickle (FREE tier maintained)
- glm-4.6v → glm-4.6

Changes:
- Updated 6 source files (model-requirements, delegate-task, think-mode, etc.)
- Updated 9 documentation files (installation, configurations, features, etc.)
- Updated 14 test files with new model references
- Regenerated snapshots to reflect catalog changes
- Removed duplicate think-mode entries for preview variants

Impact:
- FREE tier access preserved via gpt-5-nano and big-pickle
- All 55 model-related tests passing
- Zero breaking changes - pure string replacement
- Aligns codebase with official OpenCode Zen model catalog

Verified:
- Zero deprecated model names in codebase
- All model-related tests pass (55/55)
- Snapshots regenerated and validated

Affects: 30 files (6 source, 9 docs, 14 tests, 1 snapshot)

* fix(multimodal-looker): update fallback chain with glm-4.6v and gpt-5-nano

- Change glm-4.6 to glm-4.6v for zai-coding-plan provider
- Add opencode/gpt-5-nano as 4th fallback (FREE tier)
- Push gpt-5.2 to 5th position

Fallback chain now:
1. gemini-3-flash (google, github-copilot, opencode)
2. claude-haiku-4-5 (anthropic, github-copilot, opencode)
3. glm-4.6v (zai-coding-plan)
4. gpt-5-nano (opencode) - FREE
5. gpt-5.2 (openai, github-copilot, opencode)

* chore: update bun.lock

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-24 15:30:35 +09:00
justsisyphus
444fbe396a fix(delegate-task): use lowercase sisyphus-junior agent name in API calls
Previous fix (7ed7bf5c) only updated Atlas → atlas, but missed Sisyphus-Junior.
OpenCode does case-sensitive agent lookup, causing crash when delegate_task
tried to spawn 'Sisyphus-Junior' (registered as 'sisyphus-junior').

- SISYPHUS_JUNIOR_AGENT constant: 'Sisyphus-Junior' → 'sisyphus-junior'
- agent-tool-restrictions key: 'Sisyphus-Junior' → 'sisyphus-junior'
- Updated related test mocks
2026-01-24 03:00:58 +09:00
justsisyphus
f4348885f2 fix: model fallback properly falls through to system default
- Remove Step 3 in model-resolver that forced first fallbackChain entry
  even when unavailable, blocking system default fallback
- Add sisyphusJuniorModel option to delegate_task so agents["Sisyphus-Junior"]
  model override is respected in category-based delegation
- Update tests to reflect new fallback behavior
2026-01-23 10:56:31 +09:00
justsisyphus
71474bb4a2 fix(delegate-task): reset model cache between tests 2026-01-23 02:59:03 +09:00
justsisyphus
aa2b052d28 refactor(delegate-task): enhance delegation with dynamic descriptions
Generate tool description dynamically from available categories and skills. Remove hardcoded DELEGATE_TASK_DESCRIPTION constant. Improve parameter handling with unified 'subagent_type' field replacing 'agent'.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-22 22:46:23 +09:00
justsisyphus
516edb445c fix(delegate-task): wait for result when forcing unstable agents to background
Previously, unstable agents (gemini models, is_unstable_agent=true) were
forced to background mode but returned immediately with task ID. This
caused callers to lose visibility into results.

Now: launch as background for monitoring stability, but poll and wait
for completion, returning actual task output like sync mode.

🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)
2026-01-21 11:19:00 +09:00
justsisyphus
46189eef8a test(delegate-task): add tests for DEFAULT_CATEGORIES variant handling
Adds test coverage for variant propagation from DEFAULT_CATEGORIES
to background tasks when no user categories are defined.
2026-01-20 18:55:08 +09:00
justsisyphus
3d3d3e493b fix(delegate-task): category built-in model takes precedence over inherited model
Previously, when using categories like 'quick', the parent session's model
(e.g., Opus 4.5) would override the category's built-in model (e.g., Haiku).

Fixed priority: userConfig.model → category built-in → systemDefault

The inherited model from parent session no longer affects category-based
delegation - categories have their own explicit models.
2026-01-20 17:05:17 +09:00
justsisyphus
8cc995891e refactor(delegate-task): restructure category system for unbiased model selection
- Remove temperature from all categories
- Consolidate CATEGORY_MODEL_CATALOG into DEFAULT_CATEGORIES
- Replace 'general' and 'most-capable' with 'unspecified-low' and 'unspecified-high'
- Add Selection_Gate to unspecified categories to force deliberate selection
- Update quick category to use claude-haiku-4-5
- Update all references and tests across codebase
2026-01-20 16:22:53 +09:00
justsisyphus
52d9b30035 refactor(plugin): update atlas references
Update plugin handlers, commands, and integration points to use 'atlas' agent name. Start-work command and config handler now reference 'atlas'.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-20 15:55:09 +09:00
Kenny
6956ce0a19 fix: correct config.data.model access pattern and use dynamic config paths
- Fixed delegate-task/tools.ts to access openCodeConfig.data.model instead of openCodeConfig.model
- Updated error messages to use getOpenCodeConfigPaths() for cross-platform paths
- Fixed 12 test mocks to use correct { data: { model: ... } } structure
- Fixes delegation failing with 'requires a default model' even when model is configured
2026-01-18 22:08:27 -05:00
Kenny
c910820cdb restore gitignore 2026-01-17 20:40:55 -05:00
Kenny
c698a5b888 fix: remove hardcoded model defaults from categories and agents
BREAKING CHANGE: Model resolution overhauled

- Created centralized model-resolver.ts with priority chain:
  userModel → inheritedModel → systemDefaultModel
- Removed model field from all 7 DEFAULT_CATEGORIES entries
- Removed DEFAULT_MODEL constants from 10 agents
- Removed singleton agent exports (use factories instead)
- Made CategoryConfigSchema.model optional
- CLI no longer generates model overrides
- Empty strings treated as unset (uses fallback)

Users must now:
1. Use factory functions (createOracleAgent, etc.) instead of singletons
2. Provide model explicitly or use systemDefaultModel
3. Configure category models explicitly if needed

Fixes model fallback bug where hardcoded defaults overrode
user's OpenCode configured model.
2026-01-17 12:51:03 -05:00
justsisyphus
255f535a50 refactor(delegate-task): use empty array instead of null for skills parameter
- Change skills type from string[] | null to string[]
- Allow skills=[] for no skills, reject skills=null
- Remove emojis from error messages and prompts
- Update tests accordingly
2026-01-17 21:25:02 +09:00
justsisyphus
cb6f1c9f75 fix(delegate-task): category default model takes precedence over parent model
Previously, parent model string would override category default model,
causing categories like 'ultrabrain' to use the parent's model (e.g., sonnet)
instead of the intended category default (e.g., gpt-5.2).

Model priority is now:
1. userConfig.model (oh-my-opencode.json override)
2. defaultConfig.model (category default)
3. parentModelString (fallback)
4. systemDefaultModel (last resort)
2026-01-16 18:45:19 +09:00
justsisyphus
188bbef018 refactor: rename sisyphus_task to delegate_task
- Rename directories: sisyphus-task → delegate-task
- Rename types: SisyphusTaskArgs → DelegateTaskArgs, etc.
- Rename functions: createSisyphusTask → createDelegateTask
- Rename constants: SISYPHUS_TASK_* → DELEGATE_TASK_*
- Update tool name: sisyphus_task → delegate_task
- Update all prompts, docs, and tests
2026-01-16 18:31:08 +09:00