25 Commits

Author SHA1 Message Date
Jeon Suyeol
3eb7dc73b7 block remote URLs in look-at file_path validation 2026-02-11 18:50:51 +09:00
YeonGyu-Kim
13d960f3ca fix(look-at): revert to sync prompt to fix race condition with async polling
df0b9f76 regressed look_at from synchronous prompt (session.prompt) to
async prompt (session.promptAsync) + pollSessionUntilIdle polling. This
introduced a race condition where the poller fires before the server
registers the session as busy, causing it to return immediately with no
messages available.

Fix: restore promptSyncWithModelSuggestionRetry (blocking HTTP call) and
remove polling entirely. Catch prompt errors gracefully and still attempt
to fetch messages, since session.prompt may throw even on success.
2026-02-11 09:59:00 +09:00
YeonGyu-Kim
df0b9f7664 fix(delegate-task): Wave 1 - fix polling timeout, resource cleanup, tool restrictions, idle dedup, auth-plugins JSONC, CLI runner hang
- fix(delegate-task): return error on poll timeout instead of silent null
- fix(delegate-task): ensure toast and session cleanup on all error paths with try/finally
- fix(delegate-task): apply agent tool restrictions in sync-prompt-sender
- fix(plugin): add symmetric idle dedup to prevent double hook triggers
- fix(cli): replace regex-based JSONC editing with jsonc-parser in auth-plugins
- fix(cli): abort event stream after completion and restore no-timeout default

All changes verified with tests and typecheck.
2026-02-10 22:00:54 +09:00
YeonGyu-Kim
f22f14d9d1 fix(look-at): catch prompt errors gracefully instead of re-throwing
session.prompt() may throw {} or JSON parse errors even when the server
successfully processes the request. Instead of crashing the tool, catch
all errors and proceed to fetch messages — if the response is available,
return it; otherwise return a clean error string.
2026-02-09 14:18:24 +09:00
YeonGyu-Kim
480dcff420 refactor(look-at): split tools.ts into argument parsing and extraction modules
Extract multimodal look-at tool internals:
- look-at-arguments.ts: argument validation and parsing
- assistant-message-extractor.ts: response extraction
- mime-type-inference.ts: file type detection
- multimodal-agent-metadata.ts: agent metadata constants
2026-02-08 16:24:21 +09:00
YeonGyu-Kim
3d4ed912d7 fix(look-at): use synchronous prompt to fix race condition (#1620 regression)
PR #1620 migrated all prompt calls from session.prompt (blocking) to
session.promptAsync (fire-and-forget HTTP 204). This broke look_at which
needs the multimodal-looker response to be available immediately after
the prompt call returns.

Fix: add promptSyncWithModelSuggestionRetry() that uses session.prompt
(blocking) with model suggestion retry support. look_at now uses this
sync variant while all other callers keep using promptAsync.

- Add promptSyncWithModelSuggestionRetry to model-suggestion-retry.ts
- Switch look_at from promptWithModelSuggestionRetry to sync variant
- Add comprehensive tests for the new sync function
- No changes to other callers (delegate-task, background-agent)
2026-02-08 02:36:27 +09:00
Peïo Thibault
414cecd7df test: add promptAsync mocks to all test files for promptAsync migration 2026-02-07 14:41:46 +01:00
Peïo Thibault
fad7354b13 fix(look-at): remove isJsonParseError band-aid (root cause fixed) 2026-02-07 13:46:03 +01:00
lihaitao
d099b0255f feat(look_at): add image_data parameter for clipboard/pasted image support
Closes #704

Add support for base64-encoded image data in the look_at tool,
enabling analysis of clipboard/pasted images without requiring
a file path.

Changes:
- Add optional image_data parameter to LookAtArgs type
- Update validateArgs to accept either file_path or image_data
- Add inferMimeTypeFromBase64 function to detect image format
- Add try/catch around atob() to handle invalid base64 gracefully
- Update execute to handle both file path and data URL inputs
- Add comprehensive tests for image_data functionality
2026-02-04 12:24:00 +08:00
YeonGyu-Kim
49c933961e fix(background-cancel): skip notification when user explicitly cancels tasks
- Add skipNotification option to cancelTask method
- Apply skipNotification to background_cancel tool
- Prevents unwanted notifications when user cancels via tool
2026-02-03 16:56:40 +09:00
YeonGyu-Kim
f146aeff0f
refactor: major codebase cleanup - BDD comments, file splitting, bug fixes (#1350)
* style(tests): normalize BDD comments from '// #given' to '// given'

- Replace 4,668 Python-style BDD comments across 107 test files
- Patterns changed: // #given -> // given, // #when -> // when, // #then -> // then
- Also handles no-space variants: //#given -> // given

* fix(rules-injector): prefer output.metadata.filePath over output.title

- Extract file path resolution to dedicated output-path.ts module
- Prefer metadata.filePath which contains actual file path
- Fall back to output.title only when metadata unavailable
- Fixes issue where rules weren't injected when tool output title was a label

* feat(slashcommand): add optional user_message parameter

- Add user_message optional parameter for command arguments
- Model can now call: command='publish' user_message='patch'
- Improves error messages with clearer format guidance
- Helps LLMs understand correct parameter usage

* feat(hooks): restore compaction-context-injector hook

- Restore hook deleted in cbbc7bd0 for session compaction context
- Injects 7 mandatory sections: User Requests, Final Goal, Work Completed,
  Remaining Tasks, Active Working Context, MUST NOT Do, Agent Verification State
- Re-register in hooks/index.ts and main plugin entry

* refactor(background-agent): split manager.ts into focused modules

- Extract constants.ts for TTL values and internal types (52 lines)
- Extract state.ts for TaskStateManager class (204 lines)
- Extract spawner.ts for task creation logic (244 lines)
- Extract result-handler.ts for completion handling (265 lines)
- Reduce manager.ts from 1377 to 755 lines (45% reduction)
- Maintain backward compatible exports

* refactor(agents): split prometheus-prompt.ts into subdirectory

- Move 1196-line prometheus-prompt.ts to prometheus/ subdirectory
- Organize prompt sections into separate files for maintainability
- Update agents/index.ts exports

* refactor(delegate-task): split tools.ts into focused modules

- Extract categories.ts for category definitions and routing
- Extract executor.ts for task execution logic
- Extract helpers.ts for utility functions
- Extract prompt-builder.ts for prompt construction
- Reduce tools.ts complexity with cleaner separation of concerns

* refactor(builtin-skills): split skills.ts into individual skill files

- Move each skill to dedicated file in skills/ subdirectory
- Create barrel export for backward compatibility
- Improve maintainability with focused skill modules

* chore: update import paths and lockfile

- Update prometheus import path after refactor
- Update bun.lock

* fix(tests): complete BDD comment normalization

- Fix remaining #when/#then patterns missed by initial sed
- Affected: state.test.ts, events.test.ts

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-02-01 16:47:50 +09:00
justsisyphus
08439a511a fix(test): add missing ToolContext fields to test mocks
@opencode-ai/plugin ToolContext now requires directory, worktree,
metadata, and ask fields. Updated all tool test mocks to comply.
2026-02-01 14:16:28 +09:00
YeonGyu-Kim
4a82ff40fb
Consolidate duplicate patterns and simplify codebase (#1317)
* refactor(shared): unify binary downloader and session path storage

- Create binary-downloader.ts for common download/extract logic
- Create session-injected-paths.ts for unified path tracking
- Refactor comment-checker, ast-grep, grep downloaders to use shared util
- Consolidate directory injector types into shared module

* feat(shared): implement unified model resolution pipeline

- Create ModelResolutionPipeline for centralized model selection
- Refactor model-resolver to use pipeline
- Update delegate-task and config-handler to use unified logic
- Ensure consistent model resolution across all agent types

* refactor(agents): simplify agent utils and metadata management

- Extract helper functions for config merging and env context
- Register prompt metadata for all agents
- Simplify agent variant detection logic

* cleanup: inline utilities and remove unused exports

- Remove case-insensitive.ts (inline with native JS)
- Simplify opencode-version helpers
- Remove unused getModelLimit, createCompactionContextInjector exports
- Inline transcript entry creation in claude-code-hooks
- Update tests accordingly

---------

Co-authored-by: justsisyphus <justsisyphus@users.noreply.github.com>
2026-01-31 15:46:14 +09:00
justsisyphus
80ee52fe3b fix: improve model resolution with client API fallback and explicit model passing
- fetchAvailableModels now falls back to client.model.list() when cache is empty
- provider-models cache empty → models.json → client API (3-tier fallback)
- look-at tool explicitly passes registered agent's model to session.prompt
- Ensures multimodal-looker uses correctly resolved model (e.g., gemini-3-flash-preview)
- Add comprehensive tests for fuzzy matching and fallback scenarios
2026-01-30 16:57:21 +09:00
YeonGyu-Kim
3ab4529bc7
fix(look-at): handle JSON parse errors from session.prompt gracefully (#1216)
When multimodal-looker agent returns empty/malformed response, the SDK
throws 'JSON Parse error: Unexpected EOF'. This commit adds try-catch
around session.prompt() to provide user-friendly error message with
troubleshooting guidance.

- Add error handling for JSON parse errors with detailed guidance
- Add error handling for generic prompt failures
- Add test cases for both error scenarios
2026-01-28 23:58:01 +09:00
YeonGyu-Kim
3dd80889a5
fix(tools): add permission field to session.create() for consistency (#1192) (#1199)
- Add permission field to look_at and call_omo_agent session.create()
- Match pattern used in delegate_task and background-agent
- Add better error messages for Unauthorized failures
- Provide actionable guidance in error messages

This addresses potential session creation failures by ensuring
consistent session configuration across all tools that create
child sessions.
2026-01-28 17:35:25 +09:00
justsisyphus
89fa9ff167 fix(look-at): add path alias and validation for LLM compatibility
LLMs often call look_at with 'path' instead of 'file_path' parameter,
causing TypeError and infinite retry loops.

- Add normalizeArgs() to accept both 'path' and 'file_path'
- Add validateArgs() with clear error messages showing correct usage
- Add tests for normalization and validation

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-15 14:49:56 +09:00
Oussama Douhou
9e98cef182 fix(background-agent): inherit parent session directory for background tasks
Background tasks were defaulting to $HOME instead of the parent session's
working directory. This caused background agents to scan the entire home
directory instead of the project directory, leading to:
- High CPU/memory load from scanning unrelated files
- Permission errors on system directories
- Task failures and timeouts

The fix retrieves the parent session's directory before creating a new
background session and passes it via the query.directory parameter.

Files modified:
- manager.ts: Look up parent session directory in launch()
- call-omo-agent/tools.ts: Same fix for sync mode
- look-at/tools.ts: Same fix for look_at tool
- sisyphus-task/tools.ts: Same fix + interface update for directory prop
- index.ts: Pass directory to sisyphusTask factory
2026-01-13 06:27:56 +01:00
YeonGyu-Kim
2c778d9352 fix: extend look_at MIME type support for Gemini API media formats
- Add HEIC/HEIF image format support
- Add video formats (mp4, mpeg, mov, avi, flv, webm, wmv, 3gpp)
- Add audio formats (wav, mp3, aiff, aac, ogg, flac)
- Add CSV and Python document formats
- Remove unsupported formats (gif, svg, bmp, ico, css, ts)
- Update tool description to clarify purpose

🤖 Generated with assistance of OhMyOpenCode
2025-12-30 11:47:50 +09:00
YeonGyu-Kim
1d2dc69ae5
fix: use pathToFileURL for Windows-compatible file URLs in look_at tool (#279)
Fixes #276 - The look_at tool was constructing invalid file:// URLs on Windows
by using template literals. Now uses Node.js pathToFileURL() which correctly
handles backslashes, spaces, and the triple-slash prefix required on Windows.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)
2025-12-27 23:47:59 +09:00
Lukin
2246d1c5ef
feat: add Claude Code plugin support (#240) 2025-12-27 18:56:40 +09:00
YeonGyu-Kim
e752032ea6
fix(look-at): use direct file passthrough instead of Read tool (#173)
- Embed files directly in message parts using file:// URL format
- Remove dependency on Read tool for multimodal-looker agent
- Add inferMimeType helper for proper MIME type detection
- Disable read tool in agent tools config (no longer needed)
- Upgrade multimodal-looker model to gemini-3-flash
- Update all README docs to reflect gemini-3-flash change

Fixes #126

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)
2025-12-23 11:22:59 +09:00
YeonGyu-Kim
715756b68a
Optimize tool descriptions for token efficiency (#73)
* Optimize background-task tool descriptions for token efficiency

- BACKGROUND_TASK_DESCRIPTION: 571 chars → 127 chars
- BACKGROUND_OUTPUT_DESCRIPTION: 268 chars → 95 chars
- BACKGROUND_CANCEL_DESCRIPTION: 374 chars → 83 chars

Follows token efficiency improvements pattern from PR #71.

🤖 Generated with assistance of OhMyOpenCode (https://github.com/code-yeongyu/oh-my-opencode)

* Optimize call-omo-agent tool description for token efficiency

- CALL_OMO_AGENT_DESCRIPTION: 841 chars → 156 chars (~81% reduction)
- Follows pattern from PR #71 where LSP tool descriptions were optimized
- Maintains core information while removing redundant explanations

🤖 Generated with assistance of OhMyOpenCode (https://github.com/code-yeongyu/oh-my-opencode)

* Optimize look-at tool description for token efficiency

🤖 Generated with assistance of OhMyOpenCode (https://github.com/code-yeongyu/oh-my-opencode)

* Optimize interactive-bash tool description for token efficiency

346 chars → 130 chars (~62% reduction), following PR #71 pattern.

🤖 Generated with assistance of OhMyOpenCode
2025-12-17 00:38:38 +09:00
YeonGyu-Kim
424723f7ce refactor(agents): Complete rewrite of OmO system prompt with Task Complexity assessment
- Added comprehensive Task Complexity assessment before agent delegation (TRIVIAL/EXPLORATION/IMPLEMENTATION/ORCHESTRATION)
- Redefined Explore agent as 'contextual grep' - cheap, parallel background agent for internal codebase search (Level 2 in search strategy)
- Restricted Librarian agent to 3 explicit use cases: Official Documentation, GitHub Context, Famous OSS Implementation
- Added mandatory delegation gate (GATE 2.5) for ALL frontend files (.tsx/.jsx/.vue/.svelte/.css/.scss) - NO direct edits allowed
- Implemented obsessive Todo Management framework with BLOCKING evidence requirements for every action
- Introduced comprehensive Search Strategy Framework with 3-level approach (Direct Tools → Explore → Librarian)
- Restructured Blocking Gates with explicit Pre-Search gate and Pre-Completion verification
- Enhanced Delegation Rules with clear agent purposes and parallelization strategies
- Added Implementation Flow and Exploration Flow with phase-based workflows
- Introduced Decision Matrix for quick action selection
- Enhanced Anti-Patterns section with comprehensive BLOCKING rules for frontend work
- Updated Tool Selection guide with clear preferences (Direct Tools > Agent Tools)
- Improved parallel execution guidelines for explore/librarian agents
- Strengthened verification protocol with evidence requirements

🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)
2025-12-15 19:14:06 +09:00
YeonGyu-Kim
a3938e8c25 feat: add look_at tool and multimodal-looker agent
Add a new tool and agent for analyzing media files (PDFs, images, diagrams)
that require visual interpretation beyond raw text.

- Add `multimodal-looker` agent using Gemini 2.5 Flash model
- Add `look_at` tool that spawns multimodal-looker sessions
- Restrict multimodal-looker from calling task/call_omo_agent/look_at tools

Inspired by Sourcegraph Ampcode's look_at tool design.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)
2025-12-13 15:28:59 +09:00