4 Commits

Author SHA1 Message Date
YeonGyu-Kim
2c778d9352 fix: extend look_at MIME type support for Gemini API media formats
- Add HEIC/HEIF image format support
- Add video formats (mp4, mpeg, mov, avi, flv, webm, wmv, 3gpp)
- Add audio formats (wav, mp3, aiff, aac, ogg, flac)
- Add CSV and Python document formats
- Remove unsupported formats (gif, svg, bmp, ico, css, ts)
- Update tool description to clarify purpose

🤖 Generated with assistance of OhMyOpenCode
2025-12-30 11:47:50 +09:00
YeonGyu-Kim
715756b68a
Optimize tool descriptions for token efficiency (#73)
* Optimize background-task tool descriptions for token efficiency

- BACKGROUND_TASK_DESCRIPTION: 571 chars → 127 chars
- BACKGROUND_OUTPUT_DESCRIPTION: 268 chars → 95 chars
- BACKGROUND_CANCEL_DESCRIPTION: 374 chars → 83 chars

Follows token efficiency improvements pattern from PR #71.

🤖 Generated with assistance of OhMyOpenCode (https://github.com/code-yeongyu/oh-my-opencode)

* Optimize call-omo-agent tool description for token efficiency

- CALL_OMO_AGENT_DESCRIPTION: 841 chars → 156 chars (~81% reduction)
- Follows pattern from PR #71 where LSP tool descriptions were optimized
- Maintains core information while removing redundant explanations

🤖 Generated with assistance of OhMyOpenCode (https://github.com/code-yeongyu/oh-my-opencode)

* Optimize look-at tool description for token efficiency

🤖 Generated with assistance of OhMyOpenCode (https://github.com/code-yeongyu/oh-my-opencode)

* Optimize interactive-bash tool description for token efficiency

346 chars → 130 chars (~62% reduction), following PR #71 pattern.

🤖 Generated with assistance of OhMyOpenCode
2025-12-17 00:38:38 +09:00
YeonGyu-Kim
424723f7ce refactor(agents): Complete rewrite of OmO system prompt with Task Complexity assessment
- Added comprehensive Task Complexity assessment before agent delegation (TRIVIAL/EXPLORATION/IMPLEMENTATION/ORCHESTRATION)
- Redefined Explore agent as 'contextual grep' - cheap, parallel background agent for internal codebase search (Level 2 in search strategy)
- Restricted Librarian agent to 3 explicit use cases: Official Documentation, GitHub Context, Famous OSS Implementation
- Added mandatory delegation gate (GATE 2.5) for ALL frontend files (.tsx/.jsx/.vue/.svelte/.css/.scss) - NO direct edits allowed
- Implemented obsessive Todo Management framework with BLOCKING evidence requirements for every action
- Introduced comprehensive Search Strategy Framework with 3-level approach (Direct Tools → Explore → Librarian)
- Restructured Blocking Gates with explicit Pre-Search gate and Pre-Completion verification
- Enhanced Delegation Rules with clear agent purposes and parallelization strategies
- Added Implementation Flow and Exploration Flow with phase-based workflows
- Introduced Decision Matrix for quick action selection
- Enhanced Anti-Patterns section with comprehensive BLOCKING rules for frontend work
- Updated Tool Selection guide with clear preferences (Direct Tools > Agent Tools)
- Improved parallel execution guidelines for explore/librarian agents
- Strengthened verification protocol with evidence requirements

🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)
2025-12-15 19:14:06 +09:00
YeonGyu-Kim
a3938e8c25 feat: add look_at tool and multimodal-looker agent
Add a new tool and agent for analyzing media files (PDFs, images, diagrams)
that require visual interpretation beyond raw text.

- Add `multimodal-looker` agent using Gemini 2.5 Flash model
- Add `look_at` tool that spawns multimodal-looker sessions
- Restrict multimodal-looker from calling task/call_omo_agent/look_at tools

Inspired by Sourcegraph Ampcode's look_at tool design.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)
2025-12-13 15:28:59 +09:00