feat(prompts): strengthen post-task reminders with actionable guidance
- Rewrite VERIFICATION_REMINDER with 3-step action flow (verify → determine QA → add to todo) - Add explicit BLOCKING directive to prevent premature task progression - Enhance buildOrchestratorReminder with clear post-verification actions - Improve capture-pane block message with concrete Bash examples
This commit is contained in:
parent
74f355322a
commit
9bed597e46
@ -63,34 +63,45 @@ RULES:
|
|||||||
- Do not stop until all tasks are complete
|
- Do not stop until all tasks are complete
|
||||||
- If blocked, document the blocker and move to the next task`
|
- If blocked, document the blocker and move to the next task`
|
||||||
|
|
||||||
const VERIFICATION_REMINDER = `**MANDATORY VERIFICATION - SUBAGENTS LIE**
|
const VERIFICATION_REMINDER = `**MANDATORY: WHAT YOU MUST DO RIGHT NOW**
|
||||||
|
|
||||||
Subagents FREQUENTLY claim completion when:
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||||
- Tests are actually FAILING
|
|
||||||
- Code has type/lint ERRORS
|
|
||||||
- Implementation is INCOMPLETE
|
|
||||||
- Patterns were NOT followed
|
|
||||||
|
|
||||||
**YOU MUST VERIFY EVERYTHING YOURSELF:**
|
⚠️ CRITICAL: Subagents FREQUENTLY LIE about completion.
|
||||||
|
Tests FAILING, code has ERRORS, implementation INCOMPLETE - but they say "done".
|
||||||
|
|
||||||
1. Run \`lsp_diagnostics\` on changed files - Must be CLEAN
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||||
2. Run tests yourself - Must PASS (not "agent said it passed")
|
|
||||||
3. Read the actual code - Must match requirements
|
|
||||||
4. Check build/typecheck - Must succeed
|
|
||||||
|
|
||||||
DO NOT TRUST THE AGENT'S SELF-REPORT.
|
**STEP 1: VERIFY WITH YOUR OWN TOOL CALLS (DO THIS NOW)**
|
||||||
VERIFY EACH CLAIM WITH YOUR OWN TOOL CALLS.
|
|
||||||
|
|
||||||
**HANDS-ON QA REQUIRED (after ALL tasks complete):**
|
Run these commands YOURSELF - do NOT trust agent's claims:
|
||||||
|
1. \`lsp_diagnostics\` on changed files → Must be CLEAN
|
||||||
|
2. \`bash\` to run tests → Must PASS
|
||||||
|
3. \`bash\` to run build/typecheck → Must succeed
|
||||||
|
4. \`Read\` the actual code → Must match requirements
|
||||||
|
|
||||||
| Deliverable Type | Verification Tool | Action |
|
**STEP 2: DETERMINE IF HANDS-ON QA IS NEEDED**
|
||||||
|------------------|-------------------|--------|
|
|
||||||
| **Frontend/UI** | \`/playwright\` skill | Navigate, interact, screenshot evidence |
|
|
||||||
| **TUI/CLI** | \`interactive_bash\` (tmux) | Run interactively, verify output |
|
|
||||||
| **API/Backend** | \`bash\` with curl | Send requests, verify responses |
|
|
||||||
|
|
||||||
Static analysis CANNOT catch: visual bugs, animation issues, user flow breakages, integration problems.
|
| Deliverable Type | QA Method | Tool |
|
||||||
**FAILURE TO DO HANDS-ON QA = INCOMPLETE WORK.**`
|
|------------------|-----------|------|
|
||||||
|
| **Frontend/UI** | Browser interaction | \`/playwright\` skill |
|
||||||
|
| **TUI/CLI** | Run interactively | \`interactive_bash\` (tmux) |
|
||||||
|
| **API/Backend** | Send real requests | \`bash\` with curl |
|
||||||
|
|
||||||
|
Static analysis CANNOT catch: visual bugs, animation issues, user flow breakages.
|
||||||
|
|
||||||
|
**STEP 3: IF QA IS NEEDED - ADD TO TODO IMMEDIATELY**
|
||||||
|
|
||||||
|
\`\`\`
|
||||||
|
todowrite([
|
||||||
|
{ id: "qa-X", content: "HANDS-ON QA: [specific verification action]", status: "pending", priority: "high" }
|
||||||
|
])
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||||
|
|
||||||
|
**BLOCKING: DO NOT proceed to next task until Steps 1-3 are complete.**
|
||||||
|
**FAILURE TO DO QA = INCOMPLETE WORK = USER WILL REJECT.**`
|
||||||
|
|
||||||
const ORCHESTRATOR_DELEGATION_REQUIRED = `
|
const ORCHESTRATOR_DELEGATION_REQUIRED = `
|
||||||
|
|
||||||
@ -183,20 +194,38 @@ function buildOrchestratorReminder(planName: string, progress: { total: number;
|
|||||||
return `
|
return `
|
||||||
---
|
---
|
||||||
|
|
||||||
**State:** Plan: ${planName} | ${progress.completed}/${progress.total} done, ${remaining} left
|
**BOULDER STATE:** Plan: \`${planName}\` | ✅ ${progress.completed}/${progress.total} done | ⏳ ${remaining} remaining
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
${buildVerificationReminder(sessionId)}
|
${buildVerificationReminder(sessionId)}
|
||||||
|
|
||||||
ALL pass? → commit atomic unit, mark \`[x]\`, next task.`
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||||
|
|
||||||
|
**AFTER VERIFICATION PASSES - YOUR NEXT ACTIONS (IN ORDER):**
|
||||||
|
|
||||||
|
1. **COMMIT** atomic unit (only verified changes)
|
||||||
|
2. **MARK** \`[x]\` in plan file for completed task
|
||||||
|
3. **PROCEED** to next task immediately
|
||||||
|
|
||||||
|
**DO NOT STOP. ${remaining} tasks remain. Keep bouldering.**`
|
||||||
}
|
}
|
||||||
|
|
||||||
function buildStandaloneVerificationReminder(sessionId: string): string {
|
function buildStandaloneVerificationReminder(sessionId: string): string {
|
||||||
return `
|
return `
|
||||||
---
|
---
|
||||||
|
|
||||||
${buildVerificationReminder(sessionId)}`
|
${buildVerificationReminder(sessionId)}
|
||||||
|
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||||
|
|
||||||
|
**AFTER VERIFICATION - CHECK YOUR TODO LIST:**
|
||||||
|
|
||||||
|
1. Run \`todoread\` to see remaining tasks
|
||||||
|
2. If QA tasks exist → execute them BEFORE marking complete
|
||||||
|
3. Mark completed tasks → proceed to next pending task
|
||||||
|
|
||||||
|
**NO TODO = NO TRACKING = INCOMPLETE WORK. Use todowrite aggressively.**`
|
||||||
}
|
}
|
||||||
|
|
||||||
function extractSessionIdFromOutput(output: string): string {
|
function extractSessionIdFromOutput(output: string): string {
|
||||||
|
|||||||
@ -64,7 +64,29 @@ export const interactive_bash: ToolDefinition = tool({
|
|||||||
|
|
||||||
const subcommand = parts[0].toLowerCase()
|
const subcommand = parts[0].toLowerCase()
|
||||||
if (BLOCKED_TMUX_SUBCOMMANDS.includes(subcommand)) {
|
if (BLOCKED_TMUX_SUBCOMMANDS.includes(subcommand)) {
|
||||||
return `Error: '${parts[0]}' is blocked. Use bash tool instead for capturing/printing terminal output.`
|
const sessionIdx = parts.findIndex(p => p === "-t" || p.startsWith("-t"))
|
||||||
|
let sessionName = "omo-session"
|
||||||
|
if (sessionIdx !== -1) {
|
||||||
|
if (parts[sessionIdx] === "-t" && parts[sessionIdx + 1]) {
|
||||||
|
sessionName = parts[sessionIdx + 1]
|
||||||
|
} else if (parts[sessionIdx].startsWith("-t")) {
|
||||||
|
sessionName = parts[sessionIdx].slice(2)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return `Error: '${parts[0]}' is blocked in interactive_bash.
|
||||||
|
|
||||||
|
**USE BASH TOOL INSTEAD:**
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
# Capture terminal output
|
||||||
|
tmux capture-pane -p -t ${sessionName}
|
||||||
|
|
||||||
|
# Or capture with history (last 1000 lines)
|
||||||
|
tmux capture-pane -p -t ${sessionName} -S -1000
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
The Bash tool can execute these commands directly. Do NOT retry with interactive_bash.`
|
||||||
}
|
}
|
||||||
|
|
||||||
const proc = Bun.spawn([tmuxPath, ...parts], {
|
const proc = Bun.spawn([tmuxPath, ...parts], {
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user