fix(athena): parallelize council launches and gate handoff actions

This commit is contained in:
ismeth 2026-02-18 19:26:20 +01:00 committed by YeonGyu-Kim
parent b0e2630db1
commit 697c4c6341
8 changed files with 314 additions and 410 deletions

View File

@ -51,8 +51,8 @@ Question({
}) })
**Shortcut skip the Question tool if:** **Shortcut skip the Question tool if:**
- The user already specified models in their message (e.g., "ask GPT and Claude about X") call athena_council directly with those members. - The user already specified models in their message (e.g., "ask GPT and Claude about X") call athena_council once per specified member.
- The user says "all", "everyone", "the whole council" call athena_council without the members parameter. - The user says "all", "everyone", "the whole council" call athena_council once per configured member.
DO NOT: DO NOT:
- Read files yourself - Read files yourself
@ -65,16 +65,32 @@ You are an ORCHESTRATOR, not an analyst. Your council members do the analysis. Y
## Workflow ## Workflow
Step 1: Present the Question tool multi-select for council member selection (see above). Once the user responds, call athena_council with the user's question. If the user selected specific members, pass their names in the members parameter. If the user selected "All Members", omit the members parameter. Step 1: Present the Question tool multi-select for council member selection (see above).
Step 2: Call athena_council with the question and selected members. The tool launches all council members in parallel, waits for them to complete, and returns ALL of their responses in a single result. This may take a few minutes that is expected. Step 2: Resolve the selected member list:
- If user selected "All Members", resolve to every configured member listed in the athena_council tool description.
- Otherwise resolve to the explicitly selected member labels.
Step 3: Synthesize the findings returned by athena_council: Step 3: Call athena_council ONCE PER MEMBER (member per tool call):
- For each selected member, call athena_council with:
- question: the user's original question
- members: ["<exact member name or model>"] // single-item array only
- Launch all selected members first (one athena_council call per member) so they run in parallel.
- Track every returned task_id and member mapping.
Step 4: Collect all member outputs after launch:
- For each tracked task_id, call background_output with block=true.
- Gather each member's final output/status.
- Do not proceed until every launched member has reached a terminal status (completed, error, cancelled, timeout).
- Do not ask the final action question while any launched member is still pending.
- Do not present interim synthesis from partial results. Wait for all members first.
Step 5: Synthesize the findings returned by all collected member outputs:
- Group findings by agreement level: unanimous, majority, minority, solo - Group findings by agreement level: unanimous, majority, minority, solo
- Solo findings are potential false positives flag the risk explicitly - Solo findings are potential false positives flag the risk explicitly
- Add your own assessment and rationale to each finding - Add your own assessment and rationale to each finding
Step 4: Present synthesized findings to the user grouped by agreement level (unanimous first, then majority, minority, solo). Then use the Question tool to ask which action to take: Step 6: Present synthesized findings to the user grouped by agreement level (unanimous first, then majority, minority, solo). Then use the Question tool to ask which action to take:
Question({ Question({
questions: [{ questions: [{
@ -89,7 +105,7 @@ Question({
}] }]
}) })
Step 5: After the user selects an action: Step 7: After the user selects an action:
- **"Fix now (Atlas)"** Call switch_agent with agent="atlas" and context containing the confirmed findings summary, the original question, and instruction to implement the fixes. - **"Fix now (Atlas)"** Call switch_agent with agent="atlas" and context containing the confirmed findings summary, the original question, and instruction to implement the fixes.
- **"Create plan (Prometheus)"** Call switch_agent with agent="prometheus" and context containing the confirmed findings summary, the original question, and instruction to create a phased plan. - **"Create plan (Prometheus)"** Call switch_agent with agent="prometheus" and context containing the confirmed findings summary, the original question, and instruction to create a phased plan.
- **"No action"** Acknowledge and end. Do not delegate. - **"No action"** Acknowledge and end. Do not delegate.
@ -99,7 +115,10 @@ The switch_agent tool switches the active agent. After you call it, end your res
## Constraints ## Constraints
- Use the Question tool for member selection BEFORE calling athena_council (unless user pre-specified). - Use the Question tool for member selection BEFORE calling athena_council (unless user pre-specified).
- Use the Question tool for action selection AFTER synthesis (unless user already stated intent). - Use the Question tool for action selection AFTER synthesis (unless user already stated intent).
- Do NOT use background_output athena_council returns all member responses directly. - Use background_output (block=true) to collect member outputs after launch.
- Do NOT call athena_council with multiple members in one call.
- Do NOT ask "How should we proceed" until all selected member calls have finished.
- Do NOT present or summarize partial council findings while any selected member is still running.
- Do NOT write or edit files directly. - Do NOT write or edit files directly.
- Do NOT delegate without explicit user confirmation via Question tool. - Do NOT delegate without explicit user confirmation via Question tool.
- Do NOT ignore solo finding false-positive warnings. - Do NOT ignore solo finding false-positive warnings.

View File

@ -65,6 +65,7 @@ export function createPluginInterface(args: {
"tool.execute.before": createToolExecuteBeforeHandler({ "tool.execute.before": createToolExecuteBeforeHandler({
ctx, ctx,
hooks, hooks,
backgroundManager: managers.backgroundManager,
}), }),
"tool.execute.after": createToolExecuteAfterHandler({ "tool.execute.after": createToolExecuteAfterHandler({

View File

@ -2,6 +2,109 @@ const { describe, expect, test } = require("bun:test")
const { createToolExecuteBeforeHandler } = require("./tool-execute-before") const { createToolExecuteBeforeHandler } = require("./tool-execute-before")
describe("createToolExecuteBeforeHandler", () => { describe("createToolExecuteBeforeHandler", () => {
test("blocks Athena question tool while council members are still running", async () => {
//#given
const ctx = {
client: {
session: {
messages: async () => ({
data: [{ info: { role: "assistant", agent: "Athena (Council)" } }],
}),
},
},
}
const backgroundManager = {
getTasksByParentSession: () => [
{ agent: "council-member", status: "running" },
],
}
const handler = createToolExecuteBeforeHandler({
ctx,
hooks: {},
backgroundManager,
})
//#when
const run = handler(
{ tool: "question", sessionID: "ses_athena", callID: "call_1" },
{ args: { questions: [] } }
)
//#then
await expect(run).rejects.toThrow("Council members are still running")
})
test("blocks Athena switch_agent while council members are still running", async () => {
//#given
const ctx = {
client: {
session: {
messages: async () => ({
data: [{ info: { role: "assistant", agent: "Athena (Council)" } }],
}),
},
},
}
const backgroundManager = {
getTasksByParentSession: () => [
{ agent: "council-member", status: "pending" },
],
}
const handler = createToolExecuteBeforeHandler({
ctx,
hooks: {},
backgroundManager,
})
//#when
const run = handler(
{ tool: "switch_agent", sessionID: "ses_athena", callID: "call_1" },
{ args: { agent: "atlas", context: "ctx" } }
)
//#then
await expect(run).rejects.toThrow("Council members are still running")
})
test("allows Athena question tool when no council members are pending", async () => {
//#given
const ctx = {
client: {
session: {
messages: async () => ({
data: [{ info: { role: "assistant", agent: "Athena (Council)" } }],
}),
},
},
}
const backgroundManager = {
getTasksByParentSession: () => [
{ agent: "council-member", status: "completed" },
{ agent: "council-member", status: "cancelled" },
],
}
const handler = createToolExecuteBeforeHandler({
ctx,
hooks: {},
backgroundManager,
})
//#when
const run = handler(
{ tool: "question", sessionID: "ses_athena", callID: "call_1" },
{ args: { questions: [] } }
)
//#then
await expect(run).resolves.toBeUndefined()
})
test("does not execute subagent question blocker hook for question tool", async () => { test("does not execute subagent question blocker hook for question tool", async () => {
//#given //#given
const ctx = { const ctx = {

View File

@ -1,23 +1,51 @@
import type { PluginContext } from "./types" import type { PluginContext } from "./types"
import type { BackgroundManager } from "../features/background-agent"
import { getMainSessionID } from "../features/claude-code-session-state" import { getMainSessionID } from "../features/claude-code-session-state"
import { clearBoulderState } from "../features/boulder-state" import { clearBoulderState } from "../features/boulder-state"
import { log } from "../shared" import { log } from "../shared"
import { resolveSessionAgent } from "./session-agent-resolver" import { resolveSessionAgent } from "./session-agent-resolver"
import { parseRalphLoopArguments } from "../hooks/ralph-loop/command-arguments" import { parseRalphLoopArguments } from "../hooks/ralph-loop/command-arguments"
import { getAgentConfigKey } from "../shared/agent-display-names"
import type { CreatedHooks } from "../create-hooks" import type { CreatedHooks } from "../create-hooks"
export function createToolExecuteBeforeHandler(args: { export function createToolExecuteBeforeHandler(args: {
ctx: PluginContext ctx: PluginContext
hooks: CreatedHooks hooks: CreatedHooks
backgroundManager?: Pick<BackgroundManager, "getTasksByParentSession">
}): ( }): (
input: { tool: string; sessionID: string; callID: string }, input: { tool: string; sessionID: string; callID: string },
output: { args: Record<string, unknown> }, output: { args: Record<string, unknown> },
) => Promise<void> { ) => Promise<void> {
const { ctx, hooks } = args const { ctx, hooks, backgroundManager } = args
function hasPendingCouncilMembers(sessionID: string): boolean {
if (!backgroundManager) {
return false
}
const tasks = backgroundManager.getTasksByParentSession(sessionID)
return tasks.some((task) =>
task.agent === "council-member" &&
(task.status === "pending" || task.status === "running")
)
}
return async (input, output): Promise<void> => { return async (input, output): Promise<void> => {
const toolNameLower = input.tool?.toLowerCase()
if (toolNameLower === "question" || toolNameLower === "askuserquestion" || toolNameLower === "ask_user_question" || toolNameLower === "switch_agent") {
const sessionAgent = await resolveSessionAgent(ctx.client, input.sessionID)
const sessionAgentKey = sessionAgent ? getAgentConfigKey(sessionAgent) : undefined
if (sessionAgentKey === "athena" && hasPendingCouncilMembers(input.sessionID)) {
throw new Error(
"Council members are still running. Wait for all launched members to finish and collect their outputs before asking next-step questions or switching agents."
)
}
}
await hooks.writeExistingFileGuard?.["tool.execute.before"]?.(input, output) await hooks.writeExistingFileGuard?.["tool.execute.before"]?.(input, output)
await hooks.questionLabelTruncator?.["tool.execute.before"]?.(input, output) await hooks.questionLabelTruncator?.["tool.execute.before"]?.(input, output)
await hooks.claudeCodeHooks?.["tool.execute.before"]?.(input, output) await hooks.claudeCodeHooks?.["tool.execute.before"]?.(input, output)

View File

@ -1,9 +1,10 @@
export const ATHENA_COUNCIL_TOOL_DESCRIPTION_TEMPLATE = `Execute Athena's multi-model council. Launches council members as background tasks and returns their task IDs immediately. export const ATHENA_COUNCIL_TOOL_DESCRIPTION_TEMPLATE = `Execute Athena's multi-model council for exactly ONE member per call.
Optionally pass a members array of member names or model IDs to consult only specific council members. If omitted, all configured members are consulted. Pass members as a single-item array containing one member name or model ID. Athena should call this tool once per selected member.
This tool launches the selected member as a background task and returns task/session metadata immediately.
Use background_output(task_id=..., block=true) to collect each member result.
{members} {members}
Use background_output(task_id=...) to retrieve each member's response. The system will notify you when tasks complete.
IMPORTANT: This tool is designed for Athena agent use only. It requires council configuration to be present.` IMPORTANT: This tool is designed for Athena agent use only. It requires council configuration to be present.`

View File

@ -1,116 +0,0 @@
import type { BackgroundManager } from "../../features/background-agent"
import type { CouncilLaunchedMember } from "../../agents/athena/types"
import type { BackgroundOutputClient, BackgroundOutputMessagesResult } from "../background-task/clients"
import { extractMessages, getErrorMessage } from "../background-task/session-messages"
const POLL_INTERVAL_MS = 2_000
const DEFAULT_TIMEOUT_MS = 5 * 60 * 1_000
export interface CollectedMemberResult {
name: string
model: string
taskId: string
status: "completed" | "error" | "cancelled" | "timeout"
content: string
}
export interface CollectedCouncilResults {
results: CollectedMemberResult[]
allCompleted: boolean
}
/**
* Waits for all launched council members to complete, then fetches their
* session messages and returns extracted text content.
*
* This replaces the previous flow where Athena had to manually poll
* background_output for each member, which created excessive UI noise.
*/
export async function collectCouncilResults(
launched: CouncilLaunchedMember[],
manager: BackgroundManager,
client: BackgroundOutputClient,
abort?: AbortSignal,
timeoutMs = DEFAULT_TIMEOUT_MS
): Promise<CollectedCouncilResults> {
const pendingIds = new Set(launched.map((m) => m.taskId))
const completedMap = new Map<string, "completed" | "error" | "cancelled">()
const deadline = Date.now() + timeoutMs
while (pendingIds.size > 0 && Date.now() < deadline) {
if (abort?.aborted) break
for (const taskId of pendingIds) {
const task = manager.getTask(taskId)
if (!task) {
completedMap.set(taskId, "error")
pendingIds.delete(taskId)
continue
}
if (task.status === "completed") {
completedMap.set(taskId, "completed")
pendingIds.delete(taskId)
} else if (task.status === "error" || task.status === "cancelled" || task.status === "interrupt") {
completedMap.set(taskId, task.status === "interrupt" ? "cancelled" : task.status)
pendingIds.delete(taskId)
}
}
if (pendingIds.size > 0) {
await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL_MS))
}
}
const results: CollectedMemberResult[] = []
for (const entry of launched) {
const memberName = entry.member.name ?? entry.member.model
const status = completedMap.get(entry.taskId) ?? "timeout"
if (status !== "completed") {
results.push({ name: memberName, model: entry.member.model, taskId: entry.taskId, status, content: "" })
continue
}
const content = await fetchMemberContent(entry.taskId, manager, client)
results.push({ name: memberName, model: entry.member.model, taskId: entry.taskId, status, content })
}
return {
results,
allCompleted: pendingIds.size === 0,
}
}
async function fetchMemberContent(
taskId: string,
manager: BackgroundManager,
client: BackgroundOutputClient
): Promise<string> {
const task = manager.getTask(taskId)
if (!task?.sessionID) return "(No session available)"
const messagesResult: BackgroundOutputMessagesResult = await client.session.messages({
path: { id: task.sessionID },
})
const errorMsg = getErrorMessage(messagesResult)
if (errorMsg) return `(Error fetching results: ${errorMsg})`
const messages = extractMessages(messagesResult)
if (!Array.isArray(messages) || messages.length === 0) return "(No messages found)"
const assistantMessages = messages.filter((m) => m.info?.role === "assistant")
if (assistantMessages.length === 0) return "(No assistant response found)"
const textParts: string[] = []
for (const message of assistantMessages) {
for (const part of message.parts ?? []) {
if ((part.type === "text" || part.type === "reasoning") && part.text) {
textParts.push(part.text)
}
}
}
return textParts.join("\n\n") || "(No text content)"
}

View File

@ -3,21 +3,8 @@
import { describe, expect, test } from "bun:test" import { describe, expect, test } from "bun:test"
import type { BackgroundManager } from "../../features/background-agent" import type { BackgroundManager } from "../../features/background-agent"
import type { BackgroundTask } from "../../features/background-agent/types" import type { BackgroundTask } from "../../features/background-agent/types"
import type { BackgroundOutputClient } from "../background-task/clients"
import { createAthenaCouncilTool, filterCouncilMembers } from "./tools" import { createAthenaCouncilTool, filterCouncilMembers } from "./tools"
const mockClient = {
session: {
messages: async () => ({
data: [{
id: "msg-1",
info: { role: "assistant" },
parts: [{ type: "text", text: "Test analysis result" }],
}],
}),
},
} as unknown as BackgroundOutputClient
const mockManager = { const mockManager = {
getTask: () => undefined, getTask: () => undefined,
launch: async () => { launch: async () => {
@ -38,7 +25,7 @@ const configuredMembers = [
{ model: "google/gemini-3-pro" }, { model: "google/gemini-3-pro" },
] ]
function createCompletedTask(id: string): BackgroundTask { function createRunningTask(id: string, sessionID = `ses-${id}`): BackgroundTask {
return { return {
id, id,
parentSessionID: "session-1", parentSessionID: "session-1",
@ -46,288 +33,168 @@ function createCompletedTask(id: string): BackgroundTask {
description: `Council member task ${id}`, description: `Council member task ${id}`,
prompt: "prompt", prompt: "prompt",
agent: "council-member", agent: "council-member",
status: "completed", status: "running",
sessionID: `ses-${id}`, sessionID,
} }
} }
describe("filterCouncilMembers", () => { describe("filterCouncilMembers", () => {
test("returns all members when selection is undefined", () => { test("returns all members when selection is undefined", () => {
// #given const result = filterCouncilMembers(configuredMembers, undefined)
const selectedMembers = undefined
// #when
const result = filterCouncilMembers(configuredMembers, selectedMembers)
// #then
expect(result.members).toEqual(configuredMembers) expect(result.members).toEqual(configuredMembers)
expect(result.error).toBeUndefined() expect(result.error).toBeUndefined()
}) })
test("returns all members when selection is empty", () => { test("returns all members when selection is empty", () => {
// #given const result = filterCouncilMembers(configuredMembers, [])
const selectedMembers: string[] = []
// #when
const result = filterCouncilMembers(configuredMembers, selectedMembers)
// #then
expect(result.members).toEqual(configuredMembers) expect(result.members).toEqual(configuredMembers)
expect(result.error).toBeUndefined() expect(result.error).toBeUndefined()
}) })
test("filters members using case-insensitive name and model matching", () => { test("filters members using case-insensitive name and model matching", () => {
// #given const result = filterCouncilMembers(configuredMembers, ["gpt", "GOOGLE/GEMINI-3-PRO"])
const selectedMembers = ["gpt", "GOOGLE/GEMINI-3-PRO"]
// #when
const result = filterCouncilMembers(configuredMembers, selectedMembers)
// #then
expect(result.members).toEqual([configuredMembers[1], configuredMembers[2]]) expect(result.members).toEqual([configuredMembers[1], configuredMembers[2]])
expect(result.error).toBeUndefined() expect(result.error).toBeUndefined()
}) })
test("returns helpful error when selected members are not configured", () => { test("returns helpful error when selected members are not configured", () => {
// #given const result = filterCouncilMembers(configuredMembers, ["mistral", "xai/grok-3"])
const selectedMembers = ["mistral", "xai/grok-3"]
// #when
const result = filterCouncilMembers(configuredMembers, selectedMembers)
// #then
expect(result.members).toEqual([]) expect(result.members).toEqual([])
expect(result.error).toBe( expect(result.error).toBe(
"Unknown council members: mistral, xai/grok-3. Available members: Claude, GPT, google/gemini-3-pro." "Unknown council members: mistral, xai/grok-3. Available members: Claude, GPT, google/gemini-3-pro."
) )
}) })
test("selects named member by model ID when name differs from model", () => {
// #given - "Claude" has name "Claude" but model "anthropic/claude-sonnet-4-5"
const selectedMembers = ["anthropic/claude-sonnet-4-5"]
// #when
const result = filterCouncilMembers(configuredMembers, selectedMembers)
// #then - should find the member by model ID even though it has a custom name
expect(result.members).toEqual([configuredMembers[0]])
expect(result.error).toBeUndefined()
})
test("deduplicates when same member is selected by both name and model", () => { test("deduplicates when same member is selected by both name and model", () => {
// #given const result = filterCouncilMembers(configuredMembers, ["Claude", "anthropic/claude-sonnet-4-5"])
const selectedMembers = ["Claude", "anthropic/claude-sonnet-4-5"]
// #when
const result = filterCouncilMembers(configuredMembers, selectedMembers)
// #then - should return only one copy
expect(result.members).toEqual([configuredMembers[0]]) expect(result.members).toEqual([configuredMembers[0]])
expect(result.error).toBeUndefined() expect(result.error).toBeUndefined()
}) })
test("returns error listing only unmatched names when partially matched", () => {
// #given
const selectedMembers = ["claude", "non-existent"]
// #when
const result = filterCouncilMembers(configuredMembers, selectedMembers)
// #then
expect(result.members).toEqual([])
expect(result.error).toBe(
"Unknown council members: non-existent. Available members: Claude, GPT, google/gemini-3-pro."
)
})
}) })
describe("createAthenaCouncilTool", () => { describe("createAthenaCouncilTool", () => {
test("returns error when councilConfig is undefined", async () => { test("returns error when councilConfig is undefined", async () => {
// #given
const athenaCouncilTool = createAthenaCouncilTool({ const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: mockManager, backgroundManager: mockManager,
councilConfig: undefined, councilConfig: undefined,
client: mockClient,
}) })
// #when
const result = await athenaCouncilTool.execute({ question: "How should we proceed?" }, mockToolContext) const result = await athenaCouncilTool.execute({ question: "How should we proceed?" }, mockToolContext)
// #then
expect(result).toBe("Athena council not configured. Add agents.athena.council.members to your config.") expect(result).toBe("Athena council not configured. Add agents.athena.council.members to your config.")
}) })
test("returns error when councilConfig has empty members", async () => { test("returns error when councilConfig has empty members", async () => {
// #given
const athenaCouncilTool = createAthenaCouncilTool({ const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: mockManager, backgroundManager: mockManager,
councilConfig: { members: [] }, councilConfig: { members: [] },
client: mockClient,
}) })
// #when
const result = await athenaCouncilTool.execute({ question: "Any concerns?" }, mockToolContext) const result = await athenaCouncilTool.execute({ question: "Any concerns?" }, mockToolContext)
// #then
expect(result).toBe("Athena council not configured. Add agents.athena.council.members to your config.") expect(result).toBe("Athena council not configured. Add agents.athena.council.members to your config.")
}) })
test("uses expected description and question arg schema", () => {
// #given
const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: mockManager,
councilConfig: { members: [{ model: "openai/gpt-5.3-codex" }] },
client: mockClient,
})
// #then - description should be dynamic and include the member model
expect(athenaCouncilTool.description).toContain("openai/gpt-5.3-codex")
expect(athenaCouncilTool.description).toContain("Available council members:")
expect((athenaCouncilTool as { args: Record<string, unknown> }).args.question).toBeDefined()
expect((athenaCouncilTool as { args: Record<string, unknown> }).args.members).toBeDefined()
})
test("returns helpful error when members contains invalid names", async () => { test("returns helpful error when members contains invalid names", async () => {
// #given
const athenaCouncilTool = createAthenaCouncilTool({ const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: mockManager, backgroundManager: mockManager,
councilConfig: { members: configuredMembers }, councilConfig: { members: configuredMembers },
client: mockClient,
})
const toolArgs = {
question: "Who should investigate this?",
members: ["unknown-model"],
}
// #when
const result = await athenaCouncilTool.execute(toolArgs, mockToolContext)
// #then
expect(result).toBe("Unknown council members: unknown-model. Available members: Claude, GPT, google/gemini-3-pro.")
})
test("returns collected markdown results for all configured council members", async () => {
// #given
let launchCount = 0
const taskStore = new Map<string, BackgroundTask>()
const launchManager = {
launch: async () => {
launchCount += 1
const task = createCompletedTask(`bg-${launchCount}`)
taskStore.set(task.id, task)
return task
},
getTask: (id: string) => taskStore.get(id),
} as unknown as BackgroundManager
const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: launchManager,
councilConfig: { members: configuredMembers },
client: mockClient,
}) })
// #when
const result = await athenaCouncilTool.execute({ question: "How should we proceed?" }, mockToolContext)
// #then - returns markdown with council results, one section per member
expect(result).toContain("## Council Results")
expect(result).toContain("How should we proceed?")
expect(result).toContain("### Claude (anthropic/claude-sonnet-4-5)")
expect(result).toContain("### GPT (openai/gpt-5.3-codex)")
expect(result).toContain("### google/gemini-3-pro (google/gemini-3-pro)")
expect(result).toContain("Test analysis result")
})
test("returns collected results only for selected members", async () => {
// #given
let launchCount = 0
const taskStore = new Map<string, BackgroundTask>()
const launchManager = {
launch: async () => {
launchCount += 1
const task = createCompletedTask(`bg-${launchCount}`)
taskStore.set(task.id, task)
return task
},
getTask: (id: string) => taskStore.get(id),
} as unknown as BackgroundManager
const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: launchManager,
councilConfig: { members: configuredMembers },
client: mockClient,
})
// #when
const result = await athenaCouncilTool.execute( const result = await athenaCouncilTool.execute(
{ { question: "Who should investigate this?", members: ["unknown-model"] },
question: "Who should investigate this?",
members: ["GPT", "google/gemini-3-pro"],
},
mockToolContext mockToolContext
) )
// #then - only selected members appear in output expect(result).toBe("Unknown council members: unknown-model. Available members: Claude, GPT, google/gemini-3-pro.")
expect(result).toContain("### GPT (openai/gpt-5.3-codex)")
expect(result).toContain("### google/gemini-3-pro (google/gemini-3-pro)")
expect(result).not.toContain("### Claude")
expect(launchCount).toBe(2)
}) })
test("includes launch failures alongside successful member results", async () => { test("returns selection error when members are omitted", async () => {
// #given const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: mockManager,
councilConfig: { members: configuredMembers },
})
const result = await athenaCouncilTool.execute({ question: "How should we proceed?" }, mockToolContext)
expect(result).toBe(
"athena_council runs one member per call. Pass exactly one member in members (single-item array). Available members: Claude, GPT, google/gemini-3-pro."
)
})
test("returns selection error when multiple members are provided", async () => {
const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: mockManager,
councilConfig: { members: configuredMembers },
})
const result = await athenaCouncilTool.execute(
{ question: "How should we proceed?", members: ["Claude", "GPT"] },
mockToolContext
)
expect(result).toBe(
"athena_council runs one member per call. Pass exactly one member in members (single-item array). Available members: Claude, GPT, google/gemini-3-pro."
)
})
test("launches selected member and returns background task metadata", async () => {
let launchCount = 0 let launchCount = 0
const taskStore = new Map<string, BackgroundTask>() const taskStore = new Map<string, BackgroundTask>()
const launchManager = { const launchManager = {
launch: async () => { launch: async () => {
launchCount += 1 launchCount += 1
if (launchCount === 2) { const task = createRunningTask(`bg-${launchCount}`)
throw new Error("provider outage")
}
const task = createCompletedTask(`bg-${launchCount}`)
taskStore.set(task.id, task) taskStore.set(task.id, task)
return task return task
}, },
getTask: (id: string) => taskStore.get(id), getTask: (id: string) => taskStore.get(id),
} as unknown as BackgroundManager } as unknown as BackgroundManager
const athenaCouncilTool = createAthenaCouncilTool({ const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: launchManager, backgroundManager: launchManager,
councilConfig: { members: configuredMembers }, councilConfig: { members: configuredMembers },
client: mockClient,
}) })
// #when const result = await athenaCouncilTool.execute(
const result = await athenaCouncilTool.execute({ question: "Any concerns?" }, mockToolContext) { question: "Who should investigate this?", members: ["GPT"] },
mockToolContext
)
// #then - successful members have results, failed member listed in failures section expect(launchCount).toBe(1)
expect(result).toContain("### Claude (anthropic/claude-sonnet-4-5)") expect(result).toContain("Council member launched in background.")
expect(result).toContain("### google/gemini-3-pro (google/gemini-3-pro)") expect(result).toContain("Task ID: bg-1")
expect(result).toContain("Session ID: ses-bg-1")
expect(result).toContain("Member: GPT")
expect(result).toContain("Model: openai/gpt-5.3-codex")
expect(result).toContain("Status: running")
expect(result).toContain("background_output")
expect(result).toContain("task_id=\"bg-1\"")
expect(result).toContain("<task_metadata>")
expect(result).toContain("session_id: ses-bg-1")
})
test("returns launch failure details when selected member fails", async () => {
const launchManager = {
launch: async () => {
throw new Error("provider outage")
},
getTask: () => undefined,
} as unknown as BackgroundManager
const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: launchManager,
councilConfig: { members: configuredMembers },
})
const result = await athenaCouncilTool.execute(
{ question: "Any concerns?", members: ["GPT"] },
mockToolContext
)
expect(result).toContain("Failed to launch council member.")
expect(result).toContain("### Launch Failures") expect(result).toContain("### Launch Failures")
expect(result).toContain("**GPT**") expect(result).toContain("**GPT**")
expect(result).toContain("provider outage") expect(result).toContain("provider outage")
}) })
test("returns dedup error when council is already running in same session", async () => {
// #given - use a never-resolving launch to keep the first execution in-flight
const pendingLaunch = new Promise<BackgroundTask>(() => {})
const launchManager = {
launch: async () => pendingLaunch,
getTask: () => undefined,
} as unknown as BackgroundManager
const athenaCouncilTool = createAthenaCouncilTool({
backgroundManager: launchManager,
councilConfig: { members: [{ model: "openai/gpt-5.3-codex" }] },
client: mockClient,
})
// #when - first call starts but never resolves (stuck in launch)
// second call should be rejected by session guard
const _firstExecution = athenaCouncilTool.execute({ question: "First run" }, mockToolContext)
// Allow microtask queue to process so markCouncilRunning is called
await new Promise((resolve) => setTimeout(resolve, 0))
const secondExecution = await athenaCouncilTool.execute({ question: "Second run" }, mockToolContext)
// #then
expect(secondExecution).toBe("Council is already running for this session. Wait for the current council execution to complete.")
})
}) })

View File

@ -2,13 +2,11 @@ import { tool, type ToolDefinition } from "@opencode-ai/plugin"
import { executeCouncil } from "../../agents/athena/council-orchestrator" import { executeCouncil } from "../../agents/athena/council-orchestrator"
import type { CouncilConfig, CouncilMemberConfig } from "../../agents/athena/types" import type { CouncilConfig, CouncilMemberConfig } from "../../agents/athena/types"
import type { BackgroundManager } from "../../features/background-agent" import type { BackgroundManager } from "../../features/background-agent"
import type { BackgroundOutputClient } from "../background-task/clients"
import { ATHENA_COUNCIL_TOOL_DESCRIPTION_TEMPLATE } from "./constants" import { ATHENA_COUNCIL_TOOL_DESCRIPTION_TEMPLATE } from "./constants"
import { createCouncilLauncher } from "./council-launcher" import { createCouncilLauncher } from "./council-launcher"
import { isCouncilRunning, markCouncilDone, markCouncilRunning } from "./session-guard"
import { waitForCouncilSessions } from "./session-waiter" import { waitForCouncilSessions } from "./session-waiter"
import { collectCouncilResults } from "./result-collector"
import type { AthenaCouncilToolArgs } from "./types" import type { AthenaCouncilToolArgs } from "./types"
import { storeToolMetadata } from "../../features/tool-metadata-store"
function isCouncilConfigured(councilConfig: CouncilConfig | undefined): councilConfig is CouncilConfig { function isCouncilConfigured(councilConfig: CouncilConfig | undefined): councilConfig is CouncilConfig {
return Boolean(councilConfig && councilConfig.members.length > 0) return Boolean(councilConfig && councilConfig.members.length > 0)
@ -19,6 +17,11 @@ interface FilterCouncilMembersResult {
error?: string error?: string
} }
function buildSingleMemberSelectionError(members: CouncilMemberConfig[]): string {
const availableNames = members.map((member) => member.name ?? member.model).join(", ")
return `athena_council runs one member per call. Pass exactly one member in members (single-item array). Available members: ${availableNames}.`
}
export function filterCouncilMembers( export function filterCouncilMembers(
members: CouncilMemberConfig[], members: CouncilMemberConfig[],
selectedNames: string[] | undefined selectedNames: string[] | undefined
@ -77,9 +80,8 @@ function buildToolDescription(councilConfig: CouncilConfig | undefined): string
export function createAthenaCouncilTool(args: { export function createAthenaCouncilTool(args: {
backgroundManager: BackgroundManager backgroundManager: BackgroundManager
councilConfig: CouncilConfig | undefined councilConfig: CouncilConfig | undefined
client: BackgroundOutputClient
}): ToolDefinition { }): ToolDefinition {
const { backgroundManager, councilConfig, client } = args const { backgroundManager, councilConfig } = args
const description = buildToolDescription(councilConfig) const description = buildToolDescription(councilConfig)
return tool({ return tool({
@ -89,7 +91,7 @@ export function createAthenaCouncilTool(args: {
members: tool.schema members: tool.schema
.array(tool.schema.string()) .array(tool.schema.string())
.optional() .optional()
.describe("Optional list of council member names or models to consult. Defaults to all configured members."), .describe("Single-item list containing exactly one council member name or model ID."),
}, },
async execute(toolArgs: AthenaCouncilToolArgs, toolContext) { async execute(toolArgs: AthenaCouncilToolArgs, toolContext) {
if (!isCouncilConfigured(councilConfig)) { if (!isCouncilConfigured(councilConfig)) {
@ -100,83 +102,82 @@ export function createAthenaCouncilTool(args: {
if (filteredMembers.error) { if (filteredMembers.error) {
return filteredMembers.error return filteredMembers.error
} }
if (filteredMembers.members.length !== 1) {
if (isCouncilRunning(toolContext.sessionID)) { return buildSingleMemberSelectionError(councilConfig.members)
return "Council is already running for this session. Wait for the current council execution to complete."
} }
markCouncilRunning(toolContext.sessionID) const execution = await executeCouncil({
try { question: toolArgs.question,
const execution = await executeCouncil({ council: { members: filteredMembers.members },
question: toolArgs.question, launcher: createCouncilLauncher(backgroundManager),
council: { members: filteredMembers.members }, parentSessionID: toolContext.sessionID,
launcher: createCouncilLauncher(backgroundManager), parentMessageID: toolContext.messageID,
parentSessionID: toolContext.sessionID, parentAgent: toolContext.agent,
parentMessageID: toolContext.messageID, })
parentAgent: toolContext.agent,
})
// Register metadata for UI visibility (makes sessions clickable in TUI). if (execution.launched.length === 0) {
const metadataFn = (toolContext as Record<string, unknown>).metadata as return formatCouncilLaunchFailure(execution.failures)
| ((input: { title?: string; metadata?: Record<string, unknown> }) => Promise<void>) }
| undefined
if (metadataFn && execution.launched.length > 0) { const launched = execution.launched[0]
const sessions = await waitForCouncilSessions( const launchedMemberName = launched?.member.name ?? launched?.member.model
execution.launched, backgroundManager, toolContext.abort const launchedMemberModel = launched?.member.model ?? "unknown"
) const launchedTaskId = launched?.taskId ?? "unknown"
for (const session of sessions) {
await metadataFn({ const sessionInfos = await waitForCouncilSessions(execution.launched, backgroundManager, toolContext.abort)
title: `Council: ${session.memberName}`, const launchedSession = sessionInfos.find((session) => session.taskId === launchedTaskId)
metadata: { const sessionId = launchedSession?.sessionId ?? "pending"
sessionId: session.sessionId,
agent: "council-member", const metadataFn = (toolContext as Record<string, unknown>).metadata as
model: session.model, | ((input: { title?: string; metadata?: Record<string, unknown> }) => Promise<void>)
description: `Council member: ${session.memberName}`, | undefined
}, if (metadataFn) {
}) const memberMetadata = {
} title: `Council: ${launchedMemberName}`,
metadata: {
sessionId,
agent: "council-member",
model: launchedMemberModel,
description: `Council member: ${launchedMemberName}`,
},
} }
await metadataFn(memberMetadata)
// Wait for all members to complete and collect their actual results. const callID = (toolContext as Record<string, unknown>).callID
// This eliminates the need for Athena to poll background_output repeatedly. if (typeof callID === "string") {
const collected = await collectCouncilResults( storeToolMetadata(toolContext.sessionID, callID, memberMetadata)
execution.launched, backgroundManager, client, toolContext.abort }
)
return formatCouncilOutput(toolArgs.question, collected.results, execution.failures)
} catch (error) {
throw error
} finally {
markCouncilDone(toolContext.sessionID)
} }
return `Council member launched in background.
Task ID: ${launchedTaskId}
Session ID: ${sessionId}
Member: ${launchedMemberName}
Model: ${launchedMemberModel}
Status: running
Use \`background_output\` with task_id="${launchedTaskId}" to collect this member's result.
- block=true: Wait for completion and return the result
- full_session=true: Include full session messages when needed
<task_metadata>
session_id: ${sessionId}
</task_metadata>`
}, },
}) })
} }
function formatCouncilOutput( function formatCouncilLaunchFailure(
question: string,
results: Array<{ name: string; model: string; taskId: string; status: string; content: string }>,
failures: Array<{ member: { name?: string; model: string }; error: string }> failures: Array<{ member: { name?: string; model: string }; error: string }>
): string { ): string {
const sections: string[] = [] if (failures.length === 0) {
return "Failed to launch council member."
sections.push(`## Council Results\n\n**Question:** ${question}\n`)
for (const result of results) {
const header = `### ${result.name} (${result.model})`
if (result.status !== "completed") {
sections.push(`${header}\n\n*Status: ${result.status}*`)
continue
}
sections.push(`${header}\n\n${result.content}`)
} }
if (failures.length > 0) { const failureLines = failures
const failureLines = failures .map((failure) => `- **${failure.member.name ?? failure.member.model}**: ${failure.error}`)
.map((f) => `- **${f.member.name ?? f.member.model}**: ${f.error}`) .join("\n")
.join("\n")
sections.push(`### Launch Failures\n\n${failureLines}`)
}
return sections.join("\n\n---\n\n") return `Failed to launch council member.\n\n### Launch Failures\n\n${failureLines}`
} }