YeonGyu-Kim 8394926fe1
[ORCHESTRATOR TEST] feat(auth): multi-account Google Antigravity auth with automatic rotation (#579)
* feat(auth): add multi-account types and storage layer

Add foundation for multi-account Google Antigravity auth:
- ModelFamily, AccountTier, RateLimitState types for rate limit tracking
- AccountMetadata, AccountStorage, ManagedAccount interfaces
- Cross-platform storage module with XDG_DATA_HOME/APPDATA support
- Comprehensive test coverage for storage operations

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(auth): implement AccountManager for multi-account rotation

Add AccountManager class with automatic account rotation:
- Per-family rate limit tracking (claude, gemini-flash, gemini-pro)
- Paid tier prioritization in rotation logic
- Round-robin account selection within tier pools
- Account add/remove operations with index management
- Storage persistence integration

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(auth): add CLI prompts for multi-account setup

Add @clack/prompts-based CLI utilities:
- promptAddAnotherAccount() for multi-account flow
- promptAccountTier() for free/paid tier selection
- Non-TTY environment handling (graceful skip)

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(auth): integrate multi-account OAuth flow into plugin

Enhance OAuth flow for multi-account support:
- Prompt for additional accounts after first OAuth (up to 10)
- Collect email and tier for each account
- Save accounts to storage via AccountManager
- Load AccountManager in loader() from stored accounts
- Toast notifications for account authentication success
- Backward compatible with single-account flow

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(auth): add rate limit rotation to fetch interceptor

Integrate AccountManager into fetch for automatic rotation:
- Model family detection from URL (claude/gemini-flash/gemini-pro)
- Rate limit detection (429 with retry-after > 5s, 5xx errors)
- Mark rate-limited accounts and rotate to next available
- Recursive retry with new account on rotation
- Lazy load accounts from storage on first request
- Debug logging for account switches

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(cli): add auth account management commands

Add CLI commands for managing Google Antigravity accounts:
- `auth list`: Show all accounts with email, tier, rate limit status
- `auth remove <index|email>`: Remove account by index or email
- Help text with usage examples
- Active account indicator and remaining rate limit display

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(auth): address review feedback - remove duplicate ManagedAccount and reuse fetch function

- Remove unused ManagedAccount interface from types.ts (duplicate of accounts.ts)
- Reuse fetchFn in rate limit retry instead of creating new fetch closure
  Preserves cachedTokens, cachedProjectId, fetchInstanceId, accountsLoaded state

* fix(auth): address Cubic review feedback (8 issues)

P1 fixes:
- storage.ts: Use mode 0o600 for OAuth credentials file (security)
- fetch.ts: Return original 5xx status instead of synthesized 429
- accounts.ts: Adjust activeIndex/currentIndex in removeAccount
- plugin.ts: Fix multi-account migration to split on ||| not |

P2 fixes:
- cli.ts: Remove confusing cancel message when returning default
- auth.ts: Use strict parseInt check to prevent partial matches
- storage.test.ts: Use try/finally for env var cleanup

* refactor(test): import ManagedAccount from accounts.ts instead of duplicating

* fix(auth): address Oracle review findings (P1/P2)

P1 fixes:
- Clear cachedProjectId on account change to prevent stale project IDs
- Continue endpoint fallback for single-account users on rate limit
- Restore access/expires tokens from storage for non-active accounts
- Re-throw non-ENOENT filesystem errors (keep returning null for parse errors)
- Use atomic write (temp file + rename) for account storage

P2 fixes:
- Derive RateLimitState type from ModelFamily using mapped type
- Add MODEL_FAMILIES constant and use dynamic iteration in clearExpiredRateLimits
- Add missing else branch in storage.test.ts env cleanup
- Handle open() errors gracefully with user-friendly toast message

Tests updated to reflect correct behavior for token restoration.

* fix(auth): address Cubic review round 2 (5 issues)

P1: Return original 429/5xx response on last endpoint instead of generic 503
P2: Use unique temp filename (pid+timestamp) and cleanup on rename failure
P2: Clear cachedProjectId when first account introduced (lastAccountIndex null)
P3: Add console.error logging to open() catch block

* test(auth): add AccountManager removeAccount index tests

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* test(auth): add storage layer security and atomicity tests

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* fix(auth): address Cubic review round 3 (4 issues)

P1 Fixes:
- plugin.ts: Validate refresh_token before constructing first account
- plugin.ts: Validate additionalTokens.refresh_token before pushing accounts
- fetch.ts: Reset cachedTokens when switching accounts during rotation

P2 Fixes:
- fetch.ts: Improve model-family detection (parse model from body, fallback to URL)

* fix(auth): address Cubic review round 4 (3 issues)

P1 Fixes:
- plugin.ts: Close serverHandle before early return on missing refresh_token
- plugin.ts: Close additionalServerHandle before continue on missing refresh_token

P2 Fixes:
- fetch.ts: Remove overly broad 'pro' matching in getModelFamilyFromModelName

* fix(auth): address Cubic review round 5 (9 issues)

P1 Fixes:
- plugin.ts: Close additionalServerHandle after successful account auth
- fetch.ts: Cancel response body on 429/5xx to prevent connection leaks

P2 Fixes:
- plugin.ts: Close additionalServerHandle on OAuth error/missing code
- plugin.ts: Close additionalServerHandle on verifier mismatch
- auth.ts: Set activeIndex to -1 when all accounts removed
- storage.ts: Use shared getDataDir utility for consistent paths
- fetch.ts: Catch loadAccounts IO errors with graceful fallback
- storage.test.ts: Improve test assertions with proper error tracking

* feat(antigravity): add system prompt and thinking config constants

* feat(antigravity): add reasoning_effort and Gemini 3 thinkingLevel support

* feat(antigravity): inject system prompt into all requests

* feat(antigravity): integrate thinking config and system prompt in fetch layer

* feat(auth): auto-open browser for OAuth login on all platforms

* fix(auth): add alias2ModelName for Antigravity Claude models

Root cause: Antigravity API expects 'claude-sonnet-4-5-thinking' but we were
sending 'gemini-claude-sonnet-4-5-thinking'. Ported alias mapping from
CLIProxyAPI antigravity_executor.go:1328-1347.

Transforms:
- gemini-claude-sonnet-4-5-thinking → claude-sonnet-4-5-thinking
- gemini-claude-opus-4-5-thinking → claude-opus-4-5-thinking
- gemini-3-pro-preview → gemini-3-pro-high
- gemini-3-flash-preview → gemini-3-flash

* fix(auth): add requestType and toolConfig for Antigravity API

Missing required fields from CLIProxyAPI implementation:
- requestType: 'agent'
- request.toolConfig.functionCallingConfig.mode: 'VALIDATED'
- Delete request.safetySettings

Also strip 'antigravity-' prefix before alias transformation.

* fix(auth): remove broken alias2ModelName transformations for Gemini 3

CLIProxyAPI's alias mappings don't work with public Antigravity API:
- gemini-3-pro-preview → gemini-3-pro-high (404!)
- gemini-3-flash-preview → gemini-3-flash (404!)

Tested: -preview suffix names work, transformed names return 404.
Keep only gemini-claude-* prefix stripping for future Claude support.

* fix(auth): implement correct alias2ModelName transformations for Antigravity API

Implements explicit switch-based model name mappings for Antigravity API.
Updates SANDBOX endpoint constants to clarify quota/availability behavior.
Fixes test expectations to match new transformation logic.

🤖 Generated with assistance of OhMyOpenCode

---------

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-01-08 22:37:38 +09:00

756 lines
21 KiB
TypeScript

/**
* Antigravity Thinking Block Handler (Gemini only)
*
* Handles extraction and transformation of thinking/reasoning blocks
* from Gemini responses. Thinking blocks contain the model's internal
* reasoning process, available in `-high` model variants.
*
* Key responsibilities:
* - Extract thinking blocks from Gemini response format
* - Detect thinking-capable model variants (`-high` suffix)
* - Format thinking blocks for OpenAI-compatible output
*
* Note: This is Gemini-only. Claude models are NOT handled by Antigravity.
*/
import {
normalizeModelId,
ANTIGRAVITY_MODEL_CONFIGS,
REASONING_EFFORT_BUDGET_MAP,
type AntigravityModelConfig,
} from "./constants"
/**
* Represents a single thinking/reasoning block extracted from Gemini response
*/
export interface ThinkingBlock {
/** The thinking/reasoning text content */
text: string
/** Optional signature for signed thinking blocks (required for multi-turn) */
signature?: string
/** Index of the thinking block in sequence */
index?: number
}
/**
* Raw part structure from Gemini response candidates
*/
export interface GeminiPart {
/** Text content of the part */
text?: string
/** Whether this part is a thinking/reasoning block */
thought?: boolean
/** Signature for signed thinking blocks */
thoughtSignature?: string
/** Type field for Anthropic-style format */
type?: string
/** Signature field for Anthropic-style format */
signature?: string
}
/**
* Gemini response candidate structure
*/
export interface GeminiCandidate {
/** Content containing parts */
content?: {
/** Role of the content (e.g., "model", "assistant") */
role?: string
/** Array of content parts */
parts?: GeminiPart[]
}
/** Index of the candidate */
index?: number
}
/**
* Gemini response structure for thinking block extraction
*/
export interface GeminiResponse {
/** Response ID */
id?: string
/** Array of response candidates */
candidates?: GeminiCandidate[]
/** Direct content (some responses use this instead of candidates) */
content?: Array<{
type?: string
text?: string
signature?: string
}>
/** Model used for response */
model?: string
}
/**
* Result of thinking block extraction
*/
export interface ThinkingExtractionResult {
/** Extracted thinking blocks */
thinkingBlocks: ThinkingBlock[]
/** Combined thinking text for convenience */
combinedThinking: string
/** Whether any thinking blocks were found */
hasThinking: boolean
}
/**
* Default thinking budget in tokens for thinking-enabled models
*/
export const DEFAULT_THINKING_BUDGET = 16000
/**
* Check if a model variant should include thinking blocks
*
* Returns true for model variants with `-high` suffix, which have
* extended thinking capability enabled.
*
* Examples:
* - `gemini-3-pro-high` → true
* - `gemini-2.5-pro-high` → true
* - `gemini-3-pro-preview` → false
* - `gemini-2.5-pro` → false
*
* @param model - Model identifier string
* @returns True if model should include thinking blocks
*/
export function shouldIncludeThinking(model: string): boolean {
if (!model || typeof model !== "string") {
return false
}
const lowerModel = model.toLowerCase()
// Check for -high suffix (primary indicator of thinking capability)
if (lowerModel.endsWith("-high")) {
return true
}
// Also check for explicit thinking in model name
if (lowerModel.includes("thinking")) {
return true
}
return false
}
/**
* Check if a model is thinking-capable (broader check)
*
* This is a broader check than shouldIncludeThinking - it detects models
* that have thinking capability, even if not explicitly requesting thinking output.
*
* @param model - Model identifier string
* @returns True if model supports thinking/reasoning
*/
export function isThinkingCapableModel(model: string): boolean {
if (!model || typeof model !== "string") {
return false
}
const lowerModel = model.toLowerCase()
return (
lowerModel.includes("thinking") ||
lowerModel.includes("gemini-3") ||
lowerModel.endsWith("-high")
)
}
/**
* Check if a part is a thinking/reasoning block
*
* Detects both Gemini-style (thought: true) and Anthropic-style
* (type: "thinking" or type: "reasoning") formats.
*
* @param part - Content part to check
* @returns True if part is a thinking block
*/
function isThinkingPart(part: GeminiPart): boolean {
// Gemini-style: thought flag
if (part.thought === true) {
return true
}
// Anthropic-style: type field
if (part.type === "thinking" || part.type === "reasoning") {
return true
}
return false
}
/**
* Check if a thinking part has a valid signature
*
* Signatures are required for multi-turn conversations with Claude models.
* Gemini uses `thoughtSignature`, Anthropic uses `signature`.
*
* @param part - Thinking part to check
* @returns True if part has valid signature
*/
function hasValidSignature(part: GeminiPart): boolean {
// Gemini-style signature
if (part.thought === true && part.thoughtSignature) {
return true
}
// Anthropic-style signature
if ((part.type === "thinking" || part.type === "reasoning") && part.signature) {
return true
}
return false
}
/**
* Extract thinking blocks from a Gemini response
*
* Parses the response structure to identify and extract all thinking/reasoning
* content. Supports both Gemini-style (thought: true) and Anthropic-style
* (type: "thinking") formats.
*
* @param response - Gemini response object
* @returns Extraction result with thinking blocks and metadata
*/
export function extractThinkingBlocks(response: GeminiResponse): ThinkingExtractionResult {
const thinkingBlocks: ThinkingBlock[] = []
// Handle candidates array (standard Gemini format)
if (response.candidates && Array.isArray(response.candidates)) {
for (const candidate of response.candidates) {
const parts = candidate.content?.parts
if (!parts || !Array.isArray(parts)) {
continue
}
for (let i = 0; i < parts.length; i++) {
const part = parts[i]
if (!part || typeof part !== "object") {
continue
}
if (isThinkingPart(part)) {
const block: ThinkingBlock = {
text: part.text || "",
index: thinkingBlocks.length,
}
// Extract signature if present
if (part.thought === true && part.thoughtSignature) {
block.signature = part.thoughtSignature
} else if (part.signature) {
block.signature = part.signature
}
thinkingBlocks.push(block)
}
}
}
}
// Handle direct content array (Anthropic-style response)
if (response.content && Array.isArray(response.content)) {
for (let i = 0; i < response.content.length; i++) {
const item = response.content[i]
if (!item || typeof item !== "object") {
continue
}
if (item.type === "thinking" || item.type === "reasoning") {
thinkingBlocks.push({
text: item.text || "",
signature: item.signature,
index: thinkingBlocks.length,
})
}
}
}
// Combine all thinking text
const combinedThinking = thinkingBlocks.map((b) => b.text).join("\n\n")
return {
thinkingBlocks,
combinedThinking,
hasThinking: thinkingBlocks.length > 0,
}
}
/**
* Format thinking blocks for OpenAI-compatible output
*
* Converts Gemini thinking block format to OpenAI's expected structure.
* OpenAI expects thinking content as special message blocks or annotations.
*
* Output format:
* ```
* [
* { type: "reasoning", text: "thinking content...", signature?: "..." },
* ...
* ]
* ```
*
* @param thinking - Array of thinking blocks to format
* @returns OpenAI-compatible formatted array
*/
export function formatThinkingForOpenAI(
thinking: ThinkingBlock[],
): Array<{ type: "reasoning"; text: string; signature?: string }> {
if (!thinking || !Array.isArray(thinking) || thinking.length === 0) {
return []
}
return thinking.map((block) => {
const formatted: { type: "reasoning"; text: string; signature?: string } = {
type: "reasoning",
text: block.text || "",
}
if (block.signature) {
formatted.signature = block.signature
}
return formatted
})
}
/**
* Transform thinking parts in a candidate to OpenAI format
*
* Modifies candidate content parts to use OpenAI-style reasoning format
* while preserving the rest of the response structure.
*
* @param candidate - Gemini candidate to transform
* @returns Transformed candidate with reasoning-formatted thinking
*/
export function transformCandidateThinking(candidate: GeminiCandidate): GeminiCandidate {
if (!candidate || typeof candidate !== "object") {
return candidate
}
const content = candidate.content
if (!content || typeof content !== "object" || !Array.isArray(content.parts)) {
return candidate
}
const thinkingTexts: string[] = []
const transformedParts = content.parts.map((part) => {
if (part && typeof part === "object" && part.thought === true) {
thinkingTexts.push(part.text || "")
// Transform to reasoning format
return {
...part,
type: "reasoning" as const,
thought: undefined, // Remove Gemini-specific field
}
}
return part
})
const result: GeminiCandidate & { reasoning_content?: string } = {
...candidate,
content: { ...content, parts: transformedParts },
}
// Add combined reasoning content for convenience
if (thinkingTexts.length > 0) {
result.reasoning_content = thinkingTexts.join("\n\n")
}
return result
}
/**
* Transform Anthropic-style thinking blocks to reasoning format
*
* Converts `type: "thinking"` blocks to `type: "reasoning"` for consistency.
*
* @param content - Array of content blocks
* @returns Transformed content array
*/
export function transformAnthropicThinking(
content: Array<{ type?: string; text?: string; signature?: string }>,
): Array<{ type?: string; text?: string; signature?: string }> {
if (!content || !Array.isArray(content)) {
return content
}
return content.map((block) => {
if (block && typeof block === "object" && block.type === "thinking") {
return {
type: "reasoning",
text: block.text || "",
...(block.signature ? { signature: block.signature } : {}),
}
}
return block
})
}
/**
* Filter out unsigned thinking blocks
*
* Claude API requires signed thinking blocks for multi-turn conversations.
* This function removes thinking blocks without valid signatures.
*
* @param parts - Array of content parts
* @returns Filtered array without unsigned thinking blocks
*/
export function filterUnsignedThinkingBlocks(parts: GeminiPart[]): GeminiPart[] {
if (!parts || !Array.isArray(parts)) {
return parts
}
return parts.filter((part) => {
if (!part || typeof part !== "object") {
return true
}
// If it's a thinking part, only keep it if signed
if (isThinkingPart(part)) {
return hasValidSignature(part)
}
// Keep all non-thinking parts
return true
})
}
/**
* Transform entire response thinking parts
*
* Main transformation function that handles both Gemini-style and
* Anthropic-style thinking blocks in a response.
*
* @param response - Response object to transform
* @returns Transformed response with standardized reasoning format
*/
export function transformResponseThinking(response: GeminiResponse): GeminiResponse {
if (!response || typeof response !== "object") {
return response
}
const result: GeminiResponse = { ...response }
// Transform candidates (Gemini-style)
if (Array.isArray(result.candidates)) {
result.candidates = result.candidates.map(transformCandidateThinking)
}
// Transform direct content (Anthropic-style)
if (Array.isArray(result.content)) {
result.content = transformAnthropicThinking(result.content)
}
return result
}
/**
* Thinking configuration for requests
*/
export interface ThinkingConfig {
/** Token budget for thinking/reasoning */
thinkingBudget?: number
/** Whether to include thoughts in response */
includeThoughts?: boolean
}
/**
* Normalize thinking configuration
*
* Ensures thinkingConfig is valid: includeThoughts only allowed when budget > 0.
*
* @param config - Raw thinking configuration
* @returns Normalized configuration or undefined
*/
export function normalizeThinkingConfig(config: unknown): ThinkingConfig | undefined {
if (!config || typeof config !== "object") {
return undefined
}
const record = config as Record<string, unknown>
const budgetRaw = record.thinkingBudget ?? record.thinking_budget
const includeRaw = record.includeThoughts ?? record.include_thoughts
const thinkingBudget =
typeof budgetRaw === "number" && Number.isFinite(budgetRaw) ? budgetRaw : undefined
const includeThoughts = typeof includeRaw === "boolean" ? includeRaw : undefined
const enableThinking = thinkingBudget !== undefined && thinkingBudget > 0
const finalInclude = enableThinking ? (includeThoughts ?? false) : false
// Return undefined if no meaningful config
if (
!enableThinking &&
finalInclude === false &&
thinkingBudget === undefined &&
includeThoughts === undefined
) {
return undefined
}
const normalized: ThinkingConfig = {}
if (thinkingBudget !== undefined) {
normalized.thinkingBudget = thinkingBudget
}
if (finalInclude !== undefined) {
normalized.includeThoughts = finalInclude
}
return normalized
}
/**
* Extract thinking configuration from request payload
*
* Supports both Gemini-style thinkingConfig and Anthropic-style thinking options.
* Also supports reasoning_effort parameter which maps to thinking budget/level.
*
* @param requestPayload - Request body
* @param generationConfig - Generation config from request
* @param extraBody - Extra body options
* @returns Extracted thinking configuration or undefined
*/
export function extractThinkingConfig(
requestPayload: Record<string, unknown>,
generationConfig?: Record<string, unknown>,
extraBody?: Record<string, unknown>,
): ThinkingConfig | DeleteThinkingConfig | undefined {
// Check for explicit thinkingConfig
const thinkingConfig =
generationConfig?.thinkingConfig ?? extraBody?.thinkingConfig ?? requestPayload.thinkingConfig
if (thinkingConfig && typeof thinkingConfig === "object") {
const config = thinkingConfig as Record<string, unknown>
return {
includeThoughts: Boolean(config.includeThoughts),
thinkingBudget:
typeof config.thinkingBudget === "number" ? config.thinkingBudget : DEFAULT_THINKING_BUDGET,
}
}
// Convert Anthropic-style "thinking" option: { type: "enabled", budgetTokens: N }
const anthropicThinking = extraBody?.thinking ?? requestPayload.thinking
if (anthropicThinking && typeof anthropicThinking === "object") {
const thinking = anthropicThinking as Record<string, unknown>
if (thinking.type === "enabled" || thinking.budgetTokens) {
return {
includeThoughts: true,
thinkingBudget:
typeof thinking.budgetTokens === "number"
? thinking.budgetTokens
: DEFAULT_THINKING_BUDGET,
}
}
}
// Extract reasoning_effort parameter (maps to thinking budget/level)
const reasoningEffort = requestPayload.reasoning_effort ?? extraBody?.reasoning_effort
if (reasoningEffort && typeof reasoningEffort === "string") {
const budget = REASONING_EFFORT_BUDGET_MAP[reasoningEffort]
if (budget !== undefined) {
if (reasoningEffort === "none") {
// Special marker: delete thinkingConfig entirely
return { deleteThinkingConfig: true }
}
return {
includeThoughts: true,
thinkingBudget: budget,
}
}
}
return undefined
}
/**
* Resolve final thinking configuration based on model and context
*
* Handles special cases like Claude models requiring signed thinking blocks
* for multi-turn conversations.
*
* @param userConfig - User-provided thinking configuration
* @param isThinkingModel - Whether model supports thinking
* @param isClaudeModel - Whether model is Claude (not used in Antigravity, but kept for compatibility)
* @param hasAssistantHistory - Whether conversation has assistant history
* @returns Final thinking configuration
*/
export function resolveThinkingConfig(
userConfig: ThinkingConfig | undefined,
isThinkingModel: boolean,
isClaudeModel: boolean,
hasAssistantHistory: boolean,
): ThinkingConfig | undefined {
// Claude models with history need signed thinking blocks
// Since we can't guarantee signatures, disable thinking
if (isClaudeModel && hasAssistantHistory) {
return { includeThoughts: false, thinkingBudget: 0 }
}
// Enable thinking by default for thinking-capable models
if (isThinkingModel && !userConfig) {
return { includeThoughts: true, thinkingBudget: DEFAULT_THINKING_BUDGET }
}
return userConfig
}
// ============================================================================
// Model Thinking Configuration (Task 2: reasoning_effort and Gemini 3 thinkingLevel)
// ============================================================================
/**
* Get thinking config for a model by normalized ID.
* Uses pattern matching fallback if exact match not found.
*
* @param model - Model identifier string (with or without provider prefix)
* @returns Thinking configuration or undefined if not found
*/
export function getModelThinkingConfig(
model: string,
): AntigravityModelConfig | undefined {
const normalized = normalizeModelId(model)
// Exact match
if (ANTIGRAVITY_MODEL_CONFIGS[normalized]) {
return ANTIGRAVITY_MODEL_CONFIGS[normalized]
}
// Pattern matching fallback for Gemini 3
if (normalized.includes("gemini-3")) {
return {
thinkingType: "levels",
min: 128,
max: 32768,
zeroAllowed: false,
levels: ["low", "high"],
}
}
// Pattern matching fallback for Gemini 2.5
if (normalized.includes("gemini-2.5")) {
return {
thinkingType: "numeric",
min: 0,
max: 24576,
zeroAllowed: true,
}
}
// Pattern matching fallback for Claude via Antigravity
if (normalized.includes("claude")) {
return {
thinkingType: "numeric",
min: 1024,
max: 200000,
zeroAllowed: false,
}
}
return undefined
}
/**
* Type for the delete thinking config marker.
* Used when reasoning_effort is "none" to signal complete removal.
*/
export interface DeleteThinkingConfig {
deleteThinkingConfig: true
}
/**
* Union type for thinking configuration input.
*/
export type ThinkingConfigInput = ThinkingConfig | DeleteThinkingConfig
/**
* Convert thinking budget to closest level string for Gemini 3 models.
*
* @param budget - Thinking budget in tokens
* @param model - Model identifier
* @returns Level string ("low", "high", etc.) or "medium" fallback
*/
export function budgetToLevel(budget: number, model: string): string {
const config = getModelThinkingConfig(model)
// Default fallback
if (!config?.levels) {
return "medium"
}
// Map budgets to levels
const budgetMap: Record<number, string> = {
512: "minimal",
1024: "low",
8192: "medium",
24576: "high",
}
// Return matching level or highest available
if (budgetMap[budget]) {
return budgetMap[budget]
}
return config.levels[config.levels.length - 1] || "high"
}
/**
* Apply thinking config to request body.
*
* CRITICAL: Sets request.generationConfig.thinkingConfig (NOT outer body!)
*
* Handles:
* - Gemini 3: Sets thinkingLevel (string)
* - Gemini 2.5: Sets thinkingBudget (number)
* - Delete marker: Removes thinkingConfig entirely
*
* @param requestBody - Request body to modify (mutates in place)
* @param model - Model identifier
* @param config - Thinking configuration or delete marker
*/
export function applyThinkingConfigToRequest(
requestBody: Record<string, unknown>,
model: string,
config: ThinkingConfigInput,
): void {
// Handle delete marker
if ("deleteThinkingConfig" in config && config.deleteThinkingConfig) {
if (requestBody.request && typeof requestBody.request === "object") {
const req = requestBody.request as Record<string, unknown>
if (req.generationConfig && typeof req.generationConfig === "object") {
const genConfig = req.generationConfig as Record<string, unknown>
delete genConfig.thinkingConfig
}
}
return
}
const modelConfig = getModelThinkingConfig(model)
if (!modelConfig) {
return
}
// Ensure request.generationConfig.thinkingConfig exists
if (!requestBody.request || typeof requestBody.request !== "object") {
return
}
const req = requestBody.request as Record<string, unknown>
if (!req.generationConfig || typeof req.generationConfig !== "object") {
req.generationConfig = {}
}
const genConfig = req.generationConfig as Record<string, unknown>
genConfig.thinkingConfig = {}
const thinkingConfig = genConfig.thinkingConfig as Record<string, unknown>
thinkingConfig.include_thoughts = true
if (modelConfig.thinkingType === "numeric") {
thinkingConfig.thinkingBudget = (config as ThinkingConfig).thinkingBudget
} else if (modelConfig.thinkingType === "levels") {
const budget = (config as ThinkingConfig).thinkingBudget ?? DEFAULT_THINKING_BUDGET
let level = budgetToLevel(budget, model)
// Convert uppercase to lowercase (think-mode hook sends "HIGH")
level = level.toLowerCase()
thinkingConfig.thinkingLevel = level
}
}