roadmap: #221 filed — Message Batches API is structurally absent: zero /v1/messages/batches endpoint, zero /v1/batches endpoint, zero MessageBatch / BatchedRequest / BatchedResult / BatchProcessingStatus / BatchRequestCounts typed taxonomy across rust/crates/api/src/types.rs (zero hits for batches, MessageBatch, BatchedRequest, custom_id, processing_status), zero submit_batch / retrieve_batch / retrieve_batch_results / cancel_batch / list_batches methods on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist, both per-request synchronous), zero batch dispatch on ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi all closed under sync send_message + stream_message), zero BatchSubmittedEvent / BatchInProgressEvent / BatchEndedEvent typed events on the runtime telemetry sink, zero claw batch / claw batches CLI subcommand surface at rust/crates/rusty-claude-cli/src/main.rs, zero /batch slash command in SlashCommandSpec table at rust/crates/commands/src/lib.rs, zero pending_batches field in claw status --json output, zero is_batch_request flag on pricing_for_model cost estimator (so even if Batch API were wired, cost would over-charge by 2x), zero batch_input_tokens_per_million_usd / batch_output_tokens_per_million_usd fields in the Pricing struct — the API has been GA on Anthropic since 2024-10-08 (18 months ago at filing time, with explicit 'anthropic-beta: message-batches-2024-09-24' opt-in header documented) and on OpenAI since 2024-04-15 (24 months ago at filing time), uniformly offers 50% input-and-output token discount, accepts up to 100,000 requests per batch with 256MB total payload (Anthropic) or unlimited via Files API (OpenAI), 24-hour completion SLO; combining with #219's also-missing prompt-caching opt-in (90% input savings) gives a compounded ~95% input-cost asymmetry on bulk ingest scenarios — the single largest cost-reduction lever in the entire API parity audit, missing at the endpoint-family level rather than the per-field level (Jobdori cycle #373 / extends #168c emission-routing audit / sibling-shape cluster grows to twenty: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221 / wire-format-parity cluster grows to eleven: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221 / capability-parity cluster grows to three: #218+#220+#221 / cost-parity cluster grows to eight: #204+#207+#209+#210+#213+#216+#219+#221 — #221 compounds with #219 to ~95% bulk-ingest cost asymmetry, the largest cost gap in the cluster / seven-layer-endpoint-family-absence shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + Worker-registry-status-enum + CLI-subcommand-surface + pricing-tier-flag) is the largest single capability absence catalogued, exceeding #220's five-layer-feature-absence / endpoint-family-level absence shape is novel — applies to follow-on candidates 'Files API typed taxonomy is absent' (the OpenAI batch path's prerequisite endpoint, also absent), 'Embeddings API typed taxonomy is absent' (/v1/embeddings cross-cutting), 'Models list endpoint typed taxonomy is absent' (/v1/models / Anthropic Models API) / external validation: Anthropic Message Batches API reference at https://docs.anthropic.com/en/api/messages-batches documenting five operations on /v1/messages/batches + GA 2024-10-08 + 50% discount + 100k-requests-per-batch + 256MB-total-payload + 24-hour-SLO + custom_id correlation field, Anthropic launch announcement at anthropic.com/news/message-batches-api documenting '50% off both input and output tokens' positioning, Anthropic Pricing page documenting Batch API column with 50% across Sonnet 3.5/4/4.5/4.6 + Opus 3/4/4.6 + Haiku 3.5, Anthropic Python SDK client.messages.batches.create(requests=[...]) first-class typed surface, Anthropic TypeScript SDK parallel surface, AWS Bedrock InvokeModelBatch / batch-inference docs (Bedrock-anthropic-relay path), OpenAI Batch API reference at platform.openai.com/docs/api-reference/batch documenting GA 2024-04-15 + 50% discount + JSONL-via-Files-API + completion_window:'24h', OpenAI launch announcement at openai.com/index/openai-introduces-batch-api documenting 'process batches asynchronously and receive results within 24 hours at a 50% discount', DeepSeek/Moonshot/Alibaba-DashScope/xAI batch-inference parallel surfaces, OpenRouter batch passthrough, simonw/llm --batch flag, Vercel AI SDK generateBatch + provider-specific batch passthrough, LangChain Runnable.batch() + Runnable.abatch() first-class Python+TypeScript parity, LangSmith batch-aware tracing, llmindset.co.uk independent cost-calculus validation, Medium 'process 10,000 queries without breaking the bank' tutorial, Steve Kinney's Anthropic-Batch-with-Temporal workflow-orchestration article, ai.moda Anthropic-Batch+Caching 95%-compounded-savings analysis (proves #219+#221 together close the largest cost gap), VentureBeat industry-press coverage, Reddit r/ClaudeAI launch thread, zed-industries/zed#19945 (peer ecosystem with same gap), RooCodeInc/Roo-Code#8667 (peer ecosystem with same gap), n8n Anthropic-batch-processing workflow, startground.com batch-deals tracker, silicondata.com 2026-pricing per-model batch breakdown, Hacker News batch-mechanics discussions, OpenTelemetry GenAI semconv gen_ai.request.batch_id + gen_ai.batch.processing_status + gen_ai.batch.request_counts documented attributes, IANA application/x-ndjson + application/jsonl MIME-type registrations / claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero batch-dispatch capability despite the API being GA on both major providers for 18+ months — parity floor against every other CLI/SDK/coding-agent in 2024-2025, the largest single cost-reduction lever in the entire emission-routing audit, and the largest endpoint-family-level capability gap catalogued so far)

This commit is contained in:
YeonGyu-Kim 2026-04-26 01:45:20 +09:00
parent d46c423c1d
commit 9acd4f14da

File diff suppressed because one or more lines are too long