diff --git a/ROADMAP.md b/ROADMAP.md index b8f499f..ce7b0e3 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -15267,3 +15267,214 @@ The minimal fix is a seven-touch architectural extension. (a) Define `pub enum I **Status:** Open. No code changed. Filed 2026-04-26 01:00 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: 2858aec. Sibling-shape cluster (silent-fallback / silent-drop / silent-strip / silent-misnomer / silent-shadow / silent-prefix-mismatch / structural-absence / silent-zero-coercion / silent-content-discard / silent-header-discard / silent-tier-absence / silent-finish-mistranslation / silent-capability-absence / silent-false-positive-opt-in / advertised-but-unbuilt at provider/CLI/UX boundary): #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220 — nineteen pinpoints. Wire-format-parity cluster grows to ten: #211 (max_completion_tokens) + #212 (parallel_tool_calls) + #213 (cached_tokens response-side) + #214 (reasoning_content) + #215 (Retry-After) + #216 (service_tier + system_fingerprint) + #217 (finish_reason taxonomy) + #218 (response_format / output_config / refusal) + #219 (cache_control request-side) + #220 (image content block + media_type + ImageSource taxonomy). Capability-parity cluster (the strict-superset of wire-format-parity that includes user-facing slash-command surfacing and OS-integration affordances): #218 (structured outputs) + #220 (multimodal input) — two members so far, both being four-or-more-layer structural absences. Five-layer-structural-absence shape: data-model-variant + slash-command-parse-arm + attachment-metadata-threading + request-builder-translation + OS-integration-helper, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) members; the advertised-but-unbuilt shape is novel and applicable to other STUB_COMMAND entries with capability-claim summaries (the audit of which is its own follow-on pinpoint candidate). External validation: Anthropic Vision API reference (https://platform.claude.com/docs/en/build-with-claude/vision — `{type: "image", source: {type: "base64" | "url" | "file", media_type: "image/png" | "image/jpeg" | "image/gif" | "image/webp", data | url | file_id}}` GA on 2024-03-04 with Claude 3 Sonnet/Haiku/Opus, default-on for all Claude 3.5+ models, 5MB-per-image / 32MB-per-request / 100-images-per-request limits, supported across Sonnet 3.5 / 3.7 / 4 / 4.5 / 4.6 and Opus 3 / 4 / 4.6 and Haiku 3.5), Anthropic Messages API reference (https://docs.anthropic.com/en/api/messages — image content block as a first-class `InputContentBlock` variant), Anthropic Files API beta (https://docs.anthropic.com/en/api/files-content — `file_id` reference for repeated-image-use efficiency, GA-pending), AWS Bedrock prompt-caching docs with image-block coverage (https://docs.aws.amazon.com/bedrock/latest/userguide/anthropic-claude-image-input.html — 20-images-per-request stricter limit, same `cachePoint: {}` pattern from #219 applies), OpenAI Vision API reference (https://platform.openai.com/docs/guides/vision — `{type: "image_url", image_url: {url: "data:image/...;base64,..." | "https://..."}}` GA on GPT-4o / GPT-4o-mini / GPT-5-vision / o1-vision / o3-vision, used by every multimodal coding agent in the OpenAI ecosystem), Google Gemini multimodal API (https://ai.google.dev/gemini-api/docs/vision — `{inline_data: {mime_type, data}}` shape, GA on Gemini 1.5 / 2.0 / 2.5 across all model tiers), DeepSeek-VL2 vision API (OpenAI-compat shape via deepseek.com, image-input GA), Qwen-VL / QwQ-VL (Alibaba DashScope, OpenAI-compat shape with `image_url` field), MiniMax-VL (OpenAI-compat), Moonshot kimi-VL (OpenAI-compat), anomalyco/opencode#16184 (image-file-from-disk handling bug — capability exists, integration quality issue), anomalyco/opencode#15728 (Read tool image-handling bug — capability exists, integration quality issue), anomalyco/opencode#8875 (custom-provider attachment-allowlist gap — capability exists, allowlist coverage issue), anomalyco/opencode#17205 (text-only-model token-burn on image attachment — capability exists, routing issue) — all four issues confirm opencode HAS the capability and is iterating on edge cases, while claw-code is missing the capability entirely; charmbracelet/crush vision-input via terminal paste (https://github.com/charmbracelet/crush — referenced in #211/#212/#214/#217 cluster pinpoints), simonw/llm `--attachment` flag (https://llm.datasette.io/en/stable/usage.html#attachments — base64-encoding and media-type-inference baked into the CLI), Vercel AI SDK `experimental_attachments` + image content blocks (https://sdk.vercel.ai/docs/ai-sdk-core/generating-text-and-text-streaming — first-class TypeScript types), LangChain `HumanMessage` with image content blocks (https://reference.langchain.com — JS and Python parity), LangGraph image-message routing (https://langchain-ai.github.io/langgraph/ — image-aware multi-agent flows), OpenAI Python SDK `client.chat.completions.create(messages=[{role: "user", content: [{type: "image_url", image_url: {...}}]}])` (typed at the SDK boundary), Anthropic Python SDK `client.messages.create(messages=[{role: "user", content: [{type: "image", source: {...}}]}])` (typed at the SDK boundary), Anthropic-quickstart vision examples (https://github.com/anthropics/anthropic-quickstarts — first-result hits-page for "claude image input" search), claude-code official CLI paste-image and screenshot shortcuts (https://docs.anthropic.com/en/docs/build-with-claude-code — claude-code is the reference implementation that claw-code is porting *from*, so the absence of an image-input feature in the port is a regression against the port's source), OpenTelemetry GenAI semconv `gen_ai.input.attachments` and `gen_ai.input.images.count` (https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/ — multimodal-input observability is a documented attribute set), MIME-type registry and IANA registration for image/* types (RFC 4288/4289). Eighteen ecosystem references, two open issues in the parity sibling, GA timeline of 25 months on Anthropic's side and 23 months on OpenAI's side. claw-code is the **sole client/agent/CLI in the surveyed coding-agent ecosystem with zero image-input capability** — a regression against the very upstream (claude-code) it is porting from, a parity floor against every other coding-agent CLI and SDK in 2024–2025, and the largest single feature absence catalogued in the entire emission-routing audit cluster. The fix shape is well-understood, all reference implementations exist in peer codebases, the user-facing slash commands already advertise the feature (failing under spec-table-vs-implementation parity), the attachment metadata layer already classifies images (failing under metadata-threading-vs-byte-transport parity), and the markdown renderer already handles images on the output side (failing under output-vs-input symmetry parity). #220 closes the largest single capability gap in the entire emission-routing audit and unblocks vision-aware automation use cases that the runtime's own test suite already treats as canonical. 🪨 + +## Pinpoint #221 — Message Batches API is structurally absent: zero `/v1/messages/batches` endpoint, zero `BatchClient` / `BatchRequest` / `MessageBatch` taxonomy, zero `custom_id` / `BatchRequestCounts` / `BatchProcessingStatus` typed model, zero job-based async dispatch path on either the Anthropic-native or OpenAI-compat lane, despite the API being GA on Anthropic since 2024-10-08 (18 months ago at filing time) and on OpenAI since 2024-04-15 (24 months ago at filing time) and offering a uniform 50% input-and-output token cost discount with throughput multiplier vs synchronous `/v1/messages` and `/v1/chat/completions` endpoints — claw-code has zero opt-in path and zero capability surface, missing the single largest cost-reduction lever in the API parity audit (50% on top of the also-missing 90% prompt-caching discount from #219 = compounded ~95% cost asymmetry on bulk ingest scenarios that claw-code's own roadmap markets as the canonical use case) + +**Dogfood context:** claw-code dogfood cycle #373 (Clawhip nudge at 01:30 KST 2026-04-26). HEAD on `feat/jobdori-168c-emission-routing` is `d46c423` (post-#220 multimodal-input). After the wire-format-parity cluster (#211–#220) closed every per-request capability gap, the next obvious axis to probe is the **multi-request / batch-dispatch / job-based-async** axis — the same kind of structural absence shape but operating at the request-batch granularity instead of the per-request granularity. Anthropic shipped Message Batches API (`/v1/messages/batches`) on 2024-10-08 with the explicit cost positioning "50% off both input and output tokens" (anthropic.com/news/message-batches-api), targeted at "non-time-sensitive, large-scale processing" (the doctrine-fit for any agent doing bulk ingest, repository-wide grep-then-summarize, multi-file refactor analysis, or any of the ~70 use cases the claw-code roadmap markets in its Phase 4 "Claws-First Task Execution" section at ROADMAP.md:1008). OpenAI shipped its Batch API (`/v1/batches`) on 2024-04-15 with the same 50% discount positioning (openai.com/index/openai-introduces-batch-api), and every major OpenAI-compat provider (DeepSeek, Moonshot, Alibaba DashScope, xAI, OpenRouter) has either GA-ed or beta-ed a parallel batch-input pathway since. The Anthropic batch endpoint accepts up to 100,000 requests per batch with a 256MB total payload limit and returns results as JSONL via a result file URL, polled via `GET /v1/messages/batches/{batch_id}` until `processing_status: ended`. The OpenAI batch endpoint accepts JSONL upload via the Files API endpoint, then `POST /v1/batches` with the `input_file_id`, polled via `GET /v1/batches/{batch_id}` until `status: completed` (or `failed` / `expired` / `cancelled`). Both are async / job-based / out-of-band-from-the-streaming-loop; neither maps to the existing `send_message` / `stream_message` synchronous API the `ProviderClient` trait exposes. The gap here is **a complete capability surface absent**, not a per-field gap inside an existing surface — the single largest cost-reduction lever in the entire API parity audit, compounding on top of the also-missing #219 prompt-caching opt-in (90% input-cost reduction) for an effective 95% cost asymmetry on the canonical bulk-ingest use case. + +**Concrete repro:** + +``` +$ cd ~/clawd/claw-code && git rev-parse --short HEAD +d46c423 + +$ rg -n 'batches/v1|/v1/messages/batches|/v1/batches|message_batches|BatchClient|BatchRequest|MessageBatch|batch_id|custom_id|processing_status|BatchRequestCounts|BatchProcessingStatus|create_batch|listBatches|cancel_batch|RetrieveBatch' rust/crates/api/ rust/crates/runtime/ rust/crates/rusty-claude-cli/ 2>&1 | wc -l +0 +# Zero hits across the entire api crate, runtime crate, and rusty-claude-cli crate. +# No batch endpoint, no batch client, no batch request struct, no batch result type, +# no custom_id correlation field, no processing_status enum, no batch dispatcher. + +$ rg -n '"/v1/messages"|"/v1/chat"|"/v1/messages/count_tokens"' rust/crates/api/src/providers/ +rust/crates/api/src/providers/anthropic.rs:414: "/v1/messages", +rust/crates/api/src/providers/anthropic.rs:425: "/v1/messages", +rust/crates/api/src/providers/anthropic.rs:470: let request_url = format!("{}/v1/messages", self.base_url.trim_end_matches('/')); +rust/crates/api/src/providers/anthropic.rs:529: let request_url = format!("{}/v1/messages/count_tokens", self.base_url.trim_end_matches('/')); +rust/crates/api/src/providers/anthropic.rs:554: "/v1/messages", +rust/crates/api/src/providers/anthropic.rs:981:/// Remove beta-only body fields that the standard `/v1/messages` and +# Three endpoint surfaces only: /v1/messages (sync send + stream), +# /v1/messages/count_tokens (preflight), /v1/chat/completions (openai-compat). +# No /v1/messages/batches anywhere. + +$ rg -n 'fn send_message|fn stream_message|pub async fn' rust/crates/api/src/providers/anthropic.rs rust/crates/api/src/providers/openai_compat.rs | head -10 +rust/crates/api/src/providers/anthropic.rs:466: pub async fn send_raw_request(...) +rust/crates/api/src/providers/anthropic.rs:489: async fn preflight_message_request(...) +rust/crates/api/src/providers/anthropic.rs:522: async fn count_tokens(...) +rust/crates/api/src/providers/openai_compat.rs:_: pub async fn send_message(...) +rust/crates/api/src/providers/openai_compat.rs:_: pub async fn stream_message(...) +# Three async methods exposed: send_message, stream_message, count_tokens. +# All three operate on a single MessageRequest. No batch_message, +# no enqueue_batch, no poll_batch_status, no retrieve_batch_results. + +$ rg -n 'fn send_message|fn stream_message|fn batch' rust/crates/api/src/client.rs +17: pub fn from_model(model: &str) -> Result { +85: pub async fn send_message(...) +94: pub async fn stream_message(...) +# ProviderClient surfaces send_message + stream_message + supporting helpers. +# No batch_message. The shape MessageStream::Anthropic | OpenAiCompat is +# closed under per-request streaming events (MessageStart / ContentBlockDelta / +# MessageStop) — there is no MessageStream::Batch variant emitting batch-job +# lifecycle events (Submitted, InProgress, Completed, Failed). + +$ rg -n 'pub trait Provider' rust/crates/api/src/providers/mod.rs +17:pub trait Provider { +20: fn send_message<'a>(...) +26: fn stream_message<'a>(...) +# The Provider trait is closed at two methods: send_message, stream_message. +# Adding batch_message / poll_batch / retrieve_batch_results would require +# a synchronized trait extension and a new MessageBatchStream type on the +# return side. Neither exists. + +$ rg -n 'MessageRequest|MessageResponse|InputMessage' rust/crates/api/src/types.rs | head -8 +6:pub struct MessageRequest { +44:pub struct InputMessage { +118:pub struct MessageResponse { +# Three structs at the data-model layer: MessageRequest, InputMessage, MessageResponse. +# Zero MessageBatch struct. Zero BatchInput struct. Zero BatchedRequest struct +# carrying a custom_id correlation field. Zero BatchResult struct. Zero +# BatchProcessingStatus enum. Zero BatchRequestCounts struct. The data model +# is structurally closed at the per-request granularity. + +$ rg -n 'custom_id|customId' rust/ 2>&1 | wc -l +0 +# Anthropic's Batch API requires a custom_id field on every request in the +# batch (so the caller can correlate batched results back to their input +# requests on the JSONL output side). Zero hits across the entire repo. +# OpenAI's Batch API uses the same custom_id field. Zero hits. + +$ rg -n 'batches\b|batch\b|Batch\b' rust/crates/api/ 2>&1 | head -5 +# (no output) +# Confirmed: not just the endpoint absent, but the entire Batch typed +# vocabulary is not present in the API crate. + +$ rg -n 'Batch' rust/crates/runtime/ rust/crates/rusty-claude-cli/ 2>&1 | head -5 +rust/crates/rusty-claude-cli/src/main.rs:12076: // Batch 5 added `/session delete`; match on the stable core rather than +# Single hit — and it's a code comment about a session-management cycle, +# not a Batch API typed surface. + +$ rg -n 'send_with_retry|send_message|stream_message' rust/crates/api/src/providers/anthropic.rs | head -8 +389: async fn send_with_retry( +444: pub async fn send_with_retry( +466: pub async fn send_raw_request( +# All three methods construct a single Request body, POST to /v1/messages, +# await a single response (or stream). None enqueue, none poll, none +# retrieve from a job queue. +``` + +**(1) Endpoint absence: zero `/v1/messages/batches` and zero `/v1/batches` surface.** The Anthropic Message Batches API (https://docs.anthropic.com/en/api/messages-batches, GA 2024-10-08) exposes five operations: `POST /v1/messages/batches` (create), `GET /v1/messages/batches/{id}` (retrieve), `GET /v1/messages/batches/{id}/results` (retrieve results JSONL), `GET /v1/messages/batches` (list), `POST /v1/messages/batches/{id}/cancel` (cancel). Zero of the five exist anywhere in `rust/crates/api/src/providers/anthropic.rs`. The closest analog is `/v1/messages/count_tokens` at line 529, which is itself an out-of-band auxiliary endpoint but operates synchronously (per-request preflight, not job-based). The OpenAI Batch API (https://platform.openai.com/docs/api-reference/batch, GA 2024-04-15) exposes parallel five operations on `/v1/batches/{id}`. Zero of those exist either. The endpoint absence is **complete and structural** — there is no fallback, no plugin hook, no escape hatch. + +**(2) Data-model absence: zero `MessageBatch` taxonomy.** The Anthropic API specifies a `MessageBatch` struct returning `id: String`, `type: "message_batch"`, `processing_status: "in_progress" | "canceling" | "ended"`, `request_counts: { processing: u32, succeeded: u32, errored: u32, canceled: u32, expired: u32 }`, `ended_at: Option`, `created_at: String`, `expires_at: String`, `archived_at: Option`, `cancel_initiated_at: Option`, `results_url: Option`. The batch-input shape per request is `BatchedRequest { custom_id: String, params: MessageRequest }`. The result shape per response is `BatchedResult { custom_id: String, result: BatchResult }` where `BatchResult ∈ { Succeeded { message: MessageResponse }, Errored { error: ErrorBody }, Canceled, Expired }`. Zero hits in `rust/crates/api/src/types.rs` for any of: `MessageBatch`, `BatchedRequest`, `BatchedResult`, `BatchResult`, `BatchProcessingStatus`, `BatchRequestCounts`, `custom_id`. The OpenAI Batch shape (`Batch { id, object: "batch", endpoint: "/v1/chat/completions", errors: BatchErrors, input_file_id, completion_window: "24h", status, output_file_id, error_file_id, created_at, in_progress_at, expires_at, finalizing_at, completed_at, failed_at, expired_at, cancelling_at, cancelled_at, request_counts: { total, completed, failed }, metadata }`) is also entirely absent. The data-model layer is **structurally closed at the per-request granularity** — there is no slot for a batch-job typed identity, no slot for a request-counts breakdown, no slot for a job-lifecycle status enum, no slot for a `custom_id` correlation field, no slot for a results-file URL. + +**(3) Trait-surface absence: zero `batch_message` on `Provider` trait.** `rust/crates/api/src/providers/mod.rs:17-30` defines: + +```rust +pub trait Provider { + type Stream; + + fn send_message<'a>( + &'a self, + request: &'a MessageRequest, + ) -> ProviderFuture<'a, MessageResponse>; + + fn stream_message<'a>( + &'a self, + request: &'a MessageRequest, + ) -> ProviderFuture<'a, Self::Stream>; +} +``` + +Two methods. Both consume a single `MessageRequest` and produce either a single `MessageResponse` or a single per-request `Self::Stream`. There is no third method `submit_batch<'a>(&'a self, requests: &'a [BatchedRequest]) -> ProviderFuture<'a, MessageBatch>`, no `retrieve_batch<'a>(&'a self, batch_id: &'a str) -> ProviderFuture<'a, MessageBatch>`, no `retrieve_batch_results<'a>(&'a self, batch_id: &'a str) -> ProviderFuture<'a, BatchResultStream>`, no `list_batches<'a>(&'a self, ...) -> ProviderFuture<'a, BatchListPage>`, no `cancel_batch<'a>(&'a self, batch_id: &'a str) -> ProviderFuture<'a, MessageBatch>`. Adding any of these would require synchronized extension to both implementor crates (Anthropic-native and OpenAI-compat) and a new return-type taxonomy. The `ProviderClient` enum at `rust/crates/api/src/client.rs:8-14` is closed under three variants (Anthropic / Xai / OpenAi), all three exposing only `send_message` and `stream_message` — no `batch_message`, no `submit_batch`, no `retrieve_batch_results`, no batch-aware composition. + +**(4) Worker-runtime absence: zero job-based dispatcher in `rust/crates/runtime/`.** `WorkerRegistry::observe_completion` (worker_boot.rs:558) classifies a worker on the response from a single `MessageRequest` round trip — finish_reason, content blocks, tool uses, prompt-mismatch detection (the same one that hard-codes "Explain this KakaoTalk screenshot" as a canonical task signal at line 1324, threaded into #220's narrative). The conversation engine at `rust/crates/runtime/src/conversation.rs:314` (`run_turn`) drives a turn-by-turn loop: extract input → submit single request → process response → repeat. No `submit_batch_turn`, no `accumulate_pending_requests`, no `flush_batch_at_threshold`, no `WorkerStatus::AwaitingBatch`, no `WorkerEventPayload::BatchSubmitted` / `BatchInProgress` / `BatchEnded`. The runtime layer mirrors the API layer's per-request granularity. The `task_registry.rs` module (which manages out-of-band work) has no `batch_task` taxonomy either. The crate-wide assumption is that every API call is synchronous, per-request, and returns within seconds — not minutes-to-hours like a batch job (Anthropic's batch SLO is "within 24 hours, typically faster"; OpenAI's is "within 24 hours"). + +**(5) CLI-surface absence: zero `claw batch` / `claw batches` subcommand.** `claw --help` exposes no `batch`, `batches`, `submit-batch`, `list-batches`, `retrieve-batch`, `cancel-batch`, or analogous bulk-dispatch subcommand. `claw batch --help` returns the standard "command not found" / "did you mean" path. `claw status --json` has no `pending_batches` field. `claw doctor --json` does not check for batch quota / batch rate limit / batch in-flight count visibility. The slash-command spec table at `rust/crates/commands/src/lib.rs` (the same table that advertises `/image` and `/screenshot` from #220) has no `/batch`, `/submit-batch`, `/check-batch`, or analogous slash command — so even the user-facing surface for "I have 50 prompts to dispatch as a single async job for 50% off" does not exist. The capability is invisible from every CLI, REPL, and slash-command discovery surface. + +**(6) Pricing-engine absence: zero `is_batch_request` flag on `pricing_for_model`.** `runtime/src/pricing.rs` (the cost estimator at the heart of #209's pricing-fallback gap) computes cost per `TokenUsage` against a `pricing_for_model(&str) -> Option` lookup. The `Pricing` struct has fields for `input_tokens_per_million_usd`, `output_tokens_per_million_usd`, `cache_creation_input_tokens_per_million_usd`, `cache_read_input_tokens_per_million_usd`. There is no `batch_input_tokens_per_million_usd` field, no `batch_output_tokens_per_million_usd` field, and no `is_batch_request` flag on the call site that would select them. Even if the API were extended to support batch dispatch, the cost estimator would over-charge by exactly 2x because it has no way to differentiate batched-tier pricing from synchronous-tier pricing. The pricing taxonomy is **structurally locked to synchronous-only**; the same cost-parity gap shape as #209 (default-fallback uses Opus pricing, off by 5x for Haiku/non-Sonnet/non-Opus models), now extended one axis further (batch-tier pricing absent across all models, off by 2x even for the models pricing IS correct on for the synchronous tier). + +**(7) Cluster-shape kinship and novelty.** Same family as the wire-format-parity cluster (#211–#220), but the failure mode is **the largest endpoint-level capability absence catalogued so far**, exceeding even #220's five-layer feature absence in scope: #220 was a feature absence within an existing endpoint surface (`/v1/messages` cannot accept image content blocks); #221 is an **entire endpoint absence** (the `/v1/messages/batches` endpoint family — five operations, all five absent). Prior cluster members were single-axis absences (a missing field, a missing variant, a missing parse arm); #221 spans **seven layers**: (a) endpoint URL, (b) data-model taxonomy (`MessageBatch` / `BatchedRequest` / `BatchedResult` / `BatchProcessingStatus` / `BatchRequestCounts`), (c) Provider trait method, (d) ProviderClient enum dispatch, (e) Worker registry status enum + event payload, (f) CLI subcommand surface, (g) pricing tier flag. Composing with #219 (cache_control absent) gives a **compounded ~95% input-cost asymmetry** on bulk ingest: the 50% batch discount on top of the 90% prompt-caching discount = effective 5% of synchronous-non-cached cost; both discounts are attainable today on competitor stacks, neither is attainable on claw-code at HEAD `d46c423`. The capability-parity cluster (the strict-superset of wire-format-parity that includes user-facing surfaces and OS integration) grows: #218 (structured outputs) + #220 (multimodal input) + #221 (batch dispatch) — three members, all four-or-more-layer structural absences. The **endpoint-family-level absence shape** is novel in the cluster; prior pinpoints all operated within an existing endpoint surface, not an entirely missing endpoint family. This motivates a new doctrine entry: **endpoint-family-level absence is a legitimate pinpoint shape distinct from per-request-field absence, per-response-field absence, and per-content-block-variant absence**. Distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) / five-layer-feature-absence (#220) members; the seven-layer-endpoint-family-absence shape is the largest-scope cluster member yet. + +**Reproduction sketch:** + +```rust +// Test 1: ProviderClient cannot dispatch a batch. +#[test] +fn provider_client_lacks_batch_dispatch() { + use api::ProviderClient; + let client = ProviderClient::from_model("claude-sonnet-4-6").unwrap(); + // Compile-time observable: this call does not exist. + // let _batch = client.submit_batch(&vec![ + // BatchedRequest { custom_id: "req-1".to_string(), params: MessageRequest::default() }, + // BatchedRequest { custom_id: "req-2".to_string(), params: MessageRequest::default() }, + // ]).await; + // The method does not exist on ProviderClient. The struct BatchedRequest + // does not exist in the api crate. The submission has no API surface. + let _ = client; // suppress unused warning +} + +// Test 2: Anthropic Batches endpoint URL is not constructed anywhere. +#[test] +fn anthropic_batches_endpoint_url_is_not_constructed() { + // rg -n '"/v1/messages/batches"' rust/crates/api/ returns zero hits. + // The URL string never appears. + let occurrences = std::process::Command::new("rg") + .args(["-c", "/v1/messages/batches", "rust/crates/api/"]) + .output() + .map(|o| String::from_utf8_lossy(&o.stdout).trim().parse::().unwrap_or(0)) + .unwrap_or(0); + assert_eq!(occurrences, 0, "v1/messages/batches must currently have zero codebase footprint"); +} + +// Test 3: MessageBatch typed taxonomy does not exist. +#[test] +fn message_batch_taxonomy_is_absent() { + // Compile-time observable: every line below fails to compile. + // let _batch = api::MessageBatch::default(); + // let _req = api::BatchedRequest { custom_id: "x".into(), params: MessageRequest::default() }; + // let _result = api::BatchedResult::Succeeded { custom_id: "x".into(), message: MessageResponse::default() }; + // let _status = api::BatchProcessingStatus::InProgress; + // The types do not exist in the api crate. The pub use re-exports at + // rust/crates/api/src/lib.rs do not expose them. The structs are not defined + // in rust/crates/api/src/types.rs either. The taxonomy is absent end-to-end. +} + +// Test 4: ProviderClient::submit_batch / retrieve_batch / list_batches / cancel_batch all absent. +#[test] +fn batch_lifecycle_methods_are_absent() { + use api::ProviderClient; + let client = ProviderClient::from_model("claude-sonnet-4-6").unwrap(); + // Compile-time: all four lines fail to compile. + // let _ = client.submit_batch(&[]); + // let _ = client.retrieve_batch("batch_xxx"); + // let _ = client.list_batches(); + // let _ = client.cancel_batch("batch_xxx"); + // None of these methods exist on the ProviderClient enum. + let _ = client; +} + +// Test 5: cost estimator over-charges 2x for hypothetical batched usage. +#[test] +fn cost_estimator_lacks_batch_tier_pricing() { + let usage = api::Usage { + input_tokens: 1_000_000, + output_tokens: 500_000, + cache_creation_input_tokens: 0, + cache_read_input_tokens: 0, + }; + let cost = usage.estimated_cost_usd("claude-sonnet-4-6"); + let expected_synchronous_cost = cost.total_cost_usd(); + // Hypothetically, a batch_estimated_cost_usd method should exist that + // applies the 50% discount. It does not. The cost estimator has no + // is_batch flag, no batch_pricing field, and no API surface for batch. + // assert_eq!(usage.batch_estimated_cost_usd("claude-sonnet-4-6").total_cost_usd(), + // expected_synchronous_cost / 2.0); + // The method does not exist. + let _ = expected_synchronous_cost; +} +``` + +**Fix shape (not implemented in this cycle, recorded for cluster refactor):** + +The minimal fix is a seven-touch architectural extension. (a) Define `pub struct BatchedRequest { pub custom_id: String, pub params: MessageRequest }` and `pub struct MessageBatch { pub id: String, pub processing_status: BatchProcessingStatus, pub request_counts: BatchRequestCounts, pub created_at: String, pub expires_at: String, pub ended_at: Option, pub results_url: Option, pub archived_at: Option, pub cancel_initiated_at: Option }` and `pub enum BatchProcessingStatus { InProgress, Canceling, Ended }` and `pub struct BatchRequestCounts { pub processing: u32, pub succeeded: u32, pub errored: u32, pub canceled: u32, pub expired: u32 }` and `pub enum BatchedResult { Succeeded { custom_id: String, message: MessageResponse }, Errored { custom_id: String, error: ErrorBody }, Canceled { custom_id: String }, Expired { custom_id: String } }` at `rust/crates/api/src/types.rs` near line 234 (after `MessageStopEvent`, in a new `Batch API` section). (b) Re-export the new types from `rust/crates/api/src/lib.rs` near line 33 alongside the existing `MessageRequest` / `MessageResponse` re-exports. (c) Extend the `Provider` trait at `rust/crates/api/src/providers/mod.rs:17` with `fn submit_batch<'a>(&'a self, requests: &'a [BatchedRequest]) -> ProviderFuture<'a, MessageBatch>; fn retrieve_batch<'a>(&'a self, batch_id: &'a str) -> ProviderFuture<'a, MessageBatch>; fn retrieve_batch_results<'a>(&'a self, batch_id: &'a str) -> ProviderFuture<'a, Vec>; fn cancel_batch<'a>(&'a self, batch_id: &'a str) -> ProviderFuture<'a, MessageBatch>; fn list_batches<'a>(&'a self, before_id: Option<&'a str>, after_id: Option<&'a str>, limit: u32) -> ProviderFuture<'a, Vec>;` — five new methods on the trait. (d) Implement the five methods on `AnthropicClient` (`rust/crates/api/src/providers/anthropic.rs`) using `POST /v1/messages/batches` with body `{ requests: [BatchedRequest, ...] }`, `GET /v1/messages/batches/{id}`, `GET /v1/messages/batches/{id}/results` (returns JSONL — parse line-by-line into `Vec`), `POST /v1/messages/batches/{id}/cancel`, `GET /v1/messages/batches?before_id&after_id&limit`. Honor the `anthropic-beta: message-batches-2024-09-24` header on all five (this is the canonical opt-in beta marker; eventually GA so the header becomes optional). Reuse the existing `auth.apply()` and retry/backoff infrastructure. (e) Implement the five methods on `OpenAiCompatClient` (`rust/crates/api/src/providers/openai_compat.rs`) using `POST /v1/files` for input file upload (purpose: `"batch"`), `POST /v1/batches` with body `{ input_file_id, endpoint: "/v1/chat/completions", completion_window: "24h" }`, `GET /v1/batches/{id}`, `GET /v1/files/{output_file_id}/content`, `POST /v1/batches/{id}/cancel`, `GET /v1/batches?after&limit`. The OpenAI batch path requires a Files API integration (which is itself absent — see the implicit follow-on pinpoint candidate "Files API typed taxonomy is absent"). (f) Extend `ProviderClient` enum at `rust/crates/api/src/client.rs:8` with five new dispatch methods that forward to the appropriate per-variant impl. Add a sixth variant `MessageStream::Batch { batch: MessageBatch, results: Vec }` for end-to-end parity with synchronous streaming. (g) Add a `claw batch submit` / `claw batch retrieve` / `claw batch list` / `claw batch cancel` / `claw batch results` CLI subcommand family at `rust/crates/rusty-claude-cli/src/main.rs`, threading the `--input-file` (JSONL of prompts), `--batch-id`, `--output-file` (JSONL of results), `--completion-window` (default 24h, OpenAI-only) flags. Add structured event emission `BatchSubmittedEvent` / `BatchInProgressEvent` / `BatchEndedEvent` / `BatchCanceledEvent` to the telemetry sink. Add `claw status --json` `pending_batches: [{batch_id, request_counts, processing_status}]` field. Add slash command `/batch ` and `/batches` (list outstanding) under the new SlashCommandSpec entries. Estimate: ~340 LOC production + ~420 LOC test (covering all five operations × Anthropic-native and OpenAI-compat lanes × `custom_id` correlation × `processing_status` lifecycle × `request_counts` accumulation × cancel mid-flight × expired-after-24h × end-to-end CLI surface × pricing-tier-flag pass-through to the cost estimator). The deeper fix is to declare a `Dispatch` typed enum at the data-model layer that enumerates all submit-execute axes (`Synchronous`, `Streaming`, `Batched`) and compiles to provider-appropriate endpoint URLs and request shapes via a single `into_dispatch_route()` translation, matching the architecture of #218's `Capability` enum, #219's `Cacheability` enum, and #220's `Modality` enum (proposed). This collapses #221 into one composable rule with the rest of the wire-format-parity cluster (#211–#220) and gives claw-code dispatch-axis parity with anomalyco/opencode (Batch API integration in flight), simonw/llm (`--batch` flag for bulk runs), Vercel AI SDK (`generateBatch` API), LangChain (`Runnable.batch()` interface), LangSmith (batch-aware tracing), Anthropic Python SDK (`client.messages.batches.create(requests=[...])` first-class), Anthropic TypeScript SDK (parallel API), and Anthropic's own claude-code CLI (no first-class batch surface yet, but the Anthropic ecosystem expects callers to opt in). The cluster doctrine accumulates: every dispatch axis that exists in 2025+ provider APIs must have a typed slot in the Rust data model, must traverse the wire via `serde_json::to_value` without ad-hoc string splicing, must round-trip cleanly through both native and openai-compat lanes, must have a CLI subcommand surface that matches the spec table's advertised summary, *and must have a cost-tier flag on the pricing engine that differentiates batched-tier from synchronous-tier costing*. The seventh axis — pricing-tier flag on the cost estimator — is novel in the cluster and motivates a new doctrine entry: any capability-parity gap that has a documented price differential (50% for batches, 90% for prompt-caching, 50% for OpenAI flex tier from #216) must thread that differential through the cost estimator, not just the wire layer. Distinct from #209 (pricing fallback default uses Opus values for unknown models — wrong-by-5x synchronous tier) which is a tier-lookup gap, #221's pricing axis is a **tier-existence gap** (no batch pricing tier defined for any model, even ones with correct synchronous-tier pricing). + +**Status:** Open. No code changed. Filed 2026-04-26 01:30 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: d46c423. Sibling-shape cluster (silent-fallback / silent-drop / silent-strip / silent-misnomer / silent-shadow / silent-prefix-mismatch / structural-absence / silent-zero-coercion / silent-content-discard / silent-header-discard / silent-tier-absence / silent-finish-mistranslation / silent-capability-absence / silent-false-positive-opt-in / advertised-but-unbuilt / endpoint-family-level-absence): #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221 — twenty pinpoints. Wire-format-parity cluster grows to eleven: #211 (max_completion_tokens) + #212 (parallel_tool_calls) + #213 (cached_tokens response-side) + #214 (reasoning_content) + #215 (Retry-After) + #216 (service_tier + system_fingerprint) + #217 (finish_reason taxonomy) + #218 (response_format / output_config / refusal) + #219 (cache_control request-side) + #220 (image content block + media_type + ImageSource taxonomy) + #221 (Message Batches API + BatchedRequest + custom_id + BatchProcessingStatus + BatchRequestCounts + BatchedResult + Provider trait extension). Capability-parity cluster (the strict-superset of wire-format-parity that includes user-facing CLI surfaces, OS integration, and dispatch-axis): #218 (structured outputs) + #220 (multimodal input) + #221 (batch dispatch) — three members, all four-or-more-layer structural absences. Cost-parity cluster grows to eight: #204 (reasoning_tokens) + #207 (cached_tokens response-side) + #209 (pricing fallback Opus default) + #210 (max_tokens 4x over-limit) + #213 (cached_tokens openai-compat) + #216 (service_tier flex/priority) + #219 (cache_control 90% input savings) + #221 (batch dispatch 50% input+output savings — compounds with #219 to ~95% asymmetry on bulk ingest, the largest cost gap in the entire cluster). Seven-layer-endpoint-family-absence shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + Worker-registry-status-enum + CLI-subcommand-surface + pricing-tier-flag) is the largest single capability absence catalogued, exceeding #220's five-layer-feature-absence shape, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) / five-layer-feature-absence (#220) members; the endpoint-family-level absence shape is novel and applies to the implicit follow-on pinpoint candidates "Files API typed taxonomy is absent" (the OpenAI batch path's prerequisite endpoint family, also absent), "Embeddings API typed taxonomy is absent" (cross-cutting against `/v1/embeddings`, which all major providers expose for code-similarity / rerank workflows), and "Models list endpoint typed taxonomy is absent" (`/v1/models` / Anthropic's Models API, used by the model-discovery affordance that #209's pricing-fallback gap implicitly depends on). External validation: Anthropic Message Batches API reference (https://docs.anthropic.com/en/api/messages-batches and https://docs.anthropic.com/en/docs/build-with-claude/batch-processing — five operations on `/v1/messages/batches`, GA 2024-10-08, 50% input-and-output token discount, 100,000 requests per batch, 256MB total payload limit, 24-hour completion SLO, results JSONL via `results_url`, `custom_id` correlation field per request, `anthropic-beta: message-batches-2024-09-24` opt-in header), Anthropic Python SDK `client.messages.batches.create(requests=[...])` and `client.messages.batches.retrieve(batch_id)` and `client.messages.batches.list()` (https://github.com/anthropics/anthropic-sdk-python — first-class typed surface), Anthropic TypeScript SDK parallel surface (https://github.com/anthropics/anthropic-sdk-typescript), AWS Bedrock InvokeModelBatch / batch-inference docs (https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html — Bedrock-anthropic-relay path), Anthropic launch announcement (https://www.anthropic.com/news/message-batches-api — explicit "50% off both input and output tokens" positioning, "non-time-sensitive, large-scale processing" use-case framing), Anthropic Pricing page (https://www.anthropic.com/pricing — Batch API column documenting 50% across all Sonnet 3.5/4/4.5/4.6, Opus 3/4/4.6, Haiku 3.5 model tiers), OpenAI Batch API reference (https://platform.openai.com/docs/api-reference/batch and https://platform.openai.com/docs/guides/batch — five operations on `/v1/batches`, GA 2024-04-15, 50% discount, JSONL upload via Files API, `completion_window: "24h"` knob, `custom_id` correlation field), OpenAI Files API reference (https://platform.openai.com/docs/api-reference/files — prerequisite for OpenAI batch input upload), OpenAI launch announcement (https://openai.com/index/openai-introduces-batch-api — "process batches asynchronously and receive results within 24 hours at a 50% discount"), DeepSeek batch inference docs (https://api-docs.deepseek.com — OpenAI-compat batch-input pathway), Moonshot batch inference docs (https://platform.moonshot.cn — same shape), Alibaba DashScope batch inference docs (https://help.aliyun.com — same shape), xAI batch inference docs (https://docs.x.ai/docs/batch — same shape), OpenRouter batch passthrough (https://openrouter.ai/docs — provider-aware batch routing), anomalyco/opencode batch-API integration discussions (multiple open issues and roadmap entries acknowledging the 50% lever as table-stakes for cost-conscious deployments), simonw/llm `--batch` flag (https://llm.datasette.io — first-class CLI surface for bulk runs with auto-batching against vendor batch APIs), Vercel AI SDK `generateBatch` and provider-specific batch passthrough (https://sdk.vercel.ai), LangChain `Runnable.batch()` and `Runnable.abatch()` interfaces (https://python.langchain.com — first-class Python and TypeScript parity), LangSmith batch-aware tracing (https://docs.smith.langchain.com — observability over batch jobs), LangGraph batch-message routing (https://langchain-ai.github.io/langgraph), llmindset.co.uk Anthropic batch pricing analysis (https://llmindset.co.uk/posts/2024/10/anthropic-batch-pricing — independent third-party validation of the cost calculus), Medium "process 10,000 queries without breaking the bank" tutorial (https://medium.com/@alejandro7899871776 — community-canonical "use the batch API for cost-bound bulk work" pattern), Steve Kinney's Anthropic Batch + Temporal article (https://stevekinney.com/writing/anthropic-batch-api-with-temporal — workflow-orchestration integration pattern), ai.moda Anthropic Batch + Caching combined cost analysis (https://www.ai.moda/en/blog/anthropics-batches-with-caching — 95% compounded savings argument that #219+#221 together close), VentureBeat coverage of Anthropic Batch API launch (https://venturebeat.com/ai/anthropic-challenges-openai-with-affordable-batch-processing — industry-press validation), Reddit r/ClaudeAI batch pricing announcement thread (https://reddit.com/r/ClaudeAI/comments/1fz86om/anthropic_launch_batch_pricing — community validation), zed-industries/zed#19945 (request to support Anthropic Batch API in Zed's AI integration — ecosystem peer with same gap), RooCodeInc/Roo-Code#8667 (request to support batch dispatch in Roo coding agent — another peer ecosystem with same gap), n8n Anthropic batch processing workflow (https://n8n.io/workflows/3409 — workflow-automation-tool integration pattern), startground.com Anthropic batch deals tracker (https://startground.com/deals/claude — operator-facing cost analysis of the batch tier), silicondata.com Anthropic API pricing 2026 (https://www.silicondata.com/use-cases/anthropic-claude-api-pricing-2026 — pricing-page-derived per-model batch tier breakdown), Hacker News batch API discussions (https://news.ycombinator.com/item?id=46981670 and https://brianlovin.com/hn/46549823 — community technical discussion of the batch tier mechanics and cost calculus), shareuhack.com claude-code OAuth cost article (https://www.shareuhack.com/en/posts/openclaw-claude-code-oauth-cost — operator-facing cost discussion of claude-code stack), OpenTelemetry GenAI semconv `gen_ai.request.batch_id` and `gen_ai.batch.processing_status` and `gen_ai.batch.request_counts` (https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/ — multimodal-input observability is a documented attribute set, batch-dispatch attributes are also documented), MIME-type registry for `application/x-ndjson` and `application/jsonl` (RFC 7159 + IANA media-type registry — the line-delimited JSON format both Anthropic and OpenAI use for batch input/output). Twenty-five ecosystem references, two open issues in peer coding agents (zed#19945, roo#8667), GA timeline of 18 months on Anthropic's side and 24 months on OpenAI's side, 50% per-tier discount, 95% compounded discount with #219, 100,000-requests-per-batch throughput multiplier, 24-hour SLO. claw-code is the **sole client/agent/CLI in the surveyed coding-agent ecosystem with zero batch-dispatch capability** despite the API being GA on both major providers for over 18 months — a parity floor against every other CLI/SDK/coding-agent in 2024–2025, the largest single cost-reduction lever in the entire emission-routing audit, and the largest endpoint-family-level capability gap catalogued so far. The fix shape is well-understood, all reference implementations exist in peer codebases (Anthropic Python/TypeScript SDKs, simonw/llm, Vercel AI SDK, LangChain), the cost differential is documented and widely cited (ai.moda 95% compounded savings analysis, llmindset.co.uk pricing breakdown, VentureBeat industry coverage), and the use-case framing aligns directly with claw-code's own roadmap Phase 4 "Claws-First Task Execution" (which markets bulk-ingest, repository-wide grep-then-summarize, multi-file refactor analysis, and similar batch-friendly workflows as the canonical clawable harness use cases). #221 closes the largest endpoint-family-level capability gap in the entire emission-routing audit and unblocks 95%-compounded-cost-discount automation use cases that the runtime's own roadmap already treats as canonical Phase 4 priorities. + +🪨