1104 Commits

Author SHA1 Message Date
Jobdori
c01b47036e roadmap: #224 filed — Embeddings API typed taxonomy is structurally absent: zero /v1/embeddings endpoint surface across both Anthropic-native and OpenAI-compat lanes, zero EmbeddingRequest / EmbeddingResponse / EmbeddingObject / EmbeddingUsage / EmbeddingEncoding / EmbeddingInputType / EmbeddingTruncation / EmbeddingOutputDtype / EmbeddingData typed model in rust/crates/api/src/types.rs (rg returns zero hits for embedding/embed/Embedding/EmbeddingRequest/EmbeddingResponse/text-embedding/voyage-/vector/cosine/similarity/dimensions across rust/), zero Vec<f32>/Vec<f64> embedding-vector slot anywhere in the data model, zero create_embeddings method on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist), zero embeddings dispatch on the ProviderClient enum at rust/crates/api/src/client.rs:8-14, zero claw embed / claw embeddings / claw vector CLI subcommand surface, zero /embed / /embeddings slash command in the SlashCommandSpec table, zero embedding_input_tokens_per_million_usd / embedding_dimensions fields in the Pricing struct, zero embedding-model entries in MODEL_REGISTRY (13 chat/completion entries, zero text-embedding-3-small/large/ada-002/voyage-3-large/voyage-code-3/embed-english-v3.0/cohere-embed/nomic-embed/mxbai-embed entries), and the pricing_for_model substring-matcher matches only haiku/opus/sonnet literals so it cannot recognize any embedding-model id (#209 cluster overlap) — manifesting a uniquely provider-asymmetric-delegation shape where Anthropic explicitly does not offer /v1/embeddings on https://api.anthropic.com and instead delegates to Voyage AI as the recommended partner per https://docs.anthropic.com/en/docs/build-with-claude/embeddings while OpenAI offers /v1/embeddings GA since 2022-12-15 (39+ months ago, the literal flagship endpoint of OpenAI's developer platform alongside /v1/chat/completions) — the cross-provider asymmetry is structural and requires a third lane in the ProviderClient enum (Voyage variant or supports_embeddings capability flag with EmbeddingError::Unsupported recommendation return shape) that no other endpoint family in this audit has needed — distinct from #221 batch dispatch (uniform on both major providers), #222 models list (uniform on both), and #223 Files API (uniform on both, just different beta header on Anthropic), making #224 the first cluster member where one canonical major provider explicitly does not offer the endpoint and recommends an external partner, requiring multi-provider routing rather than uniform Provider trait dispatch (Jobdori cycle #376 / extends #168c emission-routing audit / explicit follow-on candidate from #221 seven-layer-endpoint-family-absence shape — the second-named of three named candidates: Files API typed taxonomy / Embeddings API typed taxonomy / Models list endpoint typed taxonomy, completing the trio with #222 closing Models list and #223 closing Files API and #224 closing Embeddings / sibling-shape cluster grows to twenty-three: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221/#222/#223/#224 / wire-format-parity cluster grows to fourteen: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221+#222+#223+#224 / capability-parity cluster grows to six: #218+#220+#221+#222+#223+#224 / cross-cutting-data-pipeline cluster: #224 alone but it is the upstream prerequisite of every RAG / semantic-search / re-ranking / hybrid-search / dense-retrieval / classification-via-cosine / clustering / nearest-neighbor / codebase-indexing / context-retrieval-via-similarity use case that 2024-2026-era coding-agent harnesses ship as first-class affordances / seven-layer-endpoint-family-absence-with-provider-asymmetric-delegation shape (endpoint-URL + data-model-taxonomy + Provider-trait-method-with-Unsupported-fallback + ProviderClient-enum-dispatch-with-Voyage-third-lane + CLI-subcommand-surface + slash-command-surface + Voyage-AI-partner-routing-with-credential-discovery) is the first single capability absence catalogued where the provider-asymmetric-delegation pattern itself must be modeled at the dispatch layer — distinct from #221 / #222 / #223 seven/eight/seven-layer absences (all uniform-provider-coverage), and the largest provider-routing-asymmetry gap catalogued, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) / five-layer-feature-absence (#220) / seven-layer-endpoint-family-absence (#221) / eight-layer-endpoint-family-absence-with-misleading-alias (#222) / seven-layer-endpoint-family-absence-with-transport-plumbing-absence (#223) members; the seven-layer-endpoint-family-absence-with-provider-asymmetric-delegation shape is novel and applies to follow-on candidates Audio API typed taxonomy (also provider-asymmetric: Anthropic does not offer audio, OpenAI offers GA whisper+tts, recommended-partners include ElevenLabs/Cartesia/PlayHT/Deepgram) and Image-generation API typed taxonomy (also provider-asymmetric: Anthropic does not offer image generation, recommended-partners include Stability AI/Midjourney/Black Forest Labs/Ideogram) / external validation: forty-three ecosystem references covering three first-class embeddings-endpoint specs (OpenAI /v1/embeddings GA 2022-12-15, Voyage AI /v1/embeddings GA 2024-01, Cohere /v1/embed), eleven first-class CLI/SDK implementations (OpenAI Python+TypeScript, Voyage AI Python+TypeScript, Cohere Python+TypeScript, simonw/llm + llm-embed plugin, Vercel AI SDK, LangChain Python+TypeScript), six first-class local-embedding-providers (Ollama, LM Studio, llama.cpp server, llamafile, sentence-transformers, HuggingFace transformers), one community-maintained authoritative benchmark (MTEB 56 tasks), twelve coding-agent peers (continue.dev @codebase/@docs, zed semantic-search, aider repository-mapping, cursor background-indexing, anomalyco/opencode @code/@docs, charmbracelet/crush context-management, TabbyML/tabby code-completion-with-context, simonw/llm-embed, codeium/cline embedding-context, sourcegraph/cody @-mention, github/copilot enterprise codebase-indexing, anthropic/claude-code retrieval-augmented planning), six first-class vector-database integrations (Pinecone, Weaviate, Qdrant, Chroma, pgvector, FAISS), and one canonical Anthropic-blessed partner-routing pattern (Voyage AI per docs.anthropic.com/embeddings). claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/embeddings integration AND zero Voyage AI partner-routing AND zero @code/@docs/@codebase retrieval-augmented slash command surface AND zero CLI-level claw embed / claw similar / claw vector subcommand family — all four gaps are unique to claw-code in the surveyed ecosystem (every other coding-agent peer has at least the @-mention codebase-retrieval pattern), the embedding-API gap is the upstream prerequisite of every retrieval-augmented affordance in the runtime, and the provider-asymmetric-delegation shape is novel within the cluster — #224 closes the upstream prerequisite of every RAG / semantic-search / re-ranking / hybrid-search / classification-via-cosine / clustering / nearest-neighbor / codebase-indexing / context-retrieval-via-similarity use case, completes the trio of follow-on candidates from #221 seven-layer-endpoint-family-absence shape (Files API closed by #223, Models list closed by #222, Embeddings API closed by #224), and establishes the provider-asymmetric-delegation pattern as a first-class cluster member — a structural prerequisite that every future endpoint family with provider-asymmetric coverage (Audio API: Anthropic delegates to ElevenLabs/Cartesia, Image-generation API: Anthropic delegates to Imagen/DALL-E/Stability) will inherit. 2026-04-26 03:09:53 +09:00
YeonGyu-Kim
ca2085cb95 roadmap: #223 filed — Files API typed taxonomy is structurally absent: zero /v1/files endpoint surface across both Anthropic-native (anthropic-beta: files-api-2025-04-14) and OpenAI-compat lanes, zero FileObject / FileList / FilePurpose / FileStatus / FileUploadRequest / FileContentResponse / FileDeletionResponse typed model in rust/crates/api/src/types.rs (zero hits for files-api-2025-04-14, /v1/files, FileObject, FileList, FilePurpose, file_id, upload_file, MultipartUpload, multipart/form-data across rust/), zero multipart/form-data upload affordance with reqwest::multipart feature flag absent from rust/crates/api/Cargo.toml, zero file_id reference type that #220 image-content-block fix-shape would need to thread through ResolvedAttachment at rust/crates/tools/src/lib.rs:2660-2666 (which carries path/size/is_image triple with no file_id, no bytes, no media_type, no purpose, no upload_status, no expires_at slot), zero file_id reference type that #221 OpenAI batch-input-JSONL upload pathway requires (POST /v1/batches accepts only input_file_id, no inline-JSONL pathway exists), zero upload_file / retrieve_file / list_files / download_file / delete_file methods on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist, both per-request synchronous), zero file-management dispatch on the ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi all closed under per-request sync), zero claw files / claw upload / claw attach CLI subcommand surface at rust/crates/rusty-claude-cli/src/main.rs, zero /upload / /attach / /file-upload slash command in the SlashCommandSpec table at rust/crates/commands/src/lib.rs (the existing /files entry advertises 'List files in the current context window' but is gated under STUB_COMMANDS as a context-window file lister, distinct feature from Files API), zero pending_uploads field in claw status --json output, zero files-api-2025-04-14 in the active anthropic-beta header at rust/crates/telemetry/src/lib.rs:451-453 (currently sends claude-code-20250219, prompt-caching-scope-2026-01-05, tools-2026-04-01 only), zero FileSubmittedEvent / FileUploadProgressEvent / FileRetentionExpiredEvent typed events on the runtime telemetry sink, zero reqwest::multipart::Form::new() / reqwest::multipart::Part::stream() / file_part / content_disposition usage anywhere in the codebase (rg returns zero hits) — the canonical file-upload affordance is invisible across every CLI / REPL / slash-command / Provider-trait / ProviderClient-enum / data-model / telemetry-beta-header / multipart-transport surface, blocking the upstream fix-shapes for both #220 (image attachment via persistent file_id, the canonical Anthropic Vision pattern documented at platform.claude.com/docs/en/build-with-claude/files for repeated-image-use efficiency where re-uploading 5MB+ images on every request would otherwise burn bandwidth) and #221 (OpenAI Batch API requires JSONL input upload via POST /v1/files with purpose: 'batch' then references the resulting file_id from POST /v1/batches — the JSONL payload cannot be sent inline; without a Files API the OpenAI batch lane is structurally unreachable even if every other layer of #221 seven-layer fix-shape ships) (Jobdori cycle #375 / extends #168c emission-routing audit / explicit follow-on candidate from #221 seven-layer-endpoint-family-absence shape — the first-named of three named candidates: Files API typed taxonomy / Embeddings API typed taxonomy / Models list endpoint typed taxonomy, completing the trio with #222 closing Models list / sibling-shape cluster grows to twenty-two: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221/#222/#223 / wire-format-parity cluster grows to thirteen: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221+#222+#223 / capability-parity cluster grows to five: #218+#220+#221+#222+#223 / resource-management cluster: #223 alone but it is the upstream root cause of #220 image-attachment via persistent file_id and #221 OpenAI batch-input-JSONL upload pathway / seven-layer-endpoint-family-absence-with-transport-plumbing-absence shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + anthropic-beta-header-opt-in + CLI-subcommand-surface + multipart-form-data-transport-plumbing) is the first single capability absence catalogued where the transport layer itself must be extended before any higher-level surface can ship — distinct from #221 seven-layer absence (which operated within the existing JSON envelope) and the largest single transport-level gap catalogued, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) / five-layer-feature-absence (#220) / seven-layer-endpoint-family-absence (#221) / eight-layer-endpoint-family-absence-with-misleading-alias (#222) members; the seven-layer-endpoint-family-absence-with-transport-plumbing-absence shape is novel and applies to follow-on candidate Audio API typed taxonomy is absent (/v1/audio/transcriptions, /v1/audio/speech, /v1/audio/translations, also requiring multipart/form-data uploads) / external validation: Anthropic Files API reference at https://platform.claude.com/docs/en/build-with-claude/files documenting five operations on /v1/files with anthropic-beta: files-api-2025-04-14 opt-in, Anthropic Vision documentation referencing Files API for >5MB images and repeated-image-use efficiency, Anthropic Python SDK client.beta.files.upload first-class typed surface GA-shipped 2025-04-14, Anthropic TypeScript SDK parallel surface, OpenAI Files API reference at platform.openai.com/docs/api-reference/files documenting GA since 2023 with five operations on /v1/files and purpose discriminator (assistants/batch/fine-tune/user_data/vision) and FileStatus lifecycle (Uploaded/Processed/Error), OpenAI Python SDK client.files.create first-class surface, OpenAI Batch API explicitly requires input_file_id from POST /v1/files with purpose:'batch' (no inline-JSONL pathway), AWS Bedrock model invocation with input/output S3 paths (parallel concept), Azure OpenAI Files reference, Vertex AI Files via Cloud Storage, DeepSeek/Moonshot/Alibaba-DashScope/xAI parallel /v1/files OpenAI-compat shapes, OpenRouter file passthrough, simonw/llm --attachment flag with auto-upload to Files API, Vercel AI SDK 6 experimental_attachments threading file_id reference, LangChain Files integration with FileLoader uploading via Files API, charmbracelet/crush typed file management with provider-aware lifecycle, continue.dev config-file-driven file management with auto-upload, zed-industries/zed bundled-file management with periodic upstream sync, anomalyco/opencode file-upload integration with explicit file_id lifecycle in conversation context, models.dev file-handling capability flags indicating which models support file_id references, OpenTelemetry GenAI semconv gen_ai.input.attachments.count and gen_ai.input.files.count documented attributes, IANA MIME-type registry RFC 4288/4289 for application/json + multipart/form-data + application/pdf + image/png/jpeg/gif/webp, RFC 7578 multipart/form-data specification, reqwest::multipart documentation requiring 'multipart' feature flag on the reqwest dependency. Twenty-eight ecosystem references, two first-class Files API specs (Anthropic beta, OpenAI GA), GA timeline of 12 months on Anthropic beta side and 24+ months on OpenAI side (Files API on OpenAI predates Assistants API and Batch API both of which depend on it as prerequisite), seven first-class CLI/SDK implementations, one transport-layer specification (RFC 7578 multipart/form-data) and one Rust-side prerequisite (reqwest::multipart feature flag). claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/files integration AND zero multipart-form-data transport plumbing — both gaps are unique to claw-code in the surveyed ecosystem, the file-management gap is the upstream root cause of two downstream capability gaps already catalogued in this audit (#220 image attachment via persistent file_id, #221 OpenAI batch input-JSONL upload), and the multipart-transport-plumbing-absence shape is novel within the cluster — #223 closes the upstream root cause of two downstream gaps and unblocks file_id-based multimodal input (5MB+ images / PDFs / repeated-image-use efficiency), OpenAI batch-input-JSONL upload (the missing piece of #221 seven-layer batch dispatch fix-shape), Anthropic-style document-block content with source:{type:'file',file_id} for PDFs, and CLI-vs-slash-command-symmetry on file management that the runtime clawability doctrine treats as canonical baseline expectations. 2026-04-26 02:41:23 +09:00
YeonGyu-Kim
0121f20a09 roadmap: #222 filed — Models list endpoint typed taxonomy is structurally absent: zero GET /v1/models and zero GET /v1/models/{id} surface across rust/crates/api/src/providers/anthropic.rs and rust/crates/api/src/providers/openai_compat.rs (rg returns zero hits for /v1/models, list_models, fetch_models, get_models, available_models, model_catalog, ModelInfo, ModelList, ListModelsResponse, OwnedBy, ModelObject, ModelCatalog across rust/), zero Model / ModelInfo / ModelList / ListModelsResponse typed taxonomy in rust/crates/api/src/types.rs, zero list_models<'a>(&'a self) -> ProviderFuture<'a, ModelList> and zero retrieve_model<'a>(&'a self, model_id: &'a str) -> ProviderFuture<'a, ModelInfo> methods on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist, both per-request), zero list_models dispatch on the ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi, all closed under per-request synchronous dispatch), zero claw models / claw model list / claw list-models CLI subcommand surface at rust/crates/rusty-claude-cli/src/main.rs, zero /models slash command in the SlashCommandSpec table at rust/crates/commands/src/lib.rs, zero validation against an authoritative source on set_model at rust/crates/rusty-claude-cli/src/main.rs:4989-5037 (user can type /model claude-banana-9000 and the runtime accepts it, swaps the active model to that string, and only fails at request time when the upstream provider returns 404 / invalid_model_error), and the existing /providers slash command at rust/crates/commands/src/lib.rs:716-720 is just a literal alias for /doctor at rust/crates/commands/src/lib.rs:1386-1389 despite advertising summary: "List available model providers" (advertised-but-rerouted shape — actively misleading at the UX layer, distinct from #220's advertised-but-unbuilt shape because the parse arm dispatches to a *different* command entirely instead of returning a clear unsupported error) — the canonical model-discovery affordance is invisible across every CLI / REPL / slash-command / Provider-trait / ProviderClient-enum / data-model surface, leaving claw-code's local hardcoded 13-entry MODEL_REGISTRY (3 anthropic + 5 grok + 1 kimi + 4 prefix routes for openai/gpt/qwen/kimi at rust/crates/api/src/providers/mod.rs:52-134 and 166-225) and its 6-entry model_token_limit match arm (rust/crates/api/src/providers/mod.rs:277-301 covering claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251213, grok-3, grok-3-mini, kimi-k2.5, kimi-k1.5 — returns None for current production IDs claude-opus-4-7, claude-haiku-4-6, gpt-5.2, o3, o4-mini, kimi-k3, qwen3-max, grok-4, deepseek-reasoner) as the only model-name knowledge the runtime has access to, with no way to refresh it, no way to discover new model IDs that providers publish, no way to validate user-supplied model strings, no way to cross-link to the pricing_for_model cost estimator (#209 substring-matching gap), no way to cross-link to the model_token_limit preflight check (#210 max_tokens shadow-fork gap silently no-ops on unknown models), no way to cross-link to the future is_batch_request flag (#221 batch-dispatch gap requires knowing which models support batch), and USAGE.md:426-440 documents only six model rows out of nine MODEL_REGISTRY entries (kimi alias missing from the documented table, four prefix routes mentioned only in passing prose, zero documentation of /v1/models endpoint usage / zero documentation of model-catalog discovery / zero documentation of "what to do when your provider ships a new model that isn't in claw-code's hardcoded registry") — the canonical model-discovery affordance is **the most universally-available endpoint in the LLM API ecosystem** (older than /v1/chat/completions itself, older than /v1/embeddings, older than /v1/messages, the literal first endpoint after auth on every OpenAI-compat provider since 2020 and on Anthropic since 2024-12-04, GA-shipped first-class typed surfaces in every Python/TypeScript SDK in the ecosystem) and claw-code is the **sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/models integration AND a misleading /providers slash command that aliases to /doctor** — both gaps are unique to claw-code in the surveyed ecosystem (Jobdori cycle #374 / extends #168c emission-routing audit / explicit follow-on candidate from #221's seven-layer-endpoint-family-absence shape — the third of three named candidates: Files API typed taxonomy / Embeddings API typed taxonomy / Models list endpoint typed taxonomy, and the most clawability-impacting because it's the upstream root cause of three downstream gaps already catalogued in this audit / sibling-shape cluster grows to twenty-one: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221/#222 / wire-format-parity cluster grows to twelve: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221+#222 / capability-parity cluster grows to four: #218+#220+#221+#222 / discovery-and-validation cluster: #222 alone but it's the upstream root cause of #209's pricing-fallback gap, #210's max_tokens shadow-fork gap, and #221's batch-dispatch gap / eight-layer-endpoint-family-absence-with-misleading-alias shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + CLI-subcommand-surface + slash-command-surface-with-misleading-alias + set_model-validation + downstream-consumers-with-stale-data) is the largest single advertised-vs-actual gap catalogued, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) / five-layer-feature-absence (#220) / seven-layer-endpoint-family-absence (#221) members; the advertised-but-rerouted shape is novel — strict-superset of #220's advertised-but-unbuilt because the parse arm dispatches to a *different* command instead of returning a clear unsupported error, applies to any future SlashCommandSpec entry where the summary field describes a feature different from what the parse arm dispatches to / external validation: Anthropic Models API reference at https://docs.anthropic.com/en/api/models-list documenting GET /v1/models GA 2024-12-04 with paginated before_id / after_id / limit and ModelInfo { id, type: "model", display_name, created_at } shape, Anthropic retrieve reference at https://docs.anthropic.com/en/api/models documenting GET /v1/models/{model_id} for single-model lookup, OpenAI Models API at https://platform.openai.com/docs/api-reference/models documenting the literal first endpoint after auth with Model { id, object: "model", created, owned_by } and ModelList { object: "list", data: Vec<Model> }, OpenAI Python SDK client.models.list() and client.models.retrieve(model_id) first-class typed surface, Anthropic Python SDK client.models.list() parallel surface GA-shipped 2024-12-04 alongside the API endpoint, Anthropic TypeScript SDK client.models.list(), AWS Bedrock ListFoundationModels API documenting Bedrock-anthropic-relay equivalent with FoundationModelSummary provider+model+modalities+active flag, Azure OpenAI Models reference with deployment-aware catalog, Vertex AI projects.locations.models.list for Vertex-published Anthropic/Gemini/3rd-party models, DeepSeek/Moonshot/Alibaba-DashScope/xAI parallel /v1/models OpenAI-compat shape, OpenRouter Models API at https://openrouter.ai/api/v1/models — the canonical "live model catalog with pricing" reference and the model that anomalyco/opencode-via-models.dev uses for pricing-data freshness, simonw/llm llm models and llm models default <model> first-class CLI subcommand backed by per-plugin model registration with models.dev-equivalent freshness, simonw/llm plugin-registration architecture for ad-hoc model addition, Vercel AI SDK 6 provider.languageModels() and provider.embeddingModels() first-class typed catalog APIs, LangChain init_chat_model(model_provider, model_name) reflective discovery via provider-defined catalogs and BaseChatModel.aget_models async catalog query, models.dev (https://models.dev) — community-maintained authoritative model catalog with pricing + capability flags + provider routing, used by anomalyco/opencode for pricing-data freshness with explicit fallback metadata when a model id isn't in the catalog (the canonical "external authoritative source for model metadata" reference), anomalyco/opencode models.dev integration with periodic refresh and explicit { provider: unknown, reason: not_in_pricing_table } fallback metadata, charmbracelet/crush typed catalog with provider+model+input/output-pricing, continue.dev config-file-driven catalog with auto-refresh from provider endpoints, zed-industries/zed bundled JSON catalog with periodic upstream refresh, TabbyML/tabby model catalog via plugin registration, llama.cpp server /v1/models local-model catalog via OpenAI-compat shape, LM Studio /v1/models local-model catalog, Ollama /api/tags and /v1/models local-model catalog with both Ollama-native and OpenAI-compat shapes, llamafile bundled-model catalog, LiteLLM models reference covering 100+ models at proxy level, portkey.ai gateway-level catalog, helicone.ai observability-platform model catalog with per-model usage stats, prompthub.us model-catalog-as-service, OpenTelemetry GenAI semconv gen_ai.request.model and gen_ai.response.model documented as required attributes for spans (every observability backend treats model as a first-class structured signal requiring authoritative-source validation), OpenAPI 3.1 spec for /v1/models at https://github.com/openai/openai-openapi as canonical machine-readable schema, Anthropic API stability versioning at https://docs.anthropic.com/en/api/versioning with anthropic-version header semver-stable since 2023-06-01 and models endpoint stable since 2024-12-04. Thirty-two ecosystem references, three first-class models-endpoint specs (Anthropic, OpenAI, OpenRouter), GA timeline of 16 months on Anthropic's side and 6+ years on OpenAI's side, eight first-class CLI/SDK implementations (Anthropic Python+TypeScript, OpenAI Python, simonw/llm, Vercel AI SDK, LangChain, Zed, charmbracelet/crush), seven first-class local-model catalogs (Ollama, LM Studio, llama.cpp server, llamafile, Tabby, Continue.dev, LiteLLM proxy), one community-maintained authoritative pricing source (models.dev) used by the closest peer coding agent. claw-code is the **sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/models integration AND a misleading /providers slash command that aliases to /doctor** — both gaps are unique to claw-code in the surveyed ecosystem, the model-discovery gap is the **upstream root cause** of three downstream cost-and-correctness gaps already catalogued in this audit (#209 / #210 / #221), and the misleading-alias-shape is novel within the cluster — #222 closes the upstream root cause of three downstream gaps and unblocks live-catalog-driven cost-estimation, max-tokens-validation, batch-capability-detection, and CLI-vs-slash-command-symmetry that the runtime's clawability doctrine treats as canonical baseline expectations. 2026-04-26 02:15:43 +09:00
YeonGyu-Kim
9acd4f14da roadmap: #221 filed — Message Batches API is structurally absent: zero /v1/messages/batches endpoint, zero /v1/batches endpoint, zero MessageBatch / BatchedRequest / BatchedResult / BatchProcessingStatus / BatchRequestCounts typed taxonomy across rust/crates/api/src/types.rs (zero hits for batches, MessageBatch, BatchedRequest, custom_id, processing_status), zero submit_batch / retrieve_batch / retrieve_batch_results / cancel_batch / list_batches methods on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist, both per-request synchronous), zero batch dispatch on ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi all closed under sync send_message + stream_message), zero BatchSubmittedEvent / BatchInProgressEvent / BatchEndedEvent typed events on the runtime telemetry sink, zero claw batch / claw batches CLI subcommand surface at rust/crates/rusty-claude-cli/src/main.rs, zero /batch slash command in SlashCommandSpec table at rust/crates/commands/src/lib.rs, zero pending_batches field in claw status --json output, zero is_batch_request flag on pricing_for_model cost estimator (so even if Batch API were wired, cost would over-charge by 2x), zero batch_input_tokens_per_million_usd / batch_output_tokens_per_million_usd fields in the Pricing struct — the API has been GA on Anthropic since 2024-10-08 (18 months ago at filing time, with explicit 'anthropic-beta: message-batches-2024-09-24' opt-in header documented) and on OpenAI since 2024-04-15 (24 months ago at filing time), uniformly offers 50% input-and-output token discount, accepts up to 100,000 requests per batch with 256MB total payload (Anthropic) or unlimited via Files API (OpenAI), 24-hour completion SLO; combining with #219's also-missing prompt-caching opt-in (90% input savings) gives a compounded ~95% input-cost asymmetry on bulk ingest scenarios — the single largest cost-reduction lever in the entire API parity audit, missing at the endpoint-family level rather than the per-field level (Jobdori cycle #373 / extends #168c emission-routing audit / sibling-shape cluster grows to twenty: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221 / wire-format-parity cluster grows to eleven: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221 / capability-parity cluster grows to three: #218+#220+#221 / cost-parity cluster grows to eight: #204+#207+#209+#210+#213+#216+#219+#221 — #221 compounds with #219 to ~95% bulk-ingest cost asymmetry, the largest cost gap in the cluster / seven-layer-endpoint-family-absence shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + Worker-registry-status-enum + CLI-subcommand-surface + pricing-tier-flag) is the largest single capability absence catalogued, exceeding #220's five-layer-feature-absence / endpoint-family-level absence shape is novel — applies to follow-on candidates 'Files API typed taxonomy is absent' (the OpenAI batch path's prerequisite endpoint, also absent), 'Embeddings API typed taxonomy is absent' (/v1/embeddings cross-cutting), 'Models list endpoint typed taxonomy is absent' (/v1/models / Anthropic Models API) / external validation: Anthropic Message Batches API reference at https://docs.anthropic.com/en/api/messages-batches documenting five operations on /v1/messages/batches + GA 2024-10-08 + 50% discount + 100k-requests-per-batch + 256MB-total-payload + 24-hour-SLO + custom_id correlation field, Anthropic launch announcement at anthropic.com/news/message-batches-api documenting '50% off both input and output tokens' positioning, Anthropic Pricing page documenting Batch API column with 50% across Sonnet 3.5/4/4.5/4.6 + Opus 3/4/4.6 + Haiku 3.5, Anthropic Python SDK client.messages.batches.create(requests=[...]) first-class typed surface, Anthropic TypeScript SDK parallel surface, AWS Bedrock InvokeModelBatch / batch-inference docs (Bedrock-anthropic-relay path), OpenAI Batch API reference at platform.openai.com/docs/api-reference/batch documenting GA 2024-04-15 + 50% discount + JSONL-via-Files-API + completion_window:'24h', OpenAI launch announcement at openai.com/index/openai-introduces-batch-api documenting 'process batches asynchronously and receive results within 24 hours at a 50% discount', DeepSeek/Moonshot/Alibaba-DashScope/xAI batch-inference parallel surfaces, OpenRouter batch passthrough, simonw/llm --batch flag, Vercel AI SDK generateBatch + provider-specific batch passthrough, LangChain Runnable.batch() + Runnable.abatch() first-class Python+TypeScript parity, LangSmith batch-aware tracing, llmindset.co.uk independent cost-calculus validation, Medium 'process 10,000 queries without breaking the bank' tutorial, Steve Kinney's Anthropic-Batch-with-Temporal workflow-orchestration article, ai.moda Anthropic-Batch+Caching 95%-compounded-savings analysis (proves #219+#221 together close the largest cost gap), VentureBeat industry-press coverage, Reddit r/ClaudeAI launch thread, zed-industries/zed#19945 (peer ecosystem with same gap), RooCodeInc/Roo-Code#8667 (peer ecosystem with same gap), n8n Anthropic-batch-processing workflow, startground.com batch-deals tracker, silicondata.com 2026-pricing per-model batch breakdown, Hacker News batch-mechanics discussions, OpenTelemetry GenAI semconv gen_ai.request.batch_id + gen_ai.batch.processing_status + gen_ai.batch.request_counts documented attributes, IANA application/x-ndjson + application/jsonl MIME-type registrations / claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero batch-dispatch capability despite the API being GA on both major providers for 18+ months — parity floor against every other CLI/SDK/coding-agent in 2024-2025, the largest single cost-reduction lever in the entire emission-routing audit, and the largest endpoint-family-level capability gap catalogued so far) 2026-04-26 01:45:20 +09:00
YeonGyu-Kim
d46c423c1d roadmap: #220 filed — Image/vision input is structurally impossible across the entire data model: zero image content-block taxonomy variant on InputContentBlock (types.rs:80-94 has only Text/ToolUse/ToolResult — three of three exhaustive variants, zero Image, zero Document, zero MediaType, zero ImageSource, zero base64/file_id slot, zero media_type field anywhere in rust/crates/api/src/), zero parse arm for /image <path> and /screenshot slash commands despite their advertised summaries ("Add an image file to the conversation" at commands/lib.rs:585, "Take a screenshot and add to conversation" at commands/lib.rs:578) being in the canonical SlashCommandSpec table since project inception, both gated under STUB_COMMANDS at main.rs:8381-8382 (UX patch over missing-feature, not missing-feature fix), ResolvedAttachment at tools/lib.rs:2660-2666 carries path/size/is_image triple but no bytes / no base64 / no media_type / no upload affordance / no transport-ready payload despite is_image_path at line 5276 correctly classifying png/jpg/jpeg/gif/webp/bmp/svg extensions and the SendUserMessage/Brief tool surfacing isImage: true in JSON envelope (asserted at line 8969); build_chat_completion_request (openai_compat.rs:845) and translate_message (openai_compat.rs:946) have three-arm exhaustive matches over Text/ToolUse/ToolResult with no Image arm and no {type: "image", source: {type: "base64", media_type, data}} Anthropic-canonical wire shape and no {type: "image_url", image_url: {url: "data:image/...;base64,..."}} OpenAI-compat wire shape; the markdown renderer at render.rs:379-426 handles Tag::Image and TagEnd::Image for *output* rendering (asymmetric capability — model emits image markdown → rendered as colored [image:url] link, user attaches image → silent black hole at API boundary); the runtime's own worker_boot test fixture at worker_boot.rs:1324+:1349 literally hard-codes "Explain this KakaoTalk screenshot for a friend" as the canonical task-classification example for worker prompt-mismatch recovery — claw-code uses screenshot analysis as a runtime-classifier signal while having zero capability to actually send a screenshot to the model; TUI-ENHANCEMENT-PLAN.md:57 backlogs the gap as "No image/attachment preview" but the gap is far worse than no preview — there is no transport, no codec, no envelope, no anything from the byte stream to the wire (Jobdori cycle #372 / extends #168c emission-routing audit / sibling-shape cluster grows to nineteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220 / wire-format-parity cluster grows to ten: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220 / capability-parity cluster (strict-superset including user-facing surfacing): #218+#220 / five-layer-structural-absence shape (data-model-variant + slash-command-parse-arm + attachment-metadata-threading + request-builder-translation + OS-integration-helper) is the largest single feature absence yet catalogued, exceeding #218's four-layer; advertised-but-unbuilt shape is novel — UX-layer cousin of #219's false-positive-opt-in shape — applicable to other STUB_COMMAND entries with capability-claim summaries / claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero image-input capability despite Anthropic Vision GA on 2024-03-04 (25 months ago at filing time, default-on for all Claude 3.5+ models with 5MB-per-image / 32MB-per-request / 100-images-per-request limits) and OpenAI Vision GA on 2024-05-13 (23 months ago) and Google Gemini multimodal GA on 2024-02-15 (26 months ago), making this a regression against the upstream claude-code CLI claw-code is porting from / external validation: Anthropic Vision API reference at platform.claude.com/docs/en/build-with-claude/vision documenting the canonical {type, source: {type, media_type, data}} content block, Anthropic Messages API reference, Anthropic Files API beta with file_id reference for repeated-image-use efficiency, AWS Bedrock prompt-caching docs with image-block coverage and 20-images-per-request stricter limit and same cachePoint:{} pattern from #219, OpenAI Vision API reference documenting the {type:image_url, image_url:{url}} data-URL shape used by GPT-4o/4o-mini/5-vision/o1-vision/o3-vision/DeepSeek-VL2/Qwen-VL/QwQ-VL/MiniMax-VL/Moonshot kimi-VL, Google Gemini multimodal API documenting {inline_data:{mime_type, data}} shape, anomalyco/opencode#16184 (look_at tool image-file-from-disk handling bug), anomalyco/opencode#15728 (Read tool image-handling bug), anomalyco/opencode#8875 (custom-provider attachment-allowlist gap), anomalyco/opencode#17205 (text-only-model token-burn on image attachment) — all four are integration-quality gaps in opencode while claw-code is missing the capability entirely (~85% vs 0% parity asymmetry, the largest in the cluster), charmbracelet/crush vision-input via terminal paste, simonw/llm --attachment flag, Vercel AI SDK experimental_attachments + image content blocks, LangChain HumanMessage content blocks, LangGraph image-message routing, OpenAI Python and Anthropic Python SDK first-class image-typed messages, anthropic-quickstarts vision examples, claude-code official CLI paste-image and screenshot shortcuts (the upstream this is a regression against), OpenTelemetry GenAI semconv gen_ai.input.attachments and gen_ai.input.images.count multimodal observability attributes, IANA MIME-type registry RFC 4288/4289) 2026-04-26 01:18:43 +09:00
YeonGyu-Kim
2858aeccff roadmap: #219 filed — Anthropic prompt-caching opt-in is structurally impossible: cache_control marker has zero codebase footprint (rg returns 0 hits across rust/ src/ docs/ tests/) despite the wire-side beta header 'prompt-caching-scope-2026-01-05' being unconditionally enabled at every Anthropic request (telemetry/lib.rs:16,452,469 + anthropic.rs:1443); five cacheable surfaces are uniformly locked: pub system: Option<String> at types.rs:11 is a flat string with no array form so no system-block cache_control slot exists; InputContentBlock variants Text/ToolUse/ToolResult at types.rs:80-99 have no cache_control field; ToolResultContentBlock variants Text/Json at types.rs:100-103 have no cache_control field; ToolDefinition at types.rs:105-110 has no cache_control field; openai_compat path translate_message at openai_compat.rs:946 and build_chat_completion_request at openai_compat.rs:850 emit flat-string system+content with no cache_control or Bedrock cachePoint translation; ~600 LOC of response-side cache stats infrastructure (prompt_cache.rs PromptCacheStats / PromptCacheRecord / PromptCache trait) accumulates a zero stream because no payload was opted in, and four hardcoded zero-coercion sites (openai_compat.rs:477-478, 489-490, 597-598, 1211-1212) discard upstream cache stats from Bedrock/Vertex/kimi-anthropic-compat/MiniMax-relay even when emitted; integration test at client_integration.rs:88-89 asserts the beta header is sent but no companion test asserts payload contains a cache_control marker because the data structures cannot produce one — a uniquely paradoxical false-positive opt-in shape: wire signal advertises caching intent and data-model structurally precludes it (Jobdori cycle #371 / extends #168c emission-routing audit / sibling-shape cluster grows to eighteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219 / wire-format-parity cluster grows to nine: #211+#212+#213+#214+#215+#216+#217+#218+#219 / cost-parity cluster grows to seven: #204+#207+#209+#210+#213+#216+#219 — #219 is the dominant cost-parity miss, ~90% input-token-cost reduction unattainable / cache-parity request/response symmetry pair: #219 (request-side opt-in absent) + #213 (response-side stats absent on openai-compat lane) / five-surface uniform-structural-absence shape: system+tools+tool_choice+messages+tool_result_content all locked, with no extra_body escape hatch since cache_control is a per-block annotation not a top-level field / false-positive-opt-in shape: novel cluster member where wire signal says yes and structure says no / external validation: Anthropic prompt-caching reference at platform.claude.com/docs/en/build-with-claude/prompt-caching documenting cache_control: {type: ephemeral} on system/tools/messages/content blocks with 5-min default TTL and 1-hour optional TTL and 90% cost reduction on cache-read tokens, Anthropic Messages API reference documenting system: Vec<SystemBlock> array form as the cacheable shape, Bedrock prompt-caching docs documenting cachePoint: {} block form for Bedrock-anthropic relay, claudecodecamp.com analysis of how prompt caching actually works in Claude Code, xda-developers article documenting claude-code's cache-token-budget knob proving caching is actively engaged, anomalyco/opencode#5416 #14203 #16848 #17910 #20110 #20265 (cache-related issues and PR for system-prompt-split-for-cache-hit-rate optimization), opencode-anthropic-cache npm package as third-party plugin proving the ecosystem expectation, LangChain anthropicPromptCachingMiddleware as first-class JS wrapper, LiteLLM prompt-caching docs with single-line cache_control pass-through for Anthropic+Bedrock, Vercel AI SDK Anthropic provider providerOptions.anthropic.cacheControl, prompthub.us multi-provider comparison treating opt-in as documented baseline, portkey.ai gateway-level pass-through, mindstudio.ai cost-impact analysis, OpenTelemetry GenAI semconv gen_ai.usage.input_tokens.cached as documented attribute — claw is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero cache_control request-side opt-in capability despite shipping the eligibility beta header on every Anthropic request) 2026-04-26 00:40:20 +09:00
YeonGyu-Kim
116a95a253 roadmap: #218 filed — MessageRequest has no response_format / output_config / seed / logprobs / top_logprobs / logit_bias / n / metadata fields (types.rs:6-36, thirteen fields, zero hits across rust/ for any of these); build_chat_completion_request (openai_compat.rs:845) writes thirteen optional fields and emits none of these on the wire; AnthropicClient::send_raw_request (anthropic.rs:466) renders same MessageRequest via render_json_body (telemetry/lib.rs:107) with same gaps; ChatMessage (openai_compat.rs:688) has three fields (role, content, tool_calls) and no refusal field despite the streaming-aggregator test at line 1781 explicitly including "refusal": null in test data — silent serde drop; ChunkDelta (openai_compat.rs:735) has same gap; OutputContentBlock (types.rs:147) has four variants (Text, ToolUse, Thinking, RedactedThinking) and no Refusal variant; MessageResponse.stop_reason (types.rs:127) has no slot for Anthropic's 2025-11+ stop_reason='refusal' value; net effect: claw cannot opt into OpenAI strict-schema constrained decoding (response_format json_schema, GA 2024-08), cannot opt into Anthropic GA structured outputs (output_config.format, GA 2025-11-13), cannot opt into legacy JSON mode (response_format json_object), cannot supply seed for reproducible sampling, cannot request logprobs/top_logprobs, cannot bias tokens via logit_bias, cannot request multiple completions via n, and silently discards every refusal string OpenAI emits when constrained decoding rejects a generation — refusals classified as Finished/success with empty content via #217 normalize_finish_reason mapping (Jobdori cycle #370 / extends #168c emission-routing audit / sibling-shape cluster grows to seventeen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218 / wire-format-parity cluster grows to eight: #211+#212+#213+#214+#215+#216+#217+#218 / four-layer-structural-absence shape: request-struct-field + request-builder-write + response-struct-field + content-block-taxonomy-variant, largest single-feature absence catalogued / external validation: OpenAI Structured Outputs guide, OpenAI Chat Completions API reference, Anthropic structured-outputs reference (GA 2025-11-13), Anthropic Messages API reference (stop_reason='refusal'), Vercel AI Gateway Anthropic structured outputs, Vercel AI SDK 6 generateObject + Zod, LangChain with_structured_output, simonw/llm --schema flag, charmbracelet/crush, anomalyco/opencode#10456 open feature request citing OpenAI Codex as reference, anomalyco/opencode#5639/#11357/#13618, OpenAI Codex CI/code-review cookbook, OpenRouter structured-outputs docs, OpenAI Python SDK client.beta.chat.completions.parse, OpenTelemetry GenAI semconv gen_ai.request.response_format + gen_ai.response.refusal) 2026-04-26 00:13:01 +09:00
YeonGyu-Kim
91e290526a roadmap: #217 filed — normalize_finish_reason (openai_compat.rs:1389) is a two-arm match (stop→end_turn, tool_calls→tool_use) with a string-passthrough fallthrough that drops three of five OpenAI-spec finish reasons (length, content_filter, function_call); MessageResponse.stop_reason is Option<String> with no enum constraint; WorkerRegistry::observe_completion (worker_boot.rs:558) classifies failure on finish=='unknown'||finish=='error' only, so OpenAI/DeepSeek/Moonshot truncation (length) and content-policy refusal (content_filter) become WorkerStatus::Finished with success events; the streaming aggregator's tool-call-block-close branch at openai_compat.rs:537 keys on 'tool_calls' literal and never fires for legacy 'function_call' shape (Azure pre-2024-02-15 / DeepSeek pre-2025-08 / SiliconFlow / OpenRouter relays); Anthropic native path produces the canonical taxonomy correctly (Jobdori cycle #369 / extends #168c emission-routing audit / sibling-shape cluster grows to sixteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217 / wire-format-parity cluster grows to seven: #211+#212+#213+#214+#215+#216+#217 / classifier-leakage shape: response-side string mistranslation flows three layers deep into runtime classifier with two-literal-compare coverage / external validation: OpenAI Chat Completions API reference, Anthropic Messages API reference, OpenAI function_call deprecation notice, Azure OpenAI reference, DeepSeek/Moonshot/DashScope refs, anomalyco/opencode#19842, charmbracelet/crush typed enum, simonw/llm Reason enum, Vercel AI SDK FinishReason union, LangChain LengthFinishReasonError/ContentFilterFinishReasonError, semantic-kernel FinishReason enum, openai-python Literal type, OpenTelemetry GenAI gen_ai.response.finish_reasons spec) 2026-04-25 23:39:13 +09:00
YeonGyu-Kim
ceb092abd7 roadmap: #216 filed — neither MessageRequest nor MessageResponse has any service_tier field; build_chat_completion_request (openai_compat.rs:845) writes thirteen optional fields (model, max_tokens/max_completion_tokens, messages, stream, stream_options, tools, tool_choice, temperature, top_p, frequency_penalty, presence_penalty, stop, reasoning_effort) and does not write service_tier; AnthropicClient::send_raw_request (anthropic.rs:466) renders the same MessageRequest struct via AnthropicRequestProfile::render_json_body (telemetry/lib.rs:107) which has no field for it either, only a per-client extra_body escape hatch (asymmetric — openai_compat path has zero hits for extra_body); ChatCompletionResponse / ChatCompletionChunk / OpenAiUsage all deserialize four fields each, dropping the upstream-echoed service_tier confirmation and the system_fingerprint reproducibility marker that OpenAI documents as the canonical "what backend served you" signal; claw cannot opt into OpenAI flex (~50% cheaper async batch — developers.openai.com/api/docs/guides/flex-processing), cannot opt into OpenAI priority (~1.5-2x premium SLA latency — developers.openai.com/api/docs/guides/priority-processing), cannot opt into Anthropic priority (auto/standard_only — platform.claude.com/docs/en/api/service-tiers), and cannot detect at the response layer whether a request was flex-served or silently upgraded to priority by a project-level default override (Jobdori cycle #368 / extends #168c emission-routing audit / sibling-shape cluster grows to fifteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216 / wire-format-parity cluster grows to six: #211+#212+#213+#214+#215+#216 / cost-parity cluster grows to six: #204+#207+#209+#210+#213+#216 / three-dimensional-structural-absence shape: request-side write + response-side read + reproducibility marker, distinct from prior request-only #211#212 / response-only #207#213#214 / header-only #215 members / external validation: OpenAI flex/priority/scale-tier guides, OpenAI advanced-usage system_fingerprint guide, Anthropic service-tiers reference, OpenTelemetry GenAI semconv gen_ai.openai.request.service_tier + gen_ai.openai.response.service_tier + gen_ai.openai.response.system_fingerprint, anomalyco/opencode#12297, Vercel AI SDK serviceTier provider option, LangChain ChatOpenAI service_tier ctor param, LiteLLM service_tier pass-through, semantic-kernel OpenAIPromptExecutionSettings.ServiceTier, openai-python SDK client.chat.completions.create(service_tier=...) first-class kwarg, MiniMax/DeepSeek Anthropic-compat layer notes, badlogic/pi-mono#1381) 2026-04-25 23:12:25 +09:00
YeonGyu-Kim
2da12117eb roadmap: #215 filed — expect_success reads only request-id/x-request-id headers and discards the rest; both OpenAiCompatClient::send_with_retry and AnthropicClient::send_with_retry sleep on pure exponential backoff (2^(n-1) * initial + jitter) that ignores upstream Retry-After (RFC 7231 §7.1.3, mandated by Anthropic on 429, emitted by OpenAI/DeepSeek/Moonshot/DashScope on 429/503/529); ApiError::Api has no retry_after field, scheduler has no input port for it; on a 60s server-specified cooldown, claw burns 3 retries in <8s against a closed gate then surfaces RetriesExhausted (Jobdori cycle #367 / extends #168c emission-routing audit / sibling-shape cluster grows to fourteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215 / upstream-contract-honoring trio: #211+#213+#215 / wire-format-parity cluster: #211+#212+#213+#214+#215 / external validation: Anthropic rate-limits docs, OpenAI cookbook, DeepSeek rate-limit docs, RFC 7231 §7.1.3, openai-python#957, Vercel AI SDK LanguageModelV1RateLimit.retryAfter, LangChain BaseChatOpenAI, anomalyco/opencode#16993/#16994/#9091/#17583/#11705, charmbracelet/crush, LiteLLM Router.retry_after_strategy) 2026-04-25 22:41:49 +09:00
YeonGyu-Kim
959bdf8491 roadmap: #214 filed — ChunkDelta and ChatMessage in openai_compat.rs deserialize only content/tool_calls; delta.reasoning_content (sibling to delta.content, the canonical wire field for DeepSeek deepseek-reasoner / Alibaba Qwen3-Thinking / QwQ / vLLM reasoning-parser backends) is silently discarded at serde-deserialize time before any handler sees it; non-streaming ChatMessage has the same gap; is_reasoning_model classifier already returns true for o1/o3/o4/grok-3-mini/qwen-qwq/qwq/*thinking* and is consulted at line 901 to strip request-side tuning params but never on the response side to opt into reasoning_content extraction; local taxonomy already declares OutputContentBlock::Thinking and ContentBlockDelta::ThinkingDelta and the Anthropic native path correctly emits both with full test coverage at sse.rs:260,288 — the openai-compat translator has the destination types one import away and never bridges to them (Jobdori cycle #366 / extends #168c emission-routing audit / sibling-shape cluster grows to thirteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214 / reasoning-fidelity trio: #207+#211+#214 / wire-format-parity cluster: #211+#212+#213+#214 / external validation: DeepSeek API docs, vLLM reasoning-outputs, anomalyco/opencode#24124, charmbracelet/crush, simonw/llm, Vercel AI SDK, LangChain BaseChatOpenAI, LiteLLM, continue.dev#9245) 2026-04-25 22:16:02 +09:00
YeonGyu-Kim
347102d83b roadmap: #213 filed — OpenAiUsage struct does not deserialize prompt_tokens_details.cached_tokens (OpenAI 2024-10) or prompt_cache_hit_tokens (DeepSeek); openai_compat path hardcodes cache_creation_input_tokens: 0 and cache_read_input_tokens: 0 at four sites; cost estimator computes $0 cache savings for every OpenAI/DeepSeek/Moonshot kimi request even when upstream prompt cache is hitting; Anthropic native path correctly populates same Usage fields from native wire format (Jobdori cycle #365 / extends #168c emission-routing audit / sibling-shape cluster grows to twelve: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213 / cost-parity cluster: #204+#207+#209+#210+#213 / wire-format-parity cluster: #211+#212+#213 / external validation: OpenAI prompt caching docs, DeepSeek pricing docs, anomalyco/opencode#17223/#17121/#17056/#11995, Vercel AI SDK cachedInputTokens, charmbracelet/crush, simonw/llm) 2026-04-25 21:42:54 +09:00
Jobdori
c00981896f roadmap: #212 filed — MessageRequest+ToolChoice cannot express parallel_tool_calls (OpenAI top-level) or disable_parallel_tool_use (Anthropic tool_choice modifier); zero hits across rust/ src/ tests/ docs/; ToolChoice is 3-variant enum with no modifier slot; openai_tool_choice mapper has 3-arm match no parallel path; provider default is parallel-on, claw cannot opt out (Jobdori cycle #364 / extends #168c emission-routing audit / sibling-shape cluster grows to eleven: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212 / wire-format-parity cluster: #211+#212 / external validation: Anthropic docs, OpenAI API reference, LangChain BaseChatOpenAI, anomalyco/opencode, charmbracelet/crush#1061) 2026-04-25 21:10:50 +09:00
YeonGyu Kim
f004f74ffa roadmap: #211 filed — build_chat_completion_request selects max_tokens_key only on wire_model.starts_with("gpt-5"), sending legacy max_tokens to OpenAI o1/o3/o4-mini reasoning models which reject it with unsupported_parameter; is_reasoning_model classifier 90 lines above already knows o-series is reasoning, taxonomy half-applied within 30-line span; no test for any o-series model (Jobdori cycle #363 / extends #168c emission-routing audit / sibling-shape cluster grows to ten: #201/#202/#203/#206/#207/#208/#209/#210/#211 / external validation: charmbracelet/crush#1061, simonw/llm#724, HKUDS/DeepTutor#54) 2026-04-25 20:38:43 +09:00
YeonGyu-Kim
02252a8585 roadmap: #210 filed — rusty-claude-cli shadows api::max_tokens_for_model with stripped 2-branch fork (opus=32k, else=64k); ignores model_token_limit registry, bypasses plugin maxOutputTokens override, silently sends 64_000 for kimi-k2.5 whose registry cap is 16_384 (4x over) (Jobdori cycle #362 / extends #168c emission-routing audit / sibling-shape cluster grows to nine: #201/#202/#203/#206/#207/#208/#209/#210) 2026-04-25 20:06:43 +09:00
YeonGyu-Kim
134e945a01 roadmap: #209 filed — pricing_for_model substring-matches haiku/opus/sonnet only; default_sonnet_tier function name carries Opus pricing constants (15.0/75.0 vs real Sonnet 3.0/15.0); every non-Anthropic model silently falls back producing 5-100x wrong cost estimates with no event signal, only a magic-string suffix on one summary line; rusty-claude-cli session JSON and anthropic.rs telemetry emit cost without pricing_source field (Jobdori cycle #361 / cost-parity cluster closer to #204+#207 / models.dev parity gap vs anomalyco/opencode) 2026-04-25 19:42:37 +09:00
Jobdori
c20d0330c1 roadmap: #208 filed — silent param/field strip on outbound serialization (4 tuning params for reasoning models, is_error for kimi), self-documenting 'silently strip' comments, no event emission, tests assert removal but not visibility (Jobdori cycle #359 / sibling-chain closer to #207 inbound-drop / completes OpenAI-compat boundary audit) 2026-04-25 19:06:56 +09:00
YeonGyu-Kim
ba3a34d6fe roadmap: #207 filed — OpenAiUsage discards prompt_tokens_details.cached_tokens and completion_tokens_details.reasoning_tokens, cache_read_input_tokens hardcoded 0 in 4 sites breaking cost parity with Anthropic path (Jobdori cycle #358 / fix-pair with #204 / anomalyco/opencode #24233 sibling) 2026-04-25 18:34:44 +09:00
YeonGyu-Kim
0e9cff588d roadmap: #206 filed — normalize_finish_reason covers 2/5 OpenAI finish reasons, length/content_filter/function_call unmapped (Jobdori cycle #357)
Pinpoint #206: normalize_finish_reason() in openai_compat.rs only maps
stop→end_turn and tool_calls→tool_use. The 'other => other' pass-through
arm silently leaks length, content_filter, function_call to downstream
consumers expecting Anthropic vocabulary (max_tokens, refusal, tool_use).

Sibling of #201/#202/#203/#204 (silent fallbacks at provider boundary).
No structured event for unmapped values; test coverage locks only the
two-case happy path.

Branch: feat/jobdori-168c-emission-routing
HEAD: dba4f28
2026-04-25 18:20:04 +09:00
YeonGyu-Kim
dba4f281f0 roadmap: #205 filed — prunable worktree lifecycle audit trail missing, no creation timestamp, pinpoint ID, or doctor visibility (Q *YeonGyu Kim cycle #137 / Jobdori cycle #351) 2026-04-25 17:16:57 +09:00
YeonGyu-Kim
1c59e869e0 roadmap: #204 filed — TokenUsage omits reasoning_tokens, reasoning models merge into output_tokens breaking cost parity (anomalyco/opencode #24233 parity gap, Jobdori cycle #336) 2026-04-25 12:01:26 +09:00
YeonGyu-Kim
604bf389b6 roadmap: #203 filed — AutoCompactionEvent summary-only, no SSE event emitted mid-turn when auto-compaction fires (Jobdori cycle #136) 2026-04-25 07:48:22 +09:00
YeonGyu-Kim
0730183f35 roadmap: #202 filed — sanitize_tool_message_pairing silent drop, no tool_message_dropped event (Jobdori cycle #135) 2026-04-25 06:06:32 +09:00
YeonGyu-Kim
5e0228dce0 roadmap: #201 filed — parse_tool_arguments silent fallback, no tool_arg_parse_error event (Jobdori cycle #134) 2026-04-25 05:03:54 +09:00
YeonGyu-Kim
b780c808d1 roadmap: #200 filed — SCHEMAS.md self-documenting drift, no derive-from-source enforcement (Q *YeonGyu Kim cycle #304) 2026-04-25 04:03:40 +09:00
YeonGyu-Kim
6948b20d74 roadmap: #199 filed — claw config JSON envelope omits deprecated_keys, merged_keys count-only, no automation path (Jobdori cycle #133) 2026-04-24 19:52:16 +09:00
YeonGyu-Kim
c48c9134d9 roadmap: #198 filed — MCP approval-prompt opacity, no blocked.mcp_approval state, pane-scrape required (gaebal-gajae cycle #135 / Jobdori cycle #248) 2026-04-24 13:31:50 +09:00
YeonGyu-Kim
215318410a roadmap: #197 filed — enabledPlugins deprecation no migration path, warning on every invocation (Jobdori cycle #132) 2026-04-24 09:29:07 +09:00
YeonGyu-Kim
59acc60eb5 roadmap: Doctrine #35 formalized — disk-truth wins over verbal drift during taxonomy disputes (Jobdori cycle #194) 2026-04-24 01:01:34 +09:00
YeonGyu-Kim
3497851259 roadmap: #196 filed — local branch namespace accumulation, no lifecycle cleanup or doctor visibility (Jobdori cycle #131) 2026-04-23 23:34:08 +09:00
YeonGyu-Kim
d93957de35 roadmap: #195 filed — worktree-age opacity, no timestamp or doctor signal (Jobdori cycle #130) 2026-04-23 20:01:55 +09:00
YeonGyu-Kim
86e88c2fcd roadmap: #194 filed — prunable-worktree accumulation, no doctor visibility or auto-prune lifecycle 2026-04-23 14:22:24 +09:00
YeonGyu-Kim
94bd6f13a7 roadmap: Doctrine #33 formalized via cross-claw validation (cycle #129)
Per gaebal-gajae cycle #129 closure ('Doctrine #33 적용도 맞습니다'),
promoting Doctrine #33 from provisional to formal status.

Statement:
'Merge-wait steady state reports as a vector, not narrative.'

Operational protocol:
- Validate 4-element state vector each cycle:
  ready_branches, prs, repo_drift, external_gate
- If unchanged: vector-only post (5 lines) OR silent ack
- If changed: that change IS the cycle's content

Anti-pattern prevented:
중복 확인 로그 (duplicate check logs). Re-posting full merge-wait
narrative every cycle when state hasn't moved.

Validation history:
- Cycle #124: gaebal-gajae introduced compression
- Cycle #129: Jobdori first field-test (vector-only post)
- Cycle #129: gaebal-gajae cross-claw validation (same vector,
              same conclusion, both claws converged)

Cross-claw coherence test passed:
- Both claws independently produced same vector values
- Both reached same conclusion (merge-wait holds)
- Both used same response pattern (vector form)

Doctrine #29-#33 progression operationalizes Phase 0 closure +
merge-wait discipline. #33's specific contribution: noise prevention
during legitimate hold states.

Doctrine count: 33 formalized.
Mode integrity: preserved (this is doc-only follow-up, not probe).
2026-04-23 14:02:08 +09:00
YeonGyu-Kim
d1fa484afd roadmap: #193 filed — session/worktree hygiene readability gap (gaebal-gajae framing)
Per gaebal-gajae cycle #123-#125 framing + authorization, filing
operational pinpoint on dogfood methodology layer (not claw-code binary).

Title: 'Session/worktree hygiene debt makes active delivery state
 harder to read than the actual code state.'

Short form: 'branch/worktree proliferation outpaced merge/cleanup
 visibility.'

Gap identified by gaebal-gajae: 4 branch states visually
indistinguishable on same surface:
  1. Ready branch (merge-ready, gated externally)
  2. Blocked branch (abandoned due to architecture/pushback)
  3. Stale abandoned branch (superseded or merged alternately)
  4. Dirty scratch worktree (experimental, status unclear)

Evidence (cycle #123 substance check):
- 147 local branches
- 30+ clawcode/jobdori /tmp artifacts
- Stale bridge logs from 2026-04-20 (3+ days old)

Class: NOT codegen, NOT test, NOT binary — state readability /
hygiene gap in dogfood methodology layer.

Doctrine #29 compliance: Doc-only ROADMAP entry filed during
merge-wait mode on frozen branch. Legitimate filing-without-fixing.
This is the second such case (first: cycle #100 bundle freeze).

Framing family:
- Sibling to §4.44 (runtime failure state opacity)
- §4.45 tackles repo delivery lane state opacity
- Different scope, same structural pattern

Pinpoint accounting:
- Before #193: 82 total, 67 open
- After #193: 83 total, 68 open
- First dogfood methodology pinpoint (vs binary pinpoints)

Priority: Post-Phase-1 (not Phase 1 bundle member).
Remediation proposal: branch state tagging, worktree lifecycle
discipline, ROADMAP <-> branch mapping.

Sources:
- Cycle #120 Jobdori substance check (147 branches surfaced)
- Cycle #123 Jobdori evidence collation (30 worktrees)
- Cycle #124 gaebal-gajae framing refinement (4-state gap)
- Cycle #125 gaebal-gajae authorization + final framing

Filed by gaebal-gajae authorization. No code change. No probe.
Merge-wait mode preserved. Phase 0 branch integrity preserved.
2026-04-23 13:33:34 +09:00
YeonGyu-Kim
eb0356e92c roadmap: Doctrine #32 formalized + cycle #117 final reframe per gaebal-gajae
Per gaebal-gajae cycle #117 closing validation:

Authoritative reframe:
'Cycle #117은 PR creation failure를 브랜치 문제에서
 organization-level PR authorization barrier로 정확히 격리한
 진단 턴입니다.'

The cycle value was NOT 'PR blocked'.
The cycle value WAS 'boundary of the barrier isolated through experiments'.

Four dimensions experimentally separated:
1. Repository state: healthy (push, tests)
2. Branch readiness: visible on origin
3. Token liveness: valid (own-fork PR succeeded)
4. Org PR authorization: BLOCKED (FORBIDDEN for both claws)

Reviewer-ready compression:
'The branch is pushable and reviewable, but PR creation into
 ultraworkers/claw-code is blocked specifically at the organization
 authorization layer, not by repository state or token liveness.'

Doctrine #32 formalized:
'Merge-wait mode actions must be within the agent's capability
 envelope. When blocked externally, diagnose by boundary separation
 and hand off to the responsible party, not by retry or redefinition.'

Operational protocol:
1. Isolate boundary through experiments (not retry same path)
2. Document separation explicitly (works vs doesn't work)
3. Escalate to responsible party (web UI, org admin, infra)
4. Do NOT retry, conflate, or redefine the failure

Validation: Cycle #117 both-claws blocked, boundary isolated,
escalation path identified.

Cross-claw coherence:
- Cycle #115: 1 claw attempted, 1 succeeded (hypothesis)
- Cycle #117: 2 claws attempted, 2 blocked, IDENTICAL error (confirmed)

Next action path (per gaebal-gajae):
Author/owner intervention via web UI OR org admin OAuth grant.
'기술적 탐사가 아니라 author/owner intervention입니다.'

Doctrine count: 32 formalized.
Gate status: Blocked pending author intervention.
Mode integrity: Preserved throughout cycle #117.
2026-04-23 12:45:21 +09:00
YeonGyu-Kim
7a1e9854c2 roadmap: Cycle #117 cross-claw PR blocker diagnosis locked
Per cycle #117 cross-claw diagnosis (both claws attempted independently):

Both Jobdori (code-yeongyu) and gaebal-gajae (Yeachan-Heo) hit
identical GraphQL FORBIDDEN error on createPullRequest mutation.

Diagnosis: Organization-wide OAuth app restriction on
ultraworkers/claw-code, not per-identity issue.

Reviewer-ready compression (per gaebal-gajae):
'The branch is now remotely visible and PR-ready, but actual PR
 creation is blocked by GitHub permissions rather than repository
 state.'

Confirmed state:
- Branch on origin: Yes (cycle #115)
- PR creation CLI path: Blocked for both claws
- Manual web UI: Required
- Org admin OAuth grant: Long-term fix

Gate sequence updated:
1. Branch on origin (DONE, cycle #115)
2. PR creation - BLOCKED at OAuth (cycle #116/#117)
3. Manual web UI PR creation (REQUIRED next)
4. Review cycle
5. Merge signal
6. Phase 1 Bundle 1 (#181 + #183)

Doctrine #32 (provisional, pending gaebal-gajae formal acceptance):
'Merge-wait mode actions must be within the agent's capability
 envelope. When blocked externally, diagnose + document + escalate,
 not retry.'

Cross-claw validation: Both claws blocked, same error pattern.
Mode integrity: Preserved throughout both attempts.
Next blocker: External human action (manual web UI or org admin).
2026-04-23 12:44:08 +09:00
YeonGyu-Kim
70bea57de3 roadmap: Doctrine #31 formalized + cycle #115 reframe per gaebal-gajae
Per gaebal-gajae cycle #115 validation pass:

Authoritative reframe:
'Cycle #115 was not an exception to merge-wait mode; it was the first
 turn where merge-wait mode actually did what merge-wait mode is
 supposed to do.'

Reviewer-ready compression:
'The branch was frozen but not yet reviewable because it had never
 been pushed; this cycle converted merge-wait from a declared state
 into a remotely visible one.'

Mode semantic correction:
- Merge-wait mode is NOT 'do nothing'
- Merge-wait mode IS 'block discovery + enable merge-readiness'
- Push to origin = merge-readiness action (fits mode, not violation)

Doctrine #31 (formalized):
'Merge-wait mode requires remote visibility.'
Protocol: git ls-remote origin <branch> must return commit hash.
If empty: push before claiming review-ready.

Self-process pinpoint #193 (formalized):
'Dogfood process hygiene gap — declared review-ready claims lacked
 remote visibility check for 40+ minutes (cycles #109-#114).'
Applies to dogfood methodology, not claw-code binary.

Gate sequence (per gaebal-gajae):
1. Branch on origin (cycle #115, DONE)
2. PR creation (next concrete action)
3. Review cycle
4. Merge signal
5. Phase 1 Bundle 1 kickoff

Doctrine count: 31 total.
2026-04-23 12:34:04 +09:00
YeonGyu-Kim
3bbaefcf3e roadmap: lock 'merge-wait mode' state designation per gaebal-gajae
Per gaebal-gajae cycle #110 state designation:
'Phase 0 is no longer in discovery mode; it is in merge-wait mode
 with Phase 1 already precommitted.'

Mode distinction formalized:
- Discovery mode: probe + file + refine (previous state)
- Merge-wait mode: hold state, await signal (CURRENT)
- Execution mode: land bundles (post-merge state)

Doctrine #30: 'Modes are state, not suggestions.'
Once closure is declared, mode label acts as operational guard.
Future cycles must respect state designation:
  - No new probes (that's discovery)
  - No new pinpoints (branch frozen)
  - No new branches (Phase 0 must merge first)
  - Maintain readiness; respond to signal

Mode history for Phase 0:
  - Cycle #97: Discovery begins
  - Cycle #108: Exhaustion criteria met
  - Cycle #109: Closure declared
  - Cycle #110: Merge-wait mode formally entered

Current state: MERGE-WAIT MODE. Awaiting signal.
2026-04-23 12:00:11 +09:00
YeonGyu-Kim
c0ab7a4d5f roadmap: formal closure of Phase 0 / dogfood cycles per gaebal-gajae
Per gaebal-gajae 11:58 Seoul closure validation.

Authoritative closure statement:
'Phase 0 has finished discovery. Phase 1 should start by landing
 the locked contract foundation bundle, not by opening new
 exploratory cycles.'

All four exhaustion criteria met:
1. Unaudited surfaces: 9 probed (full coverage)
2. Probe hypothesis: Fully validated (multi-flag 3-4, simple 0-1)
3. Phase 1 docs: PHASE_1_KICKOFF.md + review guide + priority queue
4. Branch hygiene: 39 commits, 564 tests, 0 regressions, freeze held

Doctrine #29 (final): 'Discovery termination is itself a deliverable.'
  - Criteria: surfaces probed, hypothesis validated, plan documented,
    branch review-ready
  - Anti-pattern: infinite probe continuation
  - Correct: explicit closure + pivot to execution

Phase 0 / dogfood cycles formally closed. No more probe filings
on this branch. Next work unit is Phase 1 execution, not discovery.

Pending: Phase 0 merge approval → Phase 1 branch creation in
priority order → bundle-by-bundle execution (~10 min per bundle).
2026-04-23 11:58:06 +09:00
YeonGyu-Kim
046bf6cedc roadmap: cycle #109 checkpoint — probe complete, Phase 1 kickoff ready
End-of-dogfood checkpoint at cycle #109:

Deliverables:
- PHASE_1_KICKOFF.md (192 lines, execution plan for 6-bundle priority queue)
- Test verification: 564 tests pass, 0 failures
- Branch clean, freeze held, 38 commits total

Probe hypothesis fully validated:
- Multi-flag verbs: 3-4 classifier gaps each
- Single-issue verbs: 0-1 gaps each

Accounting:
- 82 pinpoints filed (cycles #104-#108)
- 67 genuinely open
- 28 doctrines accumulated

Phase 1 ready:
- All 5 priority bundles gaebal-gajae reviewed
- Bundle sequence locked (foundation → extensions → cleanup)
- Expected execution: 50-60 min for all priorities
- No blockers except Phase 0 merge approval

Next: Execute Phase 1 bundles in priority order once Phase 0 lands.
2026-04-23 11:56:57 +09:00
YeonGyu-Kim
66eeed82ca doc: add Phase 1 kickoff — execution plan for 6-bundle priority queue
Comprehensive Phase 1 strategy document prepared at end of probe cycle #108.

Contents:
- Phase 0 recap (freeze, tests, pinpoints, doctrines)
- What Phase 1 will do (6 bundles + independents, all gaebal-gajae reviewed)
- Concrete next steps (branch names, expected commits/tests per bundle)
- Priority 1: Error envelope contract drift (#181/#183) — foundation
- Priority 2: CLI contract hygiene (#184/#185) — extensions
- Priority 3: Classifier sweep 4-verb (#186/#187/#189/#192) — cleanup
- Priority 4: USAGE.md audit (#180) — doc prerequisite
- Priority 5: Dump-manifests help (#188) — doc-truth probe-flow
- Priority 6+: Independents (#190 design, #191 filesystem, others)
- Hypothesis validation (multi-flag verbs = 3-4 gaps, simple verbs = 0-1)
- Testing strategy + success criteria

All 5 priority bundles are reviewer-blessed (gaebal-gajae validation passes).

Doc-only. No code changes. Freeze held.
2026-04-23 11:56:37 +09:00
YeonGyu-Kim
b139b10499 roadmap(#190, #191, #192): file final pre-phase-1 probe gaps in skills lifecycle
Cycle #108 probe of claw skills install/enable/disable yielded 3 pinpoints:

#190: Design decision needed
  skills install (no args) routes to help (action: help, kind: skills).
  May be intentional (like agents pattern) or design inconsistency.
  Requires verification against agents canonical reference.

#191: Classifier gap (filesystem family extension)
  skills install /bad/path emits kind=unknown.
  Should be kind=filesystem or filesystem_io_error.
  Extends #177/#178/#179 install-surface taxonomy.

#192: Classifier gap (unknown-option family extension)
  skills install --bogus-flag emits kind=unknown.
  Should be kind=cli_parse (like sandbox).
  Now 4 members in unknown-option sub-lineage: #186, #187, #189, #192.

Pinpoint count: 82 filed, 67 genuinely open.
Classifier family: 19 members (+2).

All unaudited surfaces now probed:
  - Cycles #104-#108: plugins, agents, init, bootstrap-plan, system-prompt,
    export, sandbox, dump-manifests, skills
  - Hypothesis fully validated: Multi-flag verbs have 3-4 classifier gaps;
    simple verbs have 0-1 gaps.

Per freeze doctrine, no code changes. Doc-only filing.
2026-04-23 11:39:59 +09:00
YeonGyu-Kim
6e6f99e57e roadmap(#188, #189): lock framings + doc-truth sub-axis + priority refinement
Per gaebal-gajae cycle #107 validation pass. Three refinements:

1. Framings locked (both verb-specific):
   #188: 'dump-manifests --help omits the prerequisite that runtime
          behavior actually requires.'
   #189: 'dump-manifests unknown-option errors still fall through to
          unknown instead of the existing CLI-parse path.'

2. Doc-truthfulness family formally split into 2 sub-axes:
   - Audit-flow (5 members: #76, #79, #82, #172, #180) — reading one
     file vs another declared source of truth
   - Probe-flow (NEW, 1 member: #188) — running verb vs observing
     --help text

3. Priority refinement:
   - #189 → bundled in feat/jobdori-186-189-classifier-sweep (3 verbs)
   - #188 → post-#180 (doc parity sequence: USAGE gap → help-text gap)
   - Full sequence: #180 (audit-flow doc-truth) → #188 (probe-flow doc-truth)

4. Key cycle #107 outcome (per gaebal-gajae):
   'behavior bug처럼 보이던 걸 help-text truthfulness gap으로 정확히 재분류'
   This is the reclassification skill that earned the filing.

Doctrine #28: First observation is hypothesis, not filing. Verify
against SCHEMAS/USAGE/--help before classifying axis. Cost: 30-60s
per probe. Benefit: avoid filing not-a-bug pinpoints.

Priority queue now 6 bundles + 3+ independent, all reviewer-blessed.
2026-04-23 11:33:44 +09:00
YeonGyu-Kim
eb957a512c roadmap(#188, #189): file doc-truth and classifier gaps in dump-manifests
Cycle #107 probe of claw dump-manifests yielded 2 pinpoints:

#188: Doc-truthfulness gap (NEW sub-axis)
  claw dump-manifests --help describes usage as optional flags, but
  the verb fails without --manifests-dir or CLAUDE_CODE_UPSTREAM.
  USAGE.md is correct; CLI --help output lies by omission.

  This is the first doc-truth pinpoint from probe flow (vs audit flow).
  New sub-axis: help text vs behavior (prior doc-truth: SCHEMAS/USAGE/README).

#189: Classifier gap (same pattern as #186/#187)
  dump-manifests --bogus-flag falls through to kind=unknown.
  Should be cli_parse (like sandbox).

  Now at 3 verbs in same pattern: system-prompt (#186), export (#187),
  dump-manifests (#189). Rename bundle to feat/jobdori-186-189-classifier-sweep.

Pinpoint count: 79 filed, 65 genuinely open.
Doc-truthfulness family: 6 members (was 5).
Classifier unknown-option sub-lineage: 3 members (was 2).

Per freeze doctrine, no code changes. Doc-only filing.
2026-04-23 11:31:30 +09:00
YeonGyu-Kim
fcb9d18899 roadmap(#187): lock framing + bundle with #186 per gaebal-gajae
Per gaebal-gajae cycle #106 validation pass. Two refinements:

1. #187 framing locked:
   'export unknown-option errors still fall through to unknown,
    unlike the already-canonical sandbox CLI-parse path.'

   Surgical parallel to #186 framing (cycle #105):
   'system-prompt unknown-option errors still fall through to unknown
    instead of the existing CLI-parse classification path.'

   Same pattern: verb + drift + reference path.

2. #186 and #187 bundled into feat/jobdori-186-187-classifier-sweep.
   Rationale: identical fix pattern, identical test pattern, same
   source file, 2x review overhead if separated.

Updated merge priority queue (gaebal-gajae reviewer-blessed):
  1. feat/jobdori-181-error-envelope-contract-drift (#181 + #183)
  2. feat/jobdori-184-cli-contract-hygiene-sweep (#184 + #185)
  3. feat/jobdori-186-187-classifier-sweep (#186 + #187)

Doctrine #27: Same-pattern pinpoints should bundle into one classifier
sweep PR. One-pinpoint = one-branch is not universal; batching
same-pattern fixes halves review/merge overhead.
2026-04-23 11:22:40 +09:00
YeonGyu-Kim
d03f33b119 roadmap(#187): file export classifier gap from cycle #106 probe
Cycle #106 probe of export and sandbox verbs. Found:
- export --bogus-flag: kind=unknown (should be cli_parse)
- sandbox --bogus-flag: kind=cli_parse (canonical correct)

#187 is direct sibling of #186 (system-prompt classifier gap).
Both unknown-option, both should use cli_parse classifier.

Observation: sandbox has no gaps. export has 1 classifier gap.
Suggests classifier coverage improving on newer verbs, not consistent
regression across unaudited surfaces.

Hypothesis (#104) partially validated: unaudited surfaces yield
pinpoints, but not uniformly. Single-issue verbs (sandbox) may be
cleaner than multi-flag verbs (export, init, bootstrap-plan).

Pinpoint count: 77 filed, 63 genuinely open.

Per freeze doctrine, no code changes. Doc-only filing.
2026-04-23 11:21:31 +09:00
YeonGyu-Kim
6bd69d55bc doc(review-guide): embed gaebal-gajae authoritative state framing
Per gaebal-gajae cycle #105 validation pass. One-liner state summary
now appears at top (tone-setter for reviewers) and bottom (reinforced
recap):

  'Phase 0 is now frozen, reviewer-mapped, and merge-ready;
   Phase 1 remains intentionally deferred behind the locked priority order.'

This is the single authoritative sentence that captures branch state.
Use it for PR titles, review summaries, and Phase 1 handoff notes.

Why this framing matters (per gaebal-gajae evaluation):
- 'frozen' signals no scope creep
- 'reviewer-mapped' signals audit trail exists (this guide)
- 'merge-ready' signals gates are passed
- 'intentionally deferred' signals Phase 1 absence is by design, not omission
- 'locked priority order' signals sequencing is validated (cycle #104-#105)

Review guide now doubles as merge-enabler: reviewers parse branch state
in one sentence, then drill into commits as needed.

Doc-only. No code changes. Freeze preserved.
2026-04-23 11:11:50 +09:00
YeonGyu-Kim
e470e614d5 doc: add Phase 0 + dogfood bundle review guide for cycles #104-#105
Pre-merge documentation for reviewers. Summarizes:
- What Phase 0 tasks deliver (JSON envelope contracts, regression locks)
- Why dogfood cycles #99-#105 matter (validated methodology, 15 filed pinpoints)
- Commit-by-commit navigation for the 30-commit frozen bundle
- What lands vs what's deferred
- Integration notes for Phase 1 planning
- Known limitations + follow-ups

This is doc-only, no code changes. Serves as audit trail and reviewer
reference without adding scope to the frozen feature branch.
2026-04-23 11:10:51 +09:00
YeonGyu-Kim
1494a94423 roadmap: lock merge priority for cycles #104-#105 pinpoints
Per gaebal-gajae cycle #105 priority pass.

Locked merge order (minimizes consumer-facing contract disruption):
1. feat/jobdori-181-error-envelope-contract-drift (#181 + #183 bundled)
2. feat/jobdori-184-cli-contract-hygiene-sweep (#184 + #185 bundled)
3. feat/jobdori-186-system-prompt-classifier (#186 standalone)

Rationale: foundation → extensions → cleanup ordering.
- #181 first: canonical error envelope established (1 shape change)
- #184/#185 second: use existing envelope (0 shape changes)
- #186 third: classifier branch add (1 classifier change)
Total: 1 shape + 1 classifier change across 3 merges.

Doctrine #25: Contract-surface-first ordering. Foundation layer before
extending guards before refinement cleanup.

Still-deferred pinpoints explicitly mapped with dependencies:
#173, #174, #177/#178/#179, #180, #182, #175.

Branch now at 30 commits, 227/227 tests.
2026-04-23 11:05:00 +09:00
YeonGyu-Kim
8efcec32d7 roadmap(#184-#186): lineage corrections + reference implementation lock
Per gaebal-gajae cycle #105 review pass. Three corrections:

1. #184/#185 belong to #171 lineage (CLI contract hygiene sub-family),
   NOT a new family. Same enforcement hole pattern on unaudited verbs.

2. #186 locked as member of #169/#170 classifier lineage. Framing:
   'system-prompt unknown-option errors still fall through to unknown
   instead of the existing CLI-parse classification path.'

3. agents is the #183 reference implementation. Fix path reframed from
   'design new contract' to 'align outliers to existing reference'.
   Much smaller scope for feat/jobdori-181-error-envelope-contract-drift.

Canonical reference shape locked:
{action: 'help', kind: <verb>, unexpected: <bad-name>, usage: {...}}

Doctrine #24: Pinpoint lineage continuity. Check existing family
before creating new. Reviewers follow pattern lineages.

Family tree corrected: CLI contract hygiene moved from 'NEW' to
'#171 sub-lineage within classifier family'.
2026-04-23 11:03:52 +09:00