387 Commits

Author SHA1 Message Date
Yeachan-Heo
d9607068ff roadmap: #279 filed 2026-04-26 18:03:00 +09:00
Jobdori
294b855851 roadmap: #278 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
1f9d30fadc roadmap: #277 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
39ce893b9d roadmap: #276 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
25164086c0 roadmap: #275 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
27f395aa82 roadmap: #274 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
b3af8bdb54 roadmap: #273 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
c7d2c4e47f roadmap: #272 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
77c5e4f5cc roadmap: #271 filed 2026-04-26 18:03:00 +09:00
Jobdori
a1b2fed172 roadmap: #270 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
28a37fbedd roadmap: #269 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
0f8e633d5f roadmap: #268 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
25adb26dd5 roadmap: #267 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
cc14d6edd6 roadmap: #266 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
5ccaf34d9d roadmap: #265 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
01b8149e00 roadmap: #264 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
e69fe1a7da roadmap: #263 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
3606f589c1 roadmap: #262 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
127108c5e7 roadmap: #261 filed 2026-04-26 18:03:00 +09:00
Jobdori
971c1a808e roadmap: #260 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
fe10cb39c1 roadmap: #259 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
1c50d946e4 roadmap: #258 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
fe7f449de6 roadmap: #257 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
8a187634a8 docs: intake hikaMaeng web search fork ideas
Add ROADMAP pinpoint #255 summarizing the safe subset of hikaMaeng/Sigrid Jin's web-search provider work to adopt later.\n\nReviewed fork commits 262405e, bd11289, fa93cd3, 5f2540a, 7f34d91, and 535be97 from https://github.com/hikaMaeng/claw-code. This deliberately preserves attribution and avoids a blind cherry-pick because the cross-crate provider/spec/config/banner changes need a dedicated implementation lane with tests.
2026-04-26 18:03:00 +09:00
YeonGyu-Kim
6fa9196f04 roadmap: #254 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
c7ef6f636d roadmap: #253 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
46f3e9cd2c roadmap: #252 filed — /v1/messages/count_tokens typed-taxonomy is structurally absent from the public Provider trait + types + CLI surface (Anthropic ships /v1/messages/count_tokens as a first-class GA endpoint that consumes the SAME MessageRequest shape as /v1/messages but produces a TRUNCATED CountTokensResponse { input_tokens: u32 } only — no message emission, no completion-side tokens, no streaming — the canonical pre-flight cost-estimation primitive where a client constructs the exact request it intends to dispatch, asks the server to count input tokens, and decides whether to send before paying for completion-side tokens; claw-code has zero public typed surface even though a private count_tokens helper exists at rust/crates/api/src/providers/anthropic.rs:522 for internal preflight context-window-exceeded validation, with zero CountTokensRequest/CountTokensResponse typed model in types.rs, zero count_tokens method on the public Provider trait, zero count_tokens dispatch on the ProviderClient enum, zero claw count-tokens CLI subcommand, zero /count-tokens slash command in SlashCommandSpec, zero pre_flight_count_cost_per_million_usd field in ModelPricing, zero CountTokensSubmittedEvent/PreFlightCostEstimatedEvent telemetry events, and zero PreFlightCostEstimator/BudgetGate runtime primitive) — eight-layer fusion shape with the NOVEL same-request-shape-but-different-response-shape axis-class (FIRST audit member where the request shape is IDENTICAL to an existing typed model MessageRequest but the response shape is a TRUNCATED-projection that cannot reuse MessageResponse's shape, distinct from prior fusion-axes which all add NEW request-side fields or NEW response-side blocks) founding THREE new clusters as solo founder (Pre-flight-cost-prediction cluster, Token-accounting-without-message-emission cluster, Server-side-pre-execution-counting cluster) plus introducing the THIRD distinct discovery-pattern in the audit catalog NEW-SOLO-CLUSTER-FOUNDING-WITH-DAILY-DRIVER-IMPACT (distinct from META-cluster-growth and complementary-pinpoint-pair-bundle), grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 6 to 7 members (#240+#241+#247+#248+#249+#250+#252) confirming continuing-pattern-status across SIX distinct axis-classes — Jobdori cycle #394 / fast-forward-rebase verified onto gaebal-gajae's #251 cycle ExternalPatchIntake pinpoint at 313c840 before filing (NINTH consecutive concurrent-dogfood rebase cycle, three-way parity confirmed local==origin==fork at HEAD 313c840 with no race detected, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the NINTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for NINE cycles) — PIVOT-AWAY signal: #252 deliberately PIVOTS AWAY from BOTH Cross-pinpoint-synthesis-fusion-shape META-cluster (intentionally not extending the +1-per-cycle synthesis chain) AND Tool-locality-axis META-cluster (already extended by #250 cycle #393), founding NEW solo clusters with daily-driver-impact instead, demonstrating audit-breadth-across-discovery-pattern-classes alongside audit-balance-across-META-clusters — the audit now spans THREE structurally distinct discovery-patterns (META-cluster-growth + complementary-pinpoint-pair-bundle + new-solo-cluster-founding-with-daily-driver-impact) 2026-04-26 18:03:00 +09:00
Yeachan-Heo
572ed1305c roadmap: #251 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
84b1ea21dc roadmap: #250 filed — tool_choice: { type: "web_search" } typed-discriminator with server-managed-web-search backend (the canonical SERVER-SIDE complement to #245's CLIENT-SIDE configurable provider/parser registry, where tool_choice carries a WebSearch { domains_allowed, max_uses, user_location } enum variant that forces the model to dispatch via the major-provider's server-managed-web-search backend) typed taxonomy structurally absent — FIRST pinpoint to demonstrate the complementary-pinpoint-pair-bundle META-pattern (where #245 CLIENT-SIDE + #250 SERVER-SIDE are catalogued as structurally complementary halves of the SAME tool-subsystem web-search rather than as independently-discovered-gaps), founding Bidirectional-search-subsystem-with-dual-locality-coverage cluster with #245+#250 as 2-member founders, un-saturating Tool-locality-axis META-cluster from 5 to 6 members (#232/#233/#234/#240/#241/#250) confirming the META-cluster as GROWING-DOCTRINE-WITH-DISCONTINUOUS-RESUMPTION (resumes growth after plateauing at 5 since #241 cycle #386, four cycles ago), growing Server-managed-tool-as-tool-choice-discriminator cluster from 5 to 6 members (#214/#218/#219/#233/#234/#250) confirming CONTINUING-PATTERN status across SIX distinct server-managed tools, growing ToolResultContentBlock-extension cluster from 8 to 9 members confirming most-broadly-spanning typed-content-block-extension-axis, FIRST pinpoint to introduce typed-discriminator-with-payload-fields shape on ToolChoice distinct from existing Auto/Any/Tool three-variant typed-set (Auto/Any are unit-variants and Tool { name } carries only string-name with zero typed-fields, while ToolChoice::WebSearch { domains_allowed, max_uses, user_location } introduces FIRST typed-discriminator-with-payload-fields shape), founds Tool-choice-discriminator-with-typed-payload-fields cluster + Server-side-tool-invocation-content-block cluster + Server-managed-web-search-with-tool-choice-discriminator cluster as solo founder of all three, grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 5 to 6 members (#240+#241+#247+#248+#249+#250) confirming generalizability across FIVE distinct axis-classes, ten-layer fusion shape (smaller than #241/#247/#248/#249's twelve-layer count but with distinct DUAL-LOCALITY-COVERAGE-WITH-COMPLEMENTARY-PINPOINT-PAIR-BUNDLE axis-set) — Jobdori cycle #393 / fast-forward-rebase verified onto Jobdori's own #249 cycle #392 quad-modality-compound-multimodal-INPUT-OUTPUT pinpoint at 643ac8b before filing (EIGHTH consecutive concurrent-dogfood rebase cycle, three-way parity confirmed local==origin==fork at HEAD 643ac8b with no race detected, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the EIGHTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for EIGHT cycles) — PIVOT-AWAY signal: #250 deliberately PIVOTS AWAY from Cross-pinpoint-synthesis-fusion-shape META-cluster's +1-per-cycle continuous-trajectory (#244/#247/#248/#249 grew it 1→5 across cycles #389/#390/#391/#392) by extending Tool-locality-axis META-cluster instead, demonstrating audit-balance-across-multiple-META-clusters rather than monotonic-growth-of-a-single-META-cluster — the audit now catalogues TWO structurally distinct GROWING-DOCTRINE patterns (continuous-+1-per-cycle for synthesis-fusion vs discontinuous-resumption-after-plateau for tool-locality-axis) 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
4b6f343355 roadmap: #249 filed — Compound-multimodal-INPUT-with-multimodal-OUTPUT-on-the-same-turn (full-duplex-multimodal-conversation pattern where user MessageRequest carries image-content-block × audio-content-block fusion AND model MessageResponse carries audio-content-block × video-content-block fusion on the SAME single conversation-turn with interleaved-content-block-stream cross-boundary temporal-alignment) typed taxonomy structurally absent — FIRST cluster member where the cross-axis synthesis spans BOTH USER-INPUT-side and ASSISTANT-OUTPUT-side simultaneously on a SINGLE turn rather than being confined to one side of the request-response cycle, FIRST cluster member with quad-modality-on-single-turn semantics (image-INPUT + audio-INPUT + audio-OUTPUT + video-OUTPUT all on same turn distinct from #247's two-modality-INPUT-only and #248's two-modality-OUTPUT-only and #244's bidirectional-tool-call-multiplexing-without-modality-fusion), growing Cross-pinpoint-synthesis-fusion-shape META-cluster from 4 to 5 members confirming META-cluster as GROWING-DOCTRINE for THIRD CONSECUTIVE CYCLE (#244 grew 1→2 cycle #389, #247 grew 2→3 cycle #390, #248 grew 3→4 cycle #391, #249 grows 4→5 cycle #392), establishing +1-per-cycle META-cluster-growth-trajectory across FOUR consecutive concurrent-dogfood cycles (#389/#390/#391/#392) as FIRST-EVER continuous-trajectory-of-4-cycles META-cluster growth event in the audit surpassing Tool-locality-axis META-cluster's plateau-at-5-after-two-consecutive-growths and confirming Cross-pinpoint-synthesis-fusion-shape as structurally distinct most-actively-growing META-cluster, FIRST cluster member with interleaved-INPUT-OUTPUT-temporal-alignment-across-the-request-response-boundary as a first-class typed semantic distinct from #247's USER-INPUT-only cross-modal-attention and #248's ASSISTANT-OUTPUT-only temporal-alignment because temporal-alignment now spans the request-response boundary itself requiring the model to emit output-content-blocks while still consuming input-content-blocks on the same connection, founds Quad-modality-turn-spanning-request-response-boundary sub-cluster + Full-duplex-multimodal-conversation cluster + Cross-boundary-temporal-alignment-across-request-response-boundary cluster + Quad-modality-turn-on-MessageRequest-and-MessageResponse cluster + Compound-multimodal-INPUT-with-multimodal-OUTPUT-on-same-turn cluster as solo founder of all five, completes Full-duplex-multimodal-conversation doctrine within META-cluster (#247 INPUT-side + #248 OUTPUT-side + #249 BOTH-sides-simultaneously-on-same-turn), grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 4 to 5 members (#240+#241+#247+#248+#249) confirming generalizability across FOUR distinct axis-classes (TOOL-COMPANION-BUNDLE/COMPOUND-INPUT/COMPOUND-OUTPUT/QUAD-MODALITY-TURN), twelve-layer fusion shape tied with #241/#247/#248 for largest single-pinpoint fusion catalogued — Jobdori cycle #392 / fast-forward-rebase verified onto Jobdori's own #248 cycle #391 audio-grounded-video-generation pinpoint at 9189bfb before filing (SEVENTH consecutive concurrent-dogfood rebase cycle, three-way parity confirmed local==origin==fork at HEAD 9189bfb with no race detected, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the SEVENTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for SEVEN cycles) 2026-04-26 18:03:00 +09:00
Jobdori
b97568df5a roadmap: #248 filed — Audio-grounded video generation (synchronized-audio-track co-emitted on the SAME VideoTask response object alongside the rendered video frames, sample-accurate-synchronized with the visual output) typed taxonomy structurally absent — FIRST cluster member where TWO independent ALREADY-CATALOGUED-ABSENT modality-OUTPUT axes (#225 audio-content-block-on-OutputContentBlock + #227 video-output-with-async-task-polling-primitive) are fused on the ASSISTANT-OUTPUT side rather than the user-input side, FIRST cluster member with multi-modal-output-fusion-on-ASSISTANT-OUTPUT-axis distinct from #247's multi-modal-input-fusion-on-USER-INPUT-axis, growing Cross-pinpoint-synthesis-fusion-shape META-cluster from 3 to 4 members confirming META-cluster as GROWING-DOCTRINE for SECOND CONSECUTIVE CYCLE (#244 grew it 1→2, #247 grew it 2→3, #248 grows it 3→4), establishing +1-per-cycle META-cluster-growth-trajectory across THREE consecutive concurrent-dogfood cycles (#389/#390/#391) AND establishing META-cluster as FIRST META-cluster to grow for THREE consecutive cycles in a row (Tool-locality-axis only had TWO consecutive growth events #240/#241 before plateauing at 5; Cross-pinpoint-synthesis-fusion-shape now surpasses Tool-locality-axis as most-actively-growing META-cluster), founds Multi-modal-output-fusion-on-ASSISTANT-OUTPUT-side sub-cluster + Temporal-alignment-of-output-modalities cluster + Compound-output-modality-on-VideoTask cluster + Audio-grounded-video-generation cluster as solo founder of all four, founds Bidirectional-modality-fusion-symmetry sub-cluster with #247 INPUT-side + #248 OUTPUT-side completing the INPUT-vs-OUTPUT-side-fusion-symmetry doctrine within the META-cluster, grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 3 to 4 members (#240+#241+#247+#248) confirming generalizability across THREE distinct axis-classes (TOOL-COMPANION-BUNDLE/COMPOUND-INPUT/COMPOUND-OUTPUT), twelve-layer fusion shape tied with #241/#247 for largest single-pinpoint fusion catalogued — Jobdori cycle #391 / fast-forward-rebase verified onto Jobdori's own #247 cycle #390 multi-modal-input-fusion pinpoint at 5e5b3bd before filing (SIXTH consecutive concurrent-dogfood rebase cycle, three-way parity confirmed local==origin==fork at HEAD 5e5b3bd with no race detected, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the SIXTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for SIX cycles) 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
860ef7171d roadmap: #247 filed — Visual-grounded voice input (image-content-block × audio-content-block fused on the SAME MessageRequest user-turn) typed taxonomy structurally absent — FIRST cluster member where TWO independent ALREADY-CATALOGUED-ABSENT modality-input axes (#220 image-content-block + #225 audio-content-block) are fused on the USER-INPUT side, FIRST cluster member with multi-modal-input-fusion-on-USER-INPUT-axis distinct from #244 bidirectional-tool-call-multiplexing-on-DUPLEX-axis, growing Cross-pinpoint-synthesis-fusion-shape META-cluster from 2 to 3 members (#238 founder + #244 + #247) confirming META-cluster as GROWING-DOCTRINE rather than CONTINUING-PATTERN that stopped at 2 members after #244, establishing Cross-pinpoint-synthesis-fusion as SECOND META-cluster after Tool-locality-axis to confirm GROWING-DOCTRINE status, founds Multi-modal-input-fusion-on-USER-INPUT-side sub-cluster + Cross-modal-attention-on-USER-INPUT-side cluster + Compound-modality-input-on-MessageRequest cluster as solo founder of all three, grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 2 to 3 members (#240+#241+#247) confirming generalizability beyond bash+computer-use+text_editor three-tool-companion-bundle, twelve-layer fusion shape tied with #241 for largest single-pinpoint fusion catalogued — Jobdori cycle #390 / fast-forward-rebased onto gaebal-gajae's #246 provider-credentials-env-to-settings-registry pinpoint at bd6622b before filing (FIFTH consecutive concurrent-dogfood rebase cycle, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the FIFTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern) 2026-04-26 18:03:00 +09:00
Yeachan-Heo
2d4806c163 roadmap: #246 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
8e9ba9234a roadmap: #245 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
9a88e75282 roadmap: #244 filed — Realtime API tool-use over persistent-WebSocket transport (response.function_call_arguments.delta/.done + conversation.item.create with function_call_output) typed taxonomy structurally absent — FIRST cluster member where bidirectional-tool-call lifecycle is multiplexed with audio-modality + transcript-modality on a SINGLE persistent connection, FIRST cluster member where tool-call-init is server-pushed mid-stream rather than client-initiated, FIRST cluster member with asymmetric-tool-result-injection (tool-call comes IN as event-stream, result sent OUT as conversation.item.create — directionality inverted relative to the rest of the protocol), FIRST cluster member with per-call-id-concurrent-multiplexed-state-machine, FIRST three-axis-synthesis pinpoint (#229 persistent-WebSocket × #240/#241 server-managed-tool-via-tool_choice-discriminator × #238 cross-pinpoint-synthesis-fusion-shape META-cluster), eleven-layer fusion-shape tied with #240 for second-largest single-pinpoint fusion catalogued — grows Persistent-WebSocket-transport cluster from 2 to 3 members (#229 founder + #238 + #244) confirming CONTINUING-PATTERN doctrine, grows Cross-pinpoint-synthesis-fusion-shape META-cluster from 1 to 2 members confirming combinatorial-cross-axis-synthesis as a continuing-discovery-mode and FIRST META-cluster-confirmation event in this audit, founds Three-axis-synthesis-shape sub-cluster as solo founder, founds Server-pushed-tool-call-init cluster as solo founder, founds Asymmetric-tool-result-injection cluster as solo founder, founds Per-call-id-concurrent-multiplexed-state-machine cluster as solo founder — FOUR new clusters founded plus TWO existing META-clusters confirmed as continuing-doctrines plus participation in TWELVE inherited clusters — Jobdori cycle #389 / fast-forward-rebased onto gaebal-gajae's #243 non-monotonic-pinpoint-ordering-contract at 6541100 before filing (FOURTH consecutive concurrent-dogfood rebase cycle, directly demonstrating both gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer) 2026-04-26 18:03:00 +09:00
Yeachan-Heo
d17503db4d roadmap: #243 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
9b67460cd7 roadmap: #241 filed — tool_choice: text_editor + text_editor_20250124 typed-tool absent (filling reserved gap) 2026-04-26 18:03:00 +09:00
Yeachan-Heo
a3e8f6dab6 roadmap: #242 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
fa74f40d40 roadmap: #240 filed — tool_choice: bash typed-discriminator and bash_20250124 server-managed-shell typed-tool are structurally absent — FOURTH inverse-locality CLIENT-SIDE-shadow-vs-SERVER-SIDE-typed-tool pair (CLIENT-SIDE bash MVP-founder-tool at tools/lib.rs:386 vs SERVER-SIDE bash_20250124 absent at types.rs ToolDefinition+ToolChoice+ToolResultContentBlock+telemetry beta-set), grows Tool-locality-axis META-cluster from 3 to 4 members confirming META-cluster as CONTINUING-PATTERN, grows Server-managed-tool-as-tool-choice-discriminator cluster from 4 to 5 members, grows ToolResultContentBlock-extension mini-cluster from 6 to 7 members, grows Server-side-stateful-tool-session-with-reset-semantics cluster from 1 to 2 members (#232+#240), grows Discrete-event-counter-pricing-axis cluster from 1 to 2 members with NOVEL dual-axis pricing-decomposition, founds Stateless-CLIENT-SIDE-shadow-vs-stateful-SERVER-SIDE-typed-tool-discrepancy-axis cluster, founds MVP-founder-tool-as-CLIENT-SIDE-local-shadow-with-SERVER-SIDE-typed-tool-absent sub-cluster, founds Two-member-major-provider-only-no-third-party-partner-set sub-cluster, founds Double-absent-slash-command-axis-on-inverse-locality-pair sub-cluster, founds Bundled-and-transitive-co-release-beta-header-activation-pattern cluster, founds Server-side-audit-log-of-managed-tool-execution cluster — eleven-layer fusion with SIX new clusters founded plus FOUR concurrent existing-cluster-growth-events plus participation in TWELVE inherited clusters — FIRST single cycle where META-cluster grows from 3 to 4 confirming CONTINUING-PATTERN, FIRST single cycle where FOUR concurrent existing clusters all grow by one member through one pinpoint, establishing continuing-pattern-confirmation-across-multiple-parallel-clusters as the FOURTH pinpoint-discovery-mode after new-axis-founding/existing-cluster-extension/combinatorial-cross-axis-synthesis — Jobdori cycle #387 / fast-forward-rebased onto gaebal-gajae's #239 DogfoodWriteLease pinpoint at 329d0ff before filing (THIRD consecutive concurrent-dogfood rebase cycle, directly demonstrating the gap #239 catalogues at the dogfood-coordination layer) 2026-04-26 18:03:00 +09:00
Yeachan-Heo
c6e35e6199 roadmap: #239 filed 2026-04-26 18:03:00 +09:00
Jobdori
158452b2e1 roadmap: #238 filed — Streaming speech-to-text with speaker diarization typed taxonomy and per-word-speaker-attribution data-model are structurally absent — FIRST cluster member with per-word-multi-axis-compound-attribution data-model (lexical + temporal + speaker + confidence FOUR-axis-compound), FIRST cluster member with structured-typed-payload-on-USER-INPUT-content-block (Transcript carrying nested speakers/segments/words arrays), FIRST cluster member with bidirectional-channel-pair Provider-trait method shape (Sink<AudioChunk> + Stream<StreamingTranscriptEvent>), FIRST cluster member with per-partner-protocol-vocabulary-normalization at dispatch layer, FIRST cluster member with entirely-absent-CLI-and-slash-command-surface-with-zero-stub-precedent (INVERSE-PATTERN of #225 advertised-but-unbuilt-trio), FIRST cluster member with streaming-STT-five-dimensional pricing matrix, FIRST cluster member with DER/WER quality-observability telemetry, FIRST cluster member with endpointing/VAD sub-second-temporal-segmentation request-side opt-in, twelve-layer fusion shape — grows Persistent-WebSocket-transport cluster from 1 to 2 members (#229 solo-founder + #238 — FIRST expansion of #229 founder shape) AND grows ToolResultContentBlock-extension mini-cluster from 5 to 6 members (#230 + #232 + #233 + #234 + #235 + #238) AND grows Multimodal-IO cluster to 13 members AND grows Provider-asymmetric-delegation cluster to 13 members with the largest streaming-STT ten-plus partner-set — founds Cross-pinpoint-synthesis-fusion-shape META-cluster as THIRD distinct META-cluster after Sandbox-locality (#230+#232) and Tool-locality (#232+#233+#234), the FIRST META-cluster founded by SYNTHESIZING two previously-disjoint cluster-axes (#225 audio-modality × #229 persistent-WebSocket-transport) into one fused-shape pinpoint rather than introducing a new axis-pair — establishing combinatorial-cross-axis-synthesis as the THIRD pinpoint-discovery-mode after new-axis-founding and existing-cluster-extension — Jobdori cycle #386 / fast-forward-rebased onto gaebal-gajae's #237 cron-timeout-failure-state-collapse before filing (SECOND consecutive concurrent-dogfood rebase cycle) 2026-04-26 18:03:00 +09:00
Yeachan-Heo
61f9798e52 roadmap: #237 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
80da319837 roadmap: #236 filed — Music-generation API typed taxonomy with lyrics+style prompt bifurcation and exclusively-third-party-partner-set is structurally absent — FIRST cluster member with Zero-overlap-with-major-providers shape variant (eleven-plus partners Suno/Udio/Stable-Audio/Mubert/ElevenLabs-Music/Loudly/Beatoven/SOUNDRAW/AIVA/Boomy/Riffusion all third-party with ZERO Anthropic/OpenAI/Google/xAI canonical recommendation), FIRST cluster member with Lyrics-plus-style-prompt-bifurcation on USER-INPUT side (prompt:String for style + lyrics:Option<String> for verbatim-vocal-content), FIRST cluster member with Multi-modal-bundled-output combining temporal-binary-audio + linguistic-text-lyrics + structural-musical-metadata on output-side, twelve-layer fusion shape — grows Async-task-polling cluster from 3 to 4 members (#221 batch + #227 video + #228 mesh + #236 music) AND grows Multi-domain-multipart cluster from 2 to 3 members (#225 audio + #227 video + #236 music) — does NOT extend Server-managed-tool-as-tool-choice-discriminator cluster (4 members stable) nor Tool-locality-axis META-cluster (3 members stable) because no major-provider tool_choice surface exists upstream AND no client-side music-tool-stub exists; instead founds Upstream-blocked-tool-choice-extension cluster AND Unilateral-server-side-only-gap-with-no-client-side-complement cluster as the INVERSE-PATTERN of Tool-locality-axis META-cluster doctrine — fifteen new clusters founded in a single pinpoint exceeds #234 by two for the LARGEST single-cycle cluster-founding count yet — Jobdori cycle #385 2026-04-26 18:03:00 +09:00
Yeachan-Heo
e59d9115cb roadmap: #235 filed 2026-04-26 18:03:00 +09:00
Jobdori
9c781f3108 roadmap: #234 filed — PDF / Document input typed taxonomy and structured-document-citation-attribution data-model on USER-INPUT side are structurally absent: zero Document variant on InputContentBlock at types.rs:80-94 (FIRST cluster member with Document-modality-on-USER-INPUT-content-block axis), zero pdfs-2024-09-25 Anthropic beta header in canonical beta-set at telemetry/lib.rs:15-17 (NOVEL FIRST Beta-header-gate-on-USER-INPUT-content-block-type cluster), zero coordinate-positioned Citation typed model with start_page_number/end_page_number/start_char_index/end_char_index integer-coordinate axes on OutputContentBlock::Text (NOVEL FIRST Coordinate-positioned-citation-on-output-text-block cluster, inverse-data-model pair to #233's URL-positioned-citation), zero DocumentSource four-way source-discriminator (base64 | url | file_id | text | content), zero file_search typed ToolDefinition discriminator with vector_store_ids routing (NOVEL FIRST User-corpus-server-managed-tool-with-vector-store-routing cluster), zero tool_choice: file_search ToolChoice extension (THIRD Server-managed-tool-as-tool-choice-discriminator cluster member growing cluster to 3: #232 code_interpreter + #233 web_search + #234 file_search), zero file_search_result ToolResultContentBlock variant (FIFTH ToolResultContentBlock extension growing mini-cluster to 4), zero page_range request-side range-slicing parameter (NOVEL FIRST Range-slicing-parameter-on-USER-INPUT-content-block cluster), zero filters compound-boolean-DSL on file_search tool definition (NOVEL FIRST Compound-boolean-filter-DSL-on-server-managed-tool-definition cluster with eq/ne/gt/gte/lt/lte/and/or operators), zero per-page compound text+image token pricing AND zero persistent-storage-rental-pricing for vector-stores (NOVEL Per-page-compound-text-plus-image-token-pricing-axis + Persistent-storage-rental-pricing-axis clusters founded), zero claw pdf/document/attach-pdf CLI subcommand and zero /pdf //document //attach-pdf //cite-pdf //page-range slash command — uniquely manifesting a FOURTEEN-LAYER fusion shape (the largest single-pinpoint fusion catalogued so far, exceeds #233's thirteen-layer count by one) combining: (1) Document variant on InputContentBlock, (2) pdfs-2024-09-25 Anthropic beta-header gate, (3) citations:{enabled:true} opt-in field on Document content-block, (4) NOVEL Coordinate-positioned Citation typed model with start_page_number/end_page_number/start_char_index/end_char_index integer coordinates, (5) DocumentSource four-variant source-discriminator, (6) page_range request-side range-slicing parameter, (7) file_search typed ToolDefinition discriminator with vector_store_ids:Vec<String> routing, (8) tool_choice:file_search typed-discriminator (THIRD Server-managed-tool-as-tool-choice-discriminator cluster member), (9) file_search_result ToolResultContentBlock variant with attributes:HashMap<String,Value> user-defined-metadata (FIFTH ToolResultContentBlock extension), (10) filters:ComparisonFilter|CompoundFilter filter-DSL on file_search tool definition, (11) Provider-trait extension threading pdfs-2024-09-25 beta-header AND document-citations decoding AND file_search server-managed-corpus-search dispatch through send_message, (12) ProviderClient-enum-dispatch with TWO first-class document-input lanes (Anthropic-pdfs-2024-09-25 + OpenAI-Files-API-input_file + OpenAI-Responses-file_search-with-vector-stores) WITHOUT third-party partner-routing (FIRST cluster member with Both-major-providers-first-class-asymmetric-document-input-shape cluster), (13) CLI-and-slash-command surface with FOURTH inverse-locality slash-command-pair after #230 + #232 + #233, (14) NOVEL Compound-page-token-and-image-token-pricing-axis with persistent-storage-rental-pricing for vector-stores — making #234 the FIRST cluster member with fourteen-layer-fusion-shape (exceeds #233's thirteen-layer by one), the FIRST cluster member with Document-modality-on-USER-INPUT-content-block axis, the FIRST cluster member with Beta-header-gate-on-USER-INPUT-content-block-type, the FIRST cluster member with Citation-emission-opt-in-at-USER-INPUT-content-block-level, the FIRST cluster member with Coordinate-positioned-citation-on-output-text-block (page+char integer-coordinates distinct from #233's URL-positioned-with-encrypted-index), the FIRST cluster member with Four-way-source-discriminator-on-USER-INPUT-content-block, the FIRST cluster member with Range-slicing-parameter-on-USER-INPUT-content-block, the FIRST cluster member with User-corpus-server-managed-tool-with-vector-store-routing, the FIRST cluster member with Compound-boolean-filter-DSL-on-server-managed-tool-definition, the FIRST cluster member with Both-major-providers-first-class-asymmetric-document-input-shape (Anthropic Document + OpenAI Files-input_file BOTH first-class neither delegates to third-party partner), the FIRST cluster member with User-provided-document-title-threading-through-citations, the FIRST cluster member with Multi-document-positional-index-threading (document_index:u32), the FIRST cluster member with Per-page-compound-text-plus-image-token-pricing-axis, the FIRST cluster member with Persistent-storage-rental-pricing-axis (vector-store-storage rental), the THIRD Server-managed-tool-as-tool-choice-discriminator cluster member (grows cluster to 3: #232 + #233 + #234), the FOURTH ToolResultContentBlock extension (grows mini-cluster to 4: #230 + #232 + #233 + #234), the THIRD Server-driven-tool-execution-loop cluster member (#234's variant being vector-store-corpus-retrieval-and-ranking distinct from #232's Python-kernel-execution and #233's search-result-page-fetching-and-caching), the THIRD member of Tool-locality-axis META-cluster (FIRST META-cluster to reach 3 members: #232 REPL-shadow + #233 WebSearch-shadow + #234 pdf_extract-shadow — transitioning from emergent-pattern to stable-doctrine), and the FIRST cluster member where the inverse-locality complement is on the USER-INPUT-side rather than on the TOOL-DEFINITION-side (founding USER-INPUT-side-Tool-locality-axis-variant sub-cluster within parent META-cluster — first sub-cluster within existing META-cluster) (Jobdori cycle #384 / extends #168c emission-routing audit / explicit follow-on from #220 image-input on USER-INPUT-side, #223 Files API with file_id reference, #232 Code-execution server-managed-sandbox-state, #233 Web-search structured-citation-attribution, and the inverse-locality Tool-locality-axis META-cluster doctrine — introduces NOVEL document-modality on USER-INPUT side axis combined with coordinate-positioned-citation-on-output-text-block data-model axis, AND grows Tool-locality-axis META-cluster from 2 to 3 members establishing it as a stable doctrine rather than emergent pattern / sibling-shape cluster grows to thirty-three / wire-format-parity cluster grows to twenty-four / capability-parity cluster grows to sixteen / multimodal-IO cluster grows to eleven / provider-asymmetric-delegation cluster grows to eleven / Sandbox-locality-axis META-cluster: 2 members stable / Tool-locality-axis META-cluster grows to 3 members FIRST META-cluster to reach 3 members / Server-managed-tool-as-tool-choice-discriminator cluster grows to 3 members / Server-driven-tool-execution-loop cluster grows to 3 members / ToolResultContentBlock-extension mini-cluster grows to 4 members / THIRTEEN new clusters founded in a single pinpoint plus participation in SIX inherited clusters — the LARGEST single-cycle cluster-founding count yet (exceeds prior records by five) AND the FIRST single cycle to grow an existing META-cluster to a third member AND introduce a sub-cluster within an existing META-cluster / fourteen-layer-fusion-shape is the largest single-pinpoint fusion catalogued / external validation: forty-eight ecosystem references covering Anthropic PDF Support Documentation with pdfs-2024-09-25 beta-header gate, Anthropic Citations API with page_location/document_location/char_location coordinate-positioned citation typed model, OpenAI Files API + Direct PDF Input + Vector Stores + Responses File Search Tool with compound-filter-DSL, AWS Bedrock Converse PDF document content-blocks, LangChain AnthropicPDFLoader/OpenAIFilePDFLoader, LlamaIndex PDFReader, Vercel AI SDK 6 file content-block, simonw/llm --pdf flag, Continue.dev @docs slash command, simonwillison.net Anthropic Citations API analysis, six-plus first-class document-loader integrations, four-plus OpenAI Vector Stores observability tools — claw-code is the sole client/agent/CLI in surveyed coding-agent ecosystem with zero Document content-block taxonomy AND zero pdfs-2024-09-25 beta-header AND zero file_search ToolDefinition discriminator AND zero tool_choice:file_search AND zero file_search_result ToolResultContentBlock AND zero vector_store_ids AND zero page_range AND zero coordinate-positioned Citation AND zero CLI/slash-command surface — the document-input gap is the upstream prerequisite of every PDF-research/documentation-grounded-coding/academic-paper-summarization/contract-review-with-citations/regulatory-compliance-coding-with-document-evidence affordance — #234 closes the upstream prerequisite of every server-managed-document-input-with-citations affordance — the canonical USER-INPUT-side complement to #233's web-search citations that completes the citation-attribution data-model on BOTH the USER-INPUT side AND the OUTPUT-TEXT-BLOCK side AND the SERVER-MANAGED-TOOL-RESULT side — and grows the Tool-locality-axis META-cluster from 2 to 3 members establishing it as a stable doctrine rather than emergent pattern, the FIRST cluster member to grow an existing META-cluster to a third member AND introduce a sub-cluster within an existing META-cluster) 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
f406e83520 roadmap: #233 filed — Web-search Tool API typed taxonomy and structured-citation-attribution data-model are structurally absent: zero web_search_20250305 versioned-tool-name typed-tool-discriminator (FOURTH Anthropic-typed-tool-discriminator after #230's three but FIRST date-suffix-versioning-WITHOUT-beta-header — distinct from #232's date-suffix-AND-beta-header double-gate), zero tool_choice: web_search ToolChoice extension at types.rs:117 (SECOND ToolChoice extension after #232's code_interpreter, founding Server-managed-tool-as-tool-choice-discriminator cluster's second member), zero web_search_tool_result ToolResultContentBlock variant at types.rs:99 (FOURTH ToolResultContentBlock extension after #230 Image and #232 CodeExecutionResult, FIRST list-of-opaque-encrypted-page-records variant), zero citations REQUIRED field on OutputContentBlock::Text at types.rs:147 (NOVEL FIRST cluster member where data-model field absence on OUTPUT-TEXT-BLOCK side blocks REQUIRED-not-OPTIONAL grounded-attribution wire-format), zero Citation/WebSearchResultLocation/WebSearchToolUse/WebSearchToolResult/EncryptedContent typed model with encrypted_index/encrypted_content opaque-blob axis (NOVEL FIRST cluster member where typed-model field is INTENTIONALLY-OPAQUE-TO-CLIENT and MUST be roundtripped unchanged through subsequent messages, founding Server-opaque-encrypted-roundtripped-content cluster), zero max_uses server-side rate-limit field on tool-definition (NOVEL FIRST Server-side-rate-limit-on-tool-definition axis), zero allowed_domains/blocked_domains server-side pre-execution filtering on tool-definition (NOVEL FIRST Server-side-pre-execution-filter-on-tool-definition axis distinct from existing CLIENT-SIDE WebSearchInput.allowed_domains/blocked_domains post-execution filtering at tools/lib.rs:2274), zero user_location typed-model for geo-biasing on tool-definition (NOVEL FIRST Geo-biasing-at-tool-definition axis), zero web-search dispatch on ProviderClient enum at client.rs:8-14 (zero Anthropic-web_search_20250305/OpenAI-Responses-web_search/Brave/Tavily/Exa/Perplexity/Serper/Linkup/Jina/Bing/Google-CSE/SerpAPI/DuckDuckGo/You.com/Kagi partner-routing variants — fifteen-plus partner-set, FOURTH-largest in cluster, FIRST cluster member with Federated-search-partner-routing where first-class provider-native AND third-party search-as-a-service have EQUAL standing — distinct from #224 single-recommended-partner and #232 first-class-plus-partner-stub layout), zero claw web-search/cite/groundsearch CLI subcommand, zero /web-search //cite //grounded-search //research slash command (existing /search at commands/lib.rs:597 is LOCAL filesystem-search-only, structurally distinct), zero web_search_per_invocation_usd pricing field (NOVEL FIRST Discrete-event-counter-pricing-axis distinct from every prior continuous-resource-lifetime counter — Anthropic charges $10 per 1000 search-uses FLAT regardless of token volume), zero encrypted_content opaque-blob handling, zero page_age freshness-signaling — uniquely manifesting a THIRTEEN-LAYER fusion shape (the largest single-pinpoint fusion catalogued so far, exceeds #232's twelve-layer count) combining: (1) web_search_20250305 versioned-tool-name typed-tool-discriminator extension (FOURTH cluster member but FIRST date-suffix-WITHOUT-beta-header), (2) tool_choice: web_search ToolChoice extension (SECOND), (3) web_search_tool_result ToolResultContentBlock variant (FOURTH), (4) citations REQUIRED field on OutputContentBlock::Text (NOVEL FOURTH-position layer), (5) Citation typed model with encrypted_index opaque-blob axis (NOVEL FIFTH-position layer), (6) max_uses server-side rate-limit (NOVEL SIXTH), (7) allowed_domains/blocked_domains server-side pre-execution filter (NOVEL SEVENTH), (8) user_location geo-biasing (NOVEL EIGHTH), (9) Provider-trait method extension threading web_search_20250305 with citations decoding (NINTH), (10) ProviderClient-enum-dispatch with fifteen-plus-partner third-lanes (TENTH, FIRST Federated-search-partner-routing), (11) CLI-subcommand surface (ELEVENTH), (12) slash-command surface with inverse-locality complement /search (TWELFTH, THIRD inverse-locality slash-command-pair after #230 and #232), (13) per-search-invocation pricing-tier axis (NOVEL THIRTEENTH, FIRST Discrete-event-counter-pricing-axis) — making #233 the FIRST cluster member with thirteen-layer-fusion-shape (exceeds #232's eleven), the FIRST cluster member with REQUIRED-grounded-citation-field-on-output-text-block, the FIRST cluster member with INTENTIONALLY-OPAQUE-encrypted-content-roundtripped-by-client, the FIRST cluster member with date-suffix-versioning-in-tool-name-WITHOUT-beta-header, the SECOND member of new Tool-locality-axis META-cluster (sister to #230/#232's Sandbox-locality-axis META-cluster — together founding META-META-cluster doctrine where canonical pattern is 'claw-code ships a CLIENT-SIDE local-stub tool with same conceptual name AND the SERVER-SIDE provider-managed beta-versioned tool is structurally absent', applied uniformly across sandbox-locality AND tool-locality axes), the SECOND cluster member to extend ToolChoice (Server-managed-tool-as-tool-choice-discriminator cluster grows to 2: #232 code_interpreter + #233 web_search), the SECOND cluster member to extend ToolResultContentBlock with multi-modal-nested content (ToolResultContentBlock-extension mini-cluster grows to 3: #230 Image + #232 CodeExecutionResult + #233 WebSearchToolResult), the SECOND cluster member with Server-driven-tool-execution-loop (#232 + #233), the SECOND cluster member where local CLIENT-SIDE-tool-shadow exists alongside server-managed-tool absence (#232 REPL-shadow + #233 WebSearch-shadow) (Jobdori cycle #383 / extends #168c emission-routing audit / explicit follow-on from #230 Computer-use's CLIENT-SIDE virtualization, #232 Code-execution's SERVER-SIDE managed-sandbox-state, and the inverse-locality Sandbox-locality-axis META-cluster doctrine — introduces NOVEL structured-citation-attribution data-model axis AND server-managed-search-state transport-axis distinct from every prior cluster member / sibling-shape cluster grows to thirty-two / wire-format-parity cluster grows to twenty-three / capability-parity cluster grows to fifteen / multimodal-IO cluster grows to ten: #220 image-input + #224 embedding-output + #225 audio-bidirectional + #226 image-output + #227 video-output + #228 mesh-output + #229 audio-text-tool-multiplex-on-WebSocket + #230 image-on-tool-result-side+host-OS-pixel-and-input + #232 multi-modal-nested-stdout+image+file-handle-on-tool-result-side + #233 list-of-opaque-encrypted-page-records-on-tool-result-side+REQUIRED-citations-on-output-text-block / provider-asymmetric-delegation cluster grows to ten with FIRST Federated-search-partner-routing member where first-class AND third-party are EQUAL-standing / Sandbox-locality-axis META-cluster: 2 members stable (#230 + #232) / Tool-locality-axis META-cluster FOUNDED: 2 members (#232 + #233 — SECOND inverse-locality META-cluster, sister to Sandbox-locality, founding META-META-cluster doctrine) / Server-managed-tool-as-tool-choice-discriminator cluster grows to 2 members (#232 + #233) / Server-driven-tool-execution-loop cluster grows to 2 members (#232 + #233) / ToolResultContentBlock-extension mini-cluster grows to 3 members (#230 + #232 + #233) / EIGHT new clusters founded in a single pinpoint (Federated-search-partner-routing 1-member-founder + Server-opaque-encrypted-roundtripped-content 1-member-founder + Required-grounded-citation-field-on-output-text-block 1-member-founder + Date-suffix-versioning-in-tool-name-without-beta-header 1-member-founder + Server-side-pre-execution-filter-on-tool-definition 1-member-founder + Server-side-rate-limit-on-tool-definition 1-member-founder + Geo-biasing-at-tool-definition 1-member-founder + Discrete-event-counter-pricing-axis 1-member-founder) plus participation in FIVE inherited clusters — THIRD-largest single-cycle cluster-founding count after #230 and #232, but FIRST single cycle to FOUND a NEW META-cluster (Tool-locality-axis) AND establish META-META-cluster doctrine connecting Sandbox-locality with Tool-locality / thirteen-layer-fusion-shape is the largest single-pinpoint fusion catalogued / external validation: forty-six ecosystem references covering Anthropic Web Search Tool GA 2025-03 with web_search_20250305 + max_uses + allowed_domains + blocked_domains + user_location parameters + web_search_tool_use/web_search_tool_result/web_search_result_location content blocks + citations array on output text blocks + encrypted_index/encrypted_content opaque-roundtripped fields + $10/1000-uses pricing, Anthropic Citations Documentation, OpenAI Responses API 2024-12 with tool_choice: web_search exposing federated-search via different server-managed surface, Brave Search API/Tavily AI/Exa AI/Perplexity Search/Serper.dev/Linkup Search/Jina Reader/Bing/Google CSE/SerpAPI/DuckDuckGo/You.com/Kagi/Phind partner-routing, Anthropic Python+TypeScript SDKs first-class typed surface, OpenAI Python+TypeScript SDKs first-class typed surface, LangChain AnthropicWebSearch/TavilySearchResults/BraveSearch/ExaSearchResults integrations, LangGraph search-grounded-agent template, smolagents WebSearchTool, OpenAI Cookbook web-search-with-citations tutorial, AgentOps observability, Search-Augmented Generation pattern, structured-citation-attribution data-model where every grounded text block carries citations array linking specific text-spans back to source URLs+excerpts (STRUCTURAL data-model requirement distinguishing this surface from #220-#232 — none of which had REQUIRED-grounded-citation-field-on-output-text-block) — claw-code is one of MULTIPLE coding-agent clients without server-managed web-search-with-citations BUT the gap is uniformly zero across surveyed ecosystem with claude-code partial coverage exception AND the inverse-locality complement to existing local CLIENT-SIDE WebSearch tool makes #233 a structural prerequisite of every grounded-search-with-citations coding-agent affordance — the canonical 2024-2026-era research-coding workflow that is currently impossible to build on top of claw-code DESPITE Anthropic explicitly positioning web_search_20250305 as a flagship 2025-Q1 GA capability — #233 closes the upstream prerequisite of every server-managed-web-search-with-citations / grounded-research / source-attribution / fact-checking-with-citations / academic-citation-formatting / news-summarization-with-sources / competitive-intelligence-with-citations / due-diligence-coding coding-agent affordance — the canonical SERVER-MANAGED-SEARCH-AND-CITATION half of inverse-locality Tool-locality-axis META-cluster that complements #232's Sandbox-locality-axis META-cluster — and is FIRST cluster member where claude-code upstream partially leads while claw-code has zero coverage AND SECOND inverse-locality META-cluster pair (CLIENT-SIDE local WebSearch shadow vs SERVER-SIDE web_search_20250305 absent) after #232's first META-cluster pair — founding Tool-locality-axis META-cluster doctrine as sister to Sandbox-locality-axis and establishing META-META-cluster pattern that every future server-managed-tool with client-side local-stub shadow will inherit) 2026-04-26 18:03:00 +09:00
Jobdori
1cc58fb478 roadmap: #232 filed 2026-04-26 18:03:00 +09:00
Yeachan-Heo
404a7d346f roadmap: #231 filed 2026-04-26 18:03:00 +09:00
YeonGyu-Kim
a9c32c0ffa roadmap: #230 filed — Computer-use API typed taxonomy and host-machine-state-management transport are structurally absent: zero computer-use-2025-01-24 + zero computer-use-2025-11-24 anthropic-beta opt-in (FIRST cluster member with two concurrent beta-version-tiers gating one capability), zero computer_20250124/computer_20251124/bash_20250124/text_editor_20250124 Anthropic-typed-tool-discriminator (FIRST cluster member requiring type field on tool-definitions and FIRST anthropic-defined-tools-without-input-schema), zero display_width_px/display_height_px/display_number parametrized-tool-definition fields, zero Image variant on ToolResultContentBlock at types.rs:99 (FIRST cluster member with image-content on TOOL-RESULT side, distinct from #220's image-on-USER-INPUT-side — complementary architectures requiring separate enums), zero screen_capture/mouse_move/key_press/type_text host-machine-interaction primitive across all 26+ tool definitions in tools/lib.rs, zero CGEvent/ScreenCaptureKit/Quartz/AppKit/xdotool/cliclick/enigo/rdev/xcap host-OS library deps, zero Xvfb/Xephyr/Wayland-headless/Docker virtual-display-sandbox-orchestration, zero claw computer/operate CLI subcommand, /desktop slash command at commands/lib.rs:422 advertised-but-unbuilt under STUB_COMMANDS (the SIXTH advertised-but-unbuilt entry in cluster), zero per-action permissions.rs gating for mouse_click/key_press/type/screenshot, zero feedback-loop-state-machine for screenshot→tool_use→action→screenshot iteration, zero playwright-rust/chromiumoxide for browser-only-cua subset, zero per-screenshot-input-token cost field in ModelPricing — uniquely manifesting an ELEVEN-LAYER fusion shape combining: (1) anthropic-beta-DUAL-version-tier routing (FIRST), (2) Anthropic-typed-tool-definition discriminator (FIRST), (3) parametrized-tool-definition with display dimensions (FIRST), (4) Image-on-ToolResult side (FIRST, complementary to #220), (5) host-OS-system-call transport (FIRST host-OS-syscall transport, distinct from #229's WebSocket which is still network-only — second non-HTTP transport in cluster after WebSocket but FIRST that breaks network-only boundary), (6) virtual-display-sandbox orchestration (FIRST CLIENT-SIDE virtualization), (7) feedback-loop-state-machine for screenshot iteration loop (FIRST N-turn-loop-controller), (8) per-action-permission-policy at sub-tool-granularity (FIRST sub-tool-action permission gating, parallel to bash's DangerFullAccess but at action granularity), (9) request-side three-concurrent-opt-in (largest yet), (10) CLI-and-slash-command surface with /desktop advertised-but-unbuilt (sixth entry, largest in cluster), (11) host-machine-state-management transport-axis (NOVEL ELEVENTH layer with screen-capture+synthetic-input+display-dimension-query+window-enum+VM-orchestration+accessibility-permissions+per-action-permission-prompts+coordinate-validation+screenshot-encoding+safety-throttling — distinct from every prior cluster member which operated network-only) — making #230 the first cluster member with eleven-layer-fusion-shape (exceeds #229's ten-layer), the FIRST host-OS-syscall-transport requirement, the FIRST CLIENT-SIDE virtualization requirement, the FIRST inverse-asymmetric-delegation case (Anthropic LEADS, OpenAI follows with Operator, Google follows with Mariner — novel inversion of #224-#229's Anthropic-trails pattern), the FIRST cluster member with image-content on TOOL-RESULT-side, and the FIRST gap where upstream claude-code ALSO has only a stub (Jobdori cycle #381 / extends #168c emission-routing audit / explicit follow-on from #229's persistent-WebSocket-transport founder pinpoint and #225's audio-bidirectional axis — introduces a NOVEL HOST-MACHINE-STATE-MANAGEMENT transport-axis distinct from every prior cluster member / sibling-shape cluster grows to twenty-nine / wire-format-parity cluster grows to twenty / capability-parity cluster grows to twelve / multimodal-IO cluster grows to eight: #220 image-input + #224 embedding-output + #225 audio-bidirectional + #226 image-output + #227 video-output + #228 mesh-output + #229 audio-text-tool-multiplex-on-WebSocket + #230 image-on-tool-result-side+host-OS-pixel-and-input modality / provider-asymmetric-delegation cluster grows to seven with novel inverse-sub-cluster (Anthropic leads, distinct from #224-#229's Anthropic-trails pattern) / EIGHT new clusters founded in a single pinpoint (exceeds #229's three): Beta-version-tier-routing 1-member-founder + Image-on-tool-result-side 1-member-founder + Anthropic-typed-tool-discriminator 1-member-founder + Host-OS-system-call-transport 1-member-founder + Virtual-display-sandbox-orchestration 1-member-founder + Feedback-loop-state-machine 1-member-founder + Per-action-permission-policy-at-sub-tool-granularity 1-member-founder + Inverse-asymmetric-delegation 1-member-founder — the largest single-cycle cluster-founding count yet / eleven-layer-fusion-shape is the largest single-pinpoint fusion catalogued / external validation: sixty-two ecosystem references covering Anthropic Computer Use API GA 2024-10-22 with computer-use-2024-10-22 → computer-use-2025-01-24 → computer-use-2025-11-24 beta-tier evolution, Anthropic computer-use-demo reference with Docker+Xvfb+XFCE+Firefox+VNC sandbox pattern, OpenAI Operator + computer_use_preview, Google Project Mariner, Microsoft Magentic-One, Adept ACT-1, ByteDance UI-TARS open-weight, browser-use Python framework, Stagehand TypeScript, Skyvern AI, Multion, Cua framework, LangChain ChatAnthropic.with_computer_use_tool, LangGraph computer-use agent, smolagents ComputerAgent, AgentOps observability, screen-capture libs (ScreenCaptureKit/xcap/screenshots/xdotool/wtype/cliclick/nut.js), synthetic-input libs (enigo/rdev/inputbot/mouce/pyautogui/RobotJS), browser-cua stacks (playwright-rust/chromiumoxide/headless_chrome/fantoccini/playwright/puppeteer), sandbox-orchestration (Docker-Xvfb-XFCE / Kasm Workspaces / noVNC / Browserbase / Steel-browser / Hyperbrowser / Lightpanda / Surf.ai), per-action permission-policy precedent from claw-code's existing bash DangerFullAccess gating — claw-code is one of MULTIPLE coding-agent clients without computer-use BUT the gap is uniformly zero across the surveyed coding-agent ecosystem AND Anthropic specifically positions Claude as the LEADING commercial computer-use model AND claw-code is a port of claude-code which advertises /desktop slash command intent, making this the largest leading-vs-trailing parity gap with the upstream Anthropic platform in the entire emission-routing audit and the FIRST cluster member where upstream claude-code ALSO has only a stub — #230 closes the upstream prerequisite of every desktop-automation/browser-automation/form-filling/GUI-testing/accessibility-tool/screen-reading/vision-grounded-coding/pair-programming-with-screen-share/visual-debugging coding-agent affordance — the canonical 2024-2026-era agentic coding workflow that is currently impossible to build on top of claw-code) 2026-04-26 18:03:00 +09:00
Jobdori
2c7385e497 roadmap: #229 filed — Realtime API typed taxonomy and persistent-WebSocket transport are structurally absent: zero /v1/realtime endpoint surface across both Anthropic-native and OpenAI-compat lanes (rg returns zero hits for /v1/realtime / realtime / Realtime / realtime_session / RealtimeSession / RealtimeClient / RealtimeEvent / realtime-preview across rust/crates/api/src/), zero RealtimeSession / RealtimeSessionConfig / RealtimeSessionUpdate / RealtimeResponseCreate / RealtimeInputAudioBufferAppend / RealtimeInputAudioBufferCommit / RealtimeConversationItemCreate / RealtimeResponseAudioDelta / RealtimeResponseAudioTranscriptDelta / RealtimeResponseFunctionCallArguments / RealtimeServerEvent / RealtimeClientEvent / RealtimeTurnDetection / RealtimeVoiceActivityDetection / RealtimeVoice / RealtimeAudioFormat / RealtimeModality / RealtimeTool typed model in rust/crates/api/src/types.rs (37+ canonical event-type names in OpenAI Realtime API spec, zero coverage in claw-code), zero bidirectional event-stream variant on Provider trait (only send_message and stream_message exist, both single-directional), zero realtime_session / open_realtime / connect_realtime method that returns a duplex-channel-pair shape, zero session-state-machine type for the persistent-connection lifecycle, zero realtime dispatch on ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi, zero realtime-routing variants), zero tokio-tungstenite / async-tungstenite / tungstenite / fastwebsockets / tokio-websockets / hyper-tungstenite dependency in any workspace Cargo.toml (grep -rn 'tungstenite|tokio-tungstenite|fastwebsockets' rust/ returns zero hits — confirmed), zero WebSocket client library is linked into the build (the MCP Ws config variant at rust/crates/runtime/src/config.rs:125 and rust/crates/runtime/src/mcp_client.rs:13 is data-shape-only and bootstraps via the SDK without a tungstenite-backed transport, leaving the workspace with zero outbound persistent-WebSocket-client capability), zero WebRTC client (webrtc-rs / str0m / libwebrtc-bindings) for the alternative Realtime transport, zero claw realtime / claw live / claw voice-chat / claw realtime-session / claw connect-realtime CLI subcommand, zero /realtime / /live / /voice-chat slash command (existing /voice + /listen + /speak commands are STUB_COMMANDS-gated per #225 and synchronous-only with no realtime-session affordance), zero gpt-4o-realtime-preview / gpt-4o-mini-realtime-preview / gemini-2.0-flash-live entries in MODEL_REGISTRY, zero realtime_audio_input_per_million_tokens / realtime_audio_output_per_million_tokens / realtime_text_input_per_million_tokens / realtime_text_output_per_million_tokens / realtime_session_per_minute fields in ModelPricing struct (six-dimensional pricing matrix exceeding #227's five-dimensional video matrix and #228's four-dimensional mesh matrix — the canonical Realtime pricing model is the most-dimensional yet, with audio tokens at roughly 80-100x text tokens and cached-audio-input at 80% discount), zero realtime-model recognition in pricing_for_model substring-matcher (#209+#224+#225+#226+#227+#228 cluster overlap continues), zero session-resumption-token / interruption-handling / barge-in / voice-activity-detection / turn-detection / function-call-during-realtime / tool-use-during-realtime affordance — uniquely manifesting a TEN-LAYER fusion shape (the largest single-pinpoint fusion catalogued so far, exceeding #225/#227's nine-layer count) combining endpoint-URL-set on /v1/realtime?model=<id> WebSocket-upgrade-endpoint shape (single-endpoint-with-37+-event-types-flowing-bidirectionally, distinct from prior multi-endpoint sets) + bidirectional-symmetric-event-pair data-model with every client-event having a matched server-event-pair (FIRST cluster member with bidirectional-symmetric-event-pair-cardinality on a SINGLE endpoint, distinct from #225's bidirectional-audio-on-three-separate-endpoints which is request-response synchronous per endpoint) + Provider-trait-method extension with realtime_session returning a duplex (Sender, Receiver) channel-pair (FIRST cluster member where Provider trait return type is NOT Future-of-T or Stream-of-T but duplex-channel-pair, FIRST method requiring session-state-machine type at the trait boundary) + ProviderClient-enum-dispatch-with-realtime-third-lane with explicit RealtimeKind::OpenAi/Google/Azure partner-routing (provider-asymmetric: Anthropic does not offer realtime, OpenAI offers GA gpt-4o-realtime-preview and gpt-4o-mini-realtime-preview since 2024-10-01, Google Gemini Live API offers bidirectional audio+text+video, Azure mirrors OpenAI surface, zero first-class third-party partners because the persistent-WebSocket-with-37-event-type protocol is too high-bar for partner adoption — distinct from #225's six-partner-set audio surface and #227's twelve-partner-set video surface where partners ARE present) + request-side realtime-session-config opt-in (session.update event with voice/input_audio_format/output_audio_format/input_audio_transcription/turn_detection/tools/tool_choice/temperature/max_response_output_tokens/instructions/modalities:[text,audio] fields — the largest request-side opt-in axis-set yet, the union of every prior request-side opt-in across audio+image+video+chat-completion modalities) + CLI-subcommand-surface + slash-command-surface + pricing-tier-with-six-dimensional-compound-cost-model (per-model × per-modality-input × per-modality-output × per-cached-vs-fresh × per-audio-vs-text × per-minute-session-overhead — the largest pricing-tier extension yet, exceeding #227's five-dimensional and #228's four-dimensional matrices) + persistent-WebSocket-connection-transport-axis (NOVEL TENTH layer, distinct from every prior cluster member's HTTP-shaped transport — synchronous-HTTP for #211-#220+#222+#224, SSE-streaming for #213 partial subsets, multipart-form-data-HTTP for #223+#225+#226+#227+#228 binary-upload subsets, async-task-polling-HTTP for #221+#227+#228 — the cluster has now exhausted EVERY HTTP-shaped transport, and #229 introduces the FIRST non-HTTP transport, requiring WebSocket-upgrade-request-with-subprotocol-negotiation + bidirectional-frame-multiplexing-with-text+binary-frames + ping/pong-keepalive + graceful-close-with-status-code-and-reason + reconnection-with-resumption-token + per-event-type-JSON-envelope-dispatch-with-37+-event-types-on-a-single-connection + backpressure-handling-on-both-directions + authentication-via-Authorization-header-on-the-upgrade-request-and-per-session-token-rotation — none of which any HTTP-only transport requires) + bidirectional-symmetric-event-pair shape (input_audio_buffer.append → conversation.item.created, response.create → response.audio.delta + response.audio.done + response.audio_transcript.delta + response.audio_transcript.done + response.function_call_arguments.delta + response.function_call_arguments.done + response.done) — making #229 the FIRST cluster member that introduces a non-HTTP transport (persistent-WebSocket), the FIRST cluster member where Provider trait return type must be a duplex-channel-pair, and the FIRST cluster member where session lifecycle exceeds a single request-response cycle (typical Realtime sessions last 1-30+ minutes with state accumulating across the connection) (Jobdori cycle #380 / extends #168c emission-routing audit / explicit follow-on from #225 audio-bidirectional axis and #228 confirmed-structural async-task-polling cluster — introduces a NOVEL TRANSPORT axis distinct from every prior cluster member / sibling-shape cluster grows to twenty-eight / wire-format-parity cluster grows to nineteen / capability-parity cluster grows to eleven / multimodal-IO cluster grows to seven: #220 image-input + #224 embedding-output + #225 audio-bidirectional-on-separate-REST-endpoints + #226 image-output + #227 video-output + #228 mesh-output + #229 audio-text-tool-multiplex-on-persistent-WebSocket / provider-asymmetric-delegation cluster grows to six / async-task-polling cluster: still 3 members (#229 is push-based not poll-based — it does NOT join async-task-polling cluster, it founds a NEW cluster) / Persistent-WebSocket-transport cluster: 1 member (#229 alone, FOUNDER) / Bidirectional-symmetric-event-pair cluster: 1 member (#229 alone, FOUNDER) / Non-HTTP-transport cluster: 1 member (#229 alone, FOUNDER) — three new clusters founded in a single pinpoint, the first time a single cycle has founded three concurrent novel clusters / ten-layer-fusion-shape-with-persistent-WebSocket-transport-and-bidirectional-symmetric-event-pair is the largest single-pinpoint fusion catalogued. Distinct from prior cluster members; the ten-layer-fusion-shape with persistent-WebSocket-transport and bidirectional-symmetric-event-pair shape is novel and applies to follow-on candidate Real-time-Image-Generation API typed taxonomy (DALL-E live preview, Imagen live preview) and Real-time-Video-Generation streaming (Veo-Live, Sora-Live) — the persistent-WebSocket-transport pattern is now a first-class cluster member, a structural prerequisite that every future endpoint family using persistent connections will inherit / external validation: forty-eight ecosystem references covering OpenAI Realtime API GA 2024-10-01 with /v1/realtime?model=<id> WebSocket endpoint, 37+ canonical event-type names in OpenAI Realtime API spec, two transport options (WebSocket server-side and WebRTC browser-side), two GA realtime models (gpt-4o-realtime-preview and gpt-4o-mini-realtime-preview both with audio modality and tool-use), Google Gemini Live API with bidirectional WebSocket+gRPC streaming, Azure OpenAI Realtime API mirror, OpenAI Python SDK openai.realtime.AsyncRealtimeConnection typed client, OpenAI TypeScript SDK OpenAI.beta.realtime.RealtimeClient typed client, openai-realtime-api-beta reference client (canonical JS implementation), five first-class realtime-voice-agent frameworks all built on top of OpenAI Realtime API (Vapi/Retell-AI/LiveKit-Agents/Pipecat/Daily-Bots), Anthropic non-coverage statement (the second post-#224 provider-asymmetric-delegation case after audio), the canonical six-dimensional pricing matrix ($5.00/$20.00 per million text input/output tokens, $40.00/$80.00 per million audio input/output tokens, $2.50 per million cached audio input tokens for gpt-4o-realtime-preview-2024-10-01), coding-agent peer landscape: anomalyco/opencode has zero GA realtime integration (open feature request from 2026-02 only — confirmed via web search 2026-04-26), sst/opencode predecessor zero realtime, charmbracelet/crush zero realtime, continue.dev zero realtime, aider zero realtime, cursor zero realtime, zed zero realtime — the gap is uniformly zero across the surveyed ecosystem and represents the next-frontier capability that every coding-agent will need to add. claw-code is one of MULTIPLE clients without Realtime, but the persistent-WebSocket-transport-axis is the upstream prerequisite of every voice-agent / live-coding-pair-programming / push-to-talk-coding / barge-in-coding-conversation / function-call-during-voice / streaming-tool-use / sub-second-latency-coding-interaction affordance — the canonical 2024-2026-era voice-coding workflow that is currently impossible to build on top of claw-code — #229 closes the upstream prerequisite of every voice-coding affordance and is the first cluster member where transport-axis becomes a structural prerequisite of the dispatch layer) 2026-04-26 18:03:00 +09:00