diff --git a/ROADMAP.md b/ROADMAP.md index 6f52a47..8c92467 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -16407,3 +16407,13 @@ Required fix shape: (a) extend `InputContentBlock` enum at `rust/crates/api/src/ **Status:** Open. No source code changed. Filed as ROADMAP-only dogfood pinpoint from the 2026-04-26 06:30 KST clawhip nudge after rebasing on top of #233. Filed 2026-04-26 06:30 KST. HEAD: 2f428e2 (post-#233). Branch: feat/jobdori-168c-emission-routing. Sibling-shape cluster: 33 pinpoints. Multimodal-IO cluster: 11 members. Provider-asymmetric-delegation cluster: 11 members. **Sandbox-locality-axis META-cluster: 2 members stable (#230 + #232).** **Tool-locality-axis META-cluster grows to 3 members (#232 + #233 + #234) — the FIRST META-cluster to reach 3 members, transitioning from emergent-pattern to stable-doctrine.** **USER-INPUT-side-Tool-locality-axis-variant sub-cluster FOUNDED: 1 member (#234, FOUNDER) — first sub-cluster within an existing META-cluster.** **Server-managed-tool-as-tool-choice-discriminator cluster: 3 members (#232 + #233 + #234).** **Server-driven-tool-execution-loop cluster: 3 members (#232 + #233 + #234 — three canonical-pattern variants: Python-kernel-execution, search-result-page-fetching-and-caching, vector-store-corpus-retrieval-and-ranking).** **ToolResultContentBlock-extension mini-cluster: 4 members (#230 + #232 + #233 + #234).** **Both-major-providers-first-class-asymmetric-document-input-shape cluster: 1 member (founder).** **Coordinate-positioned-citation-on-output-text-block cluster: 1 member (founder, inverse-data-model pair to #233's URL-positioned-citation).** **Beta-header-gate-on-USER-INPUT-content-block-type cluster: 1 member (founder).** **Citation-emission-opt-in-at-USER-INPUT-content-block-level cluster: 1 member (founder).** **Four-way-source-discriminator-on-USER-INPUT-content-block cluster: 1 member (founder).** **Range-slicing-parameter-on-USER-INPUT-content-block cluster: 1 member (founder).** **User-corpus-server-managed-tool-with-vector-store-routing cluster: 1 member (founder).** **Compound-boolean-filter-DSL-on-server-managed-tool-definition cluster: 1 member (founder).** **User-provided-document-title-threading-through-citations cluster: 1 member (founder).** **Multi-document-positional-index-threading cluster: 1 member (founder).** **Per-page-compound-text-plus-image-token-pricing-axis cluster: 1 member (founder).** **Persistent-storage-rental-pricing-axis cluster: 1 member (founder).** **Client-visible-cited-text-extracted-from-source-document cluster: 1 member (founder, inverse pair to #233's Server-opaque-encrypted-roundtripped-content).** **User-defined-metadata-on-tool-result-record cluster: 1 member (founder).** Thirteen new clusters founded in a single pinpoint plus participation in SIX inherited clusters — the LARGEST single-cycle cluster-founding count yet (exceeds prior records held by #230 and #232 and #233 by five), AND the FIRST single cycle to grow an existing META-cluster to a third member (Tool-locality-axis evolves from 2-member emergent-pattern to 3-member stable-doctrine) AND the FIRST single cycle to introduce a sub-cluster within an existing META-cluster (USER-INPUT-side-Tool-locality-axis-variant within Tool-locality-axis META-cluster). Fourteen-layer-fusion-shape is the largest single-pinpoint fusion catalogued (exceeds #233's thirteen-layer by one). Distinct from prior cluster members; the fourteen-layer-fusion-shape-with-document-modality-on-USER-INPUT-side-and-coordinate-positioned-citation-on-output-text-block-and-vector-store-id-routing-on-server-managed-tool is novel and applies to follow-on candidate **Image-generation Tool-as-server-managed-tool typed taxonomy** (the OpenAI Responses `tool_choice: image_generation` server-managed image-gen surface that #226 covered as a standalone endpoint but does NOT yet cover as a server-managed-tool-as-tool-choice-discriminator extension — the natural #235 candidate that grows the `Server-managed-tool-as-tool-choice-discriminator` cluster from 3 to 4 members) and **Computer-use Tool typed-discriminator on tool_choice** (the missing `tool_choice: computer` extension that #230 covered as a typed-tool-discriminator on ToolDefinition but does NOT cover as a `tool_choice` discriminator-value — the natural follow-on that grows the `Server-managed-tool-as-tool-choice-discriminator` cluster further). #234 closes the upstream prerequisite of every server-managed-document-input-with-citations / grounded-research-on-user-corpus / source-attribution-by-page-number / academic-citation-formatting-with-page-references / multi-document-comparison-with-positional-attribution / regulatory-compliance-coding-with-document-evidence coding-agent affordance — the canonical USER-INPUT-side complement to #233's web-search citations that completes the citation-attribution data-model on BOTH the USER-INPUT side AND the OUTPUT-TEXT-BLOCK side AND the SERVER-MANAGED-TOOL-RESULT side — and grows the `Tool-locality-axis` META-cluster from 2 members to 3 members establishing it as a stable doctrine rather than emergent pattern, the FIRST cluster member to grow an existing META-cluster to a third member AND introduce a sub-cluster within an existing META-cluster. 🪨 + +## Pinpoint #235 — Server-managed image-generation tool-choice taxonomy is structurally absent + +Dogfooded 2026-04-26 06:50 KST on `feat/jobdori-168c-emission-routing` after #234 made `tool_choice:file_search` the third server-managed tool-choice member and explicitly named image-generation-as-tool as the strongest next clean follow-on. This is intentionally distinct from #226: #226 covers standalone image-generation endpoints (`/v1/images/generations` / edits / variations). #235 covers the conversational/server-managed tool surface where the model chooses or is forced to call an image-generation tool inside a response turn and returns generated-image tool outputs with attribution/provenance. + +Verified absences: zero `tool_choice: image_generation` / `image_generation_call` / `image_generation_tool_result` typed discriminator; zero `ImageGenerationToolDefinition` with prompt/style/size/quality/output_format/safety/watermark fields; zero server-managed image artifact result variant on `ToolResultContentBlock`; zero generated-image citation/provenance slot on `OutputContentBlock::Text`; zero Provider trait path that lets chat/completion responses request server-side image generation as a tool rather than as a separate endpoint family; zero ProviderClient dispatch for OpenAI Responses image-generation tool / Gemini image-generation tool / partner-managed tool lanes; zero `claw image-tool` / `claw generate-image --as-tool` CLI surface; zero `/image-generate` / `/image-tool` slash-command surface; zero pricing axis for per-tool-image-generation event + output-image-size/quality matrix; and zero artifact ledger tying generated image ids/URLs/base64 payloads back to the conversational turn that requested them. + +Cluster shape: this grows `Server-managed-tool-as-tool-choice-discriminator` to four members (#232 `code_interpreter`, #233 `web_search`, #234 `file_search`, #235 `image_generation`) and is the first member where the server-managed tool output is a generated media artifact whose lifecycle overlaps with but is not reducible to standalone endpoint output. It also extends the Tool-locality-axis META-cluster: claw-code already has local/user-facing image-adjacent stubs from #220/#226 (`/image`, `/screenshot`, standalone image-gen endpoint candidate), but the server-managed conversational image-generation tool path is absent. This creates a dual-surface contract: direct endpoint generation for explicit CLI calls (#226) and model-mediated tool generation during ordinary chat turns (#235) must share artifact provenance, pricing, safety, and output-content-block handling without duplicating routing logic. + +Required fix shape: (a) add `ToolChoice::ImageGeneration` and `ToolDefinition::ImageGeneration` typed discriminators; (b) add `ImageGenerationToolResult` / generated-image artifact structs with URL/base64/file_id variants, size/quality/style/safety metadata, and provenance linking to the assistant response/tool-call id; (c) thread server-managed image-generation tool calls through Provider trait and ProviderClient dispatch separately from #226 standalone endpoint calls; (d) add CLI/slash affordances that make the distinction explicit (`generate image now` vs `allow model to use image generation tool`); (e) add pricing and usage accounting at the tool-invocation and artifact dimension; (f) add tests proving `tool_choice:image_generation` survives request serialization, result decoding, artifact ledgering, and unsupported-provider guidance. **Status:** Open. No source code changed. Filed as ROADMAP-only dogfood pinpoint from the 2026-04-25 21:30 UTC claw-code nudge. Cluster delta: sibling-shape +1, wire-format parity +1, capability parity +1, server-managed-tool-choice +1 (now 4), Tool-locality-axis +1, generated-media-artifact-provenance subcluster founded.