mirror of
https://github.com/openai/codex.git
synced 2026-05-03 04:42:20 +03:00
feat(app-server, core): allow text + image content items for dynamic tool outputs (#10567)
Took over the work that @aaronl-openai started here: https://github.com/openai/codex/pull/10397 Now that app-server clients are able to set up custom tools (called `dynamic_tools` in app-server), we should expose a way for clients to pass in not just text, but also image outputs. This is something the Responses API already supports for function call outputs, where you can pass in either a string or an array of content outputs (text, image, file): https://platform.openai.com/docs/api-reference/responses/create#responses_create-input-input_item_list-item-function_tool_call_output-output-array-input_image So let's just plumb it through in Codex (with the caveat that we only support text and image for now). This is implemented end-to-end across app-server v2 protocol types and core tool handling. ## Breaking API change NOTE: This introduces a breaking change with dynamic tools, but I think it's ok since this concept was only recently introduced (https://github.com/openai/codex/pull/9539) and it's better to get the API contract correct. I don't think there are any real consumers of this yet (not even the Codex App). Old shape: `{ "output": "dynamic-ok", "success": true }` New shape: ``` { "contentItems": [ { "type": "inputText", "text": "dynamic-ok" }, { "type": "inputImage", "imageUrl": "data:image/png;base64,AAA" } ] "success": true } ```
This commit is contained in:
@@ -122,6 +122,7 @@ Start a fresh thread when you need a new Codex conversation.
|
||||
"approvalPolicy": "never",
|
||||
"sandbox": "workspaceWrite",
|
||||
"personality": "friendly",
|
||||
// Experimental: requires opt-in
|
||||
"dynamicTools": [
|
||||
{
|
||||
"name": "lookup_ticket",
|
||||
@@ -556,6 +557,41 @@ Order of messages:
|
||||
|
||||
UI guidance for IDEs: surface an approval dialog as soon as the request arrives. The turn will proceed after the server receives a response to the approval request. The terminal `item/completed` notification will be sent with the appropriate status.
|
||||
|
||||
### Dynamic tool calls (experimental)
|
||||
|
||||
`dynamicTools` on `thread/start` and the corresponding `item/tool/call` request/response flow are experimental APIs. To enable them, set `initialize.params.capabilities.experimentalApi = true`.
|
||||
|
||||
When a dynamic tool is invoked during a turn, the server sends an `item/tool/call` JSON-RPC request to the client:
|
||||
|
||||
```json
|
||||
{
|
||||
"method": "item/tool/call",
|
||||
"id": 60,
|
||||
"params": {
|
||||
"threadId": "thr_123",
|
||||
"turnId": "turn_123",
|
||||
"callId": "call_123",
|
||||
"tool": "lookup_ticket",
|
||||
"arguments": { "id": "ABC-123" }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The client must respond with content items. Use `inputText` for text and `inputImage` for image URLs/data URLs:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": 60,
|
||||
"result": {
|
||||
"contentItems": [
|
||||
{ "type": "inputText", "text": "Ticket ABC-123 is open." },
|
||||
{ "type": "inputImage", "imageUrl": "data:image/png;base64,AAA" }
|
||||
],
|
||||
"success": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Skills
|
||||
|
||||
Invoke a skill by including `$<skill-name>` in the text input. Add a `skill` input item (recommended) so the backend injects full skill instructions instead of relying on the model to resolve the name.
|
||||
|
||||
Reference in New Issue
Block a user