codex

mirror of https://github.com/openai/codex.git synced 2026-04-29 02:41:12 +03:00

Author	SHA1	Message	Date
marina-oai	a40f62d647	Merge branch 'main' into codex/add-process-id-to-logging	2026-01-20 17:02:55 +09:00
charley-oai	eb90e20c0b	Persist text elements through TUI input and history (#9393 ) Continuation of breaking up this PR https://github.com/openai/codex/pull/9116 ## Summary - Thread user text element ranges through TUI/TUI2 input, submission, queueing, and history so placeholders survive resume/edit flows. - Preserve local image attachments alongside text elements and rehydrate placeholders when restoring drafts. - Keep model-facing content shapes clean by attaching UI metadata only to user input/events (no API content changes). ## Key Changes - TUI/TUI2 composer now captures text element ranges, trims them with text edits, and restores them when submission is suppressed. - User history cells render styled spans for text elements and keep local image paths for future rehydration. - Initial chat widget bootstraps accept empty `initial_text_elements` to keep initialization uniform. - Protocol/core helpers updated to tolerate the new InputText field shape without changing payloads sent to the API.	2026-01-19 23:49:34 -08:00
Dylan Hurd	675f165c56	fix(core) Preserve base_instructions in SessionMeta (#9427 ) ## Summary This PR consolidates base_instructions onto SessionMeta / SessionConfiguration, so we ensure `base_instructions` is set once per session and should be (mostly) immutable, unless: - overridden by config on resume / fork - sub-agent tasks, like review or collab In a future PR, we should convert all references to `base_instructions` to consistently used the typed struct, so it's less likely that we put other strings there. See #9423. However, this PR is already quite complex, so I'm deferring that to a follow-up. ## Testing - [x] Added a resume test to assert that instructions are preserved. In particular, `resume_switches_models_preserves_base_instructions` fails against main. Existing test coverage thats assert base instructions are preserved across multiple requests in a session: - Manual compact keeps baseline instructions: core/tests/suite/compact.rs:199 - Auto-compact keeps baseline instructions: core/tests/suite/compact.rs:1142 - Prompt caching reuses the same instructions across two requests: core/tests/suite/prompt_caching.rs:150 and core/tests/suite/prompt_caching.rs:157 - Prompt caching with explicit expected string across two requests: core/tests/suite/prompt_caching.rs:213 and core/tests/suite/prompt_caching.rs:222 - Resume with model switch keeps original instructions: core/tests/suite/resume.rs:136 - Compact/resume/fork uses request 0 instructions for later expected payloads: core/tests/suite/compact_resume_fork.rs:215	2026-01-19 21:59:36 -08:00
Ahmed Ibrahim	b11e96fb04	Act on reasoning-included per turn (#9402 ) - Reset reasoning-included flag each turn and update compaction test	2026-01-19 11:23:25 -08:00
Shijie Rao	57ec3a8277	Feat: request user input tool (#9472 ) ### Summary * Add `requestUserInput` tool that the model can use for gather feedback/asking question mid turn. ### Tool input schema ``` { "$schema": "http://json-schema.org/draft-07/schema#", "title": "requestUserInput input", "type": "object", "additionalProperties": false, "required": ["questions"], "properties": { "questions": { "type": "array", "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.", "minItems": 1, "maxItems": 3, "items": { "type": "object", "additionalProperties": false, "required": ["id", "header", "question"], "properties": { "id": { "type": "string", "description": "Stable identifier for mapping answers (snake_case)." }, "header": { "type": "string", "description": "Short header label shown in the UI (12 or fewer chars)." }, "question": { "type": "string", "description": "Single-sentence prompt shown to the user." }, "options": { "type": "array", "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.", "minItems": 2, "maxItems": 3, "items": { "type": "object", "additionalProperties": false, "required": ["value", "label", "description"], "properties": { "value": { "type": "string", "description": "Machine-readable value (snake_case)." }, "label": { "type": "string", "description": "User-facing label (1-5 words)." }, "description": { "type": "string", "description": "One short sentence explaining impact/tradeoff if selected." } } } } } } } } } ``` ### Tool output schema ``` { "$schema": "http://json-schema.org/draft-07/schema#", "title": "requestUserInput output", "type": "object", "additionalProperties": false, "required": ["answers"], "properties": { "answers": { "type": "object", "description": "Map of question id to user answer.", "additionalProperties": { "type": "object", "additionalProperties": false, "required": ["selected"], "properties": { "selected": { "type": "array", "items": { "type": "string" } }, "other": { "type": ["string", "null"] } } } } } } ```	2026-01-19 10:17:30 -08:00
jif-oai	92cf2a1c3a	chore: warning metric (#9487 )	2026-01-19 17:07:04 +00:00
jif-oai	7ebe13f692	feat: timer total turn metrics (#9382 )	2026-01-19 10:44:31 +00:00
marina-oai	3807353071	Merge branch 'main' into codex/add-process-id-to-logging	2026-01-19 11:34:57 +09:00
Ahmed Ibrahim	1478a88eb0	Add collaboration developer instructions (#9424 ) - Add additional instructions when they are available - Make sure to update them on change either UserInput or UserTurn	2026-01-18 01:31:14 +00:00
Dylan Hurd	80d7a5d7fe	chore(instructions) Remove unread SessionMeta.instructions field (#9423 ) ### Description - Remove the now-unused `instructions` field from the session metadata to simplify SessionMeta and stop propagating transient instruction text through the rollout recorder API. This was only saving user_instructions, and was never being read. - Stop passing user instructions into the rollout writer at session creation so the rollout header only contains canonical session metadata. ### Testing - Ran `just fmt` which completed successfully. - Ran `just fix -p codex-protocol`, `just fix -p codex-core`, `just fix -p codex-app-server`, `just fix -p codex-tui`, and `just fix -p codex-tui2` which completed (Clippy fixes applied) as part of verification. - Ran `cargo test -p codex-protocol` which passed (28 tests). - Ran `cargo test -p codex-core` which showed failures in a small set of tests (not caused by the protocol type change directly): `default_client::tests::test_create_client_sets_default_headers`, several `models_manager::manager::tests::refresh_available_models_`, and `shell_snapshot::tests::linux_sh_snapshot_includes_sections` (these tests failed in this CI run). - Ran `cargo test -p codex-app-server` which reported several failing integration tests (including `suite::codex_message_processor_flow::test_codex_jsonrpc_conversation_flow`, `suite::output_schema::send_user_turn_`, and `suite::user_agent::get_user_agent_returns_current_codex_user_agent`). - `cargo test -p codex-tui` and `cargo test -p codex-tui2` were attempted but aborted due to disk space exhaustion (`No space left on device`). ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_696bd8ce632483228d298cf07c7eb41c)	2026-01-17 16:02:28 -08:00
Dylan Hurd	bffe9b33e9	chore(core) Create instructions module (#9422 ) ## Summary We have a variety of things we refer to as instructions in the code base: our current canonical terms are: - base instructions (raw string) - developer instructions (has a type in protocol) - user instructions We also have `instructions` floating around in various places. We should standardize on the above, and start using types to prevent them from ending up in the wrong place. There will be additional PRs, but I'm going to keep these small so we can easily follow them! ## Testing - [x] Tests pass, this is purely a file move	2026-01-17 16:01:26 -08:00
Ahmed Ibrahim	146d54cede	Add collaboration_mode override to turns (#9408 )	2026-01-16 21:51:25 -08:00
xl-openai	ad8bf59cbf	Support enable/disable skill via config/api. (#9328 ) In config.toml: ``` [[skills.config]] path = "/Users/xl/.codex/skills/my_skill/SKILL.md" enabled = false ``` API: skills/list, skills/config/write	2026-01-16 20:22:05 -08:00
Ahmed Ibrahim	246f506551	Introduce collaboration modes (#9340 ) - Merge `model` and `reasoning_effort` under collaboration modes. - Add additional instructions for custom collaboration mode - Default to Custom to not change behavior	2026-01-17 00:28:22 +00:00
Anton Panasenko	c26fe64539	feat: show forked from session id in /status (#9330 ) Summary: - Add forked_from to SessionMeta/SessionConfiguredEvent and persist it for forked sessions. - Surface forked_from in /status for tui + tui2 and add snapshots.	2026-01-16 13:41:46 -08:00
Ahmed Ibrahim	0cce6ebd83	rename model turn to sampling request (#9336 ) We have two type of turns now: model and user turns. It's always confusing to refer to either. Model turn is basically a sampling request.	2026-01-16 10:06:24 +01:00
marina-oai	3bc5129982	Merge branch 'main' into codex/add-process-id-to-logging	2026-01-16 11:11:23 +09:00
charley-oai	1fa8350ae7	Add text element metadata to protocol, app server, and core (#9331 ) The second part of breaking up PR https://github.com/openai/codex/pull/9116 Summary: - Add `TextElement` / `ByteRange` to protocol user inputs and user message events with defaults. - Thread `text_elements` through app-server v1/v2 request handling and history rebuild. - Preserve UI metadata only in user input/events (not `ContentItem`) while keeping local image attachments in user events for rehydration. Details: - Protocol: `UserInput::Text` carries `text_elements`; `UserMessageEvent` carries `text_elements` + `local_images`. Serialization includes empty vectors for backward compatibility. - app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in camelCase with conversions; v2 uses its own camelCase wrapper. - app-server: v1/v2 input mapping includes `text_elements`; thread history rebuilds include them. - Core: user event emission preserves UI metadata while model history stays clean; history replay round-trips the metadata.	2026-01-15 17:26:41 -08:00
Yuvraj Angad Singh	004a74940a	fix: send non-null content on elicitation Accept (#9196 ) ## Summary - When a user accepts an MCP elicitation request, send `content: Some(json!({}))` instead of `None` - MCP servers that use elicitation expect content to be present when action is Accept - This matches the expected behavior shown in tests at `exec-server/tests/common/lib.rs:171` ## Root Cause In `codex-rs/core/src/codex.rs`, the `resolve_elicitation` function always sent `content: None`: ```rust let response = ElicitationResponse { action, content: None, // Always None, even for Accept }; ``` ## Fix Send an empty object when accepting: ```rust let content = match action { ElicitationAction::Accept => Some(serde_json::json!({})), ElicitationAction::Decline \| ElicitationAction::Cancel => None, }; ``` ## Test plan - [x] Code compiles with `cargo check -p codex-core` - [x] Formatted with `just fmt` - [ ] Integration test `accept_elicitation_for_prompt_rule` (requires MCP server binary) Fixes #9053	2026-01-15 14:20:57 -08:00
pap-openai	d886a8646c	remove needs_follow_up error log (#9272 )	2026-01-15 21:20:54 +00:00
sayan-oai	169201b1b5	[search] allow explicitly disabling web search (#9249 ) moving `web_search` rollout serverside, so need a way to explicitly disable search + signal eligibility from the client. - Add `x‑oai‑web‑search‑eligible` header that signifies whether the request can have web search. - Only attach the `web_search` tool when the resolved `WebSearchMode` is `Live` or `Cached`.	2026-01-15 11:28:57 -08:00
xl-openai	42fa4c237f	Support SKILL.toml file. (#9125 ) We’re introducing a new SKILL.toml to hold skill metadata so Codex can deliver a richer Skills experience. Initial focus is the interface block: ``` [interface] display_name = "Optional user-facing name" short_description = "Optional user-facing description" icon_small = "./assets/small-400px.png" icon_large = "./assets/large-logo.svg" brand_color = "#3B82F6" default_prompt = "Optional surrounding prompt to use the skill with" ``` All fields are exposed via the app server API. display_name and short_description are consumed by the TUI.	2026-01-15 11:20:04 -08:00
jif-oai	05b960671d	feat: add agent roles to collab tools (#9275 ) Add `agent_type` parameter to the collab tool `spawn_agent` that contains a preset to apply on the config when spawning this agent	2026-01-15 13:33:52 +00:00
marina-oai	ca47abc69a	Record exec pids in tool call spans	2026-01-15 17:16:24 +09:00
charley-oai	4a9c2bcc5a	Add text element metadata to types (#9235 ) Initial type tweaking PR to make the diff of https://github.com/openai/codex/pull/9116 smaller This should not change any behavior, just adds some fields to types	2026-01-14 16:41:50 -08:00
sayan-oai	5e426ac270	add WebSearchMode enum (#9216 ) ### What Add `WebSearchMode` enum (disabled, cached live, defaults to cached) to config + V2 protocol. This enum takes precedence over legacy flags: `web_search_cached`, `web_search_request`, and `tools.web_search`. Keep `--search` as live. ### Tests Added tests	2026-01-14 12:51:42 -08:00
jif-oai	3d322fa9d8	feat: add collab prompt (#9208 ) Adding a prompt for collab tools. This is only for internal use and the prompt won't be gated for now as it is not stable yet. The goal of this PR is to provide the tool required to iterate on the prompt	2026-01-14 09:58:15 -08:00
jif-oai	6a939ed7a4	feat: emit events around collab tools (#9095 ) Emit the following events around the collab tools. On the `app-server` this will be under `item/started` and `item/completed` ``` #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)] pub struct CollabAgentSpawnBeginEvent { /// Identifier for the collab tool call. pub call_id: String, /// Thread ID of the sender. pub sender_thread_id: ThreadId, /// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the /// beginning. pub prompt: String, } #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)] pub struct CollabAgentSpawnEndEvent { /// Identifier for the collab tool call. pub call_id: String, /// Thread ID of the sender. pub sender_thread_id: ThreadId, /// Thread ID of the newly spawned agent, if it was created. pub new_thread_id: Option<ThreadId>, /// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the /// beginning. pub prompt: String, /// Last known status of the new agent reported to the sender agent. pub status: AgentStatus, } #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)] pub struct CollabAgentInteractionBeginEvent { /// Identifier for the collab tool call. pub call_id: String, /// Thread ID of the sender. pub sender_thread_id: ThreadId, /// Thread ID of the receiver. pub receiver_thread_id: ThreadId, /// Prompt sent from the sender to the receiver. Can be empty to prevent CoT /// leaking at the beginning. pub prompt: String, } #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)] pub struct CollabAgentInteractionEndEvent { /// Identifier for the collab tool call. pub call_id: String, /// Thread ID of the sender. pub sender_thread_id: ThreadId, /// Thread ID of the receiver. pub receiver_thread_id: ThreadId, /// Prompt sent from the sender to the receiver. Can be empty to prevent CoT /// leaking at the beginning. pub prompt: String, /// Last known status of the receiver agent reported to the sender agent. pub status: AgentStatus, } #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)] pub struct CollabWaitingBeginEvent { /// Thread ID of the sender. pub sender_thread_id: ThreadId, /// Thread ID of the receiver. pub receiver_thread_id: ThreadId, /// ID of the waiting call. pub call_id: String, } #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)] pub struct CollabWaitingEndEvent { /// Thread ID of the sender. pub sender_thread_id: ThreadId, /// Thread ID of the receiver. pub receiver_thread_id: ThreadId, /// ID of the waiting call. pub call_id: String, /// Last known status of the receiver agent reported to the sender agent. pub status: AgentStatus, } #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)] pub struct CollabCloseBeginEvent { /// Identifier for the collab tool call. pub call_id: String, /// Thread ID of the sender. pub sender_thread_id: ThreadId, /// Thread ID of the receiver. pub receiver_thread_id: ThreadId, } #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)] pub struct CollabCloseEndEvent { /// Identifier for the collab tool call. pub call_id: String, /// Thread ID of the sender. pub sender_thread_id: ThreadId, /// Thread ID of the receiver. pub receiver_thread_id: ThreadId, /// Last known status of the receiver agent reported to the sender agent before /// the close. pub status: AgentStatus, } ```	2026-01-14 17:55:57 +00:00
pakrym-oai	92472e7baa	Use current model for review (#9179 ) Instead of having a hard-coded default review model, use the current model for running `/review` unless one is specified in the config. Also inherit current reasoning effort	2026-01-14 08:59:41 -08:00
jif-oai	dc3deaa3e7	feat: return an error if the image sent by the user is a bad image (#9146 ) ## Before When we detect an `InvalidImageRequest`, we replace the image by a placeholder and keep going ## Now In such `InvalidImageRequest`, we check if the image is due to a user message or a tool call output. For tool call output we still replace it with a placeholder to avoid breaking the agentic loop bu tif this is because of a user message, we send an error to the user	2026-01-14 09:07:45 +00:00
jif-oai	6fbb89e858	fix: shell snapshot clean-up (#9155 ) Clean all shell snapshot files corresponding to sessions that have not been updated in 7 days Those files should never leak. The only known cases were it can leak are during non graceful interrupt of the process (`kill -9, `panic`, OS crash, ...)	2026-01-14 09:05:46 +00:00
Eric Traut	31d9b6f4d2	Improve handling of config and rules errors for app server clients (#9182 ) When an invalid config.toml key or value is detected, the CLI currently just quits. This leaves the VSCE in a dead state. This PR changes the behavior to not quit and bubble up the config error to users to make it actionable. It also surfaces errors related to "rules" parsing. This allows us to surface these errors to users in the VSCE, like this: <img width="342" height="129" alt="Screenshot 2026-01-13 at 4 29 22 PM" src="https://github.com/user-attachments/assets/a79ffbe7-7604-400c-a304-c5165b6eebc4" /> <img width="346" height="244" alt="Screenshot 2026-01-13 at 4 45 06 PM" src="https://github.com/user-attachments/assets/de874f7c-16a2-4a95-8c6d-15f10482e67b" />	2026-01-13 17:57:09 -08:00
Ahmed Ibrahim	7e33ac7eb6	clean models manager (#9168 ) Have only the following Methods: - `list_models`: getting current available models - `try_list_models`: sync version no refresh for tui use - `get_default_model`: get the default model (should be tightened to core and received on session configuration) - `get_model_info`: get `ModelInfo` for a specific model (should be tightened to core but used in tests) - `refresh_if_new_etag`: trigger refresh on different etags Also move the cache to its own struct	2026-01-13 16:55:33 -08:00
gt-oai	2651980bdf	Restrict MCP servers from `requirements.toml` (#9101 ) Enterprises want to restrict the MCP servers their users can use. Admins can now specify an allowlist of MCPs in `requirements.toml`. The MCP servers are matched on both Name and Transport (local path or HTTP URL) -- both must match to allow the MCP server. This prevents circumventing the allowlist by renaming MCP servers in user config. (It is still possible to replace the local path e.g. rewrite say `/usr/local/github-mcp` with a nefarious MCP. We could allow hash pinning in the future, but that would break updates. I also think this represents a broader, out-of-scope problem.) We introduce a new field to Constrained: "normalizer". In general, it is a fn(T) -> T and applies when `Constrained<T>.set()` is called. In this particular case, it disables MCP servers which do not match the allowlist. An alternative solution would remove this and instead throw a ConstraintError. That would stop Codex launching if any MCP server was configured which didn't match. I think this is bad. We currently reuse the enabled flag on MCP servers to disable them, but don't propagate any information about why they are disabled. I'd like to add that in a follow up PR, possibly by switching out enabled with an enum. In action: ``` # MCP server config has two MCPs. We are going to allowlist one of them. ➜ codex git:(gt/restrict-mcps) ✗ cat ~/.codex/config.toml \| grep mcp_servers -A1 [mcp_servers.hello_world] command = "hello-world-mcp" -- [mcp_servers.docs] command = "docs-mcp" # Restrict the MCPs to the hello_world MCP. ➜ codex git:(gt/restrict-mcps) ✗ defaults read com.openai.codex requirements_toml_base64 \| base64 -d [mcp_server_allowlist.hello_world] command = "hello-world-mcp" # List the MCPs, observe hello_world is enabled and docs is disabled. ➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list cargo run --bin codex -- "$@" Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s Running `target/debug/codex mcp list` Name Command Args Env Cwd Status Auth docs docs-mcp - - - disabled Unsupported hello_world hello-world-mcp - - - enabled Unsupported # Remove the restrictions. ➜ codex git:(gt/restrict-mcps) ✗ defaults delete com.openai.codex requirements_toml_base64 # Observe both MCPs are enabled. ➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list cargo run --bin codex -- "$@" Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s Running `target/debug/codex mcp list` Name Command Args Env Cwd Status Auth docs docs-mcp - - - enabled Unsupported hello_world hello-world-mcp - - - enabled Unsupported # A new requirements that updates the command to one that does not match. ➜ codex git:(gt/restrict-mcps) ✗ cat ~/requirements.toml [mcp_server_allowlist.hello_world] command = "hello-world-mcp-v2" # Use those requirements. ➜ codex git:(gt/restrict-mcps) ✗ defaults write com.openai.codex requirements_toml_base64 "$(base64 -i /Users/gt/requirements.toml)" # Observe both MCPs are disabled. ➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list cargo run --bin codex -- "$@" Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.75s Running `target/debug/codex mcp list` Name Command Args Env Cwd Status Auth docs docs-mcp - - - disabled Unsupported hello_world hello-world-mcp - - - disabled Unsupported ```	2026-01-13 19:45:00 +00:00
pakrym-oai	2d56519ecd	Support response.done and add integration tests (#9129 ) The agent loop using a persistent incremental web socket connection.	2026-01-13 16:12:30 +00:00
Ahmed Ibrahim	cbca43d57a	Send message by default mid turn. queue messages by tab (#9077 ) https://github.com/user-attachments/assets/03838730-4ddc-44df-a2c7-cb8ecda78660	2026-01-12 23:06:35 -08:00
pakrym-oai	d75626ad99	Reuse websocket connection (#9127 ) Reuses the connection but still sends full requests.	2026-01-13 03:30:09 +00:00
pakrym-oai	490c1c1fdd	Add model client sessions (#9102 ) Maintain a long-running session.	2026-01-13 01:15:56 +00:00
Ahmed Ibrahim	87f7226cca	Assemble sandbox/approval/network prompts dynamically (#8961 ) - Add a single builder for developer permissions messaging that accepts SandboxPolicy and approval policy. This builder now drives the developer “permissions” message that’s injected at session start and any time sandbox/approval settings change. - Trim EnvironmentContext to only include cwd, writable roots, and shell; removed sandbox/approval/network duplication and adjusted XML serialization and tests accordingly. Follow-up: adding a config value to replace the developer permissions message for custom sandboxes.	2026-01-12 23:12:59 +00:00
Shijie Rao	3e91a95ce1	feat: hot reload mcp servers (#8957 ) ### Summary * Added `mcpServer/refresh` command to inform app servers and active threads to refresh mcpServer on next turn event. * Added `pending_mcp_server_refresh_config` to codex core so that if the value is populated, we reinitialize the mcp server manager on the thread level. * The config is updated on `mcpServer/refresh` command which we iterate through threads and provide with the latest config value after last write.	2026-01-12 11:17:50 -08:00
jif-oai	623707ab58	feat: add wait tool implementation for collab (#9088 ) Add implementation for the `wait` tool. For this we consider all status different from `PendingInit` and `Running` as terminal. The `wait` tool call will return either after a given timeout or when the tool reaches a non-terminal status. A few points to note: * The usage of a channel is preferred to prevent some races (just looping on `get_status()` could "miss" a terminal status) * The order of operations is very important, we need to first subscribe and then check the last known status to prevent race conditions * If the channel gets dropped, we return an error on purpose	2026-01-12 12:16:24 +00:00
charley-oai	6709ad8975	Label attached images so agent can understand in-message labels (#8950 ) Agent wouldn't "see" attached images and would instead try to use the view_file tool: <img width="1516" height="504" alt="image" src="https://github.com/user-attachments/assets/68a705bb-f962-4fc1-9087-e932a6859b12" /> In this PR, we wrap image content items in XML tags with the name of each image (now just a numbered name like `[Image #1]`), so that the model can understand inline image references (based on name). We also put the image content items above the user message which the model seems to prefer (maybe it's more used to definitions being before references). We also tweak the view_file tool description which seemed to help a bit Results on a simple eval set of images: Before <img width="980" height="310" alt="image" src="https://github.com/user-attachments/assets/ba838651-2565-4684-a12e-81a36641bf86" /> After <img width="918" height="322" alt="image" src="https://github.com/user-attachments/assets/10a81951-7ee6-415e-a27e-e7a3fd0aee6f" /> ```json [ { "id": "single_describe", "prompt": "Describe the attached image in one sentence.", "images": ["image_a.png"] }, { "id": "single_color", "prompt": "What is the dominant color in the image? Answer with a single color word.", "images": ["image_b.png"] }, { "id": "orientation_check", "prompt": "Is the image portrait or landscape? Answer in one sentence.", "images": ["image_c.png"] }, { "id": "detail_request", "prompt": "Look closely at the image and call out any small details you notice.", "images": ["image_d.png"] }, { "id": "two_images_compare", "prompt": "I attached two images. Are they the same or different? Briefly explain.", "images": ["image_a.png", "image_b.png"] }, { "id": "two_images_captions", "prompt": "Provide a short caption for each image (Image 1, Image 2).", "images": ["image_c.png", "image_d.png"] }, { "id": "multi_image_rank", "prompt": "Rank the attached images from most colorful to least colorful.", "images": ["image_a.png", "image_b.png", "image_c.png"] }, { "id": "multi_image_choice", "prompt": "Which image looks more vibrant? Answer with 'Image 1' or 'Image 2'.", "images": ["image_b.png", "image_d.png"] } ] ```	2026-01-09 21:33:45 -08:00
jif-oai	1aed01e99f	renaming: task to turn (#8963 )	2026-01-09 17:31:17 +00:00
jif-oai	fceae86581	nit: rename session metric (#8966 )	2026-01-09 11:55:57 +00:00
jif-oai	568b938c80	feat: first pass on clb tool (#8930 )	2026-01-09 11:54:05 +00:00
pakrym-oai	634764ece9	Immutable CodexAuth (#8857 ) Historically we started with a CodexAuth that knew how to refresh it's own tokens and then added AuthManager that did a different kind of refresh (re-reading from disk). I don't think it makes sense for both `CodexAuth` and `AuthManager` to be mutable and contain behaviors. Move all refresh logic into `AuthManager` and keep `CodexAuth` as a data object.	2026-01-08 11:43:56 -08:00
jif-oai	69898e3dba	clean: all history cloning (#8916 )	2026-01-08 18:17:18 +00:00
jif-oai	5522663f92	feat: add a few metrics (#8910 )	2026-01-08 15:39:57 +00:00
jif-oai	634650dd25	feat: metrics capabilities (#8318 ) Add metrics capabilities to Codex. The `README.md` is up to date. This will not be merged with the metrics before this PR of course: https://github.com/openai/codex/pull/8350	2026-01-08 11:47:36 +00:00
Ahmed Ibrahim	a9b5e8a136	Simplify error managment in `run_turn` (#8849 )	2026-01-07 13:15:46 -08:00

1 2 3 4 5 ...

406 Commits