codex

rad/codex

mirror of https://github.com/openai/codex.git synced 2026-03-05 21:45:28 +03:00

Author	SHA1	Message	Date
Owen Lin	926b2f19e8	feat(app-server): support mcp elicitations in v2 api (#13425 ) This adds a first-class server request for MCP server elicitations: `mcpServer/elicitation/request`. Until now, MCP elicitation requests only showed up as a raw `codex/event/elicitation_request` event from core. That made it hard for v2 clients to handle elicitations using the same request/response flow as other server-driven interactions (like shell and `apply_patch` tools). This also updates the underlying MCP elicitation request handling in core to pass through the full MCP request (including URL and form data) so we can expose it properly in app-server. ### Why not `item/mcpToolCall/elicitationRequest`? This is because MCP elicitations are related to MCP servers first, and only optionally to a specific MCP tool call. In the MCP protocol, elicitation is a server-to-client capability: the server sends `elicitation/create`, and the client replies with an elicitation result. RMCP models it that way as well. In practice an elicitation is often triggered by an MCP tool call, but not always. ### What changed - add `mcpServer/elicitation/request` to the v2 app-server API - translate core `codex/event/elicitation_request` events into the new v2 server request - map client responses back into `Op::ResolveElicitation` so the MCP server can continue - update app-server docs and generated protocol schema - add an end-to-end app-server test that covers the full round trip through a real RMCP elicitation flow - The new test exercises a realistic case where an MCP tool call triggers an elicitation, the app-server emits mcpServer/elicitation/request, the client accepts it, and the tool call resumes and completes successfully. ### app-server API flow - Client starts a thread with `thread/start`. - Client starts a turn with `turn/start`. - App-server sends `item/started` for the `mcpToolCall`. - While that tool call is in progress, app-server sends `mcpServer/elicitation/request`. - Client responds to that request with `{ action: "accept" \| "decline" \| "cancel" }`. - App-server sends `serverRequest/resolved`. - App-server sends `item/completed` for the mcpToolCall. - App-server sends `turn/completed`. - If the turn is interrupted while the elicitation is pending, app-server still sends `serverRequest/resolved` before the turn finishes.	2026-03-05 07:20:20 -08:00
Won Park	229e6d0347	image-gen-event/client_processing (#13512 ) enabling client-side to process with image-generation capabilities (setting app-server)	2026-03-04 16:54:38 -08:00
xl-openai	1e877ccdd2	plugin: support local-based marketplace.json + install endpoint. (#13422 ) Support marketplace.json that points to a local file, with ``` "source": { "source": "local", "path": "./plugin-1" }, ``` Add a new plugin/install endpoint which add the plugin to the cache folder and enable it in config.toml.	2026-03-04 19:08:18 -05:00
Owen Lin	8dfd654196	feat(app-server-test-client): OTEL setup for tracing (#13493 ) ### Overview This PR: - Updates `app-server-test-client` to load OTEL settings from `$CODEX_HOME/config.toml` and initializes its own OTEL provider. - Add real client root spans to app-server test client traces. This updates `codex-app-server-test-client` so its Datadog traces reflect the full client-driven flow instead of a set of server spans stitched together under a synthetic parent. Before this change, the test client generated a fake `traceparent` once and reused it for every JSON-RPC request. That kept the requests in one trace, but there was no real client span at the top, so Datadog ended up showing the sequence in a slightly misleading way, where all RPCs were anchored under `initialize`. Now the test client: - loads OTEL settings from the normal Codex config path, including `$CODEX_HOME/config.toml` and existing --config overrides - initializes tracing the same way other Codex binaries do when trace export is enabled - creates a real client root span for each scripted command - creates per-request client spans for JSON-RPC methods like `initialize`, `thread/start`, and `turn/start` - injects W3C trace context from the current client span into request.trace instead of reusing a fabricated carrier This gives us a cleaner trace shape in Datadog: - one trace URL for the whole scripted flow - a visible client root span - proper client/server parent-child relationships for each app-server request	2026-03-04 13:30:09 -08:00
iceweasel-oai	54a1c81d73	allow apps to specify cwd for sandbox setup. (#13484 ) The electron app doesn't start up the app-server in a particular workspace directory. So sandbox setup happens in the app-installed directory instead of the project workspace. This allows the app do specify the workspace cwd so that the sandbox setup actually sets up the ACLs instead of exiting fast and then having the first shell command be slow.	2026-03-04 10:54:30 -08:00
Won Park	fa2306b303	image-gen-core (#13290 ) Core tool-calling for image-gen, handles requesting and receiving logic for images using response API	2026-03-03 23:11:28 -08:00
Val Kharitonov	4f6c4bb143	support 'flex' tier in app-server in addition to 'fast' (#13391 )	2026-03-03 22:46:05 -08:00
Celia Chen	d622bff384	chore: Nest skill and protocol network permissions under `network.enabled` (#13427 ) ## Summary Changes the permission profile shape from a bare network boolean to a nested object. Before: ```yaml permissions: network: true ``` After: ```yaml permissions: network: enabled: true ``` This also updates the shared Rust and app-server protocol types so `PermissionProfile.network` is no longer `Option<bool>`, but `Option<NetworkPermissions>` with `enabled: Option<bool>`. ## What Changed - Updated `PermissionProfile` in `codex-rs/protocol/src/models.rs`: - `pub network: Option<bool>` -> `pub network: Option<NetworkPermissions>` - Added `NetworkPermissions` with: - `pub enabled: Option<bool>` - Changed emptiness semantics so `network` is only considered empty when `enabled` is `None` - Updated skill metadata parsing to accept `permissions.network.enabled` - Updated core permission consumers to read `network.enabled.unwrap_or(false)` where a concrete boolean is needed - Updated app-server v2 protocol types and regenerated schema/TypeScript outputs - Updated docs to mention `additionalPermissions.network.enabled`	2026-03-03 20:57:29 -08:00
Michael Bolin	bfff0c729f	config: enforce enterprise feature requirements (#13388 ) ## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.	2026-03-04 04:40:22 +00:00
Celia Chen	e6773f856c	Feat: Preserve network access on read-only sandbox policies (#13409 ) ## Summary `PermissionProfile.network` could not be preserved when additional or compiled permissions resolved to `SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access field. This change makes read-only + network enabled representable directly and threads that through the protocol, app-server v2 mirror, and permission- merging logic. ## What changed - Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core protocol and app-server v2 protocol. - Kept backward compatibility by defaulting the new field to false, so legacy read-only payloads still deserialize unchanged. - Updated `has_full_network_access()` and sandbox summaries to respect read-only network access. - Preserved PermissionProfile.network when: - compiling skill permission profiles into sandbox policies - normalizing additional permissions - merging additional permissions into existing sandbox policies - Updated the approval overlay to show network in the rendered permission rule when requested. - Regenerated app-server schema fixtures for the new v2 wire shape.	2026-03-04 02:41:57 +00:00
Owen Lin	0fbd84081b	feat(app-server): add a skills/changed v2 notification (#13414 ) This adds a first-class app-server v2 `skills/changed` notification for the existing skills live-reload signal. Before this change, clients only had the legacy raw `codex/event/skills_update_available` event. With this PR, v2 clients can listen for a typed JSON-RPC notification instead of depending on the legacy `codex/event/*` stream, which we want to remove soon.	2026-03-03 17:01:00 -08:00
Curtis 'Fjord' Hawthorne	b92146d48b	Add under-development original-resolution view_image support (#13050 ) ## Summary Add original-resolution support for `view_image` behind the under-development `view_image_original_resolution` feature flag. When the flag is enabled and the target model is `gpt-5.3-codex` or newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends `detail: "original"` to the Responses API instead of using the legacy resize/compress path. ## What changed - Added `view_image_original_resolution` as an under-development feature flag. - Added `ImageDetail` to the protocol models and support for serializing `detail: "original"` on tool-returned images. - Added `PromptImageMode::Original` to `codex-utils-image`. - Preserves original PNG/JPEG/WebP bytes. - Keeps legacy behavior for the resize path. - Updated `view_image` to: - use the shared `local_image_content_items_with_label_number(...)` helper in both code paths - select original-resolution mode only when: - the feature flag is enabled, and - the model slug parses as `gpt-5.3-codex` or newer - Kept local user image attachments on the existing resize path; this change is specific to `view_image`. - Updated history/image accounting so only `detail: "original"` images use the docs-based GPT-5 image cost calculation; legacy images still use the old fixed estimate. - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG at 85% quality unless lossless is required, while still allowing other formats when explicitly requested. - Updated tests and helper code that construct `FunctionCallOutputContentItem::InputImage` to carry the new `detail` field. ## Behavior ### Feature off - `view_image` keeps the existing resize/re-encode behavior. - History estimation keeps the existing fixed-cost heuristic. ### Feature on + `gpt-5.3-codex+` - `view_image` sends original-resolution images with `detail: "original"`. - PNG/JPEG/WebP source bytes are preserved when possible. - History estimation uses the GPT-5 docs-based image-cost calculation for those `detail: "original"` images. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/13050 - ⏳ `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049	2026-03-03 15:56:54 -08:00
joeytrasatti-openai	935754baa3	Add thread metadata update endpoint to app server (#13280 ) ## Summary - add the v2 `thread/metadata/update` API, including protocol/schema/TypeScript exports and app-server docs - patch stored thread `gitInfo` in sqlite without resuming the thread, with validation plus support for explicit `null` clears - repair missing sqlite thread rows from rollout data before patching, and make those repairs safe by inserting only when absent and updating only git columns so newer metadata is not clobbered - keep sqlite authoritative for mutable thread git metadata by preserving existing sqlite git fields during reconcile/backfill and only using rollout `SessionMeta` git fields to fill gaps - add regression coverage for the endpoint, repair paths, concurrent sqlite writes, clearing git fields, and rollout/backfill reconciliation - fix the login server shutdown race so cancelling before the waiter starts still terminates `block_until_done()` correctly ## Testing - `cargo test -p codex-state apply_rollout_items_preserves_existing_git_branch_and_fills_missing_git_fields` - `cargo test -p codex-state update_thread_git_info_preserves_newer_non_git_metadata` - `cargo test -p codex-core backfill_sessions_preserves_existing_git_branch_and_fills_missing_git_fields` - `cargo test -p codex-app-server thread_metadata_update` - `cargo test` - currently fails in existing `codex-core` grep-files tests with `unsupported call: grep_files`: - `suite::grep_files::grep_files_tool_collects_matches` - `suite::grep_files::grep_files_tool_reports_empty_results`	2026-03-03 15:56:11 -08:00
Owen Lin	d7eb195b62	chore(app-server): restore EventMsg TS types (#13397 ) Realized EventMsg generated types were unintentionally removed as part of this PR: https://github.com/openai/codex/pull/13375 Turns out our TypeScript export pipeline relied on transitively reaching `EventMsg`. We should still export `EventMsg` explicitly since we're still emitting `codex/event/*` events (for now, but getting dropped soon as well).	2026-03-03 13:37:40 -08:00
Owen Lin	167158f93c	chore(app-server): delete v1 RPC methods and notifications (#13375 ) ## Summary This removes the old app-server v1 methods and notifications we no longer need, while keeping the small set the main codex app client still depends on for now. The remaining legacy surface is: - `initialize` - `getConversationSummary` - `getAuthStatus` - `gitDiffToRemote` - `fuzzyFileSearch` - `fuzzyFileSearch/sessionStart` - `fuzzyFileSearch/sessionUpdate` - `fuzzyFileSearch/sessionStop` And the raw `codex/event/*` notifications emitted from core. These notifications will be removed in a followup PR. ## What changed - removed deprecated v1 request variants from the protocol and app-server dispatcher - removed deprecated typed notifications: `authStatusChange`, `loginChatGptComplete`, and `sessionConfigured` - updated the app-server test client to use v2 flows instead of deleted v1 flows - deleted legacy-only app-server test suites and added focused coverage for `getConversationSummary` - regenerated app-server schema fixtures and updated the MCP interface docs to match the remaining compatibility surface ## Testing - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server`	2026-03-03 13:18:25 -08:00
Ahmed Ibrahim	72d368e03a	fix (#13389 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-03-03 12:48:16 -08:00
Anton Panasenko	8da7e4bdae	app-server-protocol: export flat v2 schema bundle (#13324 ) ## Summary - add an `--experimental` flag to the export binary and thread the option through TypeScript and JSON schema generation - flatten the v2 schema bundle into a datamodel-code-generator-friendly `codex_app_server_protocol.v2.schemas.json` export - retarget shared helper refs to namespaced v2 definitions, add coverage for the new export behavior, and vendor the generated schema fixtures ## Validation - `cargo test -p codex-app-server-protocol` (71 unit tests and bin targets passed locally; the final schema fixture integration target was revalidated via fresh schema regeneration and a tree diff) - `./target/debug/write_schema_fixtures --schema-root <tmpdir>` - `diff -rq app-server-protocol/schema <tmpdir>` ## Tickets - None	2026-03-03 10:25:51 -08:00
pash-openai	07e532dcb9	app-server service tier plumbing (plus some cleanup) (#13334 ) followup to https://github.com/openai/codex/pull/13212 to expose fast tier controls to app server (majority of this PR is generated schema jsons - actual code is +69 / -35 and +24 tests ) - add service tier fields to the app-server protocol surfaces used by thread lifecycle, turn start, config, and session configured events - thread service tier through the app-server message processor and core thread config snapshots - allow runtime config overrides to carry service tier for app-server callers cleanup: - Removing useless "legacy" code supporting "standard" - we moved to None \| "fast", so "standard" is not needed.	2026-03-03 02:35:09 -08:00
Ahmed Ibrahim	b20b6aa46f	Update realtime websocket API (#13265 ) - migrate the realtime websocket transport to the new session and handoff flow - make the realtime model configurable in config.toml and use API-key auth for the websocket --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-02 16:05:40 -08:00
Owen Lin	d473e8d56d	feat(app-server): add tracing to all app-server APIs (#13285 ) ### Overview This PR adds the first piece of tracing for app-server JSON-RPC requests. There are two main changes: - JSON-RPC requests can now take an optional W3C trace context at the top level via a `trace` field (`traceparent` / `tracestate`). - app-server now creates a dedicated request span for every inbound JSON-RPC request in `MessageProcessor`, and uses the request-level trace context as the parent when present. For compatibility with existing flows, app-server still falls back to the TRACEPARENT env var when there is no request-level traceparent. This PR is intentionally scoped to the app-server boundary. In a followup, we'll actually propagate trace context through the async handoff into core execution spans like run_turn, which will make app-server traces much more useful. ### Spans A few details on the app-server span shape: - each inbound request gets its own server span - span/resource names are based on the JSON-RPC method (`initialize`, `thread/start`, `turn/start`, etc.) - spans record transport (stdio vs websocket), request id, connection id, and client name/version when available - `initialize` stores client metadata in session state so later requests on the same connection can reuse it	2026-03-02 16:01:41 -08:00
Thibault Sottiaux	c9cef6ba9e	[codex] include plan type in account updates (#13181 ) This change fixes a Codex app account-state sync bug where clients could know the user was signed in but still miss the ChatGPT subscription tier, which could lead to incorrect upgrade messaging for paid users. The root cause was that `account/updated` only carried `authMode` while plan information was available separately via `account/read` and rate-limit snapshots, so this update adds `planType` to `account/updated`, populates it consistently across login and refresh paths.	2026-03-01 13:43:37 -08:00
Ruslan Nigmatullin	8c1e3f3e64	app-server: Add `ephemeral` field to `Thread` object (#13084 ) Currently there is no alternative way to know that thread is ephemeral, only client which did create it has the knowledge.	2026-02-27 17:42:25 -08:00
Ruslan Nigmatullin	69d7a456bb	app-server: Replay pending item requests on `thread/resume` (#12560 ) Replay pending client requests after `thread/resume` and emit resolved notifications when those requests clear so approval/input UI state stays in sync after reconnects and across subscribed clients. Affected RPCs: - `item/commandExecution/requestApproval` - `item/fileChange/requestApproval` - `item/tool/requestUserInput` Motivation: - Resumed clients need to see pending approval/input requests that were already outstanding before the reconnect. - Clients also need an explicit signal when a pending request resolves or is cleared so stale UI can be removed on turn start, completion, or interruption. Implementation notes: - Use pending client requests from `OutgoingMessageSender` in order to replay them after `thread/resume` attaches the connection, using original request ids. - Emit `serverRequest/resolved` when pending requests are answered or cleared by lifecycle cleanup. - Update the app-server protocol schema, generated TypeScript bindings, and README docs for the replay/resolution flow. High-level test plan: - Added automated coverage for replaying pending command execution and file change approval requests on `thread/resume`. - Added automated coverage for resolved notifications in command approval, file change approval, request_user_input, turn start, and turn interrupt flows. - Verified schema/docs updates in the relevant protocol and app-server tests. Manual testing: - Tested reconnect/resume with multiple connections. - Confirmed state stayed in sync between connections.	2026-02-27 12:45:59 -08:00
Michael Bolin	d09a7535ed	fix: use AbsolutePathBuf for permission profile file roots (#12970 ) ## Why `PermissionProfile` should describe filesystem roots as absolute paths at the type level. Using `PathBuf` in `FileSystemPermissions` made the shared type too permissive and blurred together three different deserialization cases: - skill metadata in `agents/openai.yaml`, where relative paths should resolve against the skill directory - app-server API payloads, where callers should have to send absolute paths - local tool-call payloads for commands like `shell_command` and `exec_command`, where `additional_permissions.file_system` may legitimately be relative to the command `workdir` This change tightens the shared model without regressing the existing local command flow. ## What Changed - changed `protocol::models::FileSystemPermissions` and the app-server `AdditionalFileSystemPermissions` mirror to use `AbsolutePathBuf` - wrapped skill metadata deserialization in `AbsolutePathBufGuard`, so relative permission roots in `agents/openai.yaml` resolve against the containing skill directory - kept app-server/API deserialization strict, so relative `additionalPermissions.fileSystem.*` paths are rejected at the boundary - restored cwd/workdir-relative deserialization for local tool-call payloads by parsing `shell`, `shell_command`, and `exec_command` arguments under an `AbsolutePathBufGuard` rooted at the resolved command working directory - simplified runtime additional-permission normalization so it only canonicalizes and deduplicates absolute roots instead of trying to recover relative ones later - updated the app-server schema fixtures, `app-server/README.md`, and the affected transport/TUI tests to match the final behavior	2026-02-27 17:42:52 +00:00
Ahmed Ibrahim	4d180ae428	Add model availability NUX metadata (#12972 ) - replace show_nux with structured availability_nux model metadata - expose availability NUX data through the app-server model API - update shared fixtures and tests for the new field	2026-02-26 22:02:57 -08:00
Curtis 'Fjord' Hawthorne	7e980d7db6	Support multimodal custom tool outputs (#12948 ) ## Summary This changes `custom_tool_call_output` to use the same output payload shape as `function_call_output`, so freeform tools can return either plain text or structured content items. The main goal is to let `js_repl` return image content from nested `view_image` calls in its own `custom_tool_call_output`, instead of relying on a separate injected message. ## What changed - Changed `custom_tool_call_output.output` from `string` to `FunctionCallOutputPayload` - Updated freeform tool plumbing to preserve structured output bodies - Updated `js_repl` to aggregate nested tool content items and attach them to the outer `js_repl` result - Removed the old `js_repl` special case that injected `view_image` results as a separate pending user image message - Updated normalization/history/truncation paths to handle multimodal `custom_tool_call_output` - Regenerated app-server protocol schema artifacts ## Behavior Direct `view_image` calls still return a `function_call_output` with image content. When `view_image` is called inside `js_repl`, the outer `js_repl` `custom_tool_call_output` now carries: - an `input_text` item if the JS produced text output - one or more `input_image` items from nested tool results So the nested image result now stays inside the `js_repl` tool output instead of being injected as a separate message. ## Compatibility This is intended to be backward-compatible for resumed conversations. Older histories that stored `custom_tool_call_output.output` as a plain string still deserialize correctly, and older histories that used the previous injected-image-message flow also continue to resume. Added regression coverage for resuming a pre-change rollout containing: - string-valued `custom_tool_call_output` - legacy injected image message history #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12948	2026-02-26 18:17:46 -08:00
Shijie Rao	8715a6ef84	Feat: cxa-1833 update model/list (#12958 ) ### Summary Update `model/list` in app server to include more upgrade information.	2026-02-26 17:02:24 -08:00
Eric Traut	28bfbb8f2b	Enforce user input length cap (#12823 ) Currently there is no bound on the length of a user message submitted in the TUI or through the app server interface. That means users can paste many megabytes of text, which can lead to bad performance, hangs, and crashes. In extreme cases, it can lead to a [kernel panic](https://github.com/openai/codex/issues/12323). This PR limits the length of a user input to 2**20 (about 1M) characters. This value was chosen because it fills the entire context window on the latest models, so accepting longer inputs wouldn't make sense anyway. Summary - add a shared `MAX_USER_INPUT_TEXT_CHARS` constant in codex-protocol and surface it in TUI and app server code - block oversized submissions in the TUI submit flow and emit error history cells when validation fails - reject heavy app-server requests with JSON-RPC `-32602` and structured `input_too_large` data, plus document the behavior Testing - ran the IDE extension with this change and verified that when I attempt to paste a user message that's several MB long, it correctly reports an error instead of crashing or making my computer hot.	2026-02-25 22:23:51 -08:00
Michael Bolin	14116ade8d	feat: include available decisions in command approval requests (#12758 ) Command-approval clients currently infer which choices to show from side-channel fields like `networkApprovalContext`, `proposedExecpolicyAmendment`, and `additionalPermissions`. That makes the request shape harder to evolve, and it forces each client to replicate the server's heuristics instead of receiving the exact decision list for the prompt. This PR introduces a mapping between `CommandExecutionApprovalDecision` and `codex_protocol::protocol::ReviewDecision`: ```rust impl From<CoreReviewDecision> for CommandExecutionApprovalDecision { fn from(value: CoreReviewDecision) -> Self { match value { CoreReviewDecision::Approved => Self::Accept, CoreReviewDecision::ApprovedExecpolicyAmendment { proposed_execpolicy_amendment, } => Self::AcceptWithExecpolicyAmendment { execpolicy_amendment: proposed_execpolicy_amendment.into(), }, CoreReviewDecision::ApprovedForSession => Self::AcceptForSession, CoreReviewDecision::NetworkPolicyAmendment { network_policy_amendment, } => Self::ApplyNetworkPolicyAmendment { network_policy_amendment: network_policy_amendment.into(), }, CoreReviewDecision::Abort => Self::Cancel, CoreReviewDecision::Denied => Self::Decline, } } } ``` And updates `CommandExecutionRequestApprovalParams` to have a new field: ```rust available_decisions: Option<Vec<CommandExecutionApprovalDecision>> ``` when, if specified, should make it easier for clients to display an appropriate list of options in the UI. This makes it possible for `CoreShellActionProvider::prompt()` in `unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly, adding support for `ApprovedForSession` when approving a skill script, which was previously missing in the TUI. Note this results in a significant change to `exec_options()` in `approval_overlay.rs`, as the displayed options are now derived from `available_decisions: &[ReviewDecision]`. ## What Changed - Add `available_decisions` to [`ExecApprovalRequestEvent`](`de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)`), including helpers to derive the legacy default choices when older senders omit the field. - Map `codex_protocol::protocol::ReviewDecision` to app-server `CommandExecutionApprovalDecision` and expose the ordered list as experimental `availableDecisions` in [`CommandExecutionRequestApprovalParams`](`de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)`). - Thread optional `available_decisions` through the core approval path so Unix shell escalation can explicitly request `ApprovedForSession` for session-scoped approvals instead of relying on client heuristics. [`unix_escalation.rs`](`de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214)`) - Update the TUI approval overlay to build its buttons from the ordered decision list, while preserving the legacy fallback when `available_decisions` is missing. - Update the app-server README, test client output, and generated schema artifacts to document and surface the new field. ## Testing - Add `approval_overlay.rs` coverage for explicit decision lists, including the generic `ApprovedForSession` path and network approval options. - Update `chatwidget/tests.rs` and app-server protocol tests to populate the new optional field and keep older event shapes working. ## Developers Docs - If we document `item/commandExecution/requestApproval` on [developers.openai.com/codex](https://developers.openai.com/codex), add experimental `availableDecisions` as the preferred source of approval choices and note that older servers may omit it.	2026-02-26 01:10:46 +00:00
Celia Chen	4f45668106	Revert "Add skill approval event/response (#12633 )" (#12811 ) This reverts commit https://github.com/openai/codex/pull/12633. We no longer need this PR, because we favor sending normal exec command approval server request with `additional_permissions` of skill permissions instead	2026-02-26 01:02:42 +00:00
Charley Cunningham	2f4d6ded1d	Enable request_user_input in Default mode (#12735 ) ## Summary - allow `request_user_input` in Default collaboration mode as well as Plan - update the Default-mode instructions to prefer assumptions first and use `request_user_input` only when a question is unavoidable - update request_user_input and app-server tests to match the new Default-mode behavior - refactor collaboration-mode availability plumbing into `CollaborationModesConfig` for future mode-related flags ## Codex author `codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`	2026-02-25 15:20:46 -08:00
Owen Lin	21f7032dbb	feat(app-server): thread/unsubscribe API (#10954 ) Adds a new v2 app-server API for a client to be able to unsubscribe to a thread: - New RPC method: `thread/unsubscribe` - New server notification: `thread/closed` Today clients can start/resume/archive threads, but there wasn’t a way to explicitly unload a live thread from memory without archiving it. With `thread/unsubscribe`, a client can indicate it is no longer actively working with a live Thread. If this is the only client subscribed to that given thread, the thread will be automatically closed by app-server, at which point the server will send `thread/closed` and `thread/status/changed` with `status: notLoaded` notifications. This gives clients a way to prevent long-running app-server processes from accumulating too many thread (and related) objects in memory. Closed threads will also be removed from `thread/loaded/list`.	2026-02-25 13:14:30 -08:00
Owen Lin	a0fd94bde6	feat(app-server): add ThreadItem::DynamicToolCall (#12732 ) Previously, clients would call `thread/start` with dynamic_tools set, and when a model invokes a dynamic tool, it would just make the server->client `item/tool/call` request and wait for the client's response to complete the tool call. This works, but it doesn't have an `item/started` or `item/completed` event. Now we are doing this: - [new] emit `item/started` with `DynamicToolCall` populated with the call arguments - send an `item/tool/call` server request - [new] once the client responds, emit `item/completed` with `DynamicToolCall` populated with the response. Also, with `persistExtendedHistory: true`, dynamic tool calls are now reconstructable in `thread/read` and `thread/resume` as `ThreadItem::DynamicToolCall`.	2026-02-25 12:00:10 -08:00
Ahmed Ibrahim	947092283a	Add app-server v2 thread realtime API (#12715 ) Add experimental `thread/realtime/*` v2 requests and notifications, then route app-server realtime events through that thread-scoped surface with integration coverage. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-25 09:59:10 -08:00
alexsong-oai	6d6570d89d	Support external agent config detect and import (#12660 ) Migration Behavior * Config * Migrates settings.json into config.toml * Only adds fields when config.toml is missing, or when those fields are missing from the existing file * Supported mappings: env -> shell_environment_policy sandbox.enabled = true -> sandbox_mode = "workspace-write" * Skills * Copies home and repo .claude/skills into .agents/skills * Existing skill directories are not overwritten * SKILL.md content is rewritten from Claude-related terms to Codex * AgentsMd * Repo only * Migrates CLAUDE.md into AGENTS.md * Detect/import only proceed when AGENTS.md is missing or present but empty * Content is rewritten from Claude-related terms to Codex	2026-02-25 02:11:51 -08:00
jif-oai	f46b767b7e	feat: add search term to thread list (#12578 ) Add `searchTerm` to `thread/list` that will search for a match in the titles (the condition being `searchTerm` $$\in$$ `title`)	2026-02-25 09:59:41 +00:00
jif-oai	10c04e11b8	feat: add service name to app-server (#12319 ) Add service name to the app-server so that the app can use it's own service name This is on thread level because later we might plan the app-server to become a singleton on the computer	2026-02-25 09:51:42 +00:00
viyatb-oai	c086b36b58	feat(ui): add network approval persistence plumbing (#12358 ) ## Summary - add TUI approval options for persistent network host rules - add app-server v2 approval payload plumbing for network approval context + proposed network policy amendments - add app-server handling to translate `applyNetworkPolicyAmendment` decisions back into core review decisions - update docs/test client output and generated app-server schemas/types	2026-02-25 07:06:19 +00:00
Celia Chen	1151972fb2	feat: add experimental additionalPermissions to v2 command execution approval requests (#12737 ) This adds additionalPermissions to the app-server v2 item/commandExecution/requestApproval payload as an experimental field. The field is now exposed on CommandExecutionRequestApprovalParams and is populated from the existing core approval event when a command requests additional sandbox permissions. This PR also contains changes to make server requests to support experiment API. A real app server test client test: sample payload with experimental flag off: ``` { < "id": 0, < "method": "item/commandExecution/requestApproval", < "params": { < "command": "/bin/zsh -lc 'mkdir -p ~/some/test && touch ~/some/test/file'", < "commandActions": [ < { < "command": "mkdir -p '~/some/test'", < "type": "unknown" < }, < { < "command": "touch '~/some/test/file'", < "type": "unknown" < } < ], < "cwd": "/Users/celia/code/codex/codex-rs", < "itemId": "call_QLp0LWkQ1XkU6VW9T2vUZFWB", < "proposedExecpolicyAmendment": [ < "mkdir", < "-p", < "~/some/test" < ], < "reason": "Do you want to allow creating ~/some/test/file outside the workspace?", < "threadId": "019c9309-e209-7d82-a01b-dcf9556a354d", < "turnId": "019c9309-e27a-7f33-834f-6011e795c2d6" < } < } ``` with experimental flag on: ``` < { < "id": 0, < "method": "item/commandExecution/requestApproval", < "params": { < "additionalPermissions": { < "fileSystem": null, < "macos": null, < "network": true < }, < "command": "/bin/zsh -lc 'install -D /dev/null ~/some/test/file'", < "commandActions": [ < { < "command": "install -D /dev/null '~/some/test/file'", < "type": "unknown" < } < ], < "cwd": "/Users/celia/code/codex/codex-rs", < "itemId": "call_K3U4b3dRbj3eMCqslmncbGsq", < "proposedExecpolicyAmendment": [ < "install", < "-D" < ], < "reason": "Do you want to allow creating the file at ~/some/test/file outside the workspace sandbox?", < "threadId": "019c9303-3a8e-76e1-81bf-d67ac446d892", < "turnId": "019c9303-3af1-7143-88a1-73132f771234" < } < } ```	2026-02-25 05:16:35 +00:00
Celia Chen	16ca527c80	chore: migrate additional permissions to PermissionProfile (#12731 ) This PR replaces the old `additional_permissions.fs_read/fs_write` shape with a shared `PermissionProfile` model and wires it through the command approval, sandboxing, protocol, and TUI layers. The schema is adopted from the `SkillManifestPermissions`, which is also refactored to use this unified struct. This helps us easily expose permission profiles in app server/core as a follow-up.	2026-02-25 03:35:28 +00:00
Dylan Hurd	f6053fdfb3	feat(core) Introduce Feature::RequestPermissions (#11871 ) ## Summary Introduces the initial implementation of Feature::RequestPermissions. RequestPermissions allows the model to request that a command be run inside the sandbox, with additional permissions, like writing to a specific folder. Eventually this will include other rules as well, and the ability to persist these permissions, but this PR is already quite large - let's get the core flow working and go from there! <img width="1279" height="541" alt="Screenshot 2026-02-15 at 2 26 22 PM" src="https://github.com/user-attachments/assets/0ee3ec0f-02ec-4509-91a2-809ac80be368" /> ## Testing - [x] Added tests - [x] Tested locally - [x] Feature	2026-02-24 09:48:57 -08:00
pakrym-oai	58763afa0f	Add skill approval event/response (#12633 ) Set the stage for skill-level permission approval in addition to command-level. Behind a feature flag.	2026-02-23 22:28:58 -08:00
viyatb-oai	c3048ff90a	feat(core): persist network approvals in execpolicy (#12357 ) ## Summary Persist network approval allow/deny decisions as `network_rule(...)` entries in execpolicy (not proxy config) It adds `network_rule` parsing + append support in `codex-execpolicy`, including `decision="prompt"` (parse-only; not compiled into proxy allow/deny lists) - compile execpolicy network rules into proxy allow/deny lists and update the live proxy state on approval - preserve requirements execpolicy `network_rule(...)` entries when merging with file-based execpolicy - reject broad wildcard hosts (for example `*`) for persisted `network_rule(...)`	2026-02-23 21:37:46 -08:00
Michael Bolin	1af2a37ada	chore: remove codex-core public protocol/shell re-exports (#12432 ) ## Why `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules from `codex-protocol` and `codex-shell-command`. That made it easy for workspace crates to import those APIs through `codex-core`, which in turn hides dependency edges and makes it harder to reduce compile-time coupling over time. This change removes those public re-exports so call sites must import from the source crates directly. Even when a crate still depends on `codex-core` today, this makes dependency boundaries explicit and unblocks future work to drop `codex-core` dependencies where possible. ## What Changed - Removed public re-exports from `codex-rs/core/src/lib.rs` for: - `codex_protocol::protocol` and related protocol/model types (including `InitialHistory`) - `codex_protocol::config_types` (`protocol_config_types`) - `codex_shell_command::{bash, is_dangerous_command, is_safe_command, parse_command, powershell}` - Migrated workspace Rust call sites to import directly from: - `codex_protocol::protocol` - `codex_protocol::config_types` - `codex_protocol::models` - `codex_shell_command` - Added explicit `Cargo.toml` dependencies (`codex-protocol` / `codex-shell-command`) in crates that now import those crates directly. - Kept `codex-core` internal modules compiling by using `pub(crate)` aliases in `core/src/lib.rs` (internal-only, not part of the public API). - Updated the two utility crates that can already drop a `codex-core` dependency edge entirely: - `codex-utils-approval-presets` - `codex-utils-cli` ## Verification - `cargo test -p codex-utils-approval-presets` - `cargo test -p codex-utils-cli` - `cargo check --workspace --all-targets` - `just clippy`	2026-02-20 23:45:35 -08:00
Michael Bolin	1779feb6a7	ignore v1 in JSON schema codegen (#12408 ) ## Why The generated unnamespaced JSON envelope schemas (`ClientRequest` and `ServerNotification`) still contained both v1 and v2 variants, which pulled legacy v1/core types and v2 types into the same `definitions` graph. That caused `schemars` to produce numeric suffix names (for example `AskForApproval2`, `ByteRange2`, `MessagePhase2`). This PR moves JSON codegen toward v2-only output while preserving the unnamespaced envelope artifacts, and avoids reintroducing numeric-suffix tolerance by removing the v1/internal-only variants that caused the collisions in those envelope schemas. ## What Changed - In `codex-rs/app-server-protocol/src/export.rs`, JSON generation now excludes v1 schema artifacts (`v1/`) while continuing to emit unnamespaced/root JSON schemas and the JSON bundle. - Added a narrow JSON v1 allowlist (`JSON_V1_ALLOWLIST`) so `InitializeParams` and `InitializeResponse` are still emitted. - Added JSON-only post-processing for the mixed envelope schemas before collision checks run: - `ClientRequest`: strips v1 request variants from the generated `oneOf` using the temporary `V1_CLIENT_REQUEST_METHODS` list - `ServerNotification`: strips v1 notifications plus the internal-only `rawResponseItem/completed` notification using the temporary `EXCLUDED_SERVER_NOTIFICATION_METHODS_FOR_JSON` list - Added a temporary local-definition pruning pass for those envelope schemas so now-unreferenced v1/core definitions are removed from `definitions` after method filtering. - Updated the variant-title naming heuristic for single-property literal object variants to use the literal value (when available), avoiding collisions like multiple `state`-only variants all deriving the same title. - Collision handling remains fail-fast (no numeric suffix fallback map in this PR path). ## Verification - `just write-app-server-schema` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12408). __->__ #12408 * #12406	2026-02-20 21:36:12 -08:00
Michael Bolin	48af93399e	feat: use OAI Responses API MessagePhase type directly in App Server v2 (#12422 ) https://github.com/openai/codex/pull/10455 introduced the `phase` field, and then https://github.com/openai/codex/pull/12072 introduced a `MessagePhase` type in `v2.rs` that paralleled the `MessagePhase` type in `codex-rs/protocol/src/models.rs`. The app server protocol prefers `camelCase` while the Responses API uses `snake_case`, so this meant we had two versions of `MessagePhase` with different serialization rules. When the app server protocol refers to types from the Responses API, we use the wire format of the the Responses API even though it is inconsistent with the app server API. This PR deletes `MessagePhase` from `v2.rs` and consolidates on the Responses API version to eliminate confusion.	2026-02-20 20:43:36 -08:00
Ahmed Ibrahim	6817f0be8a	Wire realtime api to core (#12268 ) - Introduce `RealtimeConversationManager` for realtime API management - Add `op::conversation` to start conversation, insert audio, insert text, and close conversation. - emit conversation lifecycle and realtime events. - Move shared realtime payload types into codex-protocol and add core e2e websocket tests for start/replace/transport-close paths. Things to consider: - Should we use the same `op::` and `Events` channel to carry audio? I think we should try this simple approach and later we can create separate one if the channels got congested. - Sending text updates to the client: we can start simple and later restrict that. - Provider auth isn't wired for now intentionally	2026-02-20 19:06:35 -08:00
natea-oai	936e744c93	Add field to Thread object for the latest rename set for a given thread (#12301 ) Exposes through the app server updated names set for a thread. This enables other surfaces to use the core as the source of truth for thread naming. `threadName` is gathered using the helper functions used to interact with `session_index.jsonl`, and is hydrated in: - `thread/list` - `thread/read` - `thread/resume` - `thread/unarchive` - `thread/rollback` We don't do this for `thread/start` and `thread/fork`.	2026-02-20 18:26:57 -08:00
Michael Bolin	53bcfaf42d	fix: explicitly list name collisions in JSON schema generation (#12406 ) ## Why JSON schema codegen was silently resolving naming collisions by appending numeric suffixes (for example `...2`, `...3`). That makes the generated schema names unstable: removing an earlier colliding type can cause a later type to be renumbered, which is a breaking change for consumers that referenced the old generated name. This PR makes those collisions explicit and reviewable. Though note that once we remove `v1` from the codegen, we will no longer support naming collisions. Or rather, naming collisions will have to be handled explicitly rather than the numeric suffix approach. ## What Changed - In `codex-rs/app-server-protocol/src/export.rs`, replaced implicit numeric suffix collision handling for generated variant titles with explicit special-case maps. - Added a panic when a collision occurs without an entry in the map, so new collisions fail loudly instead of silently renaming generated schema types. - Added the currently required special cases so existing generated names remain stable. - Extended the same approach to numbered `definitions` / `$defs` collisions (for example `MessagePhase2`-style names) so those are also explicitly tracked. ## Verification - Ran targeted generator-path test: - `cargo test -p codex-app-server-protocol generate_json_filters_experimental_fields_and_methods -- --nocapture` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12406). * #12408 * __->__ #12406	2026-02-20 17:51:53 -08:00
pakrym-oai	1bb7989b20	Add ability to attach extra files to feedback (#12370 ) Allow clients to provide extra files.	2026-02-20 22:26:14 +00:00

1 2 3 4 5 ...

282 Commits