codex

mirror of https://github.com/openai/codex.git synced 2026-04-28 02:11:08 +03:00

Author	SHA1	Message	Date
starr-openai	ddbe2536be	Support multiple managed environments (#18401 ) ## Summary - refactor EnvironmentManager to own keyed environments with default/local lookup helpers - keep remote exec-server client creation lazy until exec/fs use - preserve disabled agent environment access separately from internal local environment access ## Validation - not run (per Codex worktree instruction to avoid tests/builds unless requested) --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 15:29:35 -07:00
cassirer-openai	27d9673273	[rollout_trace] Add rollout trace crate (#18876 ) ## Summary Adds the standalone `codex-rollout-trace` crate, which defines the raw trace event format, replay/reduction model, writer, and reducer logic for reconstructing model-visible conversation/runtime state from recorded rollout data. The crate-level design is documented in [`codex-rs/rollout-trace/README.md`](https://github.com/openai/codex/blob/codex/rollout-trace-crate/codex-rs/rollout-trace/README.md). ## Stack This is PR 1/5 in the rollout trace stack. - [#18876](https://github.com/openai/codex/pull/18876): Add rollout trace crate - [#18877](https://github.com/openai/codex/pull/18877): Record core session rollout traces - [#18878](https://github.com/openai/codex/pull/18878): Trace tool and code-mode boundaries - [#18879](https://github.com/openai/codex/pull/18879): Trace sessions and multi-agent edges - [#18880](https://github.com/openai/codex/pull/18880): Add debug trace reduction command ## Review Notes This PR intentionally does not wire tracing into live Codex execution. It establishes the data model and reducer contract first, with crate-local tests covering conversation reconstruction, compaction boundaries, tool/session edges, and code-cell lifecycle reduction. Later PRs emit into this model. The README is the best entry point for reviewing the intended trace format and reduction semantics before diving into the reducer modules.	2026-04-21 21:54:05 +00:00
Shijie Rao	c5e9c6f71f	Preserve Cloudfare HTTP cookies in codex (#17783 ) ## Summary - Adds a process-local, in-memory cookie store for ChatGPT HTTP clients. - Limits cookie storage and replay to a shared ChatGPT host allowlist. - Wires the shared store into the default Codex reqwest client and backend client. - Shares the ChatGPT host allowlist with remote-control URL validation to avoid drift. - Enables reqwest cookie support and updates lockfiles.	2026-04-21 14:40:15 -07:00
efrazer-oai	be75785504	fix: fully revert agent identity runtime wiring (#18757 ) ## Summary This PR fully reverts the previously merged Agent Identity runtime integration from the old stack: https://github.com/openai/codex/pull/17387/changes It removes the Codex-side task lifecycle wiring, rollout/session persistence, feature flag plumbing, lazy `auth.json` mutation, background task auth paths, and request callsite changes introduced by that stack. This leaves the repo in a clean pre-AgentIdentity integration state so the follow-up PRs can reintroduce the pieces in smaller reviewable layers. ## Stack 1. This PR: full revert 2. https://github.com/openai/codex/pull/18871: move Agent Identity business logic into a crate 3. https://github.com/openai/codex/pull/18785: add explicit AgentIdentity auth mode and startup task allocation 4. https://github.com/openai/codex/pull/18811: migrate auth callsites through AuthProvider ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-21 14:30:55 -07:00
Ruslan Nigmatullin	69c3d12274	app-server: implement device key v2 methods (#18430 ) ## Why The device-key protocol needs an app-server implementation that keeps local key operations behind the same request-processing boundary as other v2 APIs. app-server owns request dispatch, transport policy, documentation, and JSON-RPC error shaping. `codex-device-key` owns key binding, validation, platform provider selection, and signing mechanics. Keeping the adapter thin makes the boundary easier to review and avoids moving local key-management details into thread orchestration code. ## What changed - Added `DeviceKeyApi` as the app-server adapter around `DeviceKeyStore`. - Converted protocol protection policies, payload variants, algorithms, and protection classes to and from the device-key crate types. - Encoded SPKI public keys and DER signatures as base64 protocol fields. - Routed `device/key/create`, `device/key/public`, and `device/key/sign` through `MessageProcessor`. - Rejected remote transports before provider access while allowing local `stdio` and in-process callers to reach the device-key API. - Added stdio, in-process, and websocket tests for device-key validation and transport policy. - Documented the device-key methods in the app-server v2 method list. ## Test coverage - `device_key_create_rejects_empty_account_user_id` - `in_process_allows_device_key_requests_to_reach_device_key_api` - `device_key_methods_are_rejected_over_websocket` ## Stack This is PR 3 of 4 in the device-key app-server stack. It is stacked on #18429. ## Validation - `cargo test -p codex-app-server device_key` - `just fix -p codex-app-server`	2026-04-21 14:07:08 -07:00
Felipe Coury	e502f0b52d	feat(tui): shortcuts to change reasoning level temporarily (#18866 ) ## Summary Adds main-chat shortcuts for changing reasoning effort one step at a time: - `Alt+,` lowers reasoning (has the `<` arrow on the key) - `Alt+.` raises reasoning (similarly, has the `>` arrow) The shortcut updates the active session only. It does not persist the selected reasoning level as the default for future sessions. In Plan mode, it applies temporarily to Plan mode without opening the global-vs-Plan scope prompt. ## Details The shortcut uses the active model preset to decide which reasoning levels are valid. If the current session has no explicit reasoning effort, it starts from the model default. Each keypress moves to the next supported level in the requested direction. The shortcut only runs from the main chat surface. If a popup or modal is open, input remains owned by that UI. In Plan mode, the shortcut updates the in-memory Plan reasoning override directly. The model/reasoning picker still keeps the existing scope prompt for explicit picker changes. ## Notes Ctrl-plus and Ctrl-minus were considered, but terminals do not deliver those combinations consistently, so this PR uses Alt shortcuts instead. If the current effort is unsupported by the selected model, the shortcut skips to the nearest supported level in the requested direction. If there is no valid step, it shows the existing boundary message. ## Tests - `cargo test -p codex-tui reasoning_shortcuts` - `cargo test -p codex-tui reasoning_effort` - `cargo test -p codex-tui reasoning_shortcut` - `cargo test -p codex-tui footer_snapshots` - `cargo test -p codex-tui` - `just fix -p codex-tui` - `./tools/argument-comment-lint/run.py -p codex-tui -- --tests` --------- Co-authored-by: Eric Traut <etraut@openai.com>	2026-04-21 18:04:03 -03:00
pakrym-oai	ffa6944587	Load app-server config through ConfigManager (#18870 ) ## Summary - Load app-server startup config through `ConfigManager` instead of direct `ConfigBuilder` calls. - Move `ConfigManager` constructor-owned state (`cli_overrides`, runtime feature map, cloud requirements loader) behind internal manager fields. - Pass `ConfigManager` into `MessageProcessor` directly instead of reconstructing it from raw args. ## Tests - `cargo check -p codex-app-server` - `cargo test -p codex-app-server` - `just fix -p codex-app-server` - `just fmt`	2026-04-21 14:01:02 -07:00
jif-oai	15b8cde2a4	chore: default multi-agent v2 fork to all (#18873 ) Default sub-agents v2 to `all` for the fork mode	2026-04-21 21:54:58 +01:00
iceweasel-oai	6f6997758a	skip busted tests while I fix them (#18885 )	2026-04-21 13:40:34 -07:00
Ruslan Nigmatullin	56375712e3	app-server: fix Bazel clippy in tracing tests (#18872 ) ## Why PR #18431 exposed a Bazel clippy failure in the app-server unit-test target across Linux, macOS, and Windows. The failing lint was `clippy::await_holding_invalid_type`: two tracing tests serialized access to global tracing state by holding a `tokio::sync::MutexGuard` across awaited test work. That serialization is still needed because the tests share process-global tracing setup and exporter state, but it should not require holding an async mutex guard through the whole test body. ## What changed - Replaced the bespoke async `tracing_test_guard` helper with `serial_test` on the two tracing tests that need global tracing serialization. - Removed the `#[expect(clippy::await_holding_invalid_type)]` annotations and the lock guard callsites that Bazel clippy rejected. ## Validation - `cargo test -p codex-app-server jsonrpc_span` - `just fix -p codex-app-server` - `git diff --check` I also attempted the exact failing Bazel clippy target locally with BuildBuddy disabled: `bazel --noexperimental_remote_repo_contents_cache build --config=clippy --bes_backend= --remote_cache= --experimental_remote_downloader= -- //codex-rs/app-server:app-server-unit-tests-bin`. That run did not reach clippy because Bazel timed out downloading `libcap-2.27.tar.gz` from `kernel.org`.	2026-04-21 13:10:36 -07:00
Ruslan Nigmatullin	5bab04dcd7	app-server: add codex-device-key crate (#18429 ) ## Why Device-key storage and signing are local security-sensitive operations with platform-specific behavior. Keeping the core API in `codex-device-key` keeps app-server focused on routing and business logic instead of owning key-management details. The crate keeps the signing surface intentionally narrow: callers can create a bound key, fetch its public key, or sign one of the structured payloads accepted by the crate. It does not expose a generic arbitrary-byte signing API. Key IDs cross into platform-specific labels, tags, and metadata paths, so externally supplied IDs are constrained to the same auditable namespace created by the crate: `dk_` followed by unpadded base64url for 32 bytes. Remote-control target paths are also tied to each signed payload shape so connection proofs cannot be reused for enrollment endpoints, or vice versa. ## What changed - Added the `codex-device-key` workspace crate. - Added account/client-bound key creation with stable `dk_` key IDs. - Added strict `key_id` validation before public-key lookup or signing reaches a provider. - Added public-key lookup and structured signing APIs. - Split remote-control client endpoint allowlists by connection vs enrollment payload shape. - Added validation for key bindings, accepted payload fields, token expiration, and payload/key binding mismatches. - Added flow-oriented docs on the validation helpers that gate provider signing. - Added protection policy and protection-class types without wiring a platform provider yet. - Added an unsupported default provider so platforms without an implementation fail explicitly instead of silently falling back to software-backed keys. - Updated Cargo and Bazel lock metadata for the new crate and non-platform-specific dependencies. ## Stack This is stacked on #18428. ## Validation - `cargo test -p codex-device-key` - Added unit coverage for strict `key_id` validation before provider use. - Added unit coverage that rejects remote-control paths from the wrong signed payload shape. - `just bazel-lock-update` - `just bazel-lock-check`	2026-04-21 17:57:00 +00:00
iceweasel-oai	8612714aa6	Add Windows sandbox unified exec runtime support (#15578 ) ## Summary This is the runtime/foundation half of the Windows sandbox unified-exec work. - add Windows sandbox `unified_exec` session support in `windows-sandbox-rs` for both: - the legacy restricted-token backend - the elevated runner backend - extend the PTY/process runtime so driver-backed sessions can support: - stdin streaming - stdout/stderr separation - exit propagation - PTY resize hooks - add Windows sandbox runtime coverage in `codex-windows-sandbox` / `codex-utils-pty` This PR does not enable Windows sandbox `UnifiedExec` for product callers yet because hooking this up to app-server comes in the next PR. Windows sandbox advertising is intentionally kept aligned with `main`, so sandboxed Windows callers still fall back to `ShellCommand`. This PR isolates the runtime/session layer so it can be reviewed independently from product-surface enablement. --------- Co-authored-by: jif-oai <jif@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-04-21 10:44:49 -07:00
Michael Bolin	f8562bd47b	sandboxing: intersect permission profiles semantically (#18275 ) ## Why Permission approval responses must not be able to grant more access than the tool requested. Moving this flow to `PermissionProfile` means the comparison must be profile-shaped instead of `SandboxPolicy`-shaped, and cwd-relative special paths such as `:cwd` and `:project_roots` must stay anchored to the turn that produced the request. ## What changed This implements semantic `PermissionProfile` intersection in `codex-sandboxing` for file-system and network permissions. The intersection accepts narrower path grants, rejects broader grants, preserves deny-read carve-outs and glob scan depth, and materializes cwd-dependent special-path grants to absolute paths before they can be recorded for reuse. The request-permissions response paths now use that intersection consistently. App-server captures the request turn cwd before waiting for the client response, includes that cwd in the v2 approval params, and core stores the requested profile plus cwd for direct TUI/client responses and Guardian decisions before recording turn- or session-scoped grants. The TUI app-server bridge now preserves the app-server request cwd when converting permission approval params into core events. ## Verification - `cargo test -p codex-sandboxing intersect_permission_profiles -- --nocapture` - `cargo test -p codex-app-server request_permissions_response -- --nocapture` - `cargo test -p codex-core request_permissions_response_materializes_session_cwd_grants_before_recording -- --nocapture` - `cargo check -p codex-tui --tests` - `cargo check --tests` - `cargo test -p codex-tui app_server_request_permissions_preserves_file_system_permissions`	2026-04-21 10:23:01 -07:00
pakrym-oai	2a226096f6	Split DeveloperInstructions into individual fragments. (#18813 ) Split DeveloperInstructions into individual fragments.	2026-04-21 10:22:36 -07:00
pakrym-oai	5fe767e8e1	Refactor app-server config loading into ConfigManager (#18442 ) Localize app-server configuration loading in one place.	2026-04-21 10:22:26 -07:00
Eric Traut	4ed722ab8d	Move TUI app tests to modules they cover (#18799 ) ## Summary The TUI app refactor in #18753 moved the old `app.rs` tests into a single `app/tests.rs` file. That kept the split mechanically simple, but it left several focused unit tests far from the modules they exercise. This PR is a follow-up that moves tests next to the code they cover. It also adds `tui/src/app/test_support.rs` for shared fixture construction. This is just a mechanical refactoring (no functional changes) and does not affect any production code.	2026-04-21 10:16:51 -07:00
jif-oai	10e1659d4f	Stabilize debug clear memories integration test (#18858 ) ## Why `debug_clear_memories_resets_state_and_removes_memory_dir` can be flaky because the test drops its `sqlx::SqlitePool` immediately before invoking `codex debug clear-memories`. Dropping the pool does not wait for all SQLite connections to close, so the CLI can race with still-open test connections. ## What changed - Await `pool.close()` before spawning `codex debug clear-memories`. - Close the reopened verification pool before the temp `CODEX_HOME` is torn down. ## Verification - `cargo test -p codex-cli --test debug_clear_memories debug_clear_memories_resets_state_and_removes_memory_dir`	2026-04-21 18:15:37 +01:00
Eric Traut	b7fec54354	Queue follow-up input during user shell commands (#18820 ) Fixes #17954. ## Why When a manual shell command like `!sleep 10` is running, submitting plain text such as `hi` currently sends that text as a steer for the active shell turn. User shell turns are not steerable like model turns, so the TUI can remain stuck in `Working` after the shell command finishes. ## What Changed - Detect when the only active work is one or more `ExecCommandSource::UserShell` commands. - Queue plain submitted input in that state so it drains after the shell command and shell turn complete. - Preserve `!cmd` submissions during running work so explicit shell commands keep their existing behavior. - Add regression coverage for the `!sleep 10` plus `hi` flow in `chatwidget::tests::exec_flow::user_message_during_user_shell_command_is_queued_not_steered`. ## Verification - Manually confirmed hang before the fix and no hang after the fix	2026-04-21 10:13:13 -07:00
Casey Chow	41652665f5	[codex] Add tmux-aware OSC 9 notifications (#17836 ) ## Summary - wrap OSC 9 notifications in tmux's DCS passthrough so terminal notifications make it through tmux - use codex-terminal-detection for OSC 9 auto-selection so tmux sessions inherit the underlying client terminal support - add focused notification backend tests for plain OSC 9 and tmux-wrapped output ## Stack - base PR: #18479 - review order: #18479, then this PR ## Why Tmux does not forward OSC 9 notifications directly; the sequence has to be wrapped in tmux's DCS passthrough envelope. Codex also had local notification heuristics that could miss supported terminals when running under tmux, even though codex-terminal-detection already knows how to attribute tmux sessions to the client terminal. ## Validation - `just fmt` - `cargo test -p codex-tui` (currently blocked by an unrelated existing compile error in `app-server/src/message_processor.rs:754` referencing `connection_id` out of scope; not caused by this change) Co-authored-by: Codex <noreply@openai.com>	2026-04-21 17:10:36 +00:00
Rennie	3a9df58d06	Propagate thread id in MCP tool metadata (#18093 ) ## Summary - attach the authoritative Codex thread id to MCP tool request `_meta.threadId` for model-initiated tool calls - attach the same thread id for manual `mcpServer/tool/call` requests before invoking the MCP server - cover both metadata helper behavior and the manual app-server MCP path in tests needed because the Rust app-server is the last place that still has authoritative knowledge of “this model-generated MCP tool call belongs to conversation/thread X” before the request leaves Codex and reaches Hoopa. It adds threadId to MCP request metadata in the model-generated tool-call path, using sess.conversation_id, and also does the same for the manual mcpServer/tool/call path. ## Test plan - `cargo test -p codex-core mcp_tool_call_thread_id_meta_is_added_to_request_meta --lib` - `cargo test -p codex-app-server mcp_server_tool_call_returns_tool_result` Paired Hoopa consumer PR: https://github.com/openai/openai/pull/833263	2026-04-21 10:09:46 -07:00
Ruslan Nigmatullin	48f82ca7c5	app-server: define device key v2 protocol (#18428 ) ## Why Clients need a stable app-server protocol surface for enrolling a local device key, retrieving its public key, and producing a device-bound proof. The protocol reports `protectionClass` explicitly so clients can distinguish hardware-backed keys from an explicitly allowed OS-protected fallback. Signing uses a tagged `DeviceKeySignPayload` enum rather than arbitrary bytes so each signed statement is auditable at the API boundary. ## What changed - Added v2 JSON-RPC methods for `device/key/create`, `device/key/public`, and `device/key/sign`. - Added request/response types for device-key metadata, SPKI public keys, protection classes, and ECDSA signatures. - Added `DeviceKeyProtectionPolicy` with hardware-only default behavior and an explicit `allow_os_protected_nonextractable` option. - Added the initial `remoteControlClientConnection` signing payload variant. - Regenerated JSON Schema and TypeScript fixtures for app-server clients. ## Stack This is PR 1 of 4 in the device-key app-server stack. ## Validation - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol`	2026-04-21 10:08:42 -07:00
Michael Bolin	b06fc8bd0d	core: make test-log a dev dependency (#18846 ) The `test-log` crate is only used by `codex-core` tests, so it does not need to be part of the normal `codex-core` dependency graph. Keeping `test-log` in `dev-dependencies` removes it from normal `codex-core` builds and keeps the production dependency set a little smaller. Verification: - `cargo tree -p codex-core --edges normal --invert test-log` - `cargo check -p codex-core --lib` - `cargo test -p codex-core --lib`	2026-04-21 09:48:31 -07:00
jif-oai	bf2a34b4b2	feat: baseline lib (#18848 ) This add with 2 entry point: * `reset_git_repository` that takes a directory and set it as a new git root * `diff_since_latest_init` this returns the diff for a given directory since the last `reset_git_repository`	2026-04-21 17:24:30 +01:00
Michael Bolin	53cf12cd52	build: reduce Rust dev debuginfo (#18844 ) ## What changed This PR makes the default Cargo dev profile use line-tables-only debug info: ```toml [profile.dev] debug = 1 ``` That keeps useful backtraces while avoiding the cost of full variable debug info in normal local dev builds. This also makes the Bazel CI setting explicit with `-Cdebuginfo=0` for target and exec-configuration Rust actions. Bazel/rules_rust does not read Cargo profiles for this setting, and the current fastbuild action already emitted `--codegen=debuginfo=0`; the Bazel part of this PR makes that choice direct in our build configuration. ## Why The slow codex-core rebuilds are dominated by debug-info codegen, not parsing or type checking. On a warm-dependency package rebuild, the baseline codex-core compile was about 39.5s wall / 38.9s rustc total, with codegen_crate around 14.0s and LLVM_passes around 13.4s. Setting codex-core to line-tables-only debug info brought that to about 27.2s wall / 26.7s rustc total, with codegen_crate around 3.1s and LLVM_passes around 2.8s. `debug = 0` was only about another 0.7s faster than `debug = 1` in the codex-core measurement, so `debug = 1` is the better default dev tradeoff: it captures nearly all of the compile-time win while preserving basic debuggability. I also sampled other first-party crates instead of keeping a codex-core-only package override. codex-app-server showed the same pattern: rustc total dropped from 15.85s to 10.48s, while codegen_crate plus LLVM_passes dropped from about 13.47s to 3.23s. codex-app-server-protocol had a smaller but still real improvement, 16.05s to 14.58s total, and smaller crates showed modest wins. That points to a workspace dev-profile policy rather than a hand-maintained list of large crates. ## Relationship to #18612 [#18612](https://github.com/openai/codex/pull/18612) added the `dev-small` profile. That remains useful when someone wants a working local build quickly and is willing to opt in with `cargo build --profile dev-small`. This PR is deliberately less aggressive: it changes the common default dev profile while preserving line tables/backtraces. `dev-small` remains the explicit "build quickly, no debuggability concern" path. ## Other investigation I looked for another structural win comparable to [#16631](https://github.com/openai/codex/pull/16631) and [#16630](https://github.com/openai/codex/pull/16630), but did not find one. The attempted TOML monomorphization changes were noisy or worse in measurement, and the async task changes reduced some instantiations but only translated to roughly a one-second improvement while being much more disruptive. The debug-info setting was the one repeatable, material win that survived measurement. ## Verification - `just bazel-lock-update` - `just bazel-lock-check` - `cargo check -p codex-core --lib` - `cargo test -p codex-core --lib` - Bazel `aquery --config=ci-linux` confirmed `--codegen=debuginfo=0` and `-Cdebuginfo=0` for `//codex-rs/core:core` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18844). * #18846 * __->__ #18844	2026-04-21 09:00:40 -07:00
pakrym-oai	833212115e	Move external agent config out of core (#18850 ) ## Summary - Move external agent config migration logic and tests from `codex-core` into `app-server/src/config`. - Keep the migration service crate-private to app-server and update the API adapter imports. - Remove stale core re-exports and expose only the needed marketplace source helper. ## Testing - `cargo test -p codex-app-server config::external_agent_config` - `just fmt` - `just fix -p codex-app-server` - `just fix -p codex-core` - `git diff --check`	2026-04-21 08:33:58 -07:00
Felipe Coury	1101dec9ae	fix(tui): disable enhanced keys for VS Code WSL (#18741 ) Fixes https://github.com/openai/codex/issues/13638 ## Why VS Code's integrated terminal can run a Linux shell through WSL without exposing `TERM_PROGRAM` to the Linux process, and with crossterm keyboard enhancement flags enabled that environment can turn dead-key composition into malformed key events instead of composed Unicode input. Codex already handles composed Unicode correctly, so the fix is to avoid enabling the terminal mode that breaks this path for the affected terminal combination. ## What Changed - Automatically skip crossterm keyboard enhancement flags when Codex detects WSL plus VS Code, including a Windows-side `TERM_PROGRAM` probe through WSL interop. - Add `CODEX_TUI_DISABLE_KEYBOARD_ENHANCEMENT` so users can force-disable or force-enable the keyboard enhancement policy for diagnosis. ## Verification - Added unit coverage for env parsing, VS Code detection, and the WSL/VS Code auto-disable policy. - `cargo check -p codex-tui` passed. - `./tools/argument-comment-lint/run.py -p codex-tui -- --tests` passed. - `cargo test -p codex-tui` was attempted locally, but the checkout failed during linking before tests executed because V8 symbols from `codex-code-mode` were unresolved for `arm64`.	2026-04-21 09:57:51 -03:00
Abhinav	ef071cf816	show bash mode in the TUI (#18271 ) ## What - Explicitly show our "bash mode" by changing the color and adding a callout similar to how we do for `Plan mode (shift + tab to cycle)` - Also replace our `›` composer prefix with a bang `!` ![](https://github.com/user-attachments/assets/f5549c75-3a03-433d-aa57-e4c6d0682c49) ## Why - It was unclear that we had a Bash mode - This feels more responsive - It looks cool! --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 00:15:49 -07:00
pash-openai	dc1a8f2190	[tool search] support namespaced deferred dynamic tools (#18413 ) Deferred dynamic tools need to round-trip a namespace so a tool returned by `tool_search` can be called through the same registry key that core uses for dispatch. This change adds namespace support for dynamic tool specs/calls, persists it through app-server thread state, and routes dynamic tool calls by full `ToolName` while still sending the app the leaf tool name. Deferred dynamic tools must provide a namespace; non-deferred dynamic tools may remain top-level. It also introduces `LoadableToolSpec` as the shared function-or-namespace Responses shape used by both `tool_search` output and dynamic tool registration, so dynamic tools use the same wrapping logic in both paths. Validation: - `cargo test -p codex-tools` - `cargo test -p codex-core tool_search` --------- Co-authored-by: Sayan Sisodiya <sayan@openai.com>	2026-04-21 14:13:08 +08:00
Michael Bolin	1dcea729d3	chore: enable await-holding clippy lints (#18698 ) Follow-up to https://github.com/openai/codex/pull/18178, where we said the await-holding clippy rule would be enabled separately. Enable `await_holding_lock` and `await_holding_invalid_type` after the preceding commits fixed or explicitly documented the current offenders.	2026-04-21 06:06:05 +00:00
Michael Bolin	d62421d322	chore: document intentional await-holding cases (#18423 ) ## Why This PR prepares the stack to enable Clippy await-holding lints that were left disabled in #18178. The mechanical lock-scope cleanup is handled separately; this PR is the documentation/configuration layer for the remaining await-across-guard sites. Without explicit annotations, reviewers and future maintainers cannot tell whether an await-holding warning is a real concurrency smell or an intentional serialization boundary. ## What changed - Configures `clippy.toml` so `await_holding_invalid_type` also covers `tokio::sync::{MutexGuard,RwLockReadGuard,RwLockWriteGuard}`. - Adds targeted `#[expect(clippy::await_holding_invalid_type, reason = ...)]` annotations for intentional async guard lifetimes. - Documents the main categories of intentional cases: active-turn state transitions that must remain atomic, session-owned MCP manager accesses, remote-control websocket serialization, JS REPL kernel/process serialization, OAuth persistence, external bearer token refresh serialization, and tests that intentionally serialize shared global or session-owned state. - For external bearer token refresh, documents the existing serialization boundary: holding `cached_token` across the provider command prevents concurrent cache misses from starting duplicate refresh commands, and the current behavior is small enough that an explicit expectation is easier to maintain than adding another synchronization primitive. ## Verification - `cargo clippy -p codex-login --all-targets` - `cargo clippy -p codex-connectors --all-targets` - `cargo clippy -p codex-core --all-targets` - The follow-up PR #18698 enables `await_holding_invalid_type` and `await_holding_lock` as workspace `deny` lints, so any undocumented remaining offender will fail Clippy. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18423). * #18698 * __->__ #18423	2026-04-20 22:41:54 -07:00
pakrym-oai	4c2e730488	Organize context fragments (#18794 ) Organize context fragments under `core/context`. Implement same trait on all of them.	2026-04-20 22:39:17 -07:00
Abhinav	ab26554a3a	Add remote_sandbox_config to our config requirements (#18763 ) ## Why Customers need finer-grained control over allowed sandbox modes based on the host Codex is running on. For example, they may want stricter sandbox limits on devboxes while keeping a different default elsewhere. Our current cloud requirements can target user/account groups, but they cannot vary sandbox requirements by host. That makes remote development environments awkward because the same top-level `allowed_sandbox_modes` has to apply everywhere. ## What Adds a new `remote_sandbox_config` section to `requirements.toml`: ```toml allowed_sandbox_modes = ["read-only"] [[remote_sandbox_config]] hostname_patterns = [".org"] allowed_sandbox_modes = ["read-only", "workspace-write"] [[remote_sandbox_config]] hostname_patterns = [".sh", "runner-*.ci"] allowed_sandbox_modes = ["read-only", "danger-full-access"] ``` During requirements resolution, Codex resolves the local host name once, preferring the machine FQDN when available and falling back to the cleaned kernel hostname. This host classification is best effort rather than authenticated device proof. Each requirements source applies its first matching `remote_sandbox_config` entry before it is merged with other sources. The shared merge helper keeps that `apply_remote_sandbox_config` step paired with requirements merging so new requirements sources do not have to remember the extra call. That preserves source precedence: a lower-precedence requirements file with a matching `remote_sandbox_config` cannot override a higher-precedence source that already set `allowed_sandbox_modes`. This also wires the hostname-aware resolution through app-server, CLI/TUI config loading, config API reads, and config layer metadata so they all evaluate remote sandbox requirements consistently. ## Verification - `cargo test -p codex-config remote_sandbox_config` - `cargo test -p codex-config host_name` - `cargo test -p codex-core load_config_layers_applies_matching_remote_sandbox_config` - `cargo test -p codex-core system_remote_sandbox_config_keeps_cloud_sandbox_modes` - `cargo test -p codex-config` - `cargo test -p codex-core` unit tests passed; `tests/all.rs` integration matrix was intentionally stopped after the relevant focused tests passed - `just fix -p codex-config` - `just fix -p codex-core` - `cargo check -p codex-app-server`	2026-04-21 05:05:02 +00:00
Dylan Hurd	86535c9901	feat(auto-review) Handle request_permissions calls (#18393 ) ## Summary When auto-review is enabled, it should handle request_permissions tool. We'll need to clean up the UX but I'm planning to do that in a separate pass ## Testing - [x] Ran locally <img width="893" height="396" alt="Screenshot 2026-04-17 at 1 16 13 PM" src="https://github.com/user-attachments/assets/4c045c5f-1138-4c6c-ac6e-2cb6be4514d8" /> --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 21:48:57 -07:00
canvrno-oai	2cc146f5ea	Fallback display names for TUI skill mentions (#18786 ) This updates TUI skill mentions to show a fallback label when a skill does not define a display name, so unnamed skills remain understandable in the picker without changing behavior for skills that already have one. <img width="1028" height="198" alt="Screenshot 2026-04-20 at 6 25 15 PM" src="https://github.com/user-attachments/assets/84077b85-99d0-4db9-b533-37e1887b4506" />	2026-04-20 20:46:55 -07:00
Matthew Zeng	1132ef887c	Make MCP resource read threadless (#18292 ) ## Summary Making thread id optional so that we can better cache resources for MCPs for connectors since their resource templates is universal and not particular to projects. - Make `mcpServer/resource/read` accept an optional `threadId` - Read resources from the current MCP config when no thread is supplied - Keep the existing thread-scoped path when `threadId` is present - Update the generated schemas, README, and integration coverage ## Testing - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-mcp` - `cargo test -p codex-app-server --test all mcp_resource` - `just fix -p codex-mcp` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server`	2026-04-20 19:59:36 -07:00
Dylan Hurd	58e7605efc	fix(guardian) Dont hard error on feature disable (#18795 ) ## Summary This shouldn't error for now ## Test plan - [x] Updated unit test	2026-04-20 19:54:39 -07:00
Michael Bolin	3d2f123895	protocol: preserve glob scan depth in permission profiles (#18713 ) ## Why #18274 made `PermissionProfile` the canonical file-system permissions shape, but the round-trip from `FileSystemSandboxPolicy` to `PermissionProfile` still dropped one piece of policy metadata: `glob_scan_max_depth`. That field is security-relevant for deny-read globs such as `*/.env`. On Linux, bubblewrap sandbox construction uses it to bound unreadable glob expansion. If a profile copied from active runtime permissions loses this value and is submitted back as an override, the resulting `FileSystemSandboxPolicy` can behave differently even though the visible permission entries look equivalent. ## What changed - Add `glob_scan_max_depth` to protocol `FileSystemPermissions` and preserve it when converting to/from `FileSystemSandboxPolicy`. - Keep legacy `read`/`write` JSON for simple path-only permissions, but force canonical JSON when glob scan depth is present so the metadata is not silently dropped. - Carry `globScanMaxDepth` through app-server `AdditionalFileSystemPermissions`, generated JSON/TypeScript schemas, and app-server/TUI conversion call sites. - Preserve the metadata through sandboxing permission normalization, merging, and intersection. - Carry the merged scan depth into the effective `FileSystemSandboxPolicy` used for command execution, so bounded deny-read globs reach Linux bubblewrap materialization. ## Verification - `cargo test -p codex-sandboxing glob_scan -- --nocapture` - `cargo test -p codex-sandboxing policy_transforms -- --nocapture` - `just fix -p codex-sandboxing` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18713). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * #18279 * #18278 * #18277 * #18276 * #18275 * __->__ #18713	2026-04-20 19:42:45 -07:00
xl-openai	6e9e2c2eef	feat: Support more plugin MCP file shapes. (#18780 ) Update core-plugins MCP loading to accept either an mcpServers object or a top-level server map in .mcp.json	2026-04-20 19:42:01 -07:00
Michael Bolin	ff05532723	refactor: narrow async lock scopes (#18418 ) ## Why This is part of the follow-up work from #18178 to make Codex ready for Clippy's [`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock) / [`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type) lints. This bottom PR keeps the scope intentionally small: `NetworkProxyState::record_blocked()` only needs the state write lock while it mutates the blocked-request ring buffer and counters. The debug log payload and `BlockedRequestObserver` callback can be produced after that lock is released. ## What changed - Copies the blocked-request snapshot values needed for logging while updating the state. - Releases the `RwLockWriteGuard` before logging or notifying the observer. ## Verification - `cargo test -p codex-network-proxy` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18418). * #18698 * #18423 * __->__ #18418	2026-04-21 02:23:30 +00:00
Ahmed Ibrahim	d6af7a6c03	[1/4] Add executor HTTP request protocol (#18581 ) ### Why Remote streamable HTTP MCP needs a transport-shaped executor primitive before the MCP client can move network I/O to the executor. This layer keeps the executor unaware of MCP and gives later PRs an ordered streaming surface for response bodies. ### What - Add typed `http/request` and `http/request/bodyDelta` protocol payloads. - Add executor client helpers for buffered and streamed HTTP responses. - Route body-delta notifications to request-scoped streams with sequence validation and cleanup when a stream finishes or is dropped. - Document the new protocol constants, transport structs, public client methods, body-stream lifecycle, and request-scoped routing helpers. - Add in-memory JSON-RPC client coverage for streamed HTTP response-body notifications, with comments spelling out what the test proves and each setup/exercise/assert phase. ### Stack 1. #18581 protocol 2. #18582 runner 3. #18583 RMCP client 4. #18584 manager wiring and local/remote coverage ### Verification - `just fmt` - `cargo check -p codex-exec-server -p codex-rmcp-client --tests` - `cargo check -p codex-core --test all` compile-only - `git diff --check` - Online full CI is running from the `full-ci` branch, including the remote Rust test job. Co-authored-by: Codex <noreply@openai.com> --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 02:21:08 +00:00
Celia Chen	cefcfe43b9	feat: add a built-in Amazon Bedrock model provider (#18744 ) ## Why Codex needs a first-class `amazon-bedrock` model provider so users can select Bedrock without copying a full provider definition into `config.toml`. The provider has Codex-owned defaults for the pieces that should stay consistent across users: the display `name`, Bedrock `base_url`, and `wire_api`. At the same time, users still need a way to choose the AWS credential profile used by their local environment. This change makes `amazon-bedrock` a partially modifiable built-in provider: code owns the provider identity and endpoint defaults, while user config can set `model_providers.amazon-bedrock.aws.profile`. For example: ```toml model_provider = "amazon-bedrock" [model_providers.amazon-bedrock.aws] profile = "codex-bedrock" ``` ## What Changed - Added `amazon-bedrock` to the built-in model provider map with: - `name = "Amazon Bedrock"` - `base_url = "https://bedrock-mantle.us-east-1.api.aws/v1"` - `wire_api = "responses"` - Added AWS provider auth config with a profile-only shape: `model_providers.<id>.aws.profile`. - Kept AWS auth config restricted to `amazon-bedrock`; custom providers that set `aws` are rejected. - Allowed `model_providers.amazon-bedrock` through reserved-provider validation so it can act as a partial override. - During config loading, only `aws.profile` is copied from the user-provided `amazon-bedrock` entry onto the built-in provider. Other Bedrock provider fields remain hard-coded by the built-in definition. - Updated the generated config schema for the new provider AWS profile config.	2026-04-21 00:54:05 +00:00
canvrno-oai	9a2b34213b	/statusline & /title - Shared preview values (#18435 ) This PR makes the `/statusline` and `/title` setup UIs share one preview-value source instead of each surface using its own examples. Both pickers now render consistent live values when available, and stable placeholders when they are not. It also resolves live preview values at the shared preview-item layer, so `/title` preview can use real runtime values for title-specific cases like status text, task progress, and project-name fallback behavior. - Adds a shared preview data model for status surfaces - Maps status-line items and terminal-title items onto that shared preview list - Feeds both setup views from the same chatwidget-derived preview data, with terminal-title-specific formatting applied before `/title` preview renders - Keeps project-root preview aligned with status-line behavior while project in /title keeps its title fallback/truncation behavior - Adds snapshot coverage for live-only, hardcoded-only, and mixed cases Test Steps - Open Codex TUI and launch `/statusline`. - Toggle and reorder items, then verify the preview uses current session values when possible, and placeholder values for missing values (ex: no thread ID). - Open `/title` and verify it shows the same normalized values, including live status/task-progress values when available.	2026-04-20 17:46:11 -07:00
guinness-oai	ca3246f77a	[codex] Send realtime transcript deltas on handoff (#18761 ) ## Summary - Track how many realtime transcript entries have already been attached to a background-agent handoff. - Attach only entries added since the previous handoff as `<transcript_delta>` instead of resending the accumulated transcript snapshot. - Update the realtime integration test so the second delegation carries only the second transcript delta. ## Validation - `just fmt` - `cargo test -p codex-api` - `cargo test -p codex-core inbound_handoff_request_sends_transcript_delta_after_each_handoff` - `cargo build -p codex-cli -p codex-app-server` ## Manual testing Built local debug binaries at: - `codex-rs/target/debug/codex` - `codex-rs/target/debug/codex-app-server`	2026-04-20 16:46:15 -07:00
Eric Traut	216e7a0a56	Warn when trusting Git subdirectories (#18602 ) Addresses #18505 ## Summary When Codex is launched from a subdirectory of a Git repository, the onboarding trust prompt says it is trusting the current directory even though the persisted trust target is the repository root. That can make the scope of the trust decision unclear. This updates the TUI trust prompt to show a yellow note only when the current directory differs from the resolved trust target, explaining that trust applies to the repository root and displaying that root. It also removes the stale onboarding TODO now that the warning is implemented.	2026-04-20 16:43:21 -07:00
viyatb-oai	33fa952426	fix: fix stale proxy env restoration after shell snapshots (#17271 ) ## Summary This fixes a stale-environment path in shell snapshot restoration. A sandboxed command can source a shell snapshot that was captured while an older proxy process was running. If that proxy has died and come back on a different port, the snapshot can otherwise put old proxy values back into the command environment, which is how tools like `pip` end up talking to a dead proxy. The wrapper now captures the live process environment before sourcing the snapshot and then restores or clears every proxy env var from the proxy crate's canonical list. That makes proxy state after shell snapshot restoration match the current command environment, rather than whatever proxy values happened to be present in the snapshot. On macOS, the Codex-generated `GIT_SSH_COMMAND` is refreshed when the SOCKS listener changes, while custom SSH wrappers are still left alone. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 16:39:17 -07:00
Ahmed Ibrahim	9ef1cab6f7	[6/6] Fail exec client operations after disconnect (#18027 ) ## Summary - Reject new exec-server client operations once the transport has disconnected. - Convert pending RPC calls into closed errors instead of synthetic server errors. - Cover pending read and later write behavior after remote executor disconnect. ## Verification - `just fmt` - `cargo check -p codex-exec-server` ## Stack ```text @ #18027 [6/6] Fail exec client operations after disconnect │ o #18212 [5/6] Wire executor-backed MCP stdio │ o #18087 [4/6] Abstract MCP stdio server launching │ o #18020 [3/6] Add pushed exec process events │ o #18086 [2/6] Support piped stdin in exec process API │ o #18085 [1/6] Add MCP server environment config │ o main ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 23:24:06 +00:00
Eric Traut	0f1c9b8963	Fix exec inheritance of root shared flags (#18630 ) Addresses #18113 Problem: Shared flags provided before the exec subcommand were parsed by the root CLI but not inherited by the exec CLI, so exec sessions could run with stale or default sandbox and model configuration. Solution: Move shared TUI and exec flags into a common option block and merge root selections into exec before dispatch, while preserving exec's global subcommand flag behavior.	2026-04-20 16:12:17 -07:00
Eric Traut	2af4f15479	Refactor TUI app module into submodules (#18753 ) ## Why The TUI app module had grown past the 512K source-file cap enforced by CI/CD. This keeps the app entry point below that limit while preserving the existing runtime behavior and test surface. ## What changed - Kept the top-level `App` state and run-loop wiring in `tui/src/app.rs`. - Split app responsibilities into focused private submodules under `tui/src/app/`, covering event dispatch, thread routing, session lifecycle, config persistence, background requests, startup prompts, input, history UI, platform actions, and thread event buffering. - Moved the existing app-level tests into `tui/src/app/tests.rs` and reused the existing snapshot location rather than adding new tests or snapshots. - Added module header comments for `app.rs` and the new submodules. ## Follow-up A future cleanup can move narrow unit tests from `tui/src/app/tests.rs` into the specific app submodules they exercise. This PR keeps the existing app-level tests together so the refactor stays focused on the source-file split. ## Verification - `cargo test -p codex-tui --lib app::tests::agent_picker_item_name_snapshot` - `cargo test -p codex-tui --lib app::tests::clear_ui` - `cargo test -p codex-tui --lib app::tests::ctrl_l_clear_ui_after_long_transcript_reuses_clear_header_snapshot` - `just fix -p codex-tui` Full `cargo test -p codex-tui` still fails on model-catalog drift unrelated to this refactor, including stale `gpt-5.3-codex`/`gpt-5.1-codex` snapshot and migration expectations now resolving to `gpt-5.4`.	2026-04-20 16:10:35 -07:00
Rasmus Rygaard	7b994100b3	Add session config loader interface (#18208 ) ## Why Cloud-hosted sessions need a way for the service that starts or manages a thread to provide session-owned config without treating all config as if it came from the same user/project/workspace TOML stack. The important boundary is ownership: some values should be controlled by the session/orchestrator, some by the authenticated user, and later some may come from the executor. The earlier broad config-store shape made that boundary too fuzzy and overlapped heavily with the existing filesystem-backed config loader. This PR starts with the smaller piece we need now: a typed session config loader that can feed the existing config layer stack while preserving the normal precedence and merge behavior. ## What Changed - Added `ThreadConfigLoader` and related typed payloads in `codex-config`. - `SessionThreadConfig` currently supports `model_provider`, `model_providers`, and feature flags. - `UserThreadConfig` is present as an ownership boundary, but does not yet add TOML-backed fields. - `NoopThreadConfigLoader` preserves existing behavior when no external loader is configured. - `StaticThreadConfigLoader` supports tests and simple callers. - Taught thread config sources to produce ordinary `ConfigLayerEntry` values so the existing `ConfigLayerStack` remains the place where precedence and merging happen. - Wired the loader through `ConfigBuilder`, the config loader, and app-server startup paths so app-server can provide session-owned config before deriving a thread config. - Added coverage for: - translating typed thread config into config layers, - inserting thread config layers into the stack at the right precedence, - applying session-provided model provider and feature settings when app-server derives config from thread params. ## Follow-Ups This intentionally stops short of adding the remote/service transport. The next pieces are expected to be: 1. Define the proto/API shape for this interface. 2. Add a client implementation that can source session config from the service side. ## Verification - Added unit coverage in `codex-config` for the loader and layer conversion. - Added `codex-core` config loader coverage for thread config layer precedence. - Added app-server coverage that verifies session thread config wins over request-provided config for model provider and feature settings.	2026-04-20 23:05:49 +00:00
Ruslan Nigmatullin	97d4b42583	uds: add async Unix socket crate (#18254 ) ## Summary - add a codex-uds crate with async UnixListener and UnixStream wrappers - expose helpers for private socket directory setup and stale socket path checks - migrate codex-stdio-to-uds onto codex-uds and Tokio-based stdio/socket relaying - update the CLI stdio-to-uds command path for the async runner ## Tests - cargo test -p codex-uds -p codex-stdio-to-uds - cargo test -p codex-cli - just fmt - just fix -p codex-uds - just fix -p codex-stdio-to-uds - just fix -p codex-cli - just bazel-lock-check - git diff --check	2026-04-20 15:59:05 -07:00

... 3 4 5 6 7 ...

5065 Commits