codex

mirror of https://github.com/openai/codex.git synced 2026-04-28 02:11:08 +03:00

Author	SHA1	Message	Date
Abhinav	ab26554a3a	Add remote_sandbox_config to our config requirements (#18763 ) ## Why Customers need finer-grained control over allowed sandbox modes based on the host Codex is running on. For example, they may want stricter sandbox limits on devboxes while keeping a different default elsewhere. Our current cloud requirements can target user/account groups, but they cannot vary sandbox requirements by host. That makes remote development environments awkward because the same top-level `allowed_sandbox_modes` has to apply everywhere. ## What Adds a new `remote_sandbox_config` section to `requirements.toml`: ```toml allowed_sandbox_modes = ["read-only"] [[remote_sandbox_config]] hostname_patterns = [".org"] allowed_sandbox_modes = ["read-only", "workspace-write"] [[remote_sandbox_config]] hostname_patterns = [".sh", "runner-*.ci"] allowed_sandbox_modes = ["read-only", "danger-full-access"] ``` During requirements resolution, Codex resolves the local host name once, preferring the machine FQDN when available and falling back to the cleaned kernel hostname. This host classification is best effort rather than authenticated device proof. Each requirements source applies its first matching `remote_sandbox_config` entry before it is merged with other sources. The shared merge helper keeps that `apply_remote_sandbox_config` step paired with requirements merging so new requirements sources do not have to remember the extra call. That preserves source precedence: a lower-precedence requirements file with a matching `remote_sandbox_config` cannot override a higher-precedence source that already set `allowed_sandbox_modes`. This also wires the hostname-aware resolution through app-server, CLI/TUI config loading, config API reads, and config layer metadata so they all evaluate remote sandbox requirements consistently. ## Verification - `cargo test -p codex-config remote_sandbox_config` - `cargo test -p codex-config host_name` - `cargo test -p codex-core load_config_layers_applies_matching_remote_sandbox_config` - `cargo test -p codex-core system_remote_sandbox_config_keeps_cloud_sandbox_modes` - `cargo test -p codex-config` - `cargo test -p codex-core` unit tests passed; `tests/all.rs` integration matrix was intentionally stopped after the relevant focused tests passed - `just fix -p codex-config` - `just fix -p codex-core` - `cargo check -p codex-app-server`	2026-04-21 05:05:02 +00:00
Dylan Hurd	86535c9901	feat(auto-review) Handle request_permissions calls (#18393 ) ## Summary When auto-review is enabled, it should handle request_permissions tool. We'll need to clean up the UX but I'm planning to do that in a separate pass ## Testing - [x] Ran locally <img width="893" height="396" alt="Screenshot 2026-04-17 at 1 16 13 PM" src="https://github.com/user-attachments/assets/4c045c5f-1138-4c6c-ac6e-2cb6be4514d8" /> --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 21:48:57 -07:00
Dylan Hurd	543a08dac9	chore(app-server) linguist-generated (#18807 ) ## Summary Start marking app-server schema files as [linguist-generated](https://docs.github.com/en/repositories/working-with-files/managing-files/customizing-how-changed-files-appear-on-github), so we can more easily parse reviews	2026-04-20 21:42:00 -07:00
canvrno-oai	2cc146f5ea	Fallback display names for TUI skill mentions (#18786 ) This updates TUI skill mentions to show a fallback label when a skill does not define a display name, so unnamed skills remain understandable in the picker without changing behavior for skills that already have one. <img width="1028" height="198" alt="Screenshot 2026-04-20 at 6 25 15 PM" src="https://github.com/user-attachments/assets/84077b85-99d0-4db9-b533-37e1887b4506" />	2026-04-20 20:46:55 -07:00
Matthew Zeng	1132ef887c	Make MCP resource read threadless (#18292 ) ## Summary Making thread id optional so that we can better cache resources for MCPs for connectors since their resource templates is universal and not particular to projects. - Make `mcpServer/resource/read` accept an optional `threadId` - Read resources from the current MCP config when no thread is supplied - Keep the existing thread-scoped path when `threadId` is present - Update the generated schemas, README, and integration coverage ## Testing - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-mcp` - `cargo test -p codex-app-server --test all mcp_resource` - `just fix -p codex-mcp` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server`	2026-04-20 19:59:36 -07:00
Dylan Hurd	58e7605efc	fix(guardian) Dont hard error on feature disable (#18795 ) ## Summary This shouldn't error for now ## Test plan - [x] Updated unit test	2026-04-20 19:54:39 -07:00
Michael Bolin	3d2f123895	protocol: preserve glob scan depth in permission profiles (#18713 ) ## Why #18274 made `PermissionProfile` the canonical file-system permissions shape, but the round-trip from `FileSystemSandboxPolicy` to `PermissionProfile` still dropped one piece of policy metadata: `glob_scan_max_depth`. That field is security-relevant for deny-read globs such as `*/.env`. On Linux, bubblewrap sandbox construction uses it to bound unreadable glob expansion. If a profile copied from active runtime permissions loses this value and is submitted back as an override, the resulting `FileSystemSandboxPolicy` can behave differently even though the visible permission entries look equivalent. ## What changed - Add `glob_scan_max_depth` to protocol `FileSystemPermissions` and preserve it when converting to/from `FileSystemSandboxPolicy`. - Keep legacy `read`/`write` JSON for simple path-only permissions, but force canonical JSON when glob scan depth is present so the metadata is not silently dropped. - Carry `globScanMaxDepth` through app-server `AdditionalFileSystemPermissions`, generated JSON/TypeScript schemas, and app-server/TUI conversion call sites. - Preserve the metadata through sandboxing permission normalization, merging, and intersection. - Carry the merged scan depth into the effective `FileSystemSandboxPolicy` used for command execution, so bounded deny-read globs reach Linux bubblewrap materialization. ## Verification - `cargo test -p codex-sandboxing glob_scan -- --nocapture` - `cargo test -p codex-sandboxing policy_transforms -- --nocapture` - `just fix -p codex-sandboxing` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18713). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * #18279 * #18278 * #18277 * #18276 * #18275 * __->__ #18713	2026-04-20 19:42:45 -07:00
xl-openai	6e9e2c2eef	feat: Support more plugin MCP file shapes. (#18780 ) Update core-plugins MCP loading to accept either an mcpServers object or a top-level server map in .mcp.json	2026-04-20 19:42:01 -07:00
Michael Bolin	ff05532723	refactor: narrow async lock scopes (#18418 ) ## Why This is part of the follow-up work from #18178 to make Codex ready for Clippy's [`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock) / [`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type) lints. This bottom PR keeps the scope intentionally small: `NetworkProxyState::record_blocked()` only needs the state write lock while it mutates the blocked-request ring buffer and counters. The debug log payload and `BlockedRequestObserver` callback can be produced after that lock is released. ## What changed - Copies the blocked-request snapshot values needed for logging while updating the state. - Releases the `RwLockWriteGuard` before logging or notifying the observer. ## Verification - `cargo test -p codex-network-proxy` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18418). * #18698 * #18423 * __->__ #18418	2026-04-21 02:23:30 +00:00
Ahmed Ibrahim	d6af7a6c03	[1/4] Add executor HTTP request protocol (#18581 ) ### Why Remote streamable HTTP MCP needs a transport-shaped executor primitive before the MCP client can move network I/O to the executor. This layer keeps the executor unaware of MCP and gives later PRs an ordered streaming surface for response bodies. ### What - Add typed `http/request` and `http/request/bodyDelta` protocol payloads. - Add executor client helpers for buffered and streamed HTTP responses. - Route body-delta notifications to request-scoped streams with sequence validation and cleanup when a stream finishes or is dropped. - Document the new protocol constants, transport structs, public client methods, body-stream lifecycle, and request-scoped routing helpers. - Add in-memory JSON-RPC client coverage for streamed HTTP response-body notifications, with comments spelling out what the test proves and each setup/exercise/assert phase. ### Stack 1. #18581 protocol 2. #18582 runner 3. #18583 RMCP client 4. #18584 manager wiring and local/remote coverage ### Verification - `just fmt` - `cargo check -p codex-exec-server -p codex-rmcp-client --tests` - `cargo check -p codex-core --test all` compile-only - `git diff --check` - Online full CI is running from the `full-ci` branch, including the remote Rust test job. Co-authored-by: Codex <noreply@openai.com> --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 02:21:08 +00:00
Celia Chen	cefcfe43b9	feat: add a built-in Amazon Bedrock model provider (#18744 ) ## Why Codex needs a first-class `amazon-bedrock` model provider so users can select Bedrock without copying a full provider definition into `config.toml`. The provider has Codex-owned defaults for the pieces that should stay consistent across users: the display `name`, Bedrock `base_url`, and `wire_api`. At the same time, users still need a way to choose the AWS credential profile used by their local environment. This change makes `amazon-bedrock` a partially modifiable built-in provider: code owns the provider identity and endpoint defaults, while user config can set `model_providers.amazon-bedrock.aws.profile`. For example: ```toml model_provider = "amazon-bedrock" [model_providers.amazon-bedrock.aws] profile = "codex-bedrock" ``` ## What Changed - Added `amazon-bedrock` to the built-in model provider map with: - `name = "Amazon Bedrock"` - `base_url = "https://bedrock-mantle.us-east-1.api.aws/v1"` - `wire_api = "responses"` - Added AWS provider auth config with a profile-only shape: `model_providers.<id>.aws.profile`. - Kept AWS auth config restricted to `amazon-bedrock`; custom providers that set `aws` are rejected. - Allowed `model_providers.amazon-bedrock` through reserved-provider validation so it can act as a partial override. - During config loading, only `aws.profile` is copied from the user-provided `amazon-bedrock` entry onto the built-in provider. Other Bedrock provider fields remain hard-coded by the built-in definition. - Updated the generated config schema for the new provider AWS profile config.	2026-04-21 00:54:05 +00:00
canvrno-oai	9a2b34213b	/statusline & /title - Shared preview values (#18435 ) This PR makes the `/statusline` and `/title` setup UIs share one preview-value source instead of each surface using its own examples. Both pickers now render consistent live values when available, and stable placeholders when they are not. It also resolves live preview values at the shared preview-item layer, so `/title` preview can use real runtime values for title-specific cases like status text, task progress, and project-name fallback behavior. - Adds a shared preview data model for status surfaces - Maps status-line items and terminal-title items onto that shared preview list - Feeds both setup views from the same chatwidget-derived preview data, with terminal-title-specific formatting applied before `/title` preview renders - Keeps project-root preview aligned with status-line behavior while project in /title keeps its title fallback/truncation behavior - Adds snapshot coverage for live-only, hardcoded-only, and mixed cases Test Steps - Open Codex TUI and launch `/statusline`. - Toggle and reorder items, then verify the preview uses current session values when possible, and placeholder values for missing values (ex: no thread ID). - Open `/title` and verify it shows the same normalized values, including live status/task-progress values when available.	2026-04-20 17:46:11 -07:00
guinness-oai	ca3246f77a	[codex] Send realtime transcript deltas on handoff (#18761 ) ## Summary - Track how many realtime transcript entries have already been attached to a background-agent handoff. - Attach only entries added since the previous handoff as `<transcript_delta>` instead of resending the accumulated transcript snapshot. - Update the realtime integration test so the second delegation carries only the second transcript delta. ## Validation - `just fmt` - `cargo test -p codex-api` - `cargo test -p codex-core inbound_handoff_request_sends_transcript_delta_after_each_handoff` - `cargo build -p codex-cli -p codex-app-server` ## Manual testing Built local debug binaries at: - `codex-rs/target/debug/codex` - `codex-rs/target/debug/codex-app-server`	2026-04-20 16:46:15 -07:00
Eric Traut	216e7a0a56	Warn when trusting Git subdirectories (#18602 ) Addresses #18505 ## Summary When Codex is launched from a subdirectory of a Git repository, the onboarding trust prompt says it is trusting the current directory even though the persisted trust target is the repository root. That can make the scope of the trust decision unclear. This updates the TUI trust prompt to show a yellow note only when the current directory differs from the resolved trust target, explaining that trust applies to the repository root and displaying that root. It also removes the stale onboarding TODO now that the warning is implemented.	2026-04-20 16:43:21 -07:00
viyatb-oai	33fa952426	fix: fix stale proxy env restoration after shell snapshots (#17271 ) ## Summary This fixes a stale-environment path in shell snapshot restoration. A sandboxed command can source a shell snapshot that was captured while an older proxy process was running. If that proxy has died and come back on a different port, the snapshot can otherwise put old proxy values back into the command environment, which is how tools like `pip` end up talking to a dead proxy. The wrapper now captures the live process environment before sourcing the snapshot and then restores or clears every proxy env var from the proxy crate's canonical list. That makes proxy state after shell snapshot restoration match the current command environment, rather than whatever proxy values happened to be present in the snapshot. On macOS, the Codex-generated `GIT_SSH_COMMAND` is refreshed when the SOCKS listener changes, while custom SSH wrappers are still left alone. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 16:39:17 -07:00
Ahmed Ibrahim	9ef1cab6f7	[6/6] Fail exec client operations after disconnect (#18027 ) ## Summary - Reject new exec-server client operations once the transport has disconnected. - Convert pending RPC calls into closed errors instead of synthetic server errors. - Cover pending read and later write behavior after remote executor disconnect. ## Verification - `just fmt` - `cargo check -p codex-exec-server` ## Stack ```text @ #18027 [6/6] Fail exec client operations after disconnect │ o #18212 [5/6] Wire executor-backed MCP stdio │ o #18087 [4/6] Abstract MCP stdio server launching │ o #18020 [3/6] Add pushed exec process events │ o #18086 [2/6] Support piped stdin in exec process API │ o #18085 [1/6] Add MCP server environment config │ o main ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 23:24:06 +00:00
Eric Traut	0f1c9b8963	Fix exec inheritance of root shared flags (#18630 ) Addresses #18113 Problem: Shared flags provided before the exec subcommand were parsed by the root CLI but not inherited by the exec CLI, so exec sessions could run with stale or default sandbox and model configuration. Solution: Move shared TUI and exec flags into a common option block and merge root selections into exec before dispatch, while preserving exec's global subcommand flag behavior.	2026-04-20 16:12:17 -07:00
Eric Traut	2af4f15479	Refactor TUI app module into submodules (#18753 ) ## Why The TUI app module had grown past the 512K source-file cap enforced by CI/CD. This keeps the app entry point below that limit while preserving the existing runtime behavior and test surface. ## What changed - Kept the top-level `App` state and run-loop wiring in `tui/src/app.rs`. - Split app responsibilities into focused private submodules under `tui/src/app/`, covering event dispatch, thread routing, session lifecycle, config persistence, background requests, startup prompts, input, history UI, platform actions, and thread event buffering. - Moved the existing app-level tests into `tui/src/app/tests.rs` and reused the existing snapshot location rather than adding new tests or snapshots. - Added module header comments for `app.rs` and the new submodules. ## Follow-up A future cleanup can move narrow unit tests from `tui/src/app/tests.rs` into the specific app submodules they exercise. This PR keeps the existing app-level tests together so the refactor stays focused on the source-file split. ## Verification - `cargo test -p codex-tui --lib app::tests::agent_picker_item_name_snapshot` - `cargo test -p codex-tui --lib app::tests::clear_ui` - `cargo test -p codex-tui --lib app::tests::ctrl_l_clear_ui_after_long_transcript_reuses_clear_header_snapshot` - `just fix -p codex-tui` Full `cargo test -p codex-tui` still fails on model-catalog drift unrelated to this refactor, including stale `gpt-5.3-codex`/`gpt-5.1-codex` snapshot and migration expectations now resolving to `gpt-5.4`.	2026-04-20 16:10:35 -07:00
Rasmus Rygaard	7b994100b3	Add session config loader interface (#18208 ) ## Why Cloud-hosted sessions need a way for the service that starts or manages a thread to provide session-owned config without treating all config as if it came from the same user/project/workspace TOML stack. The important boundary is ownership: some values should be controlled by the session/orchestrator, some by the authenticated user, and later some may come from the executor. The earlier broad config-store shape made that boundary too fuzzy and overlapped heavily with the existing filesystem-backed config loader. This PR starts with the smaller piece we need now: a typed session config loader that can feed the existing config layer stack while preserving the normal precedence and merge behavior. ## What Changed - Added `ThreadConfigLoader` and related typed payloads in `codex-config`. - `SessionThreadConfig` currently supports `model_provider`, `model_providers`, and feature flags. - `UserThreadConfig` is present as an ownership boundary, but does not yet add TOML-backed fields. - `NoopThreadConfigLoader` preserves existing behavior when no external loader is configured. - `StaticThreadConfigLoader` supports tests and simple callers. - Taught thread config sources to produce ordinary `ConfigLayerEntry` values so the existing `ConfigLayerStack` remains the place where precedence and merging happen. - Wired the loader through `ConfigBuilder`, the config loader, and app-server startup paths so app-server can provide session-owned config before deriving a thread config. - Added coverage for: - translating typed thread config into config layers, - inserting thread config layers into the stack at the right precedence, - applying session-provided model provider and feature settings when app-server derives config from thread params. ## Follow-Ups This intentionally stops short of adding the remote/service transport. The next pieces are expected to be: 1. Define the proto/API shape for this interface. 2. Add a client implementation that can source session config from the service side. ## Verification - Added unit coverage in `codex-config` for the loader and layer conversion. - Added `codex-core` config loader coverage for thread config layer precedence. - Added app-server coverage that verifies session thread config wins over request-provided config for model provider and feature settings.	2026-04-20 23:05:49 +00:00
pakrym-oai	513dc28717	Add Code Review skill (#18746 ) Adds a skill that centralizes rules used during code review for codex.	2026-04-20 16:01:16 -07:00
Ruslan Nigmatullin	97d4b42583	uds: add async Unix socket crate (#18254 ) ## Summary - add a codex-uds crate with async UnixListener and UnixStream wrappers - expose helpers for private socket directory setup and stale socket path checks - migrate codex-stdio-to-uds onto codex-uds and Tokio-based stdio/socket relaying - update the CLI stdio-to-uds command path for the async runner ## Tests - cargo test -p codex-uds -p codex-stdio-to-uds - cargo test -p codex-cli - just fmt - just fix -p codex-uds - just fix -p codex-stdio-to-uds - just fix -p codex-cli - just bazel-lock-check - git diff --check	2026-04-20 15:59:05 -07:00
guinness-oai	1029742cf7	Add realtime silence tool (#18635 ) ## Summary Adds a second realtime v2 function tool, `remain_silent`, so the realtime model has an explicit non-speaking action when the collaboration mode or latest context says it should not answer aloud. This is stacked on #18597. ## Design - Advertise `remain_silent` alongside `background_agent` in realtime v2 conversational sessions. - Parse `remain_silent` function calls into a typed `RealtimeEvent::NoopRequested` event. - Have core answer that function call with an empty `function_call_output` and deliberately avoid `response.create`, so no follow-up realtime response is requested. - Keep the event hidden from app-server/TUI surfaces; it is operational plumbing, not user-visible conversation content.	2026-04-20 15:43:20 -07:00
Tom	a718b6fd47	Read conversation summaries through thread store (#18716 ) Migrate the conversation summary App Server methods to ThreadStore Because this app server api allows explicitly fetching the thread by rollout path, intercept that case in the app server code and (a) route directly to underlying local thread store methods if we're using a local thread store, or (b) throw an unsupported error if we're using a remote thread store. This keeps the thread store API clean and all filesystem operations inside of the local thread store, which pushing the "fundamental incompatibility" check as early as possible.	2026-04-20 22:39:10 +00:00
jif-oai	660153b6de	feat: cascade thread archive (#18112 ) Cascade the thread archive endpoint to all the sub-agents in the agent tree Fix: https://github.com/openai/codex/issues/17867 --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 23:38:18 +01:00
Eric Traut	b8e78e8869	Use app server metadata for fork parent titles (#18632 ) ## Problem The TUI resolved fork parent titles from local CODEX_HOME metadata, which could show missing or stale titles when app-server metadata is authoritative. This is a lingering bug left over from the migration of the TUI to the app-server interface. I found it when I asked Codex to review all places where the TUI code was still directly accessing the local CODEX_HOME. ## Solution Route fork parent title metadata through the app-server session state and render only that supplied title, with focused snapshot coverage for stale local metadata. ## Testing I manually tested by renaming a thread then forking it and confirming that the "forked from" message indicated the parent thread's name.	2026-04-20 15:37:31 -07:00
Thibault Sottiaux	54bd07d28c	[codex] prefer inherited spawn agent model (#18701 ) This updates the spawn-agent tool contract so subagents are presented as inheriting the parent model by default. The visible model list is now framed as optional overrides, the model parameter tells callers to leave it unset and the delegation guidance no longer nudges models toward picking a smaller/mini override. Fixes reports that 5.4 would occasionally pick 5.2 or lower as sub-agents.	2026-04-20 22:34:08 +00:00
Felipe Coury	cebe57b723	fix(tui): keep /copy aligned with rollback (#18739 ) ## Why Fixes #18718. After rewinding a thread, `/copy` could still copy the latest assistant response from before the rewind. The transcript cells were rolled back, but the copy source was a single `last_agent_markdown` cache that was not synchronized with backtracking, so the visible conversation and copied content could diverge. ## What changed `ChatWidget` now keeps a bounded copy history for the most recent 32 assistant responses, keyed by the visible user-turn count. When local rollback trims transcript cells, the copy cache is trimmed to the same surviving user-turn count so `/copy` uses the latest visible assistant response. If the user rewinds past the retained copy window, `/copy` now reports: ```text Cannot copy that response after rewinding. Only the most recent 32 responses are available to /copy. ``` The change also adds coverage for copying the latest surviving response after rollback and for the over-limit rewind message. ## Verification - Manually resumed a synthetic 35-turn session, rewound within the retained window, and verified `/copy` copied the surviving response. - Manually rewound past the retained window and verified `/copy` showed the 32-response limit message. - `cargo test -p codex-tui slash_copy` - `just fix -p codex-tui` - `cargo insta pending-snapshots` Note: `cargo test -p codex-tui` currently fails on unrelated model catalog and snapshot drift around the default model changing to `gpt-5.4`; the focused `/copy` tests pass after fixing the new test setup.	2026-04-20 19:24:10 -03:00
Tom	46e5814f77	Add experimental remote thread store config (#18714 ) Add experimental config to use remote thread store rather than local thread store implementation in app server	2026-04-20 22:20:39 +00:00
Ahmed Ibrahim	cc96a03f10	Fix stale model test fixtures (#18719 ) Fixes stale test fixtures left after the active bundled model catalog updates in #18586 and #18388. Those changes made `gpt-5.4` the current default and removed several older hardcoded slugs, which left Windows Bazel shards failing TUI and config tests. What changed: - Refresh TUI model migration, availability NUX, plan-mode, status, and snapshot fixtures to use active bundled model slugs. - Update the config edit test expectation for the TOML-quoted `"gpt-5.2"` migration key. - Move the model catalog tests into `codex-rs/tui/src/app/tests/model_catalog.rs` so touching them does not trip the blob-size policy for `app.rs`. Verification: - CI Bazel/lint checks are expected to cover the affected test shards.	2026-04-20 21:52:30 +00:00
Eric Traut	baa5dd7b29	Surface TUI skills refresh failures (#18627 ) ## Why `skills/list` refreshes are best-effort metadata updates. If one fails during startup or thread switching, the TUI should keep running and show enough detail to diagnose the app-server failure instead of leaving the user with only a log entry. This addresses the recoverability and observability issue reported in #16914. ## What Changed - Preserve the full startup `skills/list` error chain before sending it back through the app event queue. - Surface failed skills refreshes as recoverable TUI error messages while still logging the warning. This is related to the recent bug fix from [PR #18370](https://github.com/openai/codex/pull/18370).	2026-04-20 14:43:04 -07:00
guinness-oai	126bd6e7a8	Update realtime handoff transcript handling (#18597 ) ## Summary This PR aims to improve integration between the realtime model and the codex agent by sharing more context with each other. In particular, we now share full realtime conversation transcript deltas in addition to the delegation message. realtime_conversation.rs now turns a handoff into: ``` <realtime_delegation> <input>...</input> <transcript_delta>...</transcript_delta> </realtime_delegation> ``` ## Implementation notes The transcript is accumulated in the realtime websocket layer as parsed realtime events arrive. When a background-agent handoff is requested, the current transcript snapshot is copied onto the handoff event and then serialized by `realtime_conversation.rs` into the hidden realtime delegation envelope that Codex receives as user-turn context. For Realtime V2, the session now explicitly enables input audio transcription, and the parser handles the relevant input/output transcript completion events so the snapshot includes both user speech and realtime model responses. The delegation `<input>` remains the actual handoff request, while `<transcript_delta>` carries the surrounding conversation history for context. Reviewers should note that the transcript payload is intended for Codex context sharing, not UI rendering. The realtime delegation envelope should stay hidden from the user-facing transcript surface, while still being included in the background-agent turn so Codex can answer with the same conversational context the realtime model had.	2026-04-20 14:04:09 -07:00
Dylan Hurd	14ebfbced9	chore(guardian) disable mcps and plugins (#18722 ) ## Summary Disables apps, plugins, mcps for the guardian subagent thread ## Testing - [x] Added unit tests	2026-04-20 13:43:50 -07:00
rhan-oai	7f53e47250	[codex-analytics] guardian review analytics schema polishing (#17692 ) ## Why Guardian review analytics needs a Rust event shape that matches the backend schema while avoiding unnecessary PII exposure from reviewed tool calls. This PR narrows the analytics payload to the fields we intend to emit and keeps shared Guardian assessment enums in protocol instead of duplicating equivalent analytics-only enums. ## What changed - Uses protocol Guardian enums directly for `risk_level`, `user_authorization`, `outcome`, and command source values. - Removes high-risk reviewed-action fields from the analytics payload, including raw commands, display strings, working directories, file paths, network targets/hosts, justification text, retry reason, and rationale text. - Makes `target_item_id` and `tool_call_count` nullable so the Codex event can represent cases where the app-server protocol or producer does not have those values. - Keeps lower-risk structured reviewed-action metadata such as sandbox permissions, permission profile, `tty`, `execve` source/program, network protocol/port, and MCP connector/tool labels. - Adds an analytics reducer/client test covering `codex_guardian_review` serialization with an optional `target_item_id` and absent removed fields. ## Verification - `cargo test -p codex-analytics guardian_review_event_ingests_custom_fact_with_optional_target_item` - `cargo fmt --check` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17692). * #17696 * #17695 * #17693 * __->__ #17692	2026-04-20 13:08:17 -07:00
caseysilver-oai	fe04d75e0f	[codex] Fix high severity dependency alerts (#18167 ) ## Summary - Pin vulnerable npm dependencies through the existing root `resolutions` mechanism so the lockfile moves only to patched versions. - Refresh `pnpm-lock.yaml` for `@modelcontextprotocol/sdk`, `handlebars`, `path-to-regexp`, `picomatch`, `minimatch`, `flatted`, `rollup`, and `glob`. - Bump `quinn-proto` from `0.11.13` to `0.11.14` and refresh `MODULE.bazel.lock`. ## Testing - `corepack pnpm --store-dir .pnpm-store install --frozen-lockfile --ignore-scripts` - `corepack pnpm audit --audit-level high` (passes; remaining advisories are low/moderate) - `corepack pnpm -r --filter ./sdk/typescript run build` - `corepack pnpm exec eslint 'src/*/.ts' 'tests/*/.ts'` - `cargo check --locked` - `cargo build -p codex-cli` - `bazel --output_user_root=/tmp/bazel-codex-dependabot --ignore_all_rc_files mod deps --lockfile_mode=error` - `just fmt` Note: `corepack pnpm -r --filter ./sdk/typescript run test` was also attempted after building `codex`; it is blocked on this workstation by host-managed Codex MDM/auth state (`approval_policy` restrictions and ChatGPT/API-key mismatch), not by this dependency change.	2026-04-20 11:59:50 -07:00
github-actions[bot]	4676cb5ff8	Update models.json (#18388 ) Automated update of models.json. --------- Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>	2026-04-20 11:46:52 -07:00
Adrian	6b17adc231	[codex] Fix agent identity auth test fixture (#18697 ) ## Summary - Add the missing `background_task_id: None` field to the `AgentIdentityAuthRecord` fixture introduced in `auth_tests.rs`. ## Why - Current `main` fails Bazel/rust-ci compile paths after the background-task auth field landed and a later auth test fixture constructed `AgentIdentityAuthRecord` without that new field. - I intentionally removed the earlier broader CI-stability edits from this PR. The code-mode timeout, external-agent migration snapshot, and MCP resource timeout failures appear to be general/flaky or unrelated to the agent identity merge stack rather than cleanly caused by it. ## Validation - `cargo test -p codex-login dummy_chatgpt_auth_does_not_create_cwd_auth_json_when_identity_is_set -- --nocapture` - `just fmt`	2026-04-20 11:05:58 -07:00
Eric Traut	164b6a0c78	Remove simple TUI legacy_core reexports (#18631 ) ## Problem The TUI still imported path utilities and config-loader symbols through app-server-client's legacy_core facade even though those APIs already exist in utility/config crates. This is part of our ongoing effort to whittle away at these old dependencies. ## Solution Rewire imports to avoid the TUI directly importing from the core crate and instead import from common lower-level crates. This PR doesn't include any functional changes; it's just a simple rewiring.	2026-04-20 10:48:27 -07:00
Akshay Nathan	34a3e85fcd	Wire the PatchUpdated events through app_server (#18289 ) Wires patch_updated events through app_server. These events are parsed and streamed while apply_patch is being written by the model. Also adds 500ms of buffering to the patch_updated events in the diff_consumer. The eventual goal is to use this to display better progress indicators in the codex app.	2026-04-20 10:44:03 -07:00
Ahmed Ibrahim	316cf0e90b	Update models.json (#18586 ) - Replace the active models-manager catalog with the deleted core catalog contents. - Replace stale hardcoded test model slugs with current bundled model slugs. - Keep this as a stacked change on top of the cleanup PR.	2026-04-20 10:27:01 -07:00
Michael Bolin	5d5d610740	refactor: use semaphores for async serialization gates (#18403 ) This is the second cleanup in the await-holding lint stack. The higher-level goal, following https://github.com/openai/codex/pull/18178 and https://github.com/openai/codex/pull/18398, is to enable Clippy coverage for guards held across `.await` points without carrying broad suppressions. The stack is working toward enabling Clippy's [`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock) lint and the configurable [`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type) lint for Tokio guard types. Several existing fields used `tokio::sync::Mutex<()>` only as one-at-a-time async gates. Those guards intentionally lived across `.await` while an operation was serialized. A mutex over `()` suggests protected data and trips the await-holding lint shape; a single-permit `tokio::sync::Semaphore` expresses the intended serialization directly. ## What changed - Replace `Mutex<()>` serialization gates with `Semaphore::new(1)` for agent identity ensure, exec policy updates, guardian review session reuse, plugin remote sync, managed network proxy refresh, auth token refresh, and RMCP session recovery. - Update call sites from `lock().await` / `try_lock()` to `acquire().await` / `try_acquire()`. - Map closed-semaphore errors into the existing local error types, even though these semaphores are owned for the lifetime of their managers. - Update session test builders for the new `managed_network_proxy_refresh_lock` type. ## Verification - The split stack was verified at the final lint-enabling head with `just clippy`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18403). * #18698 * #18423 * #18418 * __->__ #18403	2026-04-20 17:21:29 +00:00
Michael Bolin	dcec516313	protocol: canonicalize file system permissions (#18274 ) ## Why `PermissionProfile` needs stable, canonical file-system semantics before it can become the primary runtime permissions abstraction. Without a canonical form, callers have to keep re-deriving legacy sandbox maps and profile comparisons remain lossy or order-dependent. ## What changed This adds canonicalization helpers for `FileSystemPermissions` and `PermissionProfile`, expands special paths into explicit sandbox entries, and updates permission request/conversion paths to consume those canonical entries. It also tightens the legacy bridge so root-wide write profiles with narrower carveouts are not silently projected as full-disk legacy access. ## Verification - `cargo test -p codex-protocol root_write_with_read_only_child_is_not_full_disk_write -- --nocapture` - `cargo test -p codex-sandboxing permission -- --nocapture` - `cargo test -p codex-tui permissions -- --nocapture`	2026-04-20 09:57:03 -07:00
Tom	ac7c9a685f	codex: move unloaded thread writes into store (#18361 ) - Migrates unloaded `thread/name/set` and `thread/memoryModeSet` app-server writes behind the generic `ThreadStore::update_thread_metadata` API rather than adding one-off store methods for setting thread name or memory mode. - Implements the local ThreadStore metadata patch path for thread name and memory mode, including rollout append, legacy name index updates, SessionMeta validation/update, SQLite reconciliation, and re-reading the stored thread. - Adds focused local thread-store unit coverage plus app-server integration coverage for the migrated unloaded write paths.	2026-04-20 09:50:01 -07:00
Eric Traut	0dc503ba6e	Surface parent thread status in side conversations (#18591 ) ## Summary Side conversations can hide important state changes from the parent conversation while the user is focused on the side thread. In particular, the parent may finish, fail, need user input, or require an approval while the side conversation remains visible. Users need a lightweight signal for those states, but parent approval overlays should not interrupt the side conversation itself. This change adds parent-conversation status to the side conversation context label and defers parent interactive overlays while side mode is active. When the user exits side mode, pending parent approvals and input requests are restored in the main thread. The pending approval footer avoids duplicating the same parent approval status, and replayed notice cells are filtered when restoring a pending interactive request so tips or warnings do not crowd out the approval prompt. The change is contained to the TUI side-conversation and thread replay paths. Example 1: Approval pending <img width="752" height="35" alt="Screenshot 2026-04-19 at 12 56 07 PM" src="https://github.com/user-attachments/assets/1cc0f1a3-9cab-4d60-aed2-96523ccafc20" /> Example 2: Turn complete <img width="754" height="35" alt="Screenshot 2026-04-19 at 12 56 27 PM" src="https://github.com/user-attachments/assets/653521a5-e298-4366-ae1c-72b56eb88eeb" />	2026-04-20 09:00:44 -07:00
Eric Traut	43a69c50eb	Use app server thread names in TUI picker (#18633 ) ## Problem The TUI resume/fork picker was backfilling thread names from local rollout indexes. This was left over from before the TUI was moved to the app server. It should be using app-server APIs because the TUI might be connected to a remote connection. This bug wasn't (yet) reported by a user. I found it by asking Codex to review places in the TUI code where it was still directly accessing the CODEX_HOME directory rather than going through app-server APIs. ## Solution The resume picker and session lookups should use app-server thread APIs only. Remove legacy rollout name/list backfills, and avoid local name reads in fork history. ## Testing I manually tested `codex resume` and `codex resume --all` to look for functional or performance regressions in the resume picker.	2026-04-20 08:16:24 -07:00
Eric Traut	5a8700abcc	Add verbose diagnostics for /mcp (#18610 ) Fixes #18539. ## Summary The recent `/mcp` performance work kept the default command fast by avoiding resource and resource-template inventory probes, but it also removed useful diagnostics for users trying to confirm MCP server state. This keeps bare `/mcp` on the fast tools/auth path and adds `/mcp verbose` for the slower diagnostic view. Verbose mode requests full MCP server status from the app-server and restores status, resources, and resource templates in the TUI output. ## Testing In addition to running automation, I manually tested the feature to confirm that it works.	2026-04-20 08:13:44 -07:00
jif-oai	e53e6bc48f	fix: auth.json leak in tests (#18657 ) Before this some tests were leaking an auth.json file into `codex-rs/core`. This just fixes it	2026-04-20 15:35:28 +01:00
Adrian	19e2f21827	[codex] Use background task auth for additional backend calls (#18260 ) ## Summary Splits the larger PR4.1 background task auth rollout by moving additional backend/control-plane call sites into this downstream PR. This PR keeps callers on the same design as PR4.1: most code asks `AuthManager` for the default ChatGPT backend authorization header, and `AuthManager` decides bearer vs background AgentAssertion internally. Task-pinned inference auth remains separate because it needs the thread's registered task id. ## Stack - PR1: https://github.com/openai/codex/pull/17385 - add `features.use_agent_identity` - PR2: https://github.com/openai/codex/pull/17386 - register agent identities when enabled - PR3: https://github.com/openai/codex/pull/17387 - register agent tasks when enabled - PR3.1: https://github.com/openai/codex/pull/17978 - persist and prewarm registered tasks per thread - PR4: https://github.com/openai/codex/pull/17980 - use task-scoped `AgentAssertion` for downstream calls - PR4.1: https://github.com/openai/codex/pull/18094 - introduce AuthManager-owned background/control-plane `AgentAssertion` auth - PR4.2: this PR - use background task auth for additional backend/control-plane calls ## What Changed - pass full authorization header values through backend-client and cloud-tasks-client call paths where needed - move ChatGPT client, cloud requirements, cloud tasks, thread-manager, and models-manager background auth usage into this downstream slice - make app-server remote control enrollment/websocket auth ask `AuthManager` for the local backend authorization header instead of threading a background auth mode through transport options - keep the same feature-gated bearer fallback behavior from PR4.1 ## Validation - `just fmt` - `cargo check -p codex-core -p codex-login -p codex-analytics -p codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p codex-models-manager -p codex-chatgpt -p codex-model-provider -p codex-mcp -p codex-core-skills` - `cargo test -p codex-login agent_identity` - `cargo test -p codex-model-provider bearer_auth_provider` - `cargo test -p codex-core agent_assertion` - `cargo test -p codex-app-server remote_control` - `cargo test -p codex-cloud-requirements fetch_cloud_requirements` - `cargo test -p codex-models-manager manager::tests` - `cargo test -p codex-chatgpt` - `cargo test -p codex-cloud-tasks` - `just fix -p codex-core -p codex-login -p codex-analytics -p codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p codex-models-manager -p codex-chatgpt -p codex-model-provider -p codex-mcp -p codex-core-skills` - `just fix -p codex-app-server` - `git diff --check`	2026-04-20 07:24:29 -07:00
Eric Traut	fa0e2ba87c	Avoid false shell snapshot cleanup warnings (#18441 ) ## Why Fresh app-server thread startup can create a shell snapshot through a temp file and then promote it to the final snapshot path. The previous implementation briefly wrapped the temp path in `ShellSnapshot`, so after a successful rename its `Drop` attempted to delete the old temp path and could log a false `ENOENT` warning. Fixes #17549. ## What changed - Validate the temp snapshot path directly before promotion. - Rename the temp path directly to the final snapshot path. - Keep explicit cleanup of the temp path on validation or finalization failures.	2026-04-20 15:15:05 +01:00
Adrian	904c751a40	[codex] Use background agent task auth for backend calls (#18094 ) ## Summary Introduces a single background/control-plane agent task for ChatGPT backend requests that do not have a thread-scoped task, with `AuthManager` owning the default ChatGPT backend authorization decision. Callers now ask `AuthManager` for the default ChatGPT backend authorization header. `AuthManager` decides whether that is bearer or background AgentAssertion based on config/internal state, while low-level bootstrap paths can explicitly request bearer-only auth. This PR is stacked on PR4 and focuses on the shared background task auth plumbing plus the first tranche of backend/control-plane consumers. The remaining callsite wiring is split into PR4.2 to keep review size down. ## Stack - PR1: https://github.com/openai/codex/pull/17385 - add `features.use_agent_identity` - PR2: https://github.com/openai/codex/pull/17386 - register agent identities when enabled - PR3: https://github.com/openai/codex/pull/17387 - register agent tasks when enabled - PR3.1: https://github.com/openai/codex/pull/17978 - persist and prewarm registered tasks per thread - PR4: https://github.com/openai/codex/pull/17980 - use task-scoped `AgentAssertion` for downstream calls - PR4.1: this PR - introduce AuthManager-owned background/control-plane `AgentAssertion` auth - PR4.2: https://github.com/openai/codex/pull/18260 - use background task auth for additional backend/control-plane calls ## What Changed - add background task registration and assertion minting inside `codex-login` - persist `agent_identity.background_task_id` separately from per-session task state - make `BackgroundAgentTaskManager` private to `codex-login`; call sites do not instantiate or pass it around - teach `AuthManager` the ChatGPT backend base URL and feature-derived background auth mode from resolved config - expose bearer-only helpers for bootstrap/registration/refresh-style paths that must not use AgentAssertion - wire `AuthManager` default ChatGPT authorization through app listing, connector directory listing, remote plugins, MCP status/listing, analytics, and core-skills remote calls - preserve bearer fallback when the feature is disabled, the backend host is unsupported, or background task registration is not available ## Validation - `just fmt` - `cargo check -p codex-core -p codex-login -p codex-analytics -p codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p codex-models-manager -p codex-chatgpt -p codex-model-provider -p codex-mcp -p codex-core-skills` - `cargo test -p codex-login agent_identity` - `cargo test -p codex-model-provider bearer_auth_provider` - `cargo test -p codex-core agent_assertion` - `cargo test -p codex-app-server remote_control` - `cargo test -p codex-cloud-requirements fetch_cloud_requirements` - `cargo test -p codex-models-manager manager::tests` - `cargo test -p codex-chatgpt` - `cargo test -p codex-cloud-tasks` - `just fix -p codex-core -p codex-login -p codex-analytics -p codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p codex-models-manager -p codex-chatgpt -p codex-model-provider -p codex-mcp -p codex-core-skills` - `just fix -p codex-app-server` - `git diff --check`	2026-04-20 06:50:28 -07:00
jif-oai	e1c289e11b	feat: log client use min log level (#18661 ) In the log client, use the log level filter as a minimum severity instead of exact match --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 14:40:39 +01:00

... 4 5 6 7 8 ...

5858 Commits