codex

mirror of https://github.com/openai/codex.git synced 2026-05-04 05:11:37 +03:00

Author	SHA1	Message	Date
jif-oai	e8add54e5d	feat: show effective model in spawn agent event (#14944 ) Show effective model after the full config layering for the sub agent	2026-03-17 16:58:58 +00:00
daveaitel-openai	ef36d39199	Fix agent jobs finalization race and reduce status polling churn (#14843 ) ## Summary - make `report_agent_job_result` atomically transition an item from running to completed while storing `result_json` - remove brittle finalization grace-sleep logic and make finished-item cleanup idempotent - replace blind fixed-interval waiting with status-subscription-based waiting for active worker threads - add state runtime tests for atomic completion and late-report rejection ## Why This addresses the race and polling concerns in #13948 by removing timing-based correctness assumptions and reducing unnecessary status polling churn. ## Validation - `cd codex-rs && just fmt` - `cd codex-rs && cargo test -p codex-state` - `cd codex-rs && cargo test -p codex-core --test all suite::agent_jobs` - `cd codex-rs && cargo test` - fails in an unrelated app-server tracing test: `message_processor::tracing_tests::thread_start_jsonrpc_span_exports_server_span_and_parents_children` timed out waiting for response ## Notes - This PR supersedes #14129 with the same agent-jobs fix on a clean branch from `main`. - The earlier PR branch was stacked on unrelated history, which made the review diff include unrelated commits. Fixes #13948	2026-03-17 10:40:14 -04:00
jif-oai	4ed19b0766	feat: rename to get more explicit close agent (#14935 ) https://github.com/openai/codex/issues/14907	2026-03-17 14:37:20 +00:00
jif-oai	31648563c8	feat: centralize package manager version (#14920 )	2026-03-17 12:03:07 +00:00
viyatb-oai	db7e02c739	fix: canonicalize symlinked Linux sandbox cwd (#14849 ) ## Problem On Linux, Codex can be launched from a workspace path that is a symlink (for example, a symlinked checkout or a symlinked parent directory). Our sandbox policy intentionally canonicalizes writable/readable roots to the real filesystem path before building the bubblewrap mounts. That part is correct and needed for safety. The remaining bug was that bubblewrap could still inherit the helper process's logical cwd, which might be the symlinked alias instead of the mounted canonical path. In that case, the sandbox starts in a cwd that does not exist inside the sandbox namespace even though the real workspace is mounted. This can cause sandboxed commands to fail in symlinked workspaces. ## Fix This PR keeps the sandbox policy behavior the same, but separates two concepts that were previously conflated: - the canonical cwd used to define sandbox mounts and permissions - the caller's logical cwd used when launching the command On the Linux bubblewrap path, we now thread the logical command cwd through the helper explicitly and only add `--chdir <canonical path>` when the logical cwd differs from the mounted canonical path. That means: - permissions are still computed from canonical paths - bubblewrap starts the command from a cwd that definitely exists inside the sandbox - we do not widen filesystem access or undo the earlier symlink hardening ## Why This Is Safe This is a narrow Linux-only launch fix, not a policy change. - Writable/readable root canonicalization stays intact. - Protected metadata carveouts still operate on canonical roots. - We only override bubblewrap's inherited cwd when the logical path would otherwise point at a symlink alias that is not mounted in the sandbox. ## Tests - kept the existing protocol/core regression coverage for symlink canonicalization - added regression coverage for symlinked cwd handling in the Linux bubblewrap builder/helper path Local validation: - `just fmt` - `cargo test -p codex-protocol` - `cargo test -p codex-core normalize_additional_permissions_canonicalizes_symlinked_write_paths` - `cargo clippy -p codex-linux-sandbox -p codex-protocol -p codex-core --tests -- -D warnings` - `cargo build --bin codex` ## Context This is related to #14694. The earlier writable-root symlink fix addressed the mount/permission side; this PR fixes the remaining symlinked-cwd launch mismatch in the Linux sandbox path.	2026-03-16 22:39:18 -07:00
Ahmed Ibrahim	79f476e47d	[stack 3/4] Add current thread context to realtime startup (#14829 ) ## Stack Position 3/4. Top-of-stack sibling built on #14830. ## Base - #14830 ## Sibling - #14827 ## Scope - Extend the realtime startup context with a bounded summary of the latest thread turns for continuity. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-17 05:11:05 +00:00
Thibault Sottiaux	8e34caffcc	[codex] add Jason as a predefined subagent name (#14881 ) This change adds Jason to codex-core's built-in subagent nickname pool so spawned agents can pick it without any custom role configuration. The default list was simply missing that predefined name (a grave mistake).	2026-03-16 22:01:14 -07:00
xl-openai	e5a28ba0c2	fix: align marketplace display name with existing interface conventions (#14886 ) 1. camelCase for displayName; 2. move displayName under interface.	2026-03-16 21:52:19 -07:00
Ahmed Ibrahim	fbd7f9b986	[stack 2/4] Align main realtime v2 wire and runtime flow (#14830 ) ## Stack Position 2/4. Built on top of #14828. ## Base - #14828 ## Unblocks - #14829 - #14827 ## Scope - Port the realtime v2 wire parsing, session, app-server, and conversation runtime behavior onto the split websocket-method base. - Branch runtime behavior directly on the current realtime session kind instead of parser-derived flow flags. - Keep regression coverage in the existing e2e suites. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-16 21:38:07 -07:00
xl-openai	1d85fe79ed	feat: support remote_sync for plugin install/uninstall. (#14878 ) - Added forceRemoteSync to plugin/install and plugin/uninstall. - With forceRemoteSync=true, we update the remote plugin status first, then apply the local change only if the backend call succeeds. - Kept plugin/list(forceRemoteSync=true) as the main recon path, and for now it treats remote enabled=false as uninstall. We will eventually migrate to plugin/installed for more precise state handling.	2026-03-16 21:37:27 -07:00
xl-openai	49c2b66ece	Add marketplace display names to plugin/list (#14861 ) Add display_name support to marketplace.json.	2026-03-16 19:04:40 -07:00
Michael Bolin	b77fe8fefe	Apply argument comment lint across codex-rs (#14652 ) ## Why Once the repo-local lint exists, `codex-rs` needs to follow the checked-in convention and CI needs to keep it from drifting. This commit applies the fallback `/param/` style consistently across existing positional literal call sites without changing those APIs. The longer-term preference is still to avoid APIs that require comments by choosing clearer parameter types and call shapes. This PR is intentionally the mechanical follow-through for the places where the existing signatures stay in place. After rebasing onto newer `main`, the rollout also had to cover newly introduced `tui_app_server` call sites. That made it clear the first cut of the CI job was too expensive for the common path: it was spending almost as much time installing `cargo-dylint` and re-testing the lint crate as a representative test job spends running product tests. The CI update keeps the full workspace enforcement but trims that extra overhead from ordinary `codex-rs` PRs. ## What changed - keep a dedicated `argument_comment_lint` job in `rust-ci` - mechanically annotate remaining opaque positional literals across `codex-rs` with exact `/param/` comments, including the rebased `tui_app_server` call sites that now fall under the lint - keep the checked-in style aligned with the lint policy by using `/param/` and leaving string and char literals uncommented - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo registry/git metadata in the lint job - split changed-path detection so the lint crate's own `cargo test` step runs only when `tools/argument-comment-lint/` or `rust-ci.yml` changes - continue to run the repo wrapper over the `codex-rs` workspace, so product-code enforcement is unchanged Most of the code changes in this commit are intentionally mechanical comment rewrites or insertions driven by the lint itself. ## Verification - `./tools/argument-comment-lint/run.sh --workspace` - `cargo test -p codex-tui-app-server -p codex-tui` - parsed `.github/workflows/rust-ci.yml` locally with PyYAML --- -> #14652 * #14651	2026-03-16 16:48:15 -07:00
pakrym-oai	a3ba10b44b	Add exit helper to code mode scripts (#14851 ) - Summary - expose `exit` through the code mode bridge and module so scripts can stop mid-flight - surface the helper in the description documentation - add a regression test ensuring `exit()` terminates execution cleanly - Testing - Not run (not requested)	2026-03-16 22:07:58 +00:00
Andi Liu	4c9dbc1f88	memories: exclude AGENTS and skills from stage1 input (#14268 ) ###### Why/Context/Summary - Exclude injected AGENTS.md instructions and standalone skill payloads from memory stage 1 inputs so memory generation focuses on conversation content instead of prompt scaffolding. - Strip only the AGENTS fragment from mixed contextual user messages during stage-1 serialization, which preserves environment context in the same message. - Keep subagent notifications in the memory input, and add focused unit coverage for the fragment classifier, rollout policy, and stage-1 serialization path. ###### Test plan - `just fmt` - `cargo test -p codex-core --lib contextual_user_message` - `cargo test -p codex-core --lib rollout::policy` - `cargo test -p codex-core --lib memories::phase1`	2026-03-16 19:30:38 +00:00
Anton Panasenko	663dd3f935	fix(core): fix sanitize name to use '_' everywhere (#14833 )	2026-03-16 12:22:10 -07:00
Eric Traut	db89b73a9c	Move TUI on top of app server (parallel code) (#14717 ) This PR replicates the `tui` code directory and creates a temporary parallel `tui_app_server` directory. It also implements a new feature flag `tui_app_server` to select between the two tui implementations. Once the new app-server-based TUI is stabilized, we'll delete the old `tui` directory and feature flag.	2026-03-16 10:49:19 -06:00
jif-oai	3f266bcd68	feat: make interrupt state not final for multi-agents (#13850 ) Make `interrupted` an agent state and make it not final. As a result, a `wait` won't return on an interrupted agent and no notification will be send to the parent agent. The rationals are: * If a user interrupt a sub-agent for any reason, you don't want the parent agent to instantaneously ask the sub-agent to restart * If a parent agent interrupt a sub-agent, no need to add a noisy notification in the parent agen	2026-03-16 16:39:40 +00:00
jif-oai	18ad67549c	feat: improve skills cache key to take into account config layering (#14806 ) Fix https://github.com/openai/codex/issues/14161 This fixes sub-agent [[skills.config]] overrides being ignored when parent and child share the same cwd. The root cause was that turn skill loading rebuilt from cwd-only state and reused a cwd-scoped cache, so role-local skill enable/disable overrides did not reliably affect the spawned agent's effective skill set. This change switches turn construction to use the effective per-turn config and adds a config-aware skills cache keyed by skill roots plus final disabled paths.	2026-03-16 16:12:44 +00:00
jif-oai	33acc1e65f	fix: sub-agent role when using profiles (#14807 ) Fix the layering conflict when a project profile is used with agents. This PR clean the config layering and make sure the agent config > project profile Fix https://github.com/openai/codex/issues/13849, https://github.com/openai/codex/issues/14671	2026-03-16 16:08:16 +00:00
Matthew Zeng	029aab5563	fix(core): preserve tool_params for elicitations (#14769 ) - [x] Preserve tool_params keys.	2026-03-15 23:15:52 -07:00
Charley Cunningham	6fdeb1d602	Reuse guardian session across approvals (#14668 ) ## Summary - reuse a guardian subagent session across approvals so reviews keep a stable prompt cache key and avoid one-shot startup overhead - clear the guardian child history before each review so prior guardian decisions do not leak into later approvals - include the `smart_approvals` -> `guardian_approval` feature flag rename in the same PR to minimize release latency on a very tight timeline - add regression coverage for prompt-cache-key reuse without prior-review prompt bleed ## Request - Bug/enhancement request: internal guardian prompt-cache and latency improvement request --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-15 22:56:18 -07:00
friel-openai	ba463a9dc7	Preserve background terminals on interrupt and rename cleanup command to /stop (#14602 ) ### Motivation - Interrupting a running turn (Ctrl+C / Esc) currently also terminates long‑running background shells, which is surprising for workflows like local dev servers or file watchers. - The existing cleanup command name was confusing; callers expect an explicit command to stop background terminals rather than a UI clear action. - Make background‑shell termination explicit and surface a clearer command name while preserving backward compatibility. ### Description - Renamed the background‑terminal cleanup slash command from `Clean` (`/clean`) to `Stop` (`/stop`) and kept `clean` as an alias in the command parsing/visibility layer, updated the user descriptions and command popup wiring accordingly. - Updated the unified‑exec footer text and snapshots to point to `/stop` (and trimmed corresponding snapshot output to match the new label). - Changed interrupt behavior so `Op::Interrupt` (Ctrl+C / Esc interrupt) no longer closes or clears tracked unified exec / background terminal processes in the TUI or core cleanup path; background shells are now preserved after an interrupt. - Updated protocol/docs to clarify that `turn/interrupt` (or `Op::Interrupt`) interrupts the active turn but does not terminate background terminals, and that `thread/backgroundTerminals/clean` is the explicit API to stop those shells. - Updated unit/integration tests and insta snapshots in the TUI and core unified‑exec suites to reflect the new semantics and command name. ### Testing - Ran formatting with `just fmt` in `codex-rs` (succeeded). - Ran `cargo test -p codex-protocol` (succeeded). - Attempted `cargo test -p codex-tui` but the build could not complete in this environment due to a native build dependency that requires `libcap` development headers (the `codex-linux-sandbox` vendored build step); install `libcap-dev` / make `libcap.pc` available in `PKG_CONFIG_PATH` to run the TUI test suite locally. - Updated and accepted the affected `insta` snapshots for the TUI changes so visual diffs reflect the new `/stop` wording and preserved interrupt behavior. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69b39c44b6dc8323bd133ae206310fae)	2026-03-15 22:17:25 -07:00
Matthew Zeng	d4af6053e2	[apps] Improve search tool fallback. (#14732 ) - [x] Bypass tool search and stuff tool specs directly into model context when either a. Tool search is not available for the model or b. There are not that many tools to search for.	2026-03-15 21:41:55 -07:00
Matthew Zeng	49edf311ac	[apps] Add tool call meta. (#14647 ) - [x] Add resource_uri and other things to _meta to shortcut resource lookup and speed things up.	2026-03-14 22:24:13 -07:00
Colin Young	d692b74007	Add auth 401 observability to client bug reports (#14611 ) CXC-392 [With 401](https://openai.sentry.io/issues/7333870443/?project=4510195390611458&query=019ce8f8-560c-7f10-a00a-c59553740674&referrer=issue-stream) <img width="1909" height="555" alt="401 auth tags in Sentry" src="https://github.com/user-attachments/assets/412ea950-61c4-4780-9697-15c270971ee3" /> - auth_401_: preserved facts from the latest unauthorized response snapshot - auth_: latest auth-related facts from the latest request attempt - auth_recovery_: unauthorized recovery state and follow-up result Without 401 <img width="1917" height="522" alt="happy-path auth tags in Sentry" src="https://github.com/user-attachments/assets/3381ed28-8022-43b0-b6c0-623a630e679f" /> ###### Summary - Add client-visible 401 diagnostics for auth attachment, upstream auth classification, and 401 request id / cf-ray correlation. - Record unauthorized recovery mode, phase, outcome, and retry/follow-up status without changing auth behavior. - Surface the highest-signal auth and recovery fields on uploaded client bug reports so they are usable in Sentry. - Preserve original unauthorized evidence under `auth_401_` while keeping follow-up result tags separate. ###### Rationale (from spec findings) - The dominant bucket needed proof of whether the client attached auth before send or upstream still classified the request as missing auth. - Client uploads needed to show whether unauthorized recovery ran and what the client tried next. - Request id and cf-ray needed to be preserved on the unauthorized response so server-side correlation is immediate. - The bug-report path needed the same auth evidence as the request telemetry path, otherwise the observability would not be operationally useful. ###### Scope - Add auth 401 and unauthorized-recovery observability in `codex-rs/core`, `codex-rs/codex-api`, and `codex-rs/otel`, including feedback-tag surfacing. - Keep auth semantics, refresh behavior, retry behavior, endpoint classification, and geo-denial follow-up work out of this PR. ###### Trade-offs - This exports only safe auth evidence: header presence/name, upstream auth classification, request ids, and recovery state. It does not export token values or raw upstream bodies. - This keeps websocket connection reuse as a transport clue because it can help distinguish stale reused sessions from fresh reconnects. - Misroute/base-url classification and geo-denial are intentionally deferred to a separate follow-up PR so this review stays focused on the dominant auth 401 bucket. ###### Client follow-up - PR 2 will add misroute/provider and geo-denial observability plus the matching feedback-tag surfacing. - A separate host/app-server PR should log auth-decision inputs so pre-send host auth state can be correlated with client request evidence. - `device_id` remains intentionally separate until there is a safe existing source on the feedback upload path. ###### Testing - `cargo test -p codex-core refresh_available_models_sorts_by_priority` - `cargo test -p codex-core emit_feedback_request_tags_` - `cargo test -p codex-core emit_feedback_auth_recovery_tags_` - `cargo test -p codex-core auth_request_telemetry_context_tracks_attached_auth_and_retry_phase` - `cargo test -p codex-core extract_response_debug_context_decodes_identity_headers` - `cargo test -p codex-core identity_auth_details` - `cargo test -p codex-core telemetry_error_messages_preserve_non_http_details` - `cargo test -p codex-core --all-features --no-run` - `cargo test -p codex-otel otel_export_routing_policy_routes_api_request_auth_observability` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_connect_auth_observability` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_request_transport_observability`	2026-03-14 15:38:51 -07:00
viyatb-oai	9060dc7557	fix: fix symlinked writable roots in sandbox policies (#14674 ) ## Summary - normalize effective readable, writable, and unreadable sandbox roots after resolving special paths so symlinked roots use canonical runtime paths - add a protocol regression test for a symlinked writable root with a denied child and update protocol expectations to canonicalized effective paths - update macOS seatbelt tests to assert against effective normalized roots produced by the shared policy helpers ## Testing - just fmt - cargo test -p codex-protocol - cargo test -p codex-core explicit_unreadable_paths_are_excluded_ - cargo clippy -p codex-protocol -p codex-core --tests -- -D warnings ## Notes - This is intended to fix the symlinked TMPDIR bind failure in bubblewrap described in #14672. Fixes #14672	2026-03-14 13:24:43 -07:00
Channing Conger	70eddad6b0	dynamic tool calls: add param `exposeToContext` to optionally hide tool (#14501 ) This extends dynamic_tool_calls to allow us to hide a tool from the model context but still use it as part of the general tool calling runtime (for ex from js_repl/code_mode)	2026-03-14 01:58:43 -07:00
sayan-oai	e389091042	make defaultPrompt an array, keep backcompat (#14649 ) make plugins' `defaultPrompt` an array, but keep backcompat for strings. the array is limited by app-server to 3 entries of up to 128 chars (drops extra entries, `None`s-out ones that are too long) without erroring if those invariants are violating. added tests, tested locally.	2026-03-14 06:13:51 +00:00
Eric Traut	ae0a6510e1	Enforce errors on overriding built-in model providers (#12024 ) We receive bug reports from users who attempt to override one of the three built-in model providers (openai, ollama, or lmstuio). Currently, these overrides are silently ignored. This PR makes it an error to override them. ## Summary - add validation for `model_providers` so `openai`, `ollama`, and `lmstudio` keys now produce clear configuration errors instead of being silently ignored	2026-03-13 22:10:13 -06:00
sayan-oai	d272f45058	move plugin/skill instructions into dev msg and reorder (#14609 ) Move the general `Apps`, `Skills` and `Plugins` instructions blocks out of `user_instructions` and into the developer message, with new `Apps -> Skills -> Plugins` order for better clarity. Also wrap those sections in stable XML-style instruction tags (like other sections) and update prompt-layout tests/snapshots. This makes the tests less brittle in snapshot output (we can parse the sections), and it consolidates the capability instructions in one place. #### Tests Updated snapshots, added tests. `<AGENTS_MD>` disappearing in snapshots is expected: before this change, the wrapped user-instructions message was kept alive by `Skills` content. Now that `Skills` and `Plugins` are in the developer message, that wrapper only appears when there is real project-doc/user-instructions content. --------- Co-authored-by: Charley Cunningham <ccunningham@openai.com>	2026-03-13 20:51:01 -07:00
viyatb-oai	7f571396c8	fix: sync split sandbox policies for spawned subagents (#14650 ) ## Summary - reapply the live split filesystem and network sandbox policies when building spawned subagent configs - keep spawned child sessions aligned with the parent turn after role-layer config reloads - add regression coverage for both config construction and spawned child-turn inheritance	2026-03-14 03:03:49 +00:00
viyatb-oai	6dc04df5e6	fix: persist future network host approvals across sessions (#14619 ) ## Summary - apply persisted execpolicy network rules when booting the managed network proxy - pass the current execpolicy into managed proxy startup so host approvals selected with "allow this host in the future" survive new sessions	2026-03-14 02:46:10 +00:00
Charley Cunningham	bbd329a812	Fix turn context reconstruction after backtracking (#14616 ) ## Summary - reuse rollout reconstruction when applying a backtrack rollback so `reference_context_item` is restored from persisted rollout state - build rollback replay from the flushed rollout items plus the rollback marker, avoiding the extra reread/fallback path - add regression coverage for rollback after compaction so turn-context diffing stays aligned after backtracking Co-authored-by: Codex <noreply@openai.com>	2026-03-13 19:28:31 -07:00
Ahmed Ibrahim	69c8a1ef9e	Fix Windows CI assertions for guardian and Smart Approvals (#14645 ) - Normalize guardian assessment path serialization to use forward slashes for cross-platform stability. - Seed workspace-write defaults in the Smart Approvals override-turn-context test so Windows and non-Windows selection flows are consistent. --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-14 02:15:58 +00:00
Eric Traut	4b9d5c8c1b	Add openai_base_url config override for built-in provider (#12031 ) We regularly get bug reports from users who mistakenly have the `OPENAI_BASE_URL` environment variable set. This PR deprecates this environment variable in favor of a top-level config key `openai_base_url` that is used for the same purpose. By making it a config key, it will be more visible to users. It will also participate in all of the infrastructure we've added for layered and managed configs. Summary - introduce the `openai_base_url` top-level config key, update schema/tests, and route the built-in openai provider through it while - fall back to deprecated `OPENAI_BASE_URL` env var but warn user of deprecation when no `openai_base_url` config key is present - update CLI, SDK, and TUI code to prefer the new config path (with a deprecated env-var fallback) and document the SDK behavior change	2026-03-13 20:12:25 -06:00
Michael Bolin	b859a98e0f	refactor: make unified-exec zsh-fork state explicit (#14633 ) ## Why The unified-exec path was carrying zsh-fork state in a partially flattened way. First, the decision about whether zsh-fork was active came from feature selection in `ToolsConfig`, while the real prerequisites lived in session state. That left the handler and runtime defending against partially configured cases later. Second, once zsh-fork was active, its two runtime-only paths were threaded through the runtime as separate arguments even though they form one coherent piece of configuration. This change keeps unified-exec on a single session-derived source of truth and bundles the zsh-fork-specific paths into a named config type so the runtime can pass them around as one unit. In particular, this PR introduces this enum so the `ZshFork` variant can carry the appropriate state with it: ```rust #[derive(Debug, Clone, Eq, PartialEq)] pub enum UnifiedExecShellMode { Direct, ZshFork(ZshForkConfig), } #[derive(Debug, Clone, Eq, PartialEq)] pub struct ZshForkConfig { pub(crate) shell_zsh_path: AbsolutePathBuf, pub(crate) main_execve_wrapper_exe: AbsolutePathBuf, } ``` This cleanup was done in preparation for https://github.com/openai/codex/pull/13432. ## What Changed - Replaced the feature-only `UnifiedExecBackendConfig` split with `UnifiedExecShellMode` in `codex-rs/core/src/tools/spec.rs`. - Derived the unified-exec mode from session-backed inputs when building turn `ToolsConfig`, and preserved that mode across model switches and review turns. - Introduced `ZshForkConfig`, which stores the resolved zsh-fork `AbsolutePathBuf` values for the configured `zsh` binary and `execve` wrapper. - Threaded `ZshForkConfig` through unified-exec command construction and the zsh-fork preparation path so zsh-fork-specific runtime code consumes a single config object instead of separate path arguments. - Added focused tests for constructing zsh-fork mode only when session prerequisites are available, and updated the zsh-fork expectations to be target-platform aware. ## Testing - `cargo test -p codex-core zsh_fork --lib` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/14633). * #13432 * __->__ #14633	2026-03-13 17:20:01 -07:00
Andrei Eternal	9a44a7e499	[hooks] stop continuation & stop_hook_active mechanics (#14532 ) Stop hooks now receive `stop_hook_active` and enable stop hooks to loop forever if they'd like to. In the initial hooks PR, we implemented a simpler mechanic that the stop-blocking could only happen once in a row - support stop hook adding a continuation prompt to add a further task - if multiple stop-blocks happen that have continuation prompts, they are concatenated example run: ``` › hey :) • Running SessionStart hook: lighting the observatory SessionStart hook (completed) warning: Hi, I'm a session start hook for wizard-tower (startup). hook context: Startup context: A wimboltine stonpet is an exotic cuisine from hyperspace • Aloha :) Happy to jam with you. What are we building today? • Running Stop hook: updating the guards Stop hook (blocked) warning: Wizard Tower Stop hook continuing conversation feedback: cook the stonpet • Aloha, here’s the hyperspace move for cooking a wimboltine stonpet: 1. Sear the stonpet in a hot pan with moon-oil until the edges shimmer. 2. Add star-lime, black salt, and a little fermented nebula paste. 3. Lower the heat and let it braise for 8 cosmic minutes with a splash of comet broth. 4. Finish with sky herbs and serve over warm asteroid rice. The vibe: crispy outside, tender center, deep interdimensional savor. • Running Stop hook: updating the guards Stop hook (completed) warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop. ``` .codex/config.toml ``` [features] codex_hooks = true ``` .codex/hooks.json ``` { "hooks": { "SessionStart": [ { "matcher": "startup\|resume", "hooks": [ { "type": "command", "command": "/usr/bin/python3 .codex/hooks/session_start_demo.py", "timeoutSec": 10, "statusMessage": "lighting the observatory" } ] } ], "Stop": [ { "hooks": [ { "type": "command", "command": "/usr/bin/python3 .codex/hooks/stop_demo_block.py", "timeoutSec": 10, "statusMessage": "updating the guards" } ] } ] } } ``` .codex/hooks/session_start_demo.py ``` #!/usr/bin/env python3 import json import sys from pathlib import Path def main() -> int: payload = json.load(sys.stdin) cwd = Path(payload.get("cwd", ".")).name or "wizard-tower" source = payload.get("source", "startup") source_label = "resume" if source == "resume" else "startup" source_prefix = ( "Resume context:" if source == "resume" else "Startup context:" ) output = { "systemMessage": ( f"Hi, I'm a session start hook for {cwd} ({source_label})." ), "hookSpecificOutput": { "hookEventName": "SessionStart", "additionalContext": ( f"{source_prefix} A wimboltine stonpet is an exotic cuisine from hyperspace" ), }, } print(json.dumps(output)) return 0 if __name__ == "__main__": raise SystemExit(main()) ``` .codex/hooks/stop_demo_block.py ``` #!/usr/bin/env python3 import json import sys def main() -> int: payload = json.load(sys.stdin) stop_hook_active = payload.get("stop_hook_active", False) last_assistant_message = payload.get("last_assistant_message") or "" char_count = len(last_assistant_message.strip()) if stop_hook_active: system_message = ( "Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop." ) print(json.dumps({"systemMessage": system_message})) else: system_message = ( f"Wizard Tower Stop hook continuing conversation" ) print(json.dumps({"systemMessage": system_message, "decision": "block", "reason": "cook the stonpet"})) return 0 if __name__ == "__main__": raise SystemExit(main()) ```	2026-03-13 15:51:19 -07:00
Charley Cunningham	467e6216bb	Fix stale create_wait_tool reference (#14639 ) ## Summary - replace the stale `create_wait_tool()` reference in `spec_tests.rs` - use `create_wait_agent_tool()` to match the actual multi-agent tool rename from `#14631` - fix the resulting `codex-core` spec-test compile failure on current `main` ## Context `#14631` renamed the model-facing multi-agent tool from `wait` to `wait_agent` and renamed the corresponding spec helper to `create_wait_agent_tool()`. One `spec_tests.rs` call site was left behind, so current `main` fails to compile `codex-core` tests with: - `cannot find function create_wait_tool` Using `create_wait_agent_tool()` is the correct fix here; `create_exec_wait_tool()` would point at the separate exec wait tool and would not match the renamed multi-agent toolset. ## Testing - not rerun locally after the rebase Co-authored-by: Codex <noreply@openai.com>	2026-03-13 15:35:25 -07:00
Charley Cunningham	bc24017d64	Add Smart Approvals guardian review across core, app-server, and TUI (#13860 ) ## Summary - add `approvals_reviewer = "user" \| "guardian_subagent"` as the runtime control for who reviews approval requests - route Smart Approvals guardian review through core for command execution, file changes, managed-network approvals, MCP approvals, and delegated/subagent approval flows - expose guardian review in app-server with temporary unstable `item/autoApprovalReview/{started,completed}` notifications carrying `targetItemId`, `review`, and `action` - update the TUI so Smart Approvals can be enabled from `/experimental`, aligned with the matching `/approvals` mode, and surfaced clearly while reviews are pending or resolved ## Runtime model This PR does not introduce a new `approval_policy`. Instead: - `approval_policy` still controls when approval is needed - `approvals_reviewer` controls who reviewable approval requests are routed to: - `user` - `guardian_subagent` `guardian_subagent` is a carefully prompted reviewer subagent that gathers relevant context and applies a risk-based decision framework before approving or denying the request. The `smart_approvals` feature flag is a rollout/UI gate. Core runtime behavior keys off `approvals_reviewer`. When Smart Approvals is enabled from the TUI, it also switches the current `/approvals` settings to the matching Smart Approvals mode so users immediately see guardian review in the active thread: - `approval_policy = on-request` - `approvals_reviewer = guardian_subagent` - `sandbox_mode = workspace-write` Users can still change `/approvals` afterward. Config-load behavior stays intentionally narrow: - plain `smart_approvals = true` in `config.toml` remains just the rollout/UI gate and does not auto-set `approvals_reviewer` - the deprecated `guardian_approval = true` alias migration does backfill `approvals_reviewer = "guardian_subagent"` in the same scope when that reviewer is not already configured there, so old configs preserve their original guardian-enabled behavior ARC remains a separate safety check. For MCP tool approvals, ARC escalations now flow into the configured reviewer instead of always bypassing guardian and forcing manual review. ## Config stability The runtime reviewer override is stable, but the config-backed app-server protocol shape is still settling. - `thread/start`, `thread/resume`, and `turn/start` keep stable `approvalsReviewer` overrides - the config-backed `approvals_reviewer` exposure returned via `config/read` (including profile-level config) is now marked `[UNSTABLE]` / experimental in the app-server protocol until we are more confident in that config surface ## App-server surface This PR intentionally keeps the guardian app-server shape narrow and temporary. It adds generic unstable lifecycle notifications: - `item/autoApprovalReview/started` - `item/autoApprovalReview/completed` with payloads of the form: - `{ threadId, turnId, targetItemId, review, action? }` `review` is currently: - `{ status, riskScore?, riskLevel?, rationale? }` - where `status` is one of `inProgress`, `approved`, `denied`, or `aborted` `action` carries the guardian action summary payload from core when available. This lets clients render temporary standalone pending-review UI, including parallel reviews, even when the underlying tool item has not been emitted yet. These notifications are explicitly documented as `[UNSTABLE]` and expected to change soon. This PR does not persist guardian review state onto `thread/read` tool items. The intended follow-up is to attach guardian review state to the reviewed tool item lifecycle instead, which would improve consistency with manual approvals and allow thread history / reconnect flows to replay guardian review state directly. ## TUI behavior - `/experimental` exposes the rollout gate as `Smart Approvals` - enabling it in the TUI enables the feature and switches the current session to the matching Smart Approvals `/approvals` mode - disabling it in the TUI clears the persisted `approvals_reviewer` override when appropriate and returns the session to default manual review when the effective reviewer changes - `/approvals` still exposes the reviewer choice directly - the TUI renders: - pending guardian review state in the live status footer, including parallel review aggregation - resolved approval/denial state in history ## Scope notes This PR includes the supporting core/runtime work needed to make Smart Approvals usable end-to-end: - shell / unified-exec / apply_patch / managed-network / MCP guardian review - delegated/subagent approval routing into guardian review - guardian review risk metadata and action summaries for app-server/TUI - config/profile/TUI handling for `smart_approvals`, `guardian_approval` alias migration, and `approvals_reviewer` - a small internal cleanup of delegated approval forwarding to dedupe fallback paths and simplify guardian-vs-parent approval waiting (no intended behavior change) Out of scope for this PR: - redesigning the existing manual approval protocol shapes - persisting guardian review state onto app-server `ThreadItem`s - delegated MCP elicitation auto-review (the current delegated MCP guardian shim only covers the legacy `RequestUserInput` path) --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-13 15:27:00 -07:00
Charley Cunningham	e3cbf913e8	Fix wait_agent expectations in core tests (#14637 ) ## Summary - update stale core tool-spec expectations from `wait` to `wait_agent` - update the prompt-caching tool-name assertion to match the renamed tool - fix the Bazel regressions introduced after #14631 renamed the multi-agent wait tool ## Testing - cargo test -p codex-core tools::spec::tests - cargo test -p codex-core suite::prompt_caching::prompt_tools_are_consistent_across_requests Co-authored-by: Codex <noreply@openai.com>	2026-03-13 15:15:59 -07:00
pakrym-oai	cb7d8f45a1	Normalize MCP tool names to code-mode safe form (#14605 ) Code mode doesn't allow `-` in names and it's better if function names and code-mode names are the same.	2026-03-13 14:50:16 -07:00
Ahmed Ibrahim	36dfb84427	Stabilize multi-agent feature flag (#14622 ) - make multi_agent stable and enabled by default - update feature and tool-spec coverage to match the new default --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-13 14:38:15 -07:00
Ahmed Ibrahim	cfd97b36da	Rename multi-agent wait tool to wait_agent (#14631 ) - rename the multi-agent tool name the model sees to wait_agent - update the model-facing prompts and tool descriptions to match --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-13 14:38:05 -07:00
pakrym-oai	477a2dd345	Add code_mode_only feature (#14617 ) Summary - add the code_mode_only feature flag/config schema and wire its dependency on code_mode - update code mode tool descriptions to list nested tools with detailed headers - restrict available tools for prompt and exec descriptions when code_mode_only is enabled and test the behavior Testing - Not run (not requested)	2026-03-13 13:30:19 -07:00
Michael Bolin	ef37d313c6	fix: preserve zsh-fork escalation fds across unified-exec spawn paths (#13644 ) ## Why `zsh-fork` sessions launched through unified-exec need the escalation socket to survive the wrapper -> server -> child handoff so later intercepted `exec()` calls can still reach the escalation server. The inherited-fd spawn path also needs to avoid closing Rust's internal exec-error pipe, and the shell-escalation handoff needs to tolerate the receive-side case where a transferred fd is installed into the same stdio slot it will be mapped onto. ## What Changed - Added `SpawnLifecycle::inherited_fds()` in `codex-rs/core/src/unified_exec/process.rs` and threaded inherited fds through `codex-rs/core/src/unified_exec/process_manager.rs` so unified-exec can preserve required descriptors across both PTY and no-stdin pipe spawn paths. - Updated `codex-rs/core/src/tools/runtimes/shell/zsh_fork_backend.rs` to expose the escalation socket fd through the spawn lifecycle. - Added inherited-fd-aware spawn helpers in `codex-rs/utils/pty/src/pty.rs` and `codex-rs/utils/pty/src/pipe.rs`, including Unix pre-exec fd pruning that preserves requested inherited fds while leaving `FD_CLOEXEC` descriptors alone. The pruning helper is now named `close_inherited_fds_except()` to better describe that behavior. - Updated `codex-rs/shell-escalation/src/unix/escalate_client.rs` to duplicate local stdio before transfer and send destination stdio numbers in `SuperExecMessage`, so the wrapper keeps using its own `stdin`/`stdout`/`stderr` until the escalated child takes over. - Updated `codex-rs/shell-escalation/src/unix/escalate_server.rs` so the server accepts the overlap case where a received fd reuses the same stdio descriptor number that the child setup will target with `dup2`. - Added comments around the PTY stdio wiring and the overlap regression helper to make the fd handoff and controlling-terminal setup easier to follow. ## Verification - `cargo test -p codex-utils-pty` - covers preserved-fd PTY spawn behavior, PTY resize, Python REPL continuity, exec-failure reporting, and the no-stdin pipe path - `cargo test -p codex-shell-escalation` - covers duplicated-fd transfer on the client side and verifies the overlap case by passing a pipe-backed stdin payload through the server-side `dup2` path --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13644). * #14624 * __->__ #13644	2026-03-13 20:25:31 +00:00
Owen Lin	014e19510d	feat(app-server, core): add more spans (#14479 ) ## Description This PR expands tracing coverage across app-server thread startup, core session initialization, and the Responses transport layer. It also gives core dispatch spans stable operation-specific names so traces are easier to follow than the old generic `submission_dispatch` spans. Also use `fmt::Display` for types that we serialize in traces so we send strings instead of rust types	2026-03-13 13:16:33 -07:00
canvrno-oai	914f7c7317	Override local apps settings with requirements.toml settings (#14304 ) This PR changes app and connector enablement when `requirements.toml` is present locally or via remote configuration. For apps.* entries: - `enabled = false` in `requirements.toml` overrides the user’s local `config.toml` and forces the app to be disabled. - `enabled = true` in `requirements.toml` does not re-enable an app the user has disabled in config.toml. This behavior applies whether or not the user has an explicit entry for that app in `config.toml`. It also applies to cloud-managed policies and configurations when the admin sets the override through `requirements.toml`. Scenarios tested and verified: - Remote managed, user config (present) override - Admin-defined policies & configurations include a connector override: `[apps.<appID>] enabled = false` - User's config.toml has the same connector configured with `enabled = true` - TUI/App should show connector as disabled - Connector should be unavailable for use in the composer - Remote managed, user config (absent) override - Admin-defined policies & configurations include a connector override: `[apps.<appID>] enabled = false` - User's config.toml has no entry for the the same connector - TUI/App should show connector as disabled - Connector should be unavailable for use in the composer - Locally managed, user config (present) override - Local requirements.toml includes a connector override: `[apps.<appID>] enabled = false` - User's config.toml has the same connector configured with `enabled = true` - TUI/App should show connector as disabled - Connector should be unavailable for use in the composer - Locally managed, user config (absent) override - Local requirements.toml includes a connector override: `[apps.<appID>] enabled = false` - User's config.toml has no entry for the the same connector - TUI/App should show connector as disabled - Connector should be unavailable for use in the composer <img width="1446" height="753" alt="image" src="https://github.com/user-attachments/assets/61c714ca-dcca-4952-8ad2-0afc16ff3835" /> <img width="595" height="233" alt="image" src="https://github.com/user-attachments/assets/7c8ab147-8fd7-429a-89fb-591c21c15621" />	2026-03-13 12:40:24 -07:00
Ahmed Ibrahim	d58620c852	Use subagents naming in the TUI (#14618 ) - rename user-facing TUI multi-agent wording to subagents - rename the surfaced slash command to `subagents` and update tests/snapshots Co-authored-by: Codex <noreply@openai.com>	2026-03-13 19:08:38 +00:00
Ahmed Ibrahim	3aabce9e0a	Unify realtime v1/v2 session config (#14606 ) ## Summary - unify realtime websocket settings under `[realtime]` (`version` and `type`) - remove `realtime_conversation_v2` and select parser/session mode from config ## Testing - not run (per request) --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-13 11:35:38 -07:00
Eric Traut	9dba7337f2	Start TUI on embedded app server (#14512 ) This PR is part of the effort to move the TUI on top of the app server. In a previous PR, we introduced an in-process app server and moved `exec` on top of it. For the TUI, we want to do the migration in stages. The app server doesn't currently expose all of the functionality required by the TUI, so we're going to need to support a hybrid approach as we make the transition. This PR changes the TUI initialization to instantiate an in-process app server and access its `AuthManager` and `ThreadManager` rather than constructing its own copies. It also adds a placeholder TUI event handler that will eventually translate app server events into TUI events. App server notifications are accepted but ignored for now. It also adds proper shutdown of the app server when the TUI terminates.	2026-03-13 12:04:41 -06:00

1 2 3 4 5 ...

2286 Commits