Commit Graph

2286 Commits

Author SHA1 Message Date
jif-oai
e8add54e5d feat: show effective model in spawn agent event (#14944)
Show effective model after the full config layering for the sub agent
2026-03-17 16:58:58 +00:00
daveaitel-openai
ef36d39199 Fix agent jobs finalization race and reduce status polling churn (#14843)
## Summary
- make `report_agent_job_result` atomically transition an item from
running to completed while storing `result_json`
- remove brittle finalization grace-sleep logic and make finished-item
cleanup idempotent
- replace blind fixed-interval waiting with status-subscription-based
waiting for active worker threads
- add state runtime tests for atomic completion and late-report
rejection

## Why
This addresses the race and polling concerns in #13948 by removing
timing-based correctness assumptions and reducing unnecessary status
polling churn.

## Validation
- `cd codex-rs && just fmt`
- `cd codex-rs && cargo test -p codex-state`
- `cd codex-rs && cargo test -p codex-core --test all suite::agent_jobs`
- `cd codex-rs && cargo test`
- fails in an unrelated app-server tracing test:
`message_processor::tracing_tests::thread_start_jsonrpc_span_exports_server_span_and_parents_children`
timed out waiting for response

## Notes
- This PR supersedes #14129 with the same agent-jobs fix on a clean
branch from `main`.
- The earlier PR branch was stacked on unrelated history, which made the
review diff include unrelated commits.

Fixes #13948
2026-03-17 10:40:14 -04:00
jif-oai
4ed19b0766 feat: rename to get more explicit close agent (#14935)
https://github.com/openai/codex/issues/14907
2026-03-17 14:37:20 +00:00
jif-oai
31648563c8 feat: centralize package manager version (#14920) 2026-03-17 12:03:07 +00:00
viyatb-oai
db7e02c739 fix: canonicalize symlinked Linux sandbox cwd (#14849)
## Problem
On Linux, Codex can be launched from a workspace path that is a symlink
(for example, a symlinked checkout or a symlinked parent directory).

Our sandbox policy intentionally canonicalizes writable/readable roots
to the real filesystem path before building the bubblewrap mounts. That
part is correct and needed for safety.

The remaining bug was that bubblewrap could still inherit the helper
process's logical cwd, which might be the symlinked alias instead of the
mounted canonical path. In that case, the sandbox starts in a cwd that
does not exist inside the sandbox namespace even though the real
workspace is mounted. This can cause sandboxed commands to fail in
symlinked workspaces.

## Fix
This PR keeps the sandbox policy behavior the same, but separates two
concepts that were previously conflated:

- the canonical cwd used to define sandbox mounts and permissions
- the caller's logical cwd used when launching the command

On the Linux bubblewrap path, we now thread the logical command cwd
through the helper explicitly and only add `--chdir <canonical path>`
when the logical cwd differs from the mounted canonical path.

That means:
- permissions are still computed from canonical paths
- bubblewrap starts the command from a cwd that definitely exists inside
the sandbox
- we do not widen filesystem access or undo the earlier symlink
hardening

## Why This Is Safe
This is a narrow Linux-only launch fix, not a policy change.

- Writable/readable root canonicalization stays intact.
- Protected metadata carveouts still operate on canonical roots.
- We only override bubblewrap's inherited cwd when the logical path
would otherwise point at a symlink alias that is not mounted in the
sandbox.

## Tests
- kept the existing protocol/core regression coverage for symlink
canonicalization
- added regression coverage for symlinked cwd handling in the Linux
bubblewrap builder/helper path

Local validation:
- `just fmt`
- `cargo test -p codex-protocol`
- `cargo test -p codex-core
normalize_additional_permissions_canonicalizes_symlinked_write_paths`
- `cargo clippy -p codex-linux-sandbox -p codex-protocol -p codex-core
--tests -- -D warnings`
- `cargo build --bin codex`

## Context
This is related to #14694. The earlier writable-root symlink fix
addressed the mount/permission side; this PR fixes the remaining
symlinked-cwd launch mismatch in the Linux sandbox path.
2026-03-16 22:39:18 -07:00
Ahmed Ibrahim
79f476e47d [stack 3/4] Add current thread context to realtime startup (#14829)
## Stack Position
3/4. Top-of-stack sibling built on #14830.

## Base
- #14830

## Sibling
- #14827

## Scope
- Extend the realtime startup context with a bounded summary of the
latest thread turns for continuity.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-17 05:11:05 +00:00
Thibault Sottiaux
8e34caffcc [codex] add Jason as a predefined subagent name (#14881)
This change adds Jason to codex-core's built-in subagent nickname pool
so spawned agents can pick it without any custom role configuration. The
default list was simply missing that predefined name (a grave mistake).
2026-03-16 22:01:14 -07:00
xl-openai
e5a28ba0c2 fix: align marketplace display name with existing interface conventions (#14886)
1. camelCase for displayName;
2. move displayName under interface.
2026-03-16 21:52:19 -07:00
Ahmed Ibrahim
fbd7f9b986 [stack 2/4] Align main realtime v2 wire and runtime flow (#14830)
## Stack Position
2/4. Built on top of #14828.

## Base
- #14828

## Unblocks
- #14829
- #14827

## Scope
- Port the realtime v2 wire parsing, session, app-server, and
conversation runtime behavior onto the split websocket-method base.
- Branch runtime behavior directly on the current realtime session kind
instead of parser-derived flow flags.
- Keep regression coverage in the existing e2e suites.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-16 21:38:07 -07:00
xl-openai
1d85fe79ed feat: support remote_sync for plugin install/uninstall. (#14878)
- Added forceRemoteSync to plugin/install and plugin/uninstall.
- With forceRemoteSync=true, we update the remote plugin status first,
then apply the local change only if the backend call succeeds.
- Kept plugin/list(forceRemoteSync=true) as the main recon path, and for
now it treats remote enabled=false as uninstall. We
will eventually migrate to plugin/installed for more precise state
handling.
2026-03-16 21:37:27 -07:00
xl-openai
49c2b66ece Add marketplace display names to plugin/list (#14861)
Add display_name support to marketplace.json.
2026-03-16 19:04:40 -07:00
Michael Bolin
b77fe8fefe Apply argument comment lint across codex-rs (#14652)
## Why

Once the repo-local lint exists, `codex-rs` needs to follow the
checked-in convention and CI needs to keep it from drifting. This commit
applies the fallback `/*param*/` style consistently across existing
positional literal call sites without changing those APIs.

The longer-term preference is still to avoid APIs that require comments
by choosing clearer parameter types and call shapes. This PR is
intentionally the mechanical follow-through for the places where the
existing signatures stay in place.

After rebasing onto newer `main`, the rollout also had to cover newly
introduced `tui_app_server` call sites. That made it clear the first cut
of the CI job was too expensive for the common path: it was spending
almost as much time installing `cargo-dylint` and re-testing the lint
crate as a representative test job spends running product tests. The CI
update keeps the full workspace enforcement but trims that extra
overhead from ordinary `codex-rs` PRs.

## What changed

- keep a dedicated `argument_comment_lint` job in `rust-ci`
- mechanically annotate remaining opaque positional literals across
`codex-rs` with exact `/*param*/` comments, including the rebased
`tui_app_server` call sites that now fall under the lint
- keep the checked-in style aligned with the lint policy by using
`/*param*/` and leaving string and char literals uncommented
- cache `cargo-dylint`, `dylint-link`, and the relevant Cargo
registry/git metadata in the lint job
- split changed-path detection so the lint crate's own `cargo test` step
runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes
- continue to run the repo wrapper over the `codex-rs` workspace, so
product-code enforcement is unchanged

Most of the code changes in this commit are intentionally mechanical
comment rewrites or insertions driven by the lint itself.

## Verification

- `./tools/argument-comment-lint/run.sh --workspace`
- `cargo test -p codex-tui-app-server -p codex-tui`
- parsed `.github/workflows/rust-ci.yml` locally with PyYAML

---

* -> #14652
* #14651
2026-03-16 16:48:15 -07:00
pakrym-oai
a3ba10b44b Add exit helper to code mode scripts (#14851)
- **Summary**
- expose `exit` through the code mode bridge and module so scripts can
stop mid-flight
  - surface the helper in the description documentation
  - add a regression test ensuring `exit()` terminates execution cleanly
- **Testing**
  - Not run (not requested)
2026-03-16 22:07:58 +00:00
Andi Liu
4c9dbc1f88 memories: exclude AGENTS and skills from stage1 input (#14268)
###### Why/Context/Summary
- Exclude injected AGENTS.md instructions and standalone skill payloads
from memory stage 1 inputs so memory generation focuses on conversation
content instead of prompt scaffolding.
- Strip only the AGENTS fragment from mixed contextual user messages
during stage-1 serialization, which preserves environment context in the
same message.
- Keep subagent notifications in the memory input, and add focused unit
coverage for the fragment classifier, rollout policy, and stage-1
serialization path.

###### Test plan
- `just fmt`
- `cargo test -p codex-core --lib contextual_user_message`
- `cargo test -p codex-core --lib rollout::policy`
- `cargo test -p codex-core --lib memories::phase1`
2026-03-16 19:30:38 +00:00
Anton Panasenko
663dd3f935 fix(core): fix sanitize name to use '_' everywhere (#14833) 2026-03-16 12:22:10 -07:00
Eric Traut
db89b73a9c Move TUI on top of app server (parallel code) (#14717)
This PR replicates the `tui` code directory and creates a temporary
parallel `tui_app_server` directory. It also implements a new feature
flag `tui_app_server` to select between the two tui implementations.

Once the new app-server-based TUI is stabilized, we'll delete the old
`tui` directory and feature flag.
2026-03-16 10:49:19 -06:00
jif-oai
3f266bcd68 feat: make interrupt state not final for multi-agents (#13850)
Make `interrupted` an agent state and make it not final. As a result, a
`wait` won't return on an interrupted agent and no notification will be
send to the parent agent.

The rationals are:
* If a user interrupt a sub-agent for any reason, you don't want the
parent agent to instantaneously ask the sub-agent to restart
* If a parent agent interrupt a sub-agent, no need to add a noisy
notification in the parent agen
2026-03-16 16:39:40 +00:00
jif-oai
18ad67549c feat: improve skills cache key to take into account config layering (#14806)
Fix https://github.com/openai/codex/issues/14161

This fixes sub-agent [[skills.config]] overrides being ignored when
parent and child share the same cwd. The root cause was that turn skill
loading rebuilt from cwd-only state and reused a cwd-scoped cache, so
role-local skill enable/disable overrides did not reliably affect the
spawned agent's effective skill set.

This change switches turn construction to use the effective per-turn
config and adds a config-aware skills cache keyed by skill roots plus
final disabled paths.
2026-03-16 16:12:44 +00:00
jif-oai
33acc1e65f fix: sub-agent role when using profiles (#14807)
Fix the layering conflict when a project profile is used with agents.
This PR clean the config layering and make sure the agent config >
project profile

Fix https://github.com/openai/codex/issues/13849,
https://github.com/openai/codex/issues/14671
2026-03-16 16:08:16 +00:00
Matthew Zeng
029aab5563 fix(core): preserve tool_params for elicitations (#14769)
- [x] Preserve tool_params keys.
2026-03-15 23:15:52 -07:00
Charley Cunningham
6fdeb1d602 Reuse guardian session across approvals (#14668)
## Summary
- reuse a guardian subagent session across approvals so reviews keep a
stable prompt cache key and avoid one-shot startup overhead
- clear the guardian child history before each review so prior guardian
decisions do not leak into later approvals
- include the `smart_approvals` -> `guardian_approval` feature flag
rename in the same PR to minimize release latency on a very tight
timeline
- add regression coverage for prompt-cache-key reuse without
prior-review prompt bleed

## Request
- Bug/enhancement request: internal guardian prompt-cache and latency
improvement request

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-15 22:56:18 -07:00
friel-openai
ba463a9dc7 Preserve background terminals on interrupt and rename cleanup command to /stop (#14602)
### Motivation
- Interrupting a running turn (Ctrl+C / Esc) currently also terminates
long‑running background shells, which is surprising for workflows like
local dev servers or file watchers.
- The existing cleanup command name was confusing; callers expect an
explicit command to stop background terminals rather than a UI clear
action.
- Make background‑shell termination explicit and surface a clearer
command name while preserving backward compatibility.

### Description
- Renamed the background‑terminal cleanup slash command from `Clean`
(`/clean`) to `Stop` (`/stop`) and kept `clean` as an alias in the
command parsing/visibility layer, updated the user descriptions and
command popup wiring accordingly.
- Updated the unified‑exec footer text and snapshots to point to `/stop`
(and trimmed corresponding snapshot output to match the new label).
- Changed interrupt behavior so `Op::Interrupt` (Ctrl+C / Esc interrupt)
no longer closes or clears tracked unified exec / background terminal
processes in the TUI or core cleanup path; background shells are now
preserved after an interrupt.
- Updated protocol/docs to clarify that `turn/interrupt` (or
`Op::Interrupt`) interrupts the active turn but does not terminate
background terminals, and that `thread/backgroundTerminals/clean` is the
explicit API to stop those shells.
- Updated unit/integration tests and insta snapshots in the TUI and core
unified‑exec suites to reflect the new semantics and command name.

### Testing
- Ran formatting with `just fmt` in `codex-rs` (succeeded). 
- Ran `cargo test -p codex-protocol` (succeeded). 
- Attempted `cargo test -p codex-tui` but the build could not complete
in this environment due to a native build dependency that requires
`libcap` development headers (the `codex-linux-sandbox` vendored build
step); install `libcap-dev` / make `libcap.pc` available in
`PKG_CONFIG_PATH` to run the TUI test suite locally.
- Updated and accepted the affected `insta` snapshots for the TUI
changes so visual diffs reflect the new `/stop` wording and preserved
interrupt behavior.

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69b39c44b6dc8323bd133ae206310fae)
2026-03-15 22:17:25 -07:00
Matthew Zeng
d4af6053e2 [apps] Improve search tool fallback. (#14732)
- [x] Bypass tool search and stuff tool specs directly into model
context when either a. Tool search is not available for the model or b.
There are not that many tools to search for.
2026-03-15 21:41:55 -07:00
Matthew Zeng
49edf311ac [apps] Add tool call meta. (#14647)
- [x] Add resource_uri and other things to _meta to shortcut resource
lookup and speed things up.
2026-03-14 22:24:13 -07:00
Colin Young
d692b74007 Add auth 401 observability to client bug reports (#14611)
CXC-392

  [With
  401](https://openai.sentry.io/issues/7333870443/?project=4510195390611458&query=019ce8f8-560c-7f10-a00a-c59553740674&referrer=issue-stream)
  <img width="1909" height="555" alt="401 auth tags in Sentry"
  src="https://github.com/user-attachments/assets/412ea950-61c4-4780-9697-15c270971ee3"
  />


  - auth_401_*: preserved facts from the latest unauthorized response snapshot
  - auth_*: latest auth-related facts from the latest request attempt
  - auth_recovery_*: unauthorized recovery state and follow-up result


  Without 401
  <img width="1917" height="522" alt="happy-path auth tags in Sentry"
  src="https://github.com/user-attachments/assets/3381ed28-8022-43b0-b6c0-623a630e679f"
  />

  ###### Summary
  - Add client-visible 401 diagnostics for auth attachment, upstream auth classification, and 401 request id / cf-ray correlation.
  - Record unauthorized recovery mode, phase, outcome, and retry/follow-up status without changing auth behavior.
  - Surface the highest-signal auth and recovery fields on uploaded client bug reports so they are usable in Sentry.
  - Preserve original unauthorized evidence under `auth_401_*` while keeping follow-up result tags separate.

  ###### Rationale (from spec findings)
  - The dominant bucket needed proof of whether the client attached auth before send or upstream still classified the request as missing auth.
  - Client uploads needed to show whether unauthorized recovery ran and what the client tried next.
  - Request id and cf-ray needed to be preserved on the unauthorized response so server-side correlation is immediate.
  - The bug-report path needed the same auth evidence as the request telemetry path, otherwise the observability would not be operationally useful.

  ###### Scope
  - Add auth 401 and unauthorized-recovery observability in `codex-rs/core`, `codex-rs/codex-api`, and `codex-rs/otel`, including feedback-tag surfacing.
  - Keep auth semantics, refresh behavior, retry behavior, endpoint classification, and geo-denial follow-up work out of this PR.

  ###### Trade-offs
  - This exports only safe auth evidence: header presence/name, upstream auth classification, request ids, and recovery state. It does not export token values or raw upstream bodies.
  - This keeps websocket connection reuse as a transport clue because it can help distinguish stale reused sessions from fresh reconnects.
  - Misroute/base-url classification and geo-denial are intentionally deferred to a separate follow-up PR so this review stays focused on the dominant auth 401 bucket.

  ###### Client follow-up
  - PR 2 will add misroute/provider and geo-denial observability plus the matching feedback-tag surfacing.
  - A separate host/app-server PR should log auth-decision inputs so pre-send host auth state can be correlated with client request evidence.
  - `device_id` remains intentionally separate until there is a safe existing source on the feedback upload path.

  ###### Testing
  - `cargo test -p codex-core refresh_available_models_sorts_by_priority`
  - `cargo test -p codex-core emit_feedback_request_tags_`
  - `cargo test -p codex-core emit_feedback_auth_recovery_tags_`
  - `cargo test -p codex-core auth_request_telemetry_context_tracks_attached_auth_and_retry_phase`
  - `cargo test -p codex-core extract_response_debug_context_decodes_identity_headers`
  - `cargo test -p codex-core identity_auth_details`
  - `cargo test -p codex-core telemetry_error_messages_preserve_non_http_details`
  - `cargo test -p codex-core --all-features --no-run`
  - `cargo test -p codex-otel otel_export_routing_policy_routes_api_request_auth_observability`
  - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_connect_auth_observability`
  - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_request_transport_observability`
2026-03-14 15:38:51 -07:00
viyatb-oai
9060dc7557 fix: fix symlinked writable roots in sandbox policies (#14674)
## Summary
- normalize effective readable, writable, and unreadable sandbox roots
after resolving special paths so symlinked roots use canonical runtime
paths
- add a protocol regression test for a symlinked writable root with a
denied child and update protocol expectations to canonicalized effective
paths
- update macOS seatbelt tests to assert against effective normalized
roots produced by the shared policy helpers

## Testing
- just fmt
- cargo test -p codex-protocol
- cargo test -p codex-core explicit_unreadable_paths_are_excluded_
- cargo clippy -p codex-protocol -p codex-core --tests -- -D warnings

## Notes
- This is intended to fix the symlinked TMPDIR bind failure in
bubblewrap described in #14672.
Fixes #14672
2026-03-14 13:24:43 -07:00
Channing Conger
70eddad6b0 dynamic tool calls: add param exposeToContext to optionally hide tool (#14501)
This extends dynamic_tool_calls to allow us to hide a tool from the
model context but still use it as part of the general tool calling
runtime (for ex from js_repl/code_mode)
2026-03-14 01:58:43 -07:00
sayan-oai
e389091042 make defaultPrompt an array, keep backcompat (#14649)
make plugins' `defaultPrompt` an array, but keep backcompat for strings.

the array is limited by app-server to 3 entries of up to 128 chars
(drops extra entries, `None`s-out ones that are too long) without
erroring if those invariants are violating.

added tests, tested locally.
2026-03-14 06:13:51 +00:00
Eric Traut
ae0a6510e1 Enforce errors on overriding built-in model providers (#12024)
We receive bug reports from users who attempt to override one of the
three built-in model providers (openai, ollama, or lmstuio). Currently,
these overrides are silently ignored. This PR makes it an error to
override them.

## Summary
- add validation for `model_providers` so `openai`, `ollama`, and
`lmstudio` keys now produce clear configuration errors instead of being
silently ignored
2026-03-13 22:10:13 -06:00
sayan-oai
d272f45058 move plugin/skill instructions into dev msg and reorder (#14609)
Move the general `Apps`, `Skills` and `Plugins` instructions blocks out
of `user_instructions` and into the developer message, with new `Apps ->
Skills -> Plugins` order for better clarity.

Also wrap those sections in stable XML-style instruction tags (like
other sections) and update prompt-layout tests/snapshots. This makes the
tests less brittle in snapshot output (we can parse the sections), and
it consolidates the capability instructions in one place.

#### Tests
Updated snapshots, added tests.

`<AGENTS_MD>` disappearing in snapshots is expected: before this change,
the wrapped user-instructions message was kept alive by `Skills`
content. Now that `Skills` and `Plugins` are in the developer message,
that wrapper only appears when there is real
project-doc/user-instructions content.

---------

Co-authored-by: Charley Cunningham <ccunningham@openai.com>
2026-03-13 20:51:01 -07:00
viyatb-oai
7f571396c8 fix: sync split sandbox policies for spawned subagents (#14650)
## Summary
- reapply the live split filesystem and network sandbox policies when
building spawned subagent configs
- keep spawned child sessions aligned with the parent turn after
role-layer config reloads
- add regression coverage for both config construction and spawned
child-turn inheritance
2026-03-14 03:03:49 +00:00
viyatb-oai
6dc04df5e6 fix: persist future network host approvals across sessions (#14619)
## Summary
- apply persisted execpolicy network rules when booting the managed
network proxy
- pass the current execpolicy into managed proxy startup so host
approvals selected with "allow this host in the future" survive new
sessions
2026-03-14 02:46:10 +00:00
Charley Cunningham
bbd329a812 Fix turn context reconstruction after backtracking (#14616)
## Summary
- reuse rollout reconstruction when applying a backtrack rollback so
`reference_context_item` is restored from persisted rollout state
- build rollback replay from the flushed rollout items plus the rollback
marker, avoiding the extra reread/fallback path
- add regression coverage for rollback after compaction so turn-context
diffing stays aligned after backtracking

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 19:28:31 -07:00
Ahmed Ibrahim
69c8a1ef9e Fix Windows CI assertions for guardian and Smart Approvals (#14645)
- Normalize guardian assessment path serialization to use forward
slashes for cross-platform stability.
- Seed workspace-write defaults in the Smart Approvals
override-turn-context test so Windows and non-Windows selection flows
are consistent.

---------

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Charles Cunningham <ccunningham@openai.com>
2026-03-14 02:15:58 +00:00
Eric Traut
4b9d5c8c1b Add openai_base_url config override for built-in provider (#12031)
We regularly get bug reports from users who mistakenly have the
`OPENAI_BASE_URL` environment variable set. This PR deprecates this
environment variable in favor of a top-level config key
`openai_base_url` that is used for the same purpose. By making it a
config key, it will be more visible to users. It will also participate
in all of the infrastructure we've added for layered and managed
configs.

Summary
- introduce the `openai_base_url` top-level config key, update
schema/tests, and route the built-in openai provider through it while
- fall back to deprecated `OPENAI_BASE_URL` env var but warn user of
deprecation when no `openai_base_url` config key is present
- update CLI, SDK, and TUI code to prefer the new config path (with a
deprecated env-var fallback) and document the SDK behavior change
2026-03-13 20:12:25 -06:00
Michael Bolin
b859a98e0f refactor: make unified-exec zsh-fork state explicit (#14633)
## Why

The unified-exec path was carrying zsh-fork state in a partially
flattened way.

First, the decision about whether zsh-fork was active came from feature
selection in `ToolsConfig`, while the real prerequisites lived in
session state. That left the handler and runtime defending against
partially configured cases later.

Second, once zsh-fork was active, its two runtime-only paths were
threaded through the runtime as separate arguments even though they form
one coherent piece of configuration.

This change keeps unified-exec on a single session-derived source of
truth and bundles the zsh-fork-specific paths into a named config type
so the runtime can pass them around as one unit.

In particular, this PR introduces this enum so the `ZshFork` variant can
carry the appropriate state with it:

```rust
#[derive(Debug, Clone, Eq, PartialEq)]
pub enum UnifiedExecShellMode {
    Direct,
    ZshFork(ZshForkConfig),
}

#[derive(Debug, Clone, Eq, PartialEq)]
pub struct ZshForkConfig {
    pub(crate) shell_zsh_path: AbsolutePathBuf,
    pub(crate) main_execve_wrapper_exe: AbsolutePathBuf,
}
```

This cleanup was done in preparation for
https://github.com/openai/codex/pull/13432.

## What Changed

- Replaced the feature-only `UnifiedExecBackendConfig` split with
`UnifiedExecShellMode` in `codex-rs/core/src/tools/spec.rs`.
- Derived the unified-exec mode from session-backed inputs when building
turn `ToolsConfig`, and preserved that mode across model switches and
review turns.
- Introduced `ZshForkConfig`, which stores the resolved zsh-fork
`AbsolutePathBuf` values for the configured `zsh` binary and `execve`
wrapper.
- Threaded `ZshForkConfig` through unified-exec command construction and
the zsh-fork preparation path so zsh-fork-specific runtime code consumes
a single config object instead of separate path arguments.
- Added focused tests for constructing zsh-fork mode only when session
prerequisites are available, and updated the zsh-fork expectations to be
target-platform aware.

## Testing

- `cargo test -p codex-core zsh_fork --lib`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/14633).
* #13432
* __->__ #14633
2026-03-13 17:20:01 -07:00
Andrei Eternal
9a44a7e499 [hooks] stop continuation & stop_hook_active mechanics (#14532)
Stop hooks now receive `stop_hook_active` and enable stop hooks to loop
forever if they'd like to. In the initial hooks PR, we implemented a
simpler mechanic that the stop-blocking could only happen once in a row

- support stop hook adding a continuation prompt to add a further task
- if multiple stop-blocks happen that have continuation prompts, they
are concatenated

example run:
```
› hey :)


• Running SessionStart hook: lighting the observatory

SessionStart hook (completed)
  warning: Hi, I'm a session start hook for wizard-tower (startup).
  hook context: Startup context: A wimboltine stonpet is an exotic cuisine from hyperspace

• Aloha :) Happy to jam with you. What are we building today?

• Running Stop hook: updating the guards

Stop hook (blocked)
  warning: Wizard Tower Stop hook continuing conversation
  feedback: cook the stonpet

• Aloha, here’s the hyperspace move for cooking a wimboltine stonpet:

  1. Sear the stonpet in a hot pan with moon-oil until the edges shimmer.
  2. Add star-lime, black salt, and a little fermented nebula paste.
  3. Lower the heat and let it braise for 8 cosmic minutes with a splash of comet broth.
  4. Finish with sky herbs and serve over warm asteroid rice.

  The vibe: crispy outside, tender center, deep interdimensional savor.

• Running Stop hook: updating the guards

Stop hook (completed)
  warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
```

.codex/config.toml
```
[features]
codex_hooks = true
```

.codex/hooks.json
```
{
  "hooks": {
    "SessionStart": [
      {
        "matcher": "startup|resume",
        "hooks": [
          {
            "type": "command",
            "command": "/usr/bin/python3 .codex/hooks/session_start_demo.py",
            "timeoutSec": 10,
            "statusMessage": "lighting the observatory"
          }
        ]
      }
    ],
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "/usr/bin/python3 .codex/hooks/stop_demo_block.py",
            "timeoutSec": 10,
            "statusMessage": "updating the guards"
          }
        ]
      }
    ]
  }
}
```

.codex/hooks/session_start_demo.py
```
#!/usr/bin/env python3

import json
import sys
from pathlib import Path


def main() -> int:
    payload = json.load(sys.stdin)
    cwd = Path(payload.get("cwd", ".")).name or "wizard-tower"
    source = payload.get("source", "startup")
    source_label = "resume" if source == "resume" else "startup"
    source_prefix = (
        "Resume context:"
        if source == "resume"
        else "Startup context:"
    )

    output = {
        "systemMessage": (
            f"Hi, I'm a session start hook for {cwd} ({source_label})."
        ),
        "hookSpecificOutput": {
            "hookEventName": "SessionStart",
            "additionalContext": (
                f"{source_prefix} A wimboltine stonpet is an exotic cuisine from hyperspace"
            ),
        },
    }
    print(json.dumps(output))
    return 0


if __name__ == "__main__":
    raise SystemExit(main())
```

.codex/hooks/stop_demo_block.py
```
#!/usr/bin/env python3

import json
import sys


def main() -> int:
    payload = json.load(sys.stdin)
    stop_hook_active = payload.get("stop_hook_active", False)
    last_assistant_message = payload.get("last_assistant_message") or ""
    char_count = len(last_assistant_message.strip())

    if stop_hook_active:
        system_message = (
            "Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop."
        )
        print(json.dumps({"systemMessage": system_message}))
    else:
        system_message = (
            f"Wizard Tower Stop hook continuing conversation"
        )
        print(json.dumps({"systemMessage": system_message, "decision": "block", "reason": "cook the stonpet"}))

    return 0


if __name__ == "__main__":
    raise SystemExit(main())
```
2026-03-13 15:51:19 -07:00
Charley Cunningham
467e6216bb Fix stale create_wait_tool reference (#14639)
## Summary
- replace the stale `create_wait_tool()` reference in `spec_tests.rs`
- use `create_wait_agent_tool()` to match the actual multi-agent tool
rename from `#14631`
- fix the resulting `codex-core` spec-test compile failure on current
`main`

## Context
`#14631` renamed the model-facing multi-agent tool from `wait` to
`wait_agent` and renamed the corresponding spec helper to
`create_wait_agent_tool()`.

One `spec_tests.rs` call site was left behind, so current `main` fails
to compile `codex-core` tests with:
- `cannot find function create_wait_tool`

Using `create_wait_agent_tool()` is the correct fix here;
`create_exec_wait_tool()` would point at the separate exec wait tool and
would not match the renamed multi-agent toolset.

## Testing
- not rerun locally after the rebase

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 15:35:25 -07:00
Charley Cunningham
bc24017d64 Add Smart Approvals guardian review across core, app-server, and TUI (#13860)
## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved

## Runtime model
This PR does not introduce a new `approval_policy`.

Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
  - `user`
  - `guardian_subagent`

`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.

The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.

When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`

Users can still change `/approvals` afterward.

Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior

ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.

## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.

- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface

## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.

It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`

with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`

`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`

`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.

These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.

This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.

## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
  - resolved approval/denial state in history

## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)

Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 15:27:00 -07:00
Charley Cunningham
e3cbf913e8 Fix wait_agent expectations in core tests (#14637)
## Summary
- update stale core tool-spec expectations from `wait` to `wait_agent`
- update the prompt-caching tool-name assertion to match the renamed
tool
- fix the Bazel regressions introduced after #14631 renamed the
multi-agent wait tool

## Testing
- cargo test -p codex-core tools::spec::tests
- cargo test -p codex-core
suite::prompt_caching::prompt_tools_are_consistent_across_requests

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 15:15:59 -07:00
pakrym-oai
cb7d8f45a1 Normalize MCP tool names to code-mode safe form (#14605)
Code mode doesn't allow `-` in names and it's better if function names
and code-mode names are the same.
2026-03-13 14:50:16 -07:00
Ahmed Ibrahim
36dfb84427 Stabilize multi-agent feature flag (#14622)
- make multi_agent stable and enabled by default
- update feature and tool-spec coverage to match the new default

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 14:38:15 -07:00
Ahmed Ibrahim
cfd97b36da Rename multi-agent wait tool to wait_agent (#14631)
- rename the multi-agent tool name the model sees to wait_agent
- update the model-facing prompts and tool descriptions to match

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 14:38:05 -07:00
pakrym-oai
477a2dd345 Add code_mode_only feature (#14617)
Summary
- add the code_mode_only feature flag/config schema and wire its
dependency on code_mode
- update code mode tool descriptions to list nested tools with detailed
headers
- restrict available tools for prompt and exec descriptions when
code_mode_only is enabled and test the behavior

Testing
- Not run (not requested)
2026-03-13 13:30:19 -07:00
Michael Bolin
ef37d313c6 fix: preserve zsh-fork escalation fds across unified-exec spawn paths (#13644)
## Why

`zsh-fork` sessions launched through unified-exec need the escalation
socket to survive the wrapper -> server -> child handoff so later
intercepted `exec()` calls can still reach the escalation server.

The inherited-fd spawn path also needs to avoid closing Rust's internal
exec-error pipe, and the shell-escalation handoff needs to tolerate the
receive-side case where a transferred fd is installed into the same
stdio slot it will be mapped onto.

## What Changed

- Added `SpawnLifecycle::inherited_fds()` in
`codex-rs/core/src/unified_exec/process.rs` and threaded inherited fds
through `codex-rs/core/src/unified_exec/process_manager.rs` so
unified-exec can preserve required descriptors across both PTY and
no-stdin pipe spawn paths.
- Updated `codex-rs/core/src/tools/runtimes/shell/zsh_fork_backend.rs`
to expose the escalation socket fd through the spawn lifecycle.
- Added inherited-fd-aware spawn helpers in
`codex-rs/utils/pty/src/pty.rs` and `codex-rs/utils/pty/src/pipe.rs`,
including Unix pre-exec fd pruning that preserves requested inherited
fds while leaving `FD_CLOEXEC` descriptors alone. The pruning helper is
now named `close_inherited_fds_except()` to better describe that
behavior.
- Updated `codex-rs/shell-escalation/src/unix/escalate_client.rs` to
duplicate local stdio before transfer and send destination stdio numbers
in `SuperExecMessage`, so the wrapper keeps using its own
`stdin`/`stdout`/`stderr` until the escalated child takes over.
- Updated `codex-rs/shell-escalation/src/unix/escalate_server.rs` so the
server accepts the overlap case where a received fd reuses the same
stdio descriptor number that the child setup will target with `dup2`.
- Added comments around the PTY stdio wiring and the overlap regression
helper to make the fd handoff and controlling-terminal setup easier to
follow.

## Verification

- `cargo test -p codex-utils-pty`
- covers preserved-fd PTY spawn behavior, PTY resize, Python REPL
continuity, exec-failure reporting, and the no-stdin pipe path
- `cargo test -p codex-shell-escalation`
- covers duplicated-fd transfer on the client side and verifies the
overlap case by passing a pipe-backed stdin payload through the
server-side `dup2` path

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13644).
* #14624
* __->__ #13644
2026-03-13 20:25:31 +00:00
Owen Lin
014e19510d feat(app-server, core): add more spans (#14479)
## Description

This PR expands tracing coverage across app-server thread startup, core
session initialization, and the Responses transport layer. It also gives
core dispatch spans stable operation-specific names so traces are easier
to follow than the old generic `submission_dispatch` spans.

Also use `fmt::Display` for types that we serialize in traces so we send
strings instead of rust types
2026-03-13 13:16:33 -07:00
canvrno-oai
914f7c7317 Override local apps settings with requirements.toml settings (#14304)
This PR changes app and connector enablement when `requirements.toml` is
present locally or via remote configuration.

For apps.* entries:
- `enabled = false` in `requirements.toml` overrides the user’s local
`config.toml` and forces the app to be disabled.
- `enabled = true` in `requirements.toml` does not re-enable an app the
user has disabled in config.toml.

This behavior applies whether or not the user has an explicit entry for
that app in `config.toml`. It also applies to cloud-managed policies and
configurations when the admin sets the override through
`requirements.toml`.

Scenarios tested and verified:
- Remote managed, user config (present) override
- Admin-defined policies & configurations include a connector override:
  `[apps.<appID>]
enabled = false`
- User's config.toml has the same connector configured with `enabled =
true`
  - TUI/App should show connector as disabled
  - Connector should be unavailable for use in the composer
  
- Remote managed, user config (absent) override
- Admin-defined policies & configurations include a connector override:
  `[apps.<appID>]
enabled = false`
  - User's config.toml has no entry for the the same connector
  - TUI/App should show connector as disabled
  - Connector should be unavailable for use in the composer
  
- Locally managed, user config (present) override
  - Local requirements.toml includes a connector override:
  `[apps.<appID>]
enabled = false`
- User's config.toml has the same connector configured with `enabled =
true`
  - TUI/App should show connector as disabled
  - Connector should be unavailable for use in the composer

- Locally managed, user config (absent) override
  - Local requirements.toml includes a connector override:
  `[apps.<appID>]
enabled = false`
  - User's config.toml has no entry for the the same connector
  - TUI/App should show connector as disabled
  - Connector should be unavailable for use in the composer




<img width="1446" height="753" alt="image"
src="https://github.com/user-attachments/assets/61c714ca-dcca-4952-8ad2-0afc16ff3835"
/>
<img width="595" height="233" alt="image"
src="https://github.com/user-attachments/assets/7c8ab147-8fd7-429a-89fb-591c21c15621"
/>
2026-03-13 12:40:24 -07:00
Ahmed Ibrahim
d58620c852 Use subagents naming in the TUI (#14618)
- rename user-facing TUI multi-agent wording to subagents
- rename the surfaced slash command to `subagents` and update
tests/snapshots

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 19:08:38 +00:00
Ahmed Ibrahim
3aabce9e0a Unify realtime v1/v2 session config (#14606)
## Summary
- unify realtime websocket settings under `[realtime]` (`version` and
`type`)
- remove `realtime_conversation_v2` and select parser/session mode from
config

## Testing
- not run (per request)

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 11:35:38 -07:00
Eric Traut
9dba7337f2 Start TUI on embedded app server (#14512)
This PR is part of the effort to move the TUI on top of the app server.
In a previous PR, we introduced an in-process app server and moved
`exec` on top of it.

For the TUI, we want to do the migration in stages. The app server
doesn't currently expose all of the functionality required by the TUI,
so we're going to need to support a hybrid approach as we make the
transition.

This PR changes the TUI initialization to instantiate an in-process app
server and access its `AuthManager` and `ThreadManager` rather than
constructing its own copies. It also adds a placeholder TUI event
handler that will eventually translate app server events into TUI
events. App server notifications are accepted but ignored for now. It
also adds proper shutdown of the app server when the TUI terminates.
2026-03-13 12:04:41 -06:00