Commit Graph

4344 Commits

Author SHA1 Message Date
Matthew Zeng
c9214192c5 [plugins] Update the suggestable plugins list. (#15829)
- [x] Update the suggestable plugins list to be featured plugins.
2026-03-26 15:53:22 +00:00
jif-oai
6d2f4aaafc feat: use ProcessId in exec-server (#15866)
Use a full struct for the ProcessId to increase readability and make it
easier in the future to make it evolve if needed
2026-03-26 16:45:36 +01:00
jif-oai
26c66f3ee1 fix: flaky (#15869) 2026-03-26 16:07:32 +01:00
Michael Bolin
01fa4f0212 core: remove special execve handling for skill scripts (#15812) 2026-03-26 07:46:04 -07:00
jif-oai
6dcac41d53 chore: drop artifacts lib (#15864) 2026-03-26 15:28:59 +01:00
jif-oai
7dac332c93 feat: exec-server prep for unified exec (#15691)
This PR partially rebase `unified_exec` on the `exec-server` and adapt
the `exec-server` accordingly.

## What changed in `exec-server`

1. Replaced the old "broadcast-driven; process-global" event model with
process-scoped session events. The goal is to be able to have dedicated
handler for each process.
2. Add to protocol contract to support explicit lifecycle status and
stream ordering:
- `WriteResponse` now returns `WriteStatus` (Accepted, UnknownProcess,
StdinClosed, Starting) instead of a bool.
  - Added seq fields to output/exited notifications.
  - Added terminal process/closed notification.
3. Demultiplexed remote notifications into per-process channels. Same as
for the event sys
4. Local and remote backends now both implement ExecBackend.
5. Local backend wraps internal process ID/operations into per-process
ExecProcess objects.
6. Remote backend registers a session channel before launch and
unregisters on failed launch.

## What changed in `unified_exec`

1. Added unified process-state model and backend-neutral process
wrapper. This will probably disappear in the future, but it makes it
easier to keep the work flowing on both side.
- `UnifiedExecProcess` now handles both local PTY sessions and remote
exec-server processes through a shared `ProcessHandle`.
- Added `ProcessState` to track has_exited, exit_code, and terminal
failure message consistently across backends.
2. Routed write and lifecycle handling through process-level methods.

## Some rationals

1. The change centralizes execution transport in exec-server while
preserving policy and orchestration ownership in core, avoiding
duplicated launch approval logic. This comes from internal discussion.
2. Session-scoped events remove coupling/cross-talk between processes
and make stream ordering and terminal state explicit (seq, closed,
failed).
3. The failure-path surfacing (remote launch failures, write failures,
transport disconnects) makes command tool output and cleanup behavior
deterministic

## Follow-ups:
* Unify the concept of thread ID behind an obfuscated struct
* FD handling
* Full zsh-fork compatibility
* Full network sandboxing compatibility
* Handle ws disconnection
2026-03-26 15:22:34 +01:00
jif-oai
4a5635b5a0 feat: clean spawn v1 (#15861)
Avoid the usage of path in the v1 spawn
2026-03-26 15:01:00 +01:00
jif-oai
b00a05c785 feat: drop artifact tool and feature (#15851) 2026-03-26 13:21:24 +01:00
jif-oai
7ef3cfe63e feat: replace askama by custom lib (#15784)
Finalise the drop of `askama` to use our internal lib instead
2026-03-26 10:33:25 +01:00
viyatb-oai
937cb5081d fix: fix old system bubblewrap compatibility without falling back to vendored bwrap (#15693)
Fixes #15283.

## Summary
Older system bubblewrap builds reject `--argv0`, which makes our Linux
sandbox fail before the helper can re-exec. This PR keeps using system
`/usr/bin/bwrap` whenever it exists and only falls back to vendored
bwrap when the system binary is missing. That matters on stricter
AppArmor hosts, where the distro bwrap package also provides the policy
setup needed for user namespaces.

For old system bwrap, we avoid `--argv0` instead of switching binaries:
- pass the sandbox helper a full-path `argv0`,
- keep the existing `current_exe() + --argv0` path when the selected
launcher supports it,
- otherwise omit `--argv0` and re-exec through the helper's own
`argv[0]` path, whose basename still dispatches as
`codex-linux-sandbox`.

Also updates the launcher/warning tests and docs so they match the new
behavior: present-but-old system bwrap uses the compatibility path, and
only absent system bwrap falls back to vendored.

### Validation

1. Install Ubuntu 20.04 in a VM
2. Compile codex and run without bubblewrap installed - see a warning
about falling back to the vendored bwrap
3. Install bwrap and verify version is 0.4.0 without `argv0` support
4. run codex and use apply_patch tool without errors

<img width="802" height="631" alt="Screenshot 2026-03-25 at 11 48 36 PM"
src="https://github.com/user-attachments/assets/77248a29-aa38-4d7c-9833-496ec6a458b8"
/>
<img width="807" height="634" alt="Screenshot 2026-03-25 at 11 47 32 PM"
src="https://github.com/user-attachments/assets/5af8b850-a466-489b-95a6-455b76b5050f"
/>
<img width="812" height="635" alt="Screenshot 2026-03-25 at 11 45 45 PM"
src="https://github.com/user-attachments/assets/438074f0-8435-4274-a667-332efdd5cb57"
/>
<img width="801" height="623" alt="Screenshot 2026-03-25 at 11 43 56 PM"
src="https://github.com/user-attachments/assets/0dc8d3f5-e8cf-4218-b4b4-a4f7d9bf02e3"
/>

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-03-25 23:51:39 -07:00
Tiffany Citra
6d0525ae70 Expand home-relative paths on Windows (#15817)
Follow up to: https://github.com/openai/codex/pull/9193, also support
this for Windows.

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-03-25 21:19:57 -07:00
Eric Traut
1ff39b6fa8 Wire remote app-server auth through the client (#14853)
For app-server websocket auth, support the two server-side mechanisms
from
PR #14847:

- `--ws-auth capability-token --ws-token-file /abs/path`
- `--ws-auth signed-bearer-token --ws-shared-secret-file /abs/path`
  with optional `--ws-issuer`, `--ws-audience`, and
  `--ws-max-clock-skew-seconds`

On the client side, add interactive remote support via:

- `--remote ws://host:port` or `--remote wss://host:port`
- `--remote-auth-token-env <ENV_VAR>`

Codex reads the bearer token from the named environment variable and
sends it
as `Authorization: Bearer <token>` during the websocket handshake.
Remote auth
tokens are only allowed for `wss://` URLs or loopback `ws://` URLs.

Testing:
- tested both auth methods manually to confirm connection success and
rejection for both auth types
2026-03-25 22:17:03 -06:00
Eric Traut
b565f05d79 Fix quoted command rendering in tui_app_server (#15825)
When `tui_app_server` is enabled, shell commands in the transcript
render as fully quoted invocations like `/bin/zsh -lc "..."`. The
non-app-server TUI correctly shows the parsed command body.

Root cause:
The app-server stores `ThreadItem::CommandExecution.command` as a
shell-quoted string. When `tui_app_server` bridges that item back into
the exec renderer, it was passing `vec![command]` unchanged instead of
splitting the string back into argv. That prevented
`strip_bash_lc_and_escape()` from recognizing the shell wrapper, so the
renderer displayed the wrapper literally.

Solution:
Add a shared command-string splitter that round-trips shell-quoted
commands back into argv when it is safe to do so, while preserving
non-roundtrippable inputs as a single string. Use that helper everywhere
`tui_app_server` reconstructs exec commands from app-server payloads,
including live command-execution items, replayed thread items, and exec
approval requests. This restores the same command display behavior as
the direct TUI path without breaking Windows-style commands that cannot
be safely round-tripped.
2026-03-25 22:03:29 -06:00
Matthew Zeng
4b50446ffa [plugins] Flip flags on. (#15820)
- [x] Flip flags on.
2026-03-26 03:24:06 +00:00
Andrei Eternal
c4d9887f9a [hooks] add non-streaming (non-stdin style) shell-only PostToolUse support (#15531)
CHAINED PR - note that base is eternal/hooks-pretooluse-bash, not main
-- so the following PR should be first

Matching post-tool hook to the pre-tool functionality here:
https://github.com/openai/codex/pull/15211

So, PreToolUse calls for plain shell calls, allows blocking. This
PostToolUse call runs after the command executed

example run:

```
› as a test, run in parallel the following commands:
  - echo 'one'
  - echo '[block-pre-tool-use]'
  - echo '[block-post-tool-use]'


⚠ MCP startup incomplete (failed: notion, linear)

• Cruising through those three commands in parallel now, and I’ll share the exact outputs right after
  they land.

• Running PreToolUse hook: checking the observatory runes

• Running PreToolUse hook: checking the observatory runes

• Running PreToolUse hook: checking the observatory runes

PreToolUse hook (blocked)
  warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
  feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.

PreToolUse hook (completed)
  warning: wizard-tower PreToolUse demo inspected Bash: echo 'one'

PreToolUse hook (completed)
  warning: wizard-tower PreToolUse demo inspected Bash: echo '[block-post-tool-use]'

• Ran echo '[block-post-tool-use]'
  └ [block-post-tool-use]

• Ran echo 'one'
  └ one

• Running PostToolUse hook: reading the observatory aftermath

• Running PostToolUse hook: reading the observatory aftermath

PostToolUse hook (blocked)
  warning: wizard-tower PostToolUse demo blocked the Bash follow-through on purpose.
  feedback: PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to
continue.

PostToolUse hook (completed)
  warning: wizard-tower PostToolUse demo inspected Bash: echo 'one'
  hook context: PostToolUse demo saw the Bash result. Response preview: one

──────────────────────────────────────────────────────────────────────────────────────────────────────

• Ran them in parallel. Little wave report:

  - echo 'one' completed and returned one.
  - echo '[block-pre-tool-use]' was blocked before execution by the PreToolUse hook.
  - echo '[block-post-tool-use]' ran, but its result was blocked afterward by the PostToolUse hook.

  The hook messages were:

  - PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
  - PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to continue.


› for the blocked messages (correctly blocked in this test) did the tool output land back into
  context?


• Yeah, from this run:

  - PreToolUse block: the command itself did not execute, so no command output landed in context. Only
    the block message did.
  - PostToolUse block: the command did execute, but the actual stdout did not come back into context.
    Only the post-tool block message did.

  So the blocked notifications landed in context, but the blocked command payloads themselves did not.
  Nice clean guardrail, aloha.
```
2026-03-25 19:18:03 -07:00
Matthew Zeng
78799c1bcf [mcp] Improve custom MCP elicitation (#15800)
- [x] Support don't ask again for custom MCP tool calls.
- [x] Don't run arc in yolo mode.
- [x] Run arc for custom MCP tools in always allow mode.
2026-03-26 01:02:37 +00:00
Ruslan Nigmatullin
d7e35e56cf app-server: Organize app-server to allow more transports (#15810)
Make `run_main_with_transport` slightly more flexible by consolidating
logic spread across stdio and websocket transports.
2026-03-25 17:11:22 -07:00
canvrno-oai
2794e27849 Add ReloadUserConfig to tui_app_server (#15806)
- Adds ReloadUserConfig to `tui_app_server`
2026-03-25 17:03:18 -07:00
pakrym-oai
8fa88fa8ca Add cached environment manager for exec server URL (#15785)
Add environment manager that is a singleton and is created early in
app-server (before skill manager, before config loading).

Use an environment variable to point to a running exec server.
2026-03-25 16:14:36 -07:00
canvrno-oai
f24c55f0d5 TUI plugin menu polish (#15802)
- Add "OpenAI Curated" display name for `openai-curated` marketplace
- Hide /apps menu
- Change app install phase display text
2026-03-25 16:09:19 -07:00
arnavdugar-openai
eee692e351 Treat ChatGPT hc plan as Enterprise (#15789) 2026-03-25 15:41:29 -07:00
nicholasclark-openai
b6524514c1 Add MCP tool call spans (#15659)
## Summary
- add an explicit `mcp.tools.call` span around MCP tool execution in
core
- keep MCP span validation local to `mcp_tool_call_tests` instead of
broadening the integration test suite
- inline the turn/session correlation fields directly in the span
initializer

## Included Changes
- `codex-rs/core/src/mcp_tool_call.rs`: wrap the existing MCP tool call
in `mcp.tools.call` and inline `conversation.id`, `session.id`, and
`turn.id` in the span initializer
- `codex-rs/core/src/mcp_tool_call_tests.rs`: assert the MCP span
records the expected correlation and server fields

## Testing
- `cargo test -p codex-core`
- `just fmt`

## Notes
- `cargo test -p codex-core` still hits existing unrelated failures in
guardian-config tests and the sandboxed JS REPL `mktemp` test
- metric work moved to stacked PR #15792
- transport-level RMCP spans and trace propagation remain in stacked PR
#15792
- full workspace `cargo test` was not run

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 22:13:02 +00:00
Eric Traut
2c67a27a71 Avoid duplicate auth refreshes in getAuthStatus (#15798)
I've seen several intermittent failures of
`get_auth_status_returns_token_after_proactive_refresh_recovery` today.
I investigated, and I found a couple of issues.

First, `getAuthStatus(refreshToken=true)` could refresh twice in one
request: once via `refresh_token_if_requested()` and again via the
proactive refresh path inside `auth_manager.auth()`. In the
permanent-failure case this produced an extra `/oauth/token` call and
made the app-server auth tests flaky. Use `auth_cached()` after an
explicit refresh request so the handler reuses the post-refresh auth
state instead of immediately re-entering proactive refresh logic. Keep
the existing proactive path for `refreshToken=false`.

Second, serialize auth refresh attempts in `AuthManager` have a
startup/request race. One proactive refresh could already be in flight
while a `getAuthStatus(refreshToken=false)` request entered
`auth().await`, causing a second `/oauth/token` call before the first
failure or refresh result had been recorded. Guarding the refresh flow
with a single async lock makes concurrent callers share one refresh
result, which prevents duplicate refreshes and stabilizes the
proactive-refresh auth tests.
2026-03-25 16:03:53 -06:00
Ahmed Ibrahim
9dbe098349 Extract codex-core-skills crate (#15749)
## Summary
- move skill loading and management into codex-core-skills
- leave codex-core with the thin integration layer and shared wiring

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 12:57:42 -07:00
Felipe Coury
e9996ec62a fix(tui_app_server): preserve transcript events under backpressure (#15759)
## TL;DR

When running codex with `-c features.tui_app_server=true` we see
corruption when streaming large amounts of data. This PR marks other
event types as _critical_ by making them _must-deliver_.

## Problem

When the TUI consumer falls behind the app-server event stream, the
bounded `mpsc` channel fills up and the forwarding layer drops events
via `try_send`. Previously only `TurnCompleted` was marked as
must-deliver. Streamed assistant text (`AgentMessageDelta`) and the
authoritative final item (`ItemCompleted`) were treated as droppable —
the same as ephemeral command output deltas. Because the TUI renders
markdown incrementally from these deltas, dropping any of them produces
permanently corrupted or incomplete paragraphs that persist for the rest
of the session.

## Mental model

The app-server event stream has two tiers of importance:

1. **Lossless (transcript + terminal):** Events that form the
authoritative record of what the assistant said or that signal turn
lifecycle transitions. Losing any of these corrupts the visible output
or leaves surfaces waiting forever. These are: `AgentMessageDelta`,
`PlanDelta`, `ReasoningSummaryTextDelta`, `ReasoningTextDelta`,
`ItemCompleted`, and `TurnCompleted`.

2. **Best-effort (everything else):** Ephemeral status events like
`CommandExecutionOutputDelta` and progress notifications. Dropping these
under load causes cosmetic gaps but no permanent corruption.

The forwarding layer uses `try_send` for best-effort events (dropping on
backpressure) and blocking `send().await` for lossless events (applying
back-pressure to the producer until the consumer catches up).

## Non-goals

- Eliminating backpressure entirely. The bounded queue is intentional;
this change only widens the set of events that survive it.
- Changing the event protocol or adding new notification types.
- Addressing root causes of consumer slowness (e.g. TUI render cost).

## Tradeoffs

Blocking on transcript events means a slow consumer can now stall the
producer for the duration of those events. This is acceptable because:
(a) the alternative is permanently broken output, which is worse; (b)
the consumer already had to keep up with `TurnCompleted` blocking sends;
and (c) transcript events arrive at model-output speed, not burst speed,
so sustained saturation is unlikely in practice.

## Architecture

Two parallel changes, one per transport:

- **In-process path** (`lib.rs`): The inline forwarding logic was
extracted into `forward_in_process_event`, a standalone async function
that encapsulates the lag-marker / must-deliver / try-send decision
tree. The worker loop now delegates to it. A new
`server_notification_requires_delivery` function (shared `pub(crate)`)
centralizes the notification classification.

- **Remote path** (`remote.rs`): The local `event_requires_delivery` now
delegates to the same shared `server_notification_requires_delivery`,
keeping both transports in sync.

## Observability

No new metrics or log lines. The existing `warn!` on event drops
continues to fire for best-effort events. Lossless events that block
will not produce a log line (they simply wait).

## Tests

- `event_requires_delivery_marks_transcript_and_terminal_events`: unit
test confirming the expanded classification covers `AgentMessageDelta`,
`ItemCompleted`, `TurnCompleted`, and excludes
`CommandExecutionOutputDelta` and `Lagged`.
-
`forward_in_process_event_preserves_transcript_notifications_under_backpressure`:
integration-style test that fills a capacity-1 channel, verifies a
best-effort event is dropped (skipped count increments), then sends
lossless transcript events and confirms they all arrive in order with
the correct lag marker preceding them.
- `remote_backpressure_preserves_transcript_notifications`: end-to-end
test over a real websocket that verifies the remote transport preserves
transcript events under the same backpressure scenario.
- `event_requires_delivery_marks_transcript_and_disconnect_events`
(remote): unit test confirming the remote-side classification covers
transcript events and `Disconnected`.

---------

Co-authored-by: Eric Traut <etraut@openai.com>
2026-03-25 13:50:39 -06:00
viyatb-oai
6124564297 feat: add websocket auth for app-server (#14847)
## Summary
This change adds websocket authentication at the app-server transport
boundary and enforces it before JSON-RPC `initialize`, so authenticated
deployments reject unauthenticated clients during the websocket
handshake rather than after a connection has already been admitted.

During rollout, websocket auth is opt-in for non-loopback listeners so
we do not break existing remote clients. If `--ws-auth ...` is
configured, the server enforces auth during websocket upgrade. If auth
is not configured, non-loopback listeners still start, but app-server
logs a warning and the startup banner calls out that auth should be
configured before real remote use.

The server supports two auth modes: a file-backed capability token, and
a standard HMAC-signed JWT/JWS bearer token verified with the
`jsonwebtoken` crate, with optional issuer, audience, and clock-skew
validation. Capability tokens are normalized, hashed, and compared in
constant time. Short shared secrets for signed bearer tokens are
rejected at startup. Requests carrying an `Origin` header are rejected
with `403` by transport middleware, and authenticated clients present
credentials as `Authorization: Bearer <token>` during websocket upgrade.

## Validation
- `cargo test -p codex-app-server transport::auth`
- `cargo test -p codex-cli app_server_`
- `cargo clippy -p codex-app-server --all-targets -- -D warnings`
- `just bazel-lock-check`

Note: in the broad `cargo test -p codex-app-server
connection_handling_websocket` run, the touched websocket auth cases
passed, but unrelated Unix shutdown tests failed with a timeout in this
environment.

---------

Co-authored-by: Eric Traut <etraut@openai.com>
2026-03-25 12:35:57 -07:00
Matthew Zeng
91337399fe [apps][tool_suggest] Remove tool_suggest's dependency on tool search. (#14856)
- [x] Remove tool_suggest's dependency on tool search.
2026-03-25 12:26:02 -07:00
Felipe Coury
79359fb5e7 fix(tui_app_server): fix remote subagent switching and agent names (#15513)
## TL;DR

This PR changes the `tui_app_server` _path_ in the following ways:

- add missing feature to show agent names (shows only UUIDs today) 
- add `Cmd/Alt+Arrows` navigation between agent conversations

## Problem

When the TUI connects to a remote app server, collab agent tool-call
items (spawn, wait, delegate, etc.) render thread UUIDs instead of
human-readable agent names because the `ChatWidget` never receives
nickname/role metadata for receiver threads. Separately, keyboard
next/previous agent navigation silently does nothing when the local
`AgentNavigationState` cache has not yet been populated with subagent
threads that the remote server already knows about.

Both issues share a root cause: in the remote (app-server) code path the
TUI never proactively fetches thread metadata. In the local code path
this metadata arrives naturally via spawn events the TUI itself
orchestrates, but in the remote path those events were processed by a
different client and the TUI only sees the resulting collab tool-call
notifications.

## Mental model

Collab agent tool-call notifications reference receiver threads by id,
but carry no nickname or role. The TUI needs that metadata in two
places:

1. **Rendering** -- `ChatWidget` converts `CollabAgentToolCall` items
into history cells. Without metadata, agent status lines show raw UUIDs.
2. **Navigation** -- `AgentNavigationState` tracks known threads for the
`/agent` picker and keyboard cycling. Without entries for remote
subagents, next/previous has nowhere to go.

This change closes the gap with two complementary strategies:

- **Eager hydration**: when any notification carries
`receiver_thread_ids`, the TUI fetches metadata (`thread/read`) for
threads it has not yet cached before the notification is rendered.
- **Backfill on thread switch**: when the user resumes, forks, or starts
a new app-server thread, the TUI fetches the full `thread/loaded/list`,
walks the parent-child spawn tree, and registers every descendant
subagent in both the navigation cache and the `ChatWidget` metadata map.

A new `collab_agent_metadata` side-table in `ChatWidget` stores
nickname/role keyed by `ThreadId`, kept in sync by `App` whenever it
calls `upsert_agent_picker_thread`. The `replace_chat_widget` helper
re-seeds this map from `AgentNavigationState` so that thread switches
(which reconstruct the widget) do not lose previously discovered
metadata.

## Non-goals

- This change does not alter the local (non-app-server) collab code
path. That path already receives metadata via spawn events and is
unaffected.
- No new protocol messages are introduced. The change uses existing
`thread/read` and `thread/loaded/list` RPCs.
- No changes to how `AgentNavigationState` orders or cycles through
threads. The traversal logic is unchanged; only the population of
entries is extended.

## Tradeoffs

- **Extra RPCs on notification path**:
`hydrate_collab_agent_metadata_for_notification` issues a `thread/read`
for each unknown receiver thread before the notification is forwarded to
rendering. This adds latency on the notification path but only fires
once per thread (the result is cached). The alternative -- rendering
first and backfilling names later -- would cause visible flicker as
UUIDs are replaced with names.
- **Backfill fetches all loaded threads**:
`backfill_loaded_subagent_threads` fetches the full loaded-thread list
and walks the spawn tree even when the user may only care about one
subagent. This is simple and correct but O(loaded_threads) per thread
switch. For typical session sizes this is negligible; it could become a
concern for sessions with hundreds of subagents.
- **Metadata duplication**: agent nickname/role is now stored in both
`AgentNavigationState` (for picker/label) and
`ChatWidget::collab_agent_metadata` (for rendering). The two are kept in
sync through `upsert_agent_picker_thread` and `replace_chat_widget`, but
there is no compile-time enforcement of this coupling.

## Architecture

### New module: `app::loaded_threads`

Pure function `find_loaded_subagent_threads_for_primary` that takes a
flat list of `Thread` objects and a primary thread id, then walks the
`SessionSource::SubAgent` parent-child edges to collect all transitive
descendants. Returns a sorted vec of `LoadedSubagentThread` (thread_id +
nickname + role). No async, no side effects -- designed for unit
testing.

### New methods on `App`

| Method | Purpose |
|--------|---------|
| `collab_receiver_thread_ids` | Extracts `receiver_thread_ids` from
`ItemStarted` / `ItemCompleted` collab notifications |
| `hydrate_collab_agent_metadata_for_notification` | Fetches and caches
metadata for unknown receiver threads before a notification is rendered
|
| `backfill_loaded_subagent_threads` | Bulk-fetches all loaded threads
and registers descendants of the primary thread |
| `adjacent_thread_id_with_backfill` | Attempts navigation, falls back
to backfill if the cache has no adjacent entry |
| `replace_chat_widget` | Replaces the widget and re-seeds its metadata
map from `AgentNavigationState` |

### New state in `ChatWidget`

`collab_agent_metadata: HashMap<ThreadId, CollabAgentMetadata>` -- a
lookup table that rendering functions consult to attach human-readable
names to collab tool-call items. Populated externally by `App` via
`set_collab_agent_metadata`.

### New method on `AppServerSession`

`thread_loaded_list` -- thin wrapper around
`ClientRequest::ThreadLoadedList`.

## Observability

- `tracing::warn` on invalid thread ids during hydration and backfill.
- `tracing::warn` on failed `thread/read` or `thread/loaded/list` RPCs
(with thread id and error).
- No new metrics or feature flags.

## Tests

-
**`loaded_threads::tests::finds_loaded_subagent_tree_for_primary_thread`**
-- unit test for the spawn-tree walk: verifies child and grandchild are
included, unrelated threads are excluded, and metadata is carried
through.
-
**`app::tests::replace_chat_widget_reseeds_collab_agent_metadata_for_replay`**
-- integration test that creates a `ChatWidget`, replaces it via
`replace_chat_widget`, replays a collab wait notification, and asserts
the rendered history cell contains the agent name rather than a UUID.
- **Updated snapshot** `app_server_collab_wait_items_render_history` --
the existing collab wait rendering test now sets metadata before sending
notifications, so the snapshot shows `Robie [explorer]` / `Ada
[reviewer]` instead of raw thread ids.

---------

Co-authored-by: Eric Traut <etraut@openai.com>
2026-03-25 12:50:42 -06:00
evawong-oai
6566ab7e02 Clarify codex_home base for MDM path resolution (#15707)
## Summary

Add the follow up code comment Michael asked for at the MDM
`managed_config_from_mdm` - a follow up from
https://github.com/openai/codex/pull/15351.

## Validation

1. `cargo fmt --all --check`
2. `cargo test -p codex-core
managed_preferences_expand_home_directory_in_workspace_write_roots --
--nocapture`
3. `cargo test -p codex-core
write_value_succeeds_when_managed_preferences_expand_home_directory_paths
-- --nocapture`
4. `./tools/argument-comment-lint/run-prebuilt-linter.sh -p codex-core`
2026-03-25 18:40:43 +00:00
Ahmed Ibrahim
d273efc0f3 Extract codex-analytics crate (#15748)
## Summary
- move the analytics events client into codex-analytics
- update codex-core and app-server callsites to use the new crate

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 11:08:05 -07:00
Ahmed Ibrahim
2bb1027e37 Extract codex-plugin crate (#15747)
## Summary
- extract plugin identifiers and load-outcome types into codex-plugin
- update codex-core to consume the new plugin crate

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 11:07:31 -07:00
Ahmed Ibrahim
ad74543a6f Extract codex-utils-plugins crate (#15746)
## Summary
- extract shared plugin path and manifest helpers into
codex-utils-plugins
- update codex-core to consume the utility crate

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 11:05:35 -07:00
Jeremy Rose
6b10e186c4 Add non-interactive resume filter option (#15339)
## Summary
- add `codex resume --include-non-interactive` to include
non-interactive sessions in the picker and `--last`
- keep current-provider and cwd filtering behavior unchanged
- replace the picker API boolean with a `SessionSourceFilter` enum to
avoid a boolean trap

## Tests
- `cargo test -p codex-cli`
- `cargo test -p codex-tui`
- `just fmt`
- `just fix -p codex-cli`
- `just fix -p codex-tui`
2026-03-25 11:05:07 -07:00
Ahmed Ibrahim
fba3c79885 Extract codex-instructions crate (#15744)
## Summary
- extract instruction fragment and user-instruction types into
codex-instructions
- update codex-core to consume the new crate

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 10:43:49 -07:00
jif-oai
303d0190c5 feat: add multi-thread log query (#15776)
Required for multi-agent v2
2026-03-25 16:30:04 +00:00
jif-oai
14c35a16a8 chore: remove read_file handler (#15773)
Co-authored-by: Codex <noreply@openai.com>
2026-03-25 16:27:32 +00:00
Felipe Coury
c6ffe9abab fix(tui): avoid duplicate live reasoning summaries (#15758)
## TL;DR

Fix duplicated reasoning summaries in `tui_app_server`.

<img width="1716" height="912" alt="image"
src="https://github.com/user-attachments/assets/6362f25a-ab1c-4a01-bf10-b5616c9428c2"
/>

During live turns, reasoning text is already rendered incrementally from
`ReasoningSummaryTextDelta`. When the same reasoning item later arrives
via `ItemCompleted`, we should only finalize the reasoning block, not
render the same summary again.

## What changed

- only replay rendered reasoning summaries from completed
`ThreadItem::Reasoning` items
- kept live completed reasoning items as finalize-only
- added a regression test covering the live streaming + completion path

## Why

Without this, the first reasoning summary often appears twice in the
transcript when `model_reasoning_summary = "detailed"` and
`features.tui_app_server = true`.
2026-03-25 10:14:39 -06:00
jif-oai
f190a95a4f feat: rendering library v1 (#15778)
The goal will be to replace askama
2026-03-25 16:07:04 +00:00
pakrym-oai
504aeb0e09 Use AbsolutePathBuf for cwd state (#15710)
Migrate `cwd` and related session/config state to `AbsolutePathBuf` so
downstream consumers consistently see absolute working directories.

Add test-only `.abs()` helpers for `Path`, `PathBuf`, and `TempDir`, and
update branch-local tests to use them instead of
`AbsolutePathBuf::try_from(...)`.

For the remaining TUI/app-server snapshot coverage that renders absolute
cwd values, keep the snapshots unchanged and skip the Windows-only cases
where the platform-specific absolute path layout differs.
2026-03-25 16:02:22 +00:00
jif-oai
178c3b15b4 chore: remove grep_files handler (#15775)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 16:01:45 +00:00
Fouad Matin
32c4993c8a fix(core): default approval behavior for mcp missing annotations (#15519)
- Changed `requires_mcp_tool_approval` to apply MCP spec defaults when
annotations are missing.
- Unannotated tools now default to:
  - `readOnlyHint = false`
  - `destructiveHint = true`
  - `openWorldHint = true`
- This means unannotated MCP tools now go through approval/ARC
monitoring instead of silently bypassing it.
- Explicitly read-only tools still skip approval unless they are also
explicitly marked destructive.

**Previous behavior**
Failed open for missing annotations, which was unsafe for custom MCP
tools that omitted or forgot annotations.

---------

Co-authored-by: colby-oai <228809017+colby-oai@users.noreply.github.com>
2026-03-25 07:55:41 -07:00
jif-oai
047ea642d2 chore: tty metric (#15766) 2026-03-25 13:34:43 +00:00
xl-openai
f5dccab5cf Update plugin creator skill. (#15734)
Add support for home-local plugin + fix policy.
2026-03-25 01:55:10 -07:00
Matthew Zeng
e590fad50b [plugins] Add a flag for tool search. (#15722)
- [x] Add a flag for tool search.
2026-03-25 07:00:25 +00:00
Eric Traut
c0ffd000dd Fix stale turn steering fallback in tui_app_server (#15714)
This PR adds code to recover from a narrow app-server timing race where
a follow-up can be sent after the previous turn has already ended but
before the TUI has observed that completion.

Instead of surfacing turn/steer failed: no active turn to steer, the
client now treats that as a stale active-turn cache and falls back to
starting a fresh turn, matching the intended submit behavior more
closely. This is similar to the strategy employed by other app server
clients (notably, the IDE extension and desktop app).

This race exists because the current app-server API makes the client
choose between two separate RPCs, turn/steer and turn/start, based on
its local view of whether a turn is still active. That view is
replicated from asynchronous notifications, so it can be stale for a
brief window. The server may already have ended the turn while the
client still believes it is in progress. Since the choice is made
client-side rather than atomically on the server, tui_app_server can
occasionally send turn/steer for a turn that no longer exists.
2026-03-25 00:28:07 -06:00
viyatb-oai
95ba762620 fix: support split carveouts in windows restricted-token sandbox (#14172)
## Summary
- keep legacy Windows restricted-token sandboxing as the supported
baseline
- support the split-policy subset that restricted-token can enforce
directly today
- support full-disk read, the same writable root set as legacy
`WorkspaceWrite`, and extra read-only carveouts under those writable
roots via additional deny-write ACLs
- continue to fail closed for unsupported split-only shapes, including
explicit unreadable (`none`) carveouts, reopened writable descendants
under read-only carveouts, and writable root sets that do not match the
legacy workspace roots

## Example
Given a filesystem policy like:

```toml
":root" = "read"
":cwd" = "write"
"./docs" = "read"
```

the restricted-token backend can keep the workspace writable while
denying writes under `docs` by layering an extra deny-write carveout on
top of the legacy workspace-write roots.

A policy like:

```toml
"/workspace" = "write"
"/workspace/docs" = "read"
"/workspace/docs/tmp" = "write"
```

still fails closed, because the unelevated backend cannot reopen the
nested writable descendant safely.

## Stack
-> fix: support split carveouts in windows restricted-token sandbox
#14172
fix: support split carveouts in windows elevated sandbox #14568
2026-03-24 22:54:18 -07:00
Matthew Zeng
8c62829a2b [plugins] Flip on additional flags. (#15719)
- [x] Flip on additional flags.
2026-03-24 21:52:11 -07:00
Matthew Zeng
0bff38c54a [plugins] Flip the flags. (#15713)
- [x] Flip the `plugins` and `apps` flags.
2026-03-25 03:31:21 +00:00
canvrno-oai
2250508c2e TUI plugin menu cleanup - hide app ID (#15708)
- Hide App ID from plugin details page.
2026-03-24 20:03:10 -07:00
Matthew Zeng
0b08d89304 [app-server] Add a method to override feature flags. (#15601)
- [x] Add a method to override feature flags globally and not just
thread level.
2026-03-25 02:27:00 +00:00