5873 Commits

Author SHA1 Message Date
jif-oai
28742866c7 Add agents.interrupt_message for interruption markers (#19351)
## Why

Agent interruptions currently always persist a model-visible
interrupted-turn marker before emitting `TurnAborted`. That marker is
useful by default because it gives the next model turn context about a
deliberately interrupted task, but some deployments need to suppress
that history injection entirely while still keeping the client-visible
interruption event.

## What changed

- Add `[agents] interrupt_message = false` to disable the model-visible
interrupted-turn marker.
- Resolve the setting into `Config::agent_interrupt_message_enabled`,
defaulting to `true` so existing behavior is unchanged.
- Apply the setting to both live interrupted turns and interrupted fork
snapshots.
- Keep emitting `TurnAborted` even when the history marker is disabled.
- Regenerate `core/config.schema.json` for the new
`agents.interrupt_message` field.

## Testing

- `cargo test -p codex-core load_config_resolves_agent_interrupt_message
-- --nocapture`
- `cargo test -p codex-core
disabled_interrupted_fork_snapshot_appends_only_interrupt_event --
--nocapture`
- `cargo test -p codex-core
multi_agent_v2_interrupted_marker_uses_developer_input_message --
--nocapture`
- `cargo test -p codex-core
multi_agent_v2_followup_task_can_disable_interrupted_marker --
--nocapture`
- `cargo test -p codex-core
multi_agent_v2_followup_task_interrupts_busy_child_without_losing_message
-- --nocapture`
- `cargo check -p codex-core`
2026-04-24 16:02:45 +02:00
jif-oai
deb4509302 feat: surface multi-agent thread limit in spawn description (#19360)
## Summary
- Thread `agent_max_threads` into `ToolsConfig` and
`SpawnAgentToolOptions`.
- Render the configured `max_concurrent_threads_per_session` value in
the MultiAgentV2 `spawn_agent` description.
- Cover the description text in `codex-tools` unit tests and
`codex-core` tool spec tests.

## Validation
- `just fmt`
- `cargo test -p codex-tools`
- `cargo test -p codex-core spawn_agent_description`
- `git diff --check`

## Notes
- `cargo test -p codex-core` was also attempted, but unrelated
environment-sensitive tests failed with the active local environment.
Examples: approvals reviewer defaults observed `AutoReview` instead of
`User`, request-permissions event tests did not emit events, and
proxy-env tests saw `http://127.0.0.1:50604` from the active proxy
environment.

Co-authored-by: Codex <noreply@openai.com>
2026-04-24 15:13:54 +02:00
jif-oai
9eadff9713 chore: alias max_concurrent_threads_per_session (#19354) 2026-04-24 14:33:03 +02:00
jif-oai
120aa07d81 Make MultiAgentV2 interruption markers assistant-authored (#19124)
## Why

`MultiAgentV2` follow-up messages are delivered to agents as
assistant-authored `InterAgentCommunication` envelopes. When
`followup_task` used `interrupt: true`, the interrupted-turn guidance
was still persisted as a contextual user message, so model-visible
history made a system-generated interruption boundary look
user-authored.

This keeps interruption guidance consistent with the rest of the v2
inter-agent message stream while preserving the legacy marker shape for
non-v2 sessions.

## What changed

- Make `interrupted_turn_history_marker` feature-aware.
- Record the interrupted-turn marker as an assistant `OutputText`
message when `Feature::MultiAgentV2` is enabled.
- Keep the existing user contextual fragment for non-v2 sessions.
- Apply the same feature-aware marker to interrupted fork snapshots.
- Add coverage for the live `followup_task` interrupt path and the
helper-level v2 marker shape.

## Testing

- `cargo test -p codex-core
multi_agent_v2_followup_task_interrupts_busy_child_without_losing_message
-- --nocapture`
- `cargo test -p codex-core
multi_agent_v2_interrupted_marker_uses_assistant_output_message --
--nocapture`
- `cargo test -p codex-core interrupted_fork_snapshot -- --nocapture`
2026-04-24 13:39:26 +02:00
jif-oai
21463a5074 fix alpha build (#19350) 2026-04-24 13:36:05 +02:00
sayan-oai
c10f95ddac Update models.json and related fixtures (#19323)
Supersedes #18735.

The scheduled rust-release-prepare workflow force-pushed
`bot/update-models-json` back to the generated models.json-only diff,
which dropped the test and snapshot updates needed for CI.

This PR keeps the latest generated `models.json` from #18735 and adds
the corresponding fixture updates:
- preserve model availability NUX in the app-server model cache fixture
- update core/TUI expectations for the new `gpt-5.4` `xhigh` default
reasoning
- refresh affected TUI chatwidget snapshots for the `gpt-5.5`
default/model copy changes

Validation run locally while preparing the fix:
- `just fmt`
- `cargo test -p codex-app-server model_list`
- `cargo test -p codex-core includes_no_effort_in_request`
- `cargo test -p codex-core
includes_default_reasoning_effort_in_request_when_defined_by_model_info`
- `cargo test -p codex-tui --lib chatwidget::tests`
- `cargo insta pending-snapshots`

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
2026-04-24 11:14:13 +02:00
Eric Traut
ddfa691752 Surface reasoning tokens in exec JSON usage (#19308)
## Summary

Fixes #19022.

`codex exec --json` currently emits `turn.completed.usage` with input,
cached input, and output token counts, but drops the reasoning-token
split that Codex already receives through thread token usage updates.
Programmatic consumers that rely on the JSON stream, especially
ephemeral runs that do not write rollout files, need this field to
accurately display reasoning-model usage.

This PR adds `reasoning_output_tokens` to the public exec JSON `Usage`
payload and maps it from the existing `ThreadTokenUsageUpdated` total
token usage data.

## Verification

- Added coverage to
`event_processor_with_json_output::token_usage_update_is_emitted_on_turn_completion`
so `turn.completed.usage.reasoning_output_tokens` is asserted.
- Updated SDK expectations for `run()` and `runStreamed()` so TypeScript
consumers see the new usage field.
- Ran `cargo test -p codex-exec`.
- Ran `pnpm --filter ./sdk/typescript run build`.
- Ran `pnpm --filter ./sdk/typescript run lint`.
- Ran `pnpm --filter ./sdk/typescript exec jest --runInBand
--testTimeout=30000`.
2026-04-24 01:54:11 -07:00
Eric Traut
6f87eb0479 Hide unsupported MCP bearer_token from config schema (#19294)
## Summary

Fixes #19275.

Codex runtime rejects inline MCP `bearer_token` config entries and asks
users to configure `bearer_token_env_var` instead, but the generated
config schema still advertised `mcp_servers.<name>.bearer_token` as a
supported field. That made editor/schema validation disagree with
runtime validation.

This keeps `bearer_token` in `RawMcpServerConfig` so Codex can continue
producing the targeted runtime error for recent or existing configs, but
skips the field during schemars generation. The checked-in
`core/config.schema.json` fixture now exposes `bearer_token_env_var`
without exposing unsupported inline `bearer_token`.

## Verification

- Added `config_schema_hides_unsupported_inline_mcp_bearer_token` to
assert the generated schema hides `bearer_token` while preserving
`bearer_token_env_var`.
- Ran `cargo test -p codex-config`.
- Ran `cargo test -p codex-core config_schema`.
2026-04-24 00:17:43 -07:00
sayan-oai
e083b6c757 chore: apply truncation policy to unified_exec (#19247)
we were not respecting turn's `truncation_policy` to clamp output tokens
for `unified_exec` and `write_stdin`.

this meant truncation was only being applied by `ContextManager` before
the output was stored in-memory (so it _was_ being truncated from
model-visible context), but the full output was persisted to rollout on
disk.

now we respect that `truncation_policy` and `ContextManager`-level
truncation remains a backup.

### Tests
added tests, tested locally.
2026-04-24 00:17:39 -07:00
Eric Traut
ac8c9fc49c Reject unsupported js_repl image MIME types (#19292)
## Summary

`codex.emitImage` accepted arbitrary image MIME types for byte payloads
and data URLs. That allowed a value like `image/rgba` to be wrapped as
an `input_image`, even though it is not a supported encoded image
format, so the invalid image could reach the model-input path and
trigger output sanitization.

This results in a panic in debug builds because the output sanitization
is meant as a final safety net, not a primary means of rejecting invalid
image types. I've hit this case multiple times when executing certain
long-running tasks.

This PR rejects unsupported image MIME types before they are emitted
from `js_repl`.

## Changes

- Validate `codex.emitImage({ bytes, mimeType })` in the JS kernel so
only encoded PNG, JPEG, WebP, or GIF payloads are accepted.
- Apply the same MIME allowlist to direct image data URLs, including the
Rust host-side validation path.
- Clarify the JS REPL instructions so agents know byte payloads must
already be encoded as PNG/JPEG/WebP/GIF.
2026-04-24 00:14:51 -07:00
Michael Bolin
b68366718b ci: reuse Bazel CI startup for target-discovery queries (#19232)
## Why

A rerun of the Windows Bazel clippy job after
[#19161](https://github.com/openai/codex/pull/19161) had exactly the
cache behavior we wanted in BuildBuddy: zero action-cache misses. Even
so, the GitHub job still took a little over five minutes.

The problem was that the job was paying for two separate Bazel startup
paths:

1. a `bazel query` to discover extra lint targets
2. the real `bazel build --config=clippy ...` invocation

On Windows, that query was bypassing the CI Bazel wrapper, so it did not
reuse the same `--output_user_root`, CI config, or remote-cache setup as
the real build. In practice that meant the rerun could still cold-start
a separate Bazel server before the actual clippy build even began.

## What

- add `.github/scripts/run-bazel-query-ci.sh` to run CI-side Bazel
queries with the same startup and cache-related flags as the main Bazel
command
- switch `scripts/list-bazel-clippy-targets.sh` to use that helper for
manual `rust_test` target discovery
- switch `tools/argument-comment-lint/list-bazel-targets.sh` to use the
same helper
- simplify `.github/scripts/run-argument-comment-lint-bazel.sh` so its
Windows-only query path also goes through the shared helper

This keeps the target-discovery queries aligned with the later
build/test invocation instead of treating them as a separate cold Bazel
session.

## Verification

- `bash -n .github/scripts/run-bazel-query-ci.sh`
- `bash -n scripts/list-bazel-clippy-targets.sh`
- `bash -n tools/argument-comment-lint/list-bazel-targets.sh`
- `bash -n .github/scripts/run-argument-comment-lint-bazel.sh`
- mocked a Windows invocation of `run-bazel-query-ci.sh` and verified it
forwards `--output_user_root`, `--config=ci-windows`, the BuildBuddy
auth header, and the repository cache flags

## Docs

No documentation updates are needed.
2026-04-23 23:26:17 -07:00
Eric Traut
d87d918716 Resolve relative agent role config paths from layers (#19261)
Fixes #19257.

## Summary

Agent roles declared in config layers can set `config_file` to a
relative path, but deserializing the layer-local `[agents.*]` table
happened without an `AbsolutePathBuf` base path. That caused configs
like `config_file = "agents/my-role.toml"` to fail with `AbsolutePathBuf
deserialized without a base path`.

This updates agent role layer loading to deserialize `[agents.*]` while
the layer config folder is active as the path base, matching the
behavior documented for `AgentRoleToml.config_file`. It also adds
coverage for a user config layer with a relative agent role
`config_file`.
2026-04-23 23:23:11 -07:00
Michael Bolin
4816b89204 permissions: make profiles represent enforcement (#19231)
## Why

`PermissionProfile` is becoming the canonical permissions abstraction,
but the old shape only carried optional filesystem and network fields.
It could describe allowed access, but not who is responsible for
enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy
when profiles were exported, cached, or round-tripped through app-server
APIs.

The important model change is that active permissions are now a disjoint
union over the enforcement mode. Conceptually:

```rust
pub enum PermissionProfile {
    Managed {
        file_system: FileSystemSandboxPolicy,
        network: NetworkSandboxPolicy,
    },
    Disabled,
    External {
        network: NetworkSandboxPolicy,
    },
}
```

This distinction matters because `Disabled` means Codex should apply no
outer sandbox at all, while `External` means filesystem isolation is
owned by an outside caller. Those are not equivalent to a broad managed
sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an
inner sandbox may require the outer Codex layer to use no sandbox rather
than a permissive one.

## How Existing Modeling Maps

Legacy `SandboxPolicy` remains a boundary projection, but it now maps
into the higher-fidelity profile model:

- `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed`
with restricted filesystem entries plus the corresponding network
policy.
- `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving
the “no outer sandbox” intent instead of treating it as a lax managed
sandbox.
- `ExternalSandbox { network_access }` maps to
`PermissionProfile::External { network }`, preserving external
filesystem enforcement while still carrying the active network policy.
- Split runtime policies that legacy `SandboxPolicy` cannot faithfully
express, such as managed unrestricted filesystem plus restricted
network, stay `Managed` instead of being collapsed into
`ExternalSandbox`.
- Per-command/session/turn grants remain partial overlays via
`AdditionalPermissionProfile`; full `PermissionProfile` is reserved for
complete active runtime permissions.

## What Changed

- Change active `PermissionProfile` into a tagged union: `managed`,
`disabled`, and `external`.
- Keep partial permission grants separate with
`AdditionalPermissionProfile` for command/session/turn overlays.
- Represent managed filesystem permissions as either `restricted`
entries or `unrestricted`; `glob_scan_max_depth` is non-zero when
present.
- Preserve old rollout compatibility by accepting the pre-tagged `{
network, file_system }` profile shape during deserialization.
- Preserve fidelity for important edge cases: `DangerFullAccess`
round-trips as `disabled`, `ExternalSandbox` round-trips as `external`,
and managed unrestricted filesystem + restricted network stays managed
instead of being mistaken for external enforcement.
- Preserve configured deny-read entries and bounded glob scan depth when
full profiles are projected back into runtime policies, including
unrestricted replacements that now become `:root = write` plus deny
entries.
- Regenerate the experimental app-server v2 JSON/TypeScript schema and
update the `command/exec` README example for the tagged
`permissionProfile` shape.

## Compatibility

Legacy `SandboxPolicy` remains available at config/API boundaries as the
compatibility projection. Existing rollout lines with the old
`PermissionProfile` shape continue to load. The app-server
`permissionProfile` field is experimental, so its v2 wire shape is
intentionally updated to match the higher-fidelity model.

## Verification

- `just write-app-server-schema`
- `cargo check --tests`
- `cargo test -p codex-protocol permission_profile`
- `cargo test -p codex-protocol
preserving_deny_entries_keeps_unrestricted_policy_enforceable`
- `cargo test -p codex-app-server-protocol
permission_profile_file_system_permissions`
- `cargo test -p codex-app-server-protocol serialize_client_response`
- `cargo test -p codex-core
session_configured_reports_permission_profile_for_external_sandbox`
- `just fix`
- `just fix -p codex-protocol`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-core`
- `just fix -p codex-app-server`
2026-04-23 23:02:18 -07:00
xli-oai
33cc135cc3 [codex] Support remote plugin install writes (#18917)
## Summary
- Add a remote plugin install write call that POSTs the selected remote
plugin to the ChatGPT cloud plugin API.
- Align remote install with the latest remote read contract:
`pluginName` carries the backend remote plugin id directly, for example
`plugins~Plugin_linear`, and install no longer synthesizes
`<name>@<marketplace>` ids.
- Validate remote install ids with the same character rules as remote
read, return the same install response shape as local installs, and
include mocked app-server coverage for the write path.

## Validation
- `just fmt`
- `cargo test -p codex-app-server --test all plugin_install`
- `cargo test -p codex-core-plugins`
- `just fix -p codex-app-server`
- `just fix -p codex-core-plugins`
2026-04-23 22:10:15 -07:00
Ruslan Nigmatullin
19badb0be2 app-server: persist device key bindings in sqlite (#19206)
## Why

Device-key providers should only own platform key material. The
account/client binding used to authorize a signing payload is app-server
state, and keeping that state in provider-specific metadata makes the
same check harder to audit and harder to share across platform
implementations.

Persisting the binding in the shared state database gives the device-key
crate a platform-neutral source of truth before it asks a provider to
sign. It also lets app-server move potentially blocking key operations
off the main message processor path, which matters once providers may
wait for OS authentication prompts.

## What changed

- Add a `device_key_bindings` state migration plus `StateRuntime`
helpers keyed by `key_id`.
- Add an async `DeviceKeyBindingStore` abstraction to `codex-device-key`
and use it from `DeviceKeyStore::create` and `DeviceKeyStore::sign`.
- Keep provider calls behind async store methods and run the synchronous
provider work through `spawn_blocking`.
- Wire app-server device-key RPC handling to the SQLite-backed binding
store and spawn response/error delivery tasks for device-key requests.
- Run the turn-start tracing test on the existing larger current-thread
test harness after the larger async surface made the default test stack
too small locally.

## Validation

- `cargo test -p codex-device-key`
- `cargo test -p codex-state device_key`
- `cargo test -p codex-state`
- `cargo test -p codex-app-server device_key`
- `cargo test -p codex-app-server
message_processor::tracing_tests::turn_start_jsonrpc_span_parents_core_turn_spans`
- `cargo test -p codex-app-server`
- `just fix -p codex-device-key`
- `just fix -p codex-state`
- `just fix -p codex-app-server`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `git diff --check`
2026-04-23 21:55:56 -07:00
Celia Chen
e8d8080818 feat: let model providers own model discovery (#18950)
## Why

`codex-models-manager` had grown to own provider-specific concerns:
constructing OpenAI-compatible `/models` requests, resolving provider
auth, emitting request telemetry, and deciding how provider catalogs
should be sourced. That made the manager harder to reuse for providers
whose model catalog is not fetched from the OpenAI `/models` endpoint,
such as Amazon Bedrock.

This change moves provider-specific model discovery behind
provider-owned implementations, so the models manager can focus on
refresh policy, cache behavior, picker ordering, and model metadata
merging.

## What Changed

- Introduced a `ModelsManager` trait with separate `OpenAiModelsManager`
and `StaticModelsManager` implementations.
- Added `ModelsEndpointClient` so OpenAI-compatible HTTP fetching lives
outside `codex-models-manager`.
- Moved `/models` request construction, provider auth resolution,
timeout handling, and request telemetry into `codex-model-provider` via
`OpenAiModelsEndpoint`.
- Added provider-owned `models_manager(...)` construction so configured
OpenAI-compatible providers use `OpenAiModelsManager`, while
static/catalog-backed providers can return `StaticModelsManager`.
- Added an Amazon Bedrock static model catalog for the GPT OSS Bedrock
model IDs.
- Updated core/session/thread manager code and tests to depend on
`Arc<dyn ModelsManager>`.
- Moved offline model test helpers into
`codex_models_manager::test_support`.
## Metadata References

The Bedrock catalog metadata is based on the official Amazon Bedrock
OpenAI model documentation:

- [Amazon Bedrock OpenAI
models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-openai.html)
lists the Bedrock model IDs, text input/output modalities, and `128,000`
token context window for `gpt-oss-20b` and `gpt-oss-120b`.
- [Amazon Bedrock `gpt-oss-120b` model
card](https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-openai-gpt-oss-120b.html)
lists the `bedrock-runtime` model ID `openai.gpt-oss-120b-1:0`, the
`bedrock-mantle` model ID `openai.gpt-oss-120b`, text-only modalities,
and `128K` context window.
- [OpenAI `gpt-oss-120b` model
docs](https://developers.openai.com/api/docs/models/gpt-oss-120b)
document configurable reasoning effort with `low`, `medium`, and `high`,
plus text input/output modality.

The display names, default reasoning effort, and priority ordering are
Codex-local catalog choices.

## Test Plan
- Manually verified app-server model listing with an AWS profile:

```shell
CODEX_HOME="$(mktemp -d)" cargo run -p codex-app-server-test-client -- \
  --codex-bin ./target/debug/codex \
  -c 'model_provider="amazon-bedrock"' \
  -c 'model_providers.amazon-bedrock.aws.profile="codex-bedrock"' \
  -c 'model_providers.amazon-bedrock.aws.region="us-west-2"' \
  model-list
```

The response returned the Bedrock catalog with `openai.gpt-oss-120b-1:0`
as the default model and `openai.gpt-oss-20b-1:0` as the second listed
model, both text-only and supporting low/medium/high reasoning effort.
2026-04-24 04:28:25 +00:00
xl-openai
53be451673 feat: Use short SHA versions for curated plugin cache entries (#19095)
Curated plugin cache entries now use an 8-character SHA prefix, instead
of the full SHA, as the cache folder version number.
2026-04-23 21:15:03 -07:00
cassirer-openai
a9c111da54 [rollout_trace] Trace sessions and multi-agent edges (#18879)
## Summary

Adds the remaining session and multi-agent edge wiring needed to
reconstruct rollout relationships across spawned agents, resumed
sessions, and parent/child message delivery.

## Stack

This is PR 4/5 in the rollout trace stack.

- [#18876](https://github.com/openai/codex/pull/18876): Add rollout
trace crate
- [#18877](https://github.com/openai/codex/pull/18877): Record core
session rollout traces
- [#18878](https://github.com/openai/codex/pull/18878): Trace tool and
code-mode boundaries
- [#18879](https://github.com/openai/codex/pull/18879): Trace sessions
and multi-agent edges
- [#18880](https://github.com/openai/codex/pull/18880): Add debug trace
reduction command

## Review Notes

This is the stack layer that makes traces useful for multi-threaded
agent workflows. The main invariant is that reconstructed relationships
should come from durable rollout data rather than transient in-memory
manager state wherever possible.

The PR is intentionally small relative to the preceding layers: it uses
the recorder and reducer contracts already established by the stack and
only adds the session/agent relationship events needed by later debug
reduction.
2026-04-24 02:29:45 +00:00
starr-openai
49fb25997f Add sticky environment API and thread state (#18897)
## Summary
- add sticky environment selections to app-server v2 thread/start and
turn/start request flow
- carry thread-level selections through core session/thread state
- add app-server coverage for sticky selections and turn overrides

## Stack
1. This PR: API and thread persistence
2. #18898: config.toml named environment loading
3. #18899: downstream tool/runtime consumers

## Validation
- Not run locally; split only.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-23 18:57:13 -07:00
cassirer-openai
e3c8720a99 [rollout_trace] Add debug trace reduction command (#18880)
## Summary

Adds the debug CLI entry point for reducing recorded rollout traces.
This gives developers a direct way to inspect whether the emitted trace
stream reduces into the expected conversation/runtime model.

## Stack

This is PR 5/5 in the rollout trace stack.

- [#18876](https://github.com/openai/codex/pull/18876): Add rollout
trace crate
- [#18877](https://github.com/openai/codex/pull/18877): Record core
session rollout traces
- [#18878](https://github.com/openai/codex/pull/18878): Trace tool and
code-mode boundaries
- [#18879](https://github.com/openai/codex/pull/18879): Trace sessions
and multi-agent edges
- [#18880](https://github.com/openai/codex/pull/18880): Add debug trace
reduction command

## Review Notes

This PR is intentionally last: it depends on the trace crate, core
recorder, runtime/tool events, and session/agent edge data all existing.
The command should remain a debug/developer tool and avoid adding new
runtime behavior.

The useful review question is whether the CLI exposes the reducer in the
smallest practical way for local inspection without turning the debug
command into a supported user-facing workflow.
2026-04-24 01:56:48 +00:00
Celia Chen
432771c5fd feat: expose AWS account state from account/read (#19048)
## Why

AWS/Bedrock mode currently reports `account: null` with
`requiresOpenaiAuth: false` from `account/read`. That suppresses the
OpenAI-auth requirement, but it does not let app clients distinguish AWS
auth from any other non-OpenAI custom provider. For the prototype AWS
provider UX, clients need a simple provider-derived signal so they can
suppress ChatGPT/API-key login and token-refresh paths without
hardcoding Bedrock checks.

## What changed

- Adds an `aws` variant to the v2 `Account` protocol union.
- Adds `ProviderAccountKind` to `codex-model-provider` so the runtime
provider owns the app-visible account classification.
- Makes Amazon Bedrock return `ProviderAccountKind::Aws` from the
model-provider layer.
- Updates app-server `account/read` to map `ProviderAccountKind` to the
existing `GetAccountResponse` wire shape.
- Preserves the existing `account: null, requiresOpenaiAuth: false`
behavior for other non-OpenAI providers.
- Regenerates the app-server protocol schema fixtures.
- Adds coverage for provider account classification and for the Amazon
Bedrock `account/read` response.

## Testing

- `cargo test -p codex-model-provider`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server get_account_with_aws_provider`

## Notes

I attempted `just bazel-lock-update` and `just bazel-lock-check`, but
both are blocked in my local environment because `bazel` is not
installed.
2026-04-24 01:53:13 +00:00
Eric Traut
72f757d144 Increase app-server WebSocket outbound buffer (#19246)
Fixes #18203.

## Why

Remote TUI clients connected through `codex app-server --listen
ws://...` can receive short bursts of outbound turn and tool-output
notifications. The WebSocket transport previously used the shared
128-message channel capacity for its outbound writer queue, so a healthy
client that briefly lagged during normal output streaming could fill the
queue and be disconnected immediately.

This is a smaller mitigation than #18265: instead of adding a new
overflow/backpressure pipeline, keep the existing non-blocking router
behavior and give WebSocket clients enough bounded headroom for
realistic bursts.

## What Changed

- Added a WebSocket-only outbound writer capacity of `64 * 1024`
messages.
- Used that larger capacity only for the WebSocket data writer queue in
`codex-rs/app-server/src/transport/websocket.rs`.
- Left the shared `CHANNEL_CAPACITY` and the existing disconnect-on-full
behavior unchanged for internal/control channels and genuinely stuck
clients.

## Verification

- `cargo test -p codex-app-server
transport::tests::broadcast_does_not_block_on_slow_connection`
- Manually retried the #18203 repro prompt against the remote TUI and
confirmed it stayed connected.
2026-04-23 18:47:28 -07:00
efrazer-oai
5882f3f95e refactor: route Codex auth through AuthProvider (#18811)
## Summary

This PR moves Codex backend request authentication from direct
bearer-token handling to `AuthProvider`.

The new `codex-auth-provider` crate defines the shared request-auth
trait. `CodexAuth::provider()` returns a provider that can apply all
headers needed for the selected auth mode.

This lets ChatGPT token auth and AgentIdentity auth share the same
callsite path:
- ChatGPT token auth applies bearer auth plus account/FedRAMP headers
where needed.
- AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers
where needed.

Reference old stack: https://github.com/openai/codex/pull/17387/changes

## Callsite Migration

| Area | Change |
| --- | --- |
| backend-client | accepts an `AuthProvider` instead of a raw
token/header |
| chatgpt client/connectors | applies auth through
`CodexAuth::provider()` |
| cloud tasks | keeps Codex-backend gating, applies auth through
provider |
| cloud requirements | uses Codex-backend auth checks and provider
headers |
| app-server remote control | applies provider headers for backend calls
|
| MCP Apps/connectors | gates on `uses_codex_backend()` and keys caches
from generic account getters |
| model refresh | treats AgentIdentity as Codex-backend auth |
| OpenAI file upload path | rejects non-Codex-backend auth before
applying headers |
| core client setup | keeps model-provider auth flow and allows
AgentIdentity through provider-backed OpenAI auth |

## Stack

1. https://github.com/openai/codex/pull/18757: full revert
2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
crate
3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
auth mode and startup task allocation
4. This PR: migrate Codex backend auth callsites through AuthProvider
5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
and load `CODEX_AGENT_IDENTITY`

## Testing

Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
2026-04-23 17:14:02 -07:00
Michael Bolin
a9f75e5cda ci: derive cache-stable Windows Bazel PATH (#19161)
## Why

The BuildBuddy runs for PR #19086 and the later `main` build had the
same source tree, but their Windows Bazel action and test cache keys did
not line up. Comparing the downloaded execution logs showed the full
GitHub-hosted Windows runner `PATH` had changed from
`apache-maven-3.9.14` to `apache-maven-3.9.15`.

This repo is not using Maven; the Maven entry was just ambient
hosted-runner state. The problem was that Windows Bazel CI was still
forwarding the whole runner `PATH` into Bazel via `--action_env=PATH`,
`--host_action_env=PATH`, and `--test_env=PATH`, which made otherwise
reusable cache entries sensitive to unrelated runner image churn.

After discussion with the Bazel and BuildBuddy folks, the better shape
for this change was to stop asking Bazel to inherit the ambient Windows
`PATH` and instead compute one explicit cache-stable `PATH` in the
Windows setup action that already prepares the CI toolchain environment.

## What

- remove Windows `PATH` passthrough from `.bazelrc`
- export `CODEX_BAZEL_WINDOWS_PATH` from
`.github/actions/setup-bazel-ci/action.yml`
- move the PATH derivation logic into
`.github/scripts/compute-bazel-windows-path.ps1` so the allow-list is
easier to review and document
- keep only the Windows tool locations these Bazel jobs actually need:
MSVC and SDK paths, Git, PowerShell, Node, DotSlash, and the standard
Windows system directories
- update `.github/scripts/run-bazel-ci.sh` to require that explicit
value and forward it to Bazel action, host action, and test environments
- log the derived `CODEX_BAZEL_WINDOWS_PATH` in the setup step to
simplify cache-key debugging

## Verification

- `bash -n .github/scripts/run-bazel-ci.sh`
- `ruby -e 'require "yaml"; YAML.load_file(ARGV[0])'
.github/actions/setup-bazel-ci/action.yml`
- PowerShell parse check for
`.github/scripts/compute-bazel-windows-path.ps1`
- simulated a representative Windows `PATH` in PowerShell; the
allow-list retained MSVC, Git, PowerShell, Node, Windows, and DotSlash
entries while dropping Maven
2026-04-23 22:28:00 +00:00
iceweasel-oai
867820ac7e do not attempt ACLs on installed codex dir (#19214)
We used to attempt a read-ACL on the same dir as `codex.exe` to grant
the sandbox user the ability to invoke `codex-command-runner.exe`. That
worked for the CLI case but it always fails for the installed desktop
app.

We have another solution already in place that copies
`codex-command-runner.exe` to `CODEX_HOME/.sandbox-bin` so we don't even
need this anymore. It causes a scary looking error in the logs that is a
non-issue and is therefore confusing
2026-04-23 22:21:48 +00:00
iceweasel-oai
2e228969be guide Windows to use -WindowStyle Hidden for Start-Process calls (#19044)
Sometimes codex runs `Start-Process` to start up a service or something
similar, which launches a user-visible powershell window that probably
doesn't get cleaned up. This instruction change encourages it to do so
using a hidden window.

This was reported in
https://openai.slack.com/archives/C09K6H5DZC4/p1776741272870519

One caveat is that this change won't do anything to cleanup these
processes, but it will stop them from polluting the user's visible
workspace

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-23 21:36:17 +00:00
Michael Bolin
040976b218 tests: isolate approval fixtures from host rules (#18288)
## Why

Several approval-focused tests were unintentionally sensitive to
host-level rule files. On machines with broader allowed command
prefixes, commonly allowed commands such as `/bin/date` could bypass the
approval path these tests were meant to exercise, making the fixtures
depend on the developer or CI host configuration.

## What changed

- Pins the approval matrix fixture to the explicit user reviewer so it
does not inherit a host reviewer.
- Changes OTel approval fixtures to request `/usr/bin/touch
codex-otel-approval-test`, avoiding a command that may be pre-approved
by local rules.
- Clears the config layer stack for the permissions-message assertion
that needs to compare only the permissions text under test.

## Verification

- `env -u CODEX_SANDBOX_NETWORK_DISABLED cargo test -p codex-core --test
all approval_matrix_covers_all_modes -- --nocapture`
- `env -u CODEX_SANDBOX_NETWORK_DISABLED cargo test -p codex-core --test
all permissions_messages -- --nocapture`
2026-04-23 14:12:09 -07:00
Abhinav
dc5cf1ff78 Mark hooks schema fixtures as generated (#19194)
## Summary
- mark generated hooks schema fixture JSON as linguist-generated
- keep the app-server protocol generated schema marking unchanged

## Validation
- `git check-attr linguist-generated --
codex-rs/hooks/schema/generated/post-tool-use.command.output.schema.json`

Co-authored-by: Codex <noreply@openai.com>
2026-04-23 14:11:16 -07:00
Eric Traut
a50cb205b7 Stabilize plugin MCP tools test (#19191)
## Summary

The plugin MCP tool-listing test could hide MCP startup failures by
polling `ListMcpTools` until its own 30s deadline. If the plugin MCP
server startup had already failed or timed out, the session-owned MCP
manager would keep returning an empty tool list, so CI only reported
`discovered tools: []` instead of the startup state that mattered.

This makes the test synchronize on `McpStartupComplete` for the sample
plugin MCP server before asserting listed tools, and gives the
Bazel-launched test server a larger startup window.

## Notes

Confidence is about 80%. The source path strongly supports the RCA: a
failed MCP startup is represented as an empty tool list through
`ListMcpTools`, so the old polling contract could not distinguish "not
ready yet" from "startup already failed." I could not retrieve the CI
execution-log artifact to confirm the exact hidden startup error, but
the observed Ubuntu Bazel failure matches this path: repeated
`ListMcpTools` responses with no tools until the test-local timeout
fired.

I think this is the right solution because it keeps plugin behavior
unchanged and fixes only the test contract. Future startup failures
should now report the `McpStartupComplete` failure/cancellation instead
of timing out on an empty tool snapshot.

This test was introduced in https://github.com/openai/codex/pull/12864.
2026-04-23 14:08:40 -07:00
Eric Traut
3f8c06e457 Fix /review interrupt and TUI exit wedges (#18921)
Addresses #11267

## Summary
`/review` can be interrupted while it is still spawning the review
sub-agent. That spawn path lives in `codex-core` and did not observe the
task cancellation token until after `Codex::spawn` returned, so an
interrupted review could keep building a child session and leave the TUI
in a wedged state.

The TUI exit path also waited indefinitely for app-server
`thread/unsubscribe`, which made Ctrl+C look broken if the app-server
was already stuck. This makes interactive delegate startup
cancellation-aware and bounds the TUI shutdown-first unsubscribe wait
with a short UI escape-hatch timeout.

## Testing
I reproed the hang using the steps in the bug report. Confirmed hang no
longer exists after fix.
2026-04-23 13:28:12 -07:00
Eric Traut
cccc1b618e Stabilize approvals popup disabled-row test (#19178)
## Summary

The Windows Bazel job has been failing in
`chatwidget::tests::permissions::approvals_popup_navigation_skips_disabled`
because the test assumed a fixed approvals popup row order and shortcut
for the disabled permissions option. The approvals popup can include
platform-specific rows, so those assumptions made the test brittle.

This updates the test to derive the disabled row shortcut from the
rendered popup and assert navigation continues to skip disabled rows
before checking that disabled numeric shortcuts do not close or accept
the popup.
2026-04-23 13:21:35 -07:00
iceweasel-oai
d169bb541e use a version-specific suffix for command runner binary in .sandbox-bin (#19180)
we copy `codex-command-runner.exe` into `CODEX_HOME/.sandbox-bin/` so
that it can be executed by the sandbox user.
We also detect if that version is stale and copy a new one in if so.

This can fail when you are running multiple versions of the app - the
file in `.sandbox-bin` can look stale because it comes from another app
build.

This change allows us to have multiple versions in there for different
CLI versions, and it fallsback to a `size+mtime` hash in the filename
for dev builds that don't report a real CLI version.
2026-04-23 13:16:26 -07:00
xli-oai
0d6a90cd6b Add app-server marketplace upgrade RPC (#19074)
## Summary
- add a v2 `marketplace/upgrade` app-server RPC that mirrors the
existing configured Git marketplace upgrade path
- expose typed request/response/error payloads and regenerate
JSON/TypeScript schema fixtures
- add app-server integration coverage for all, named, already
up-to-date, and invalid marketplace upgrade requests

## Tests
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server marketplace_upgrade`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fmt`
2026-04-23 13:00:46 -07:00
Michael Bolin
491a3058f6 fix(exec-server): retain output until streams close (#18946)
## Why

A Mac Bazel run hit a flake in
`server::handler::tests::output_and_exit_are_retained_after_notification_receiver_closes`
where the read path observed process exit but lost the expected buffered
stdout (`first\nsecond\n`). See the [GitHub Actions
job](https://github.com/openai/codex/actions/runs/24758468552/job/72436716505)
and [BuildBuddy
invocation](https://app.buildbuddy.io/invocation/37475a12-4ef2-45fb-ab8a-e49a2aba1d59).

The underlying race is that process exit is not the same thing as
stdout/stderr closure. If a child or grandchild inherits the pipe write
end, or a process duplicates it with `dup2`, the watched process can
exit while the stream is still open and more output can still arrive.
The exec-server was starting exited-process retention cleanup from the
exit event, so the process entry could be removed before the output
streams had actually closed.

While stress-testing the exec-server unit suite,
`server::handler::tests::long_poll_read_fails_after_session_resume`
exposed a separate test race: it started a short-lived command that
could exit and wake the pending long-poll read before the session-resume
assertion observed the resumed-session error. That test is intended to
cover resume eviction, not process-exit delivery, so this change keeps
the process alive and quiet while the second connection resumes the
session.

## What changed

- Keep exec-server process entries retained until stdout/stderr streams
close, then start the post-exit retention timer from the closed event.
- Wake long-poll readers when the closed event is emitted.
- Add focused `local_process` unit coverage that proves late output is
still retained after the short test retention interval has elapsed, and
that closed process entries are eventually evicted.
- Add a local and remote regression test where a parent exits while a
child keeps inherited stdout open. The child waits on an explicit
release file, so the test deterministically observes exit first,
releases the child, then requires a nonzero-wait read from the exit
sequence to receive the late output.
- In `codex-rs/exec-server/src/server/handler/tests.rs`, make
`long_poll_read_fails_after_session_resume` run a long-lived silent
command instead of a short command that prints and exits. This isolates
the test to session-resume behavior and prevents a normal process exit
from satisfying the pending long-poll read first.

## Testing

- `cargo test -p codex-exec-server
exec_process_retains_output_after_exit_until_streams_close`
- `cargo test -p codex-exec-server local_process::tests`
- `cargo test -p codex-exec-server`
- `just fix -p codex-exec-server`
- `bazel test //codex-rs/exec-server:exec-server-unit-tests
//codex-rs/exec-server:exec-server-exec_process-test
//codex-rs/exec-server:exec-server-file_system-test
//codex-rs/exec-server:exec-server-http_client-test
//codex-rs/exec-server:exec-server-initialize-test
//codex-rs/exec-server:exec-server-process-test
//codex-rs/exec-server:exec-server-websocket-test`
- `bazel test --runs_per_test=25
//codex-rs/exec-server:exec-server-unit-tests`

## Documentation

No docs update needed; this is an internal exec-server correctness fix.
2026-04-23 19:49:58 +00:00
Michael Bolin
9c0eced391 shell-escalation: carry resolved permission profiles (#18287)
## Why

Shell escalation still has adapter code that expects a legacy sandbox
policy, but command approvals should carry the resolved
`PermissionProfile` so callers can reason about the granted permissions
canonically.

## What changed

This introduces profile-shaped resolved escalation permissions while
retaining the derived legacy sandbox policy for the Unix escalation
adapter. It updates approval types, the escalation server protocol, and
tests that inspect escalated command permissions.

## Verification

- `cargo test -p codex-core --test all handle_container_exec_ --
--nocapture`
- `cargo test -p codex-core --test all handle_sandbox_ -- --nocapture`

























































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18287).
* #18288
* __->__ #18287
2026-04-23 12:46:19 -07:00
cassirer-openai
6d09b6752d [rollout_trace] Trace tool and code-mode boundaries (#18878)
## Summary

Extends rollout tracing across tool dispatch and code-mode runtime
boundaries. This records canonical tool-call lifecycle events and links
code-mode execution/wait operations back to the model-visible calls that
caused them.

## Stack

This is PR 3/5 in the rollout trace stack.

- [#18876](https://github.com/openai/codex/pull/18876): Add rollout
trace crate
- [#18877](https://github.com/openai/codex/pull/18877): Record core
session rollout traces
- [#18878](https://github.com/openai/codex/pull/18878): Trace tool and
code-mode boundaries
- [#18879](https://github.com/openai/codex/pull/18879): Trace sessions
and multi-agent edges
- [#18880](https://github.com/openai/codex/pull/18880): Add debug trace
reduction command

## Review Notes

This PR is about attribution. Reviewers should focus on whether direct
tool calls, code-mode-originated tool calls, waits, outputs, and
cancellation boundaries are recorded with enough source information for
deterministic reduction without coupling the reducer to live runtime
internals.

The stack remains valid after this layer: tool and code-mode traces
reduce through the existing crate model, while the broader session and
multi-agent relationships are added in the next PR.
2026-04-23 12:22:11 -07:00
Michael Bolin
ff22982d75 mcp: include permission profiles in sandbox state (#18286)
## Why

MCP tool calls can receive a serialized `SandboxState` when a server
declares the sandbox-state capability. That state is one of the places
MCP runtimes learn what permissions Codex is operating under. As the
permissions migration makes `PermissionProfile` the canonical
representation, MCP consumers should be able to read that profile
directly instead of reconstructing permissions from the legacy
`SandboxPolicy`.

## What changed

- Adds optional `permissionProfile` to `codex_mcp::SandboxState`, while
keeping `sandboxPolicy` for existing MCP consumers.
- Populates `permissionProfile` from the current `TurnContext` when
serializing sandbox state for MCP tool calls.

## Verification

- Current GitHub Actions for this PR are passing.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18286).
* #18288
* #18287
* __->__ #18286
2026-04-23 12:21:26 -07:00
Michael Bolin
f90cc0ee64 tui: carry permission profiles on user turns (#18285)
## Why

Per-turn permission overrides should use the same canonical profile
abstraction as session configuration. That lets TUI submissions preserve
exact configured permissions without round-tripping through legacy
sandbox fields.

## What changed

This adds `permission_profile` to user-turn operations, threads it
through TUI/app-server submission paths, fills the new field in existing
test fixtures, and adds coverage that composer submission includes the
configured profile.

## Verification

- `cargo test -p codex-tui permissions -- --nocapture`
- `cargo test -p codex-core --test all permissions_messages --
--nocapture`























































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18285).
* #18288
* #18287
* #18286
* __->__ #18285
2026-04-23 11:54:17 -07:00
Rasmus Rygaard
f11583b8f6 Add remote thread config endpoint (#18908)
## Why

App-server needs a way to fetch thread-scoped config from the remote
thread config service when the user config opts into that behavior. This
mirrors the existing experimental remote thread store endpoint while
keeping local/noop behavior as the default.

Startup paths also need to avoid silently dropping the remote config
endpoint after the first config load. The stdio app-server path
discovers the endpoint from the initial config and installs the real
thread config loader for later config builds, while in-process clients
used by TUI/exec now select the same remote loader directly from their
provided config.

## What changed

- Added `experimental_thread_config_endpoint` to `ConfigToml`, `Config`,
and `core/config.schema.json`.
- Added config parsing coverage for the new setting.
- Updated app-server startup to select `RemoteThreadConfigLoader` from
the initially loaded config, falling back to `NoopThreadConfigLoader`
when unset.
- Let `ConfigManager` replace its thread config loader after startup
discovery so later config loads use the selected loader.
- Updated in-process app-server client startup to pass
`RemoteThreadConfigLoader` when its config has
`experimental_thread_config_endpoint` set.

## Verification

- Added `experimental_thread_config_endpoint_loads_from_config_toml`.
- Added
`runtime_start_args_use_remote_thread_config_loader_when_configured`.
- Ran `cargo check -p codex-app-server --lib`.
- Ran `cargo test -p codex-app-server-client`.
2026-04-23 11:46:06 -07:00
maja-openai
cff337e4e3 Use Auto-review wording for fallback rationale (#19168)
## Why

PR #18797 currently surfaces fallback rationale text that names Guardian
directly.

## What changed

- Updated the bare allow and bare deny fallback rationales in
`codex-rs/core/src/guardian/prompt.rs` from Guardian to Auto-review.
- Updated the existing bare allow parser test and added explicit bare
deny parser coverage.

## Verification

- `cargo test -p codex-core parse_guardian_assessment_treats_bare`
2026-04-23 11:42:43 -07:00
xl-openai
198eddd25d Move marketplace add/remove and startup sync out of core. (#19099)
Move more things to core-plugins.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-23 11:27:17 -07:00
Ruslan Nigmatullin
e9165b9f40 ci: add macOS keychain entitlements (#19167)
## Summary

- add macOS application and team identifiers to the release signing
entitlements
- add a Codex keychain access group for release-signed macOS binaries
- keep the existing JIT entitlement unchanged

## Why

Codex release binaries are signed with the OpenAI Developer ID team, but
the current entitlements plist only grants JIT. macOS Keychain and
Secure Enclave operations that create persistent keys can require the
process to carry an application identifier and keychain access group.
Adding these entitlements gives release-signed binaries a stable
Keychain namespace for Codex-owned device keys.

## Validation

- `plutil -lint
.github/actions/macos-code-sign/codex.entitlements.plist`
2026-04-23 11:20:58 -07:00
Ruslan Nigmatullin
8a0ab3fc13 app-server: add Unix socket transport (#18255)
## Summary
- add unix:// app-server transport backed by the shared codex-uds crate
- reuse the websocket connection loop for axum and tungstenite-backed
streams
- add codex app-server proxy to bridge stdio clients to the control
socket
- tolerate Windows UDS backends that report a missing rendezvous path as
connection refused before binding

## Tests
- cargo test -p codex-app-server
control_socket_acceptor_forwards_websocket_text_messages_and_pings
- cargo test -p codex-app-server
- just fmt
- just fix -p codex-app-server
- git -c core.fsmonitor=false diff --check
2026-04-23 11:09:25 -07:00
Eric Traut
c2423f42d1 Respect explicit untrusted project config (#18626)
## Why

Fixes #18475. A `-c` override such as `projects.<cwd>.trust_level =
"untrusted"` is meant to be a runtime config override, but app-server
thread startup treated any non-trusted project as eligible for automatic
trust persistence when a permissive sandbox/cwd was requested. That
meant an explicit `untrusted` session override could still cause
`config.toml` to be updated with `trusted`.

## What changed

The app-server auto-trust path now runs only when the active project
trust level is unknown. Explicit `trusted` and explicit `untrusted`
values are both respected, regardless of whether they came from
persisted config or session flags.

A focused `thread/start` test now covers the explicit `untrusted` case
with a permissive sandbox request.

## Verification

- `cargo test -p codex-app-server`
- `just fix -p codex-app-server`
2026-04-23 10:51:17 -07:00
Tom
f1061d9d07 [codex] Implement remote thread store methods (#19008) 2026-04-23 17:49:28 +00:00
Tom
f1923a38b1 [codex] Route live thread writes through ThreadStore (#18882)
Begin migrating the thread write codepaths to ThreadStore.

This starts using ThreadStore inside of core session code, not only in
the app server code.

Rework the interfaces around thread recording/persistence. We're left
with the following:

* `ThreadManager`: owns the process-level registry of loaded threads and
handles cross-thread orchestration: start, resume, fork, lookup, remove,
and route ops to running CodexThreads.
* `CodexThread`: represents one loaded/running thread from the outside.
It is the handle app-server and callers use to submit ops, inspect
session metadata, and shut the thread down.
* `LiveThread`: session-owned persistence lifecycle handle for one
active thread. Core session code uses it to append rollout items,
materialize lazy persistence, flush, shutdown, discard init-failed
writers, and load that thread’s persisted history.
* `ThreadStore`: storage backend abstraction. It answers “how are
threads persisted, read, listed, updated, archived?” Local and remote
implementations live behind this trait.
* `LocalThreadStore`: local ThreadStore implementation. It owns the
file/sqlite-specific details and keeps RolloutRecorder as a local
implementation detail.

This is a few too many Thread abstractions for my liking, but they do
all represent different concepts / needs / layers.

Migration note: in places where the core code explicitly requires a
path, rather than a thread ID, throw an error if we're running with a
remote store.

Cover the new local live-writer lifecycle with focused tests and
preserve app-server thread-start behavior, including ephemeral pathless
sessions.
2026-04-23 10:17:09 -07:00
David de Regt
3d3028a5a9 Add excludeTurns parameter to thread/resume and thread/fork (#19014)
For callers who expect to be paginating the results for the UI, they can
now call thread/resume or thread/fork with excludeturns:true so it will
not fetch any pages of turns, and instead only set up the subscription.
That call can be immediately followed by pagination requests to
thread/turns/list to fetch pages of turns according to the UI's current
interactions.
2026-04-23 10:07:59 -07:00
Rasmus Rygaard
0b4f694347 Add remote thread config loader protos (#18892)
## Why

Thread-scoped config needs a stable boundary between the app/session
owner and the config stack. Instead of having call sites manually copy
thread config fields into individual overrides, this adds the proto and
Rust plumbing needed for a `ThreadConfigLoader` implementation to return
typed sources that can be translated into ordinary config layer entries.

Keeping the remote payload typed also makes precedence easier to reason
about: session-owned thread config maps back to the existing session
config source, while user-owned thread config is represented separately
without introducing a new config-layer source until it has TOML-backed
fields.

## What changed

- Added the `codex.thread_config.v1` protobuf service and generated Rust
module for loading thread config sources.
- Added `RemoteThreadConfigLoader`, which calls the gRPC service, parses
`SessionThreadConfig` / `UserThreadConfig`, and validates provider
fields such as `wire_api`, auth timeout, and absolute auth cwd.
- Added proto generation tooling under
`config/scripts/generate-proto.sh` and
`config/examples/generate-proto.rs`.
- Added `ThreadConfigLoader::load_config_layers`, plus static/no-op
loader helpers, so tests and callers can use the same typed loader
interface while config-layer translation stays centralized.

## Verification

- `cargo test -p codex-config thread_config`
2026-04-23 10:06:05 -07:00
jif-oai
a2f868c9d6 feat: drop spawned-agent context instructions (#19127)
## Why

MultiAgentV2 children should not receive an extra model-visible
developer fragment just because they were spawned. The parent/configured
developer instructions should carry through normally, but the dedicated
`<spawned_agent_context>` block is no longer desired.

## What changed

- Removed the `SpawnAgentInstructions` context fragment and its
`<spawned_agent_context>` wrapper.
- Stopped appending spawned-agent instructions in
`codex-rs/core/src/tools/handlers/multi_agents_v2/spawn.rs`.
- Updated subagent notification coverage to assert inherited parent
developer instructions without expecting the spawned-agent wrapper.

## Verification

- `cargo test -p codex-core --test all
spawned_multi_agent_v2_child_inherits_parent_developer_context --
--nocapture`
- `cargo test -p codex-core --test all
skills_toggle_skips_instructions_for_parent_and_spawned_child --
--nocapture`
- `cargo test -p codex-core --test all subagent_notifications --
--nocapture`
2026-04-23 18:54:45 +02:00
xli-oai
e18bfeec91 [codex] Fix plugin marketplace help usage (#18710)
## Summary
- Updates generated CLI help for plugin marketplace commands to show the
full `codex plugin marketplace ...` namespace.
- Adds a regression test covering the marketplace command and its `add`,
`upgrade`, and `remove` help pages.

## Root Cause
The marketplace parser already lived under `codex plugin marketplace`,
but Clap generated usage text from the child parser's standalone command
name. That made help output show stale `codex marketplace ...`
instructions even though the top-level `codex marketplace` command no
longer parses.

## Validation
- `just fmt`
- `cargo test -p codex-cli`
- `./target/debug/codex plugin marketplace --help`
2026-04-23 09:48:37 -07:00