Commit Graph

1721 Commits

Author SHA1 Message Date
Anton Panasenko
becc3a0424 feat: search_tool (#10657)
**Why We Did This**
- The goal is to reduce MCP tool context pollution by not exposing the
full MCP tool list up front
- It forces an explicit discovery step (`search_tool_bm25`) so the model
narrows tool scope before making MCP calls, which helps relevance and
lowers prompt/tool clutter.

**What It Changed**
- Added a new experimental feature flag `search_tool` in
`core/src/features.rs:90` and `core/src/features.rs:430`.
- Added config/schema support for that flag in
`core/config.schema.json:214` and `core/config.schema.json:1235`.
- Added BM25 dependency (`bm25`) in `Cargo.toml:129` and
`core/Cargo.toml:23`.
- Added new tool handler `search_tool_bm25` in
`core/src/tools/handlers/search_tool_bm25.rs:18`.
- Registered the handler and tool spec in
`core/src/tools/handlers/mod.rs:11` and `core/src/tools/spec.rs:780` and
`core/src/tools/spec.rs:1344`.
- Extended `ToolsConfig` to carry `search_tool` enablement in
`core/src/tools/spec.rs:32` and `core/src/tools/spec.rs:56`.
- Injected dedicated developer instructions for tool-discovery workflow
in `core/src/codex.rs:483` and `core/src/codex.rs:1976`, using
`core/templates/search_tool/developer_instructions.md:1`.
- Added session state to store one-shot selected MCP tools in
`core/src/state/session.rs:27` and `core/src/state/session.rs:131`.
- Added filtering so when feature is enabled, only selected MCP tools
are exposed on the next request (then consumed) in
`core/src/codex.rs:3800` and `core/src/codex.rs:3843`.
- Added E2E suite coverage for
enablement/instructions/hide-until-search/one-turn-selection in
`core/tests/suite/search_tool.rs:72`,
`core/tests/suite/search_tool.rs:109`,
`core/tests/suite/search_tool.rs:147`, and
`core/tests/suite/search_tool.rs:218`.
- Refactored test helper utilities to support config-driven tool
collection in `core/tests/suite/tools.rs:281`.

**Net Behavioral Effect**
- With `search_tool` **off**: existing MCP behavior (tools exposed
normally).
- With `search_tool` **on**: MCP tools start hidden, model must call
`search_tool_bm25`, and only returned `selected_tools` are available for
the next model call.
2026-02-09 12:53:50 -08:00
Charley Cunningham
9450cd9ce5 core: add focused diagnostics for remote compaction overflows (#11133)
## Summary
- add targeted remote-compaction failure diagnostics in compact_remote
logging
- log the specific values needed to explain overflow timing:
  - last_api_response_total_tokens
  - estimated_tokens_of_items_added_since_last_successful_api_response
  - estimated_bytes_of_items_added_since_last_successful_api_response
  - failing_compaction_request_body_bytes
- simplify breakdown naming and remove
last_api_response_total_bytes_estimate (it was an approximation and not
useful for debugging)

## Why
When compaction fails with context_length_exceeded, we need concrete,
low-ambiguity numbers that map directly to:
1) what the API most recently reported, and
2) what local history added since then.

This keeps the failure logs actionable without adding broad, noisy
metrics.

## Testing
- just fmt
- cargo test -p codex-core
2026-02-09 12:42:20 -08:00
jif-oai
c2bfd1e473 Revert "chore: enable sub agents" (#11230)
Reverts openai/codex#11173
2026-02-09 20:22:38 +00:00
pakrym-oai
ccd17374cb Move warmup to the task level (#11216)
Instead of storing a special connection on the client level make the
regular task responsible for establishing a normal client session and
open a connection on it.

Then when the turn is started we pass in a pre-established session.
2026-02-09 10:57:52 -08:00
Eric Traut
9346d321d2 Fixed bug in file watcher that results in spurious skills update events and large log files (#11217)
On some platforms, the "notify" file watcher library emits events for
file opens and reads, not just file modifications or deletes. The
previous implementation didn't take this into account.

Furthermore, the `tracing.info!` call that I previously added was
emitting a lot of logs. I had assumed incorrectly that `info` level
logging was disabled by default, but it's apparently enabled for this
crate. This is resulting in large logs (hundreds of MB) for some users.
2026-02-09 10:33:57 -08:00
Rasmus Rygaard
b2d3843109 Translate websocket errors (#10937)
When getting errors over a websocket connection, translate the error
into our regular API error format
2026-02-09 17:53:09 +00:00
jif-oai
cfce286459 tools: remove get_memory tool and tests (#11198)
Drop this memory tool as the design changed
2026-02-09 17:47:36 +00:00
Charley Cunningham
0883e5d3e5 core: account for all post-response items in auto-compact token checks (#11132)
## Summary
- change compaction pre-check accounting to include all items added
after the last model-generated item, not only trailing codex-generated
outputs
- use that boundary consistently in get_total_token_usage() and
get_total_token_usage_breakdown()
- update history tests to cover user/tool-output items after the last
model item

## Why
last_token_usage.total_tokens is API-reported for the last successful
model response. After that point, local history may gain additional
items (user messages, injected context, tool outputs). Compaction
triggering must account for all of those items to avoid late compaction
attempts that can overflow context.

## Testing
- just fmt
- cargo test -p codex-core
2026-02-09 08:34:38 -08:00
gt-oai
9fe925b15a Load requirements on windows (#10770)
We support requirements on Unix, loading from
`/etc/codex/requirements.toml`. On MacOS, we also support MDM.

Now, on Windows, we'll load requirements from
`%ProgramData%\OpenAI\Codex\requirements.toml`
2026-02-09 16:05:38 +00:00
gt-oai
54b401aa5f Deflake mixed parallel tools timing test (#11193)
```
        FAIL [   1.903s] (1926/3311) codex-core::all suite::tool_parallelism::mixed_parallel_tools_run_in_parallel
  stdout ───

    running 1 test
    test suite::tool_parallelism::mixed_parallel_tools_run_in_parallel ... FAILED

    failures:

    failures:
        suite::tool_parallelism::mixed_parallel_tools_run_in_parallel

    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 684 filtered out; finished in 1.86s
    
  stderr ───

    thread 'suite::tool_parallelism::mixed_parallel_tools_run_in_parallel' (205083) panicked at core/tests/suite/tool_parallelism.rs:74:5:
    expected parallel execution to finish quickly, got 1.406255993s
    stack backtrace:
       0: __rustc::rust_begin_unwind
                 at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/std/src/panicking.rs:689:5
       1: core::panicking::panic_fmt
                 at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/core/src/panicking.rs:80:14
       2: all::suite::tool_parallelism::assert_parallel_duration
                 at ./tests/suite/tool_parallelism.rs:74:5
       3: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel::{{closure}}
                 at ./tests/suite/tool_parallelism.rs:206:5
       4: <core::pin::Pin<P> as core::future::future::Future>::poll
                 at /home/runner/.rustup/toolchains/1.93.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:133:9
       5: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/park.rs:284:71
       6: tokio::task::coop::with_budget
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/task/coop/mod.rs:167:5
       7: tokio::task::coop::budget
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/task/coop/mod.rs:133:5
       8: tokio::runtime::park::CachedParkThread::block_on
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/park.rs:284:31
       9: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/context/blocking.rs:66:14
      10: tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/scheduler/multi_thread/mod.rs:89:22
      11: tokio::runtime::context::runtime::enter_runtime
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/context/runtime.rs:65:16
      12: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/scheduler/multi_thread/mod.rs:88:9
      13: tokio::runtime::runtime::Runtime::block_on_inner
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/runtime.rs:370:50
      14: tokio::runtime::runtime::Runtime::block_on
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/runtime.rs:342:18
      15: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel
                 at ./tests/suite/tool_parallelism.rs:208:7
      16: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel::{{closure}}
                 at ./tests/suite/tool_parallelism.rs:178:52
      17: core::ops::function::FnOnce::call_once
                 at /home/runner/.rustup/toolchains/1.93.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
      18: core::ops::function::FnOnce::call_once
                 at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/core/src/ops/function.rs:250:5
    note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
```
2026-02-09 15:16:54 +00:00
jif-oai
284c03ceab chore: enable sub agents (#11173) 2026-02-09 11:25:37 +00:00
jif-oai
753821c90f chore: enable shell snapshot (#11172) 2026-02-09 11:23:59 +00:00
jif-oai
6cf61725d0 feat: do not close unified exec processes across turns (#10799)
With this PR we do not close the unified exec processes (i.e. background
terminals) at the end of a turn unless:
* The user interrupt the turn
* The user decide to clean the processes through `app-server` or
`/clean`

I made sure that `codex exec` correctly kill all the processes
2026-02-09 10:27:46 +00:00
Michael Bolin
383b45279e feat: include NetworkConfig through ExecParams (#11105)
This PR adds the following field to `Config`:

```rust
pub network: Option<NetworkProxy>,
```

Though for the moment, it will always be initialized as `None` (this
will be addressed in a subsequent PR).

This PR does the work to thread `network` through to `execute_exec_env()`, `process_exec_tool_call()`, and `UnifiedExecRuntime.run()` to ensure it is available whenever we span a process.
2026-02-09 03:32:17 +00:00
Michael Bolin
ff74aaae21 chore: reverse the codex-network-proxy -> codex-core dependency (#11121) 2026-02-08 17:03:24 -08:00
Matthew Zeng
45b7763c3f [apps] Improve app loading. (#10994)
There are two concepts of apps that we load in the harness:

- Directory apps, which is all the apps that the user can install.
- Accessible apps, which is what the user actually installed and can be
$ inserted and be used by the model. These are extracted from the tools
that are loaded through the gateway MCP.

Previously we wait for both sets of apps before returning the full apps
list. Which causes many issues because accessible apps won't be
available to the UI or the model if directory apps aren't loaded or
failed to load.

In this PR we are separating them so that accessible apps can be loaded
separately and are instantly available to be shown in the UI and to be
provided in model context. We also added an app-server event so that
clients can subscribe to also get accessible apps without being blocked
on the full app list.

- [x] Separate accessible apps and directory apps loading.
- [x] `app/list` request will also emit `app/list/updated` notifications
that app-server clients can subscribe. Which allows clients to get
accessible apps list to render in the $ menu without being blocked by
directory apps.
- [x] Cache both accessible and directory apps with 1 hour TTL to avoid
reloading them when creating new threads.
- [x] TUI improvements to redraw $ menu and /apps menu when app list is
updated.
2026-02-08 15:24:56 -08:00
Michael Bolin
181b721ba5 feat: include [experimental_network] in <environment_context> (#11044)
If `NetworkConstraints` is set, then include the relevant settings on `<environment_context>`. Example:

```xml
<environment_context>
  <cwd>/repo</cwd>
  <shell>bash</shell>
  <network enabled="true">
    <allowed>api.example.com</allowed>
    <allowed>*.openai.com</allowed>
    <denied>blocked.example.com</denied>
  </network>
</environment_context>
```
2026-02-08 15:16:50 -08:00
Matthew Zeng
9f1009540b Upgrade rmcp to 0.14 (#10718)
- [x] Upgrade rmcp to 0.14
2026-02-08 15:07:53 -08:00
Tom
409ec76fbc Gate view_image tool by model input_modalities (#11051)
- Plumb input modalities from model catalog through the openai model
protocol. Default to text and image.
- Conditionally add the view_image tool only if input modalities support
image.
2026-02-08 10:45:26 -08:00
Eric Traut
b3de6c7f2b Defer persistence of rollout file (#11028)
- Defer rollout persistence for fresh threads (`InitialHistory::New`):
keep rollout events in memory and only materialize rollout file + state
DB row on first `EventMsg::UserMessage`.
- Keep precomputed rollout path available before materialization.
- Change `thread/start` to build thread response from live config
snapshot and optional precomputed path.
- Improve pre-materialization behavior in app-server/TUI: clearer
invalid-request errors for file-backed ops and a friendlier `/fork` “not
ready yet” UX.
- Update tests to match deferred semantics across
start/read/archive/unarchive/fork/resume/review flows.
- Improved resilience of user_shell test, which should be unrelated to
this change but must be affected by timing changes

For Reviewers:
* The primary change is in recorder.rs
* Most of the other changes were to fix up broken assumptions in
existing tests

Testing:
* Manually tested CLI
* Exercised app server paths by manually running IDE Extension with
rebuilt CLI binary
* Only user-visible change is that `/fork` in TUI generates visible
error if used prior to first turn
2026-02-07 23:05:03 -08:00
pakrym-oai
6d08298f4e Fallback to HTTP on UPGRADE_REQUIRED (#10824)
Allow the server to trigger a connection downgrade in case the protocol
changes in incompatible ways.
2026-02-08 05:06:33 +00:00
Anton Panasenko
a94505a92a feat: enable premessage-deflate for websockets (#10966)
note:
unfortunately, tokio-tungstenite / tungstenite upgrade triggers some
problems with linker of rama-tls-boring with openssl:
```
error: linking with `/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh` failed: exit status: 1
  |
  = note:  "/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh" "-m64" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/rcrt1.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crti.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtbeginS.o" "<1 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/{liblzma_sys-662a82316f96ec30,libbzip2_sys-bf78a2d58d5cbce6,liblibsqlite3_sys-6c004987fd67a36a,libtree_sitter_bash-220b99a97d331ab7,libtree_sitter-858f0a1dbfea58bd,libzstd_sys-6eb237deec748c5b,libring-2a87376483bf916f,libopenssl_sys-7c189e68b37fe2bb,liblibz_sys-4344eef4345520b1,librama_boring_sys-0414e98115015ee0}.rlib" "-lc++" "-lc++abi" "-lunwind" "-lc" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/libcompiler_builtins-*.rlib" "-L" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/raw-dylibs" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-nostartfiles" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libz-sys-ff5ea50d88c28ffb/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/ring-bdec3dddc19f5a5e/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/openssl-sys-96e0870de3ca22bc/out/openssl-build/install/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/zstd-sys-0cc37a5da1481740/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-72d2418073317c0f/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-bash-bfd293a9f333ce6a/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libsqlite3-sys-b78b2cfb81a330fc/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/bzip2-sys-69a145cc859ef275/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/lzma-sys-07e92d0b6baa6fd4/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/ssl/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib" "-o" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/deps/codex_network_proxy-d08268b863517761" "-Wl,--gc-sections" "-static-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-Wl,--strip-all" "-nodefaultlibs" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtendS.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtn.o"
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: warning: ignoring deprecated linker optimization setting '1'
          warning: unable to open library directory '/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/': FileNotFound
          ld.lld: error: duplicate symbol: SSL_export_keying_material
          >>> defined at ssl_lib.c:3816 (ssl/ssl_lib.c:3816)
          >>>            libssl-lib-ssl_lib.o:(SSL_export_keying_material) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib
          >>> defined at t1_enc.cc:205 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/ssl/t1_enc.cc:205)
          >>>            t1_enc.cc.o:(.text.SSL_export_keying_material+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib

          ld.lld: error: duplicate symbol: d2i_ASN1_TIME
          >>> defined at a_time.c:27 (crypto/asn1/a_time.c:27)
          >>>            libcrypto-lib-a_time.o:(d2i_ASN1_TIME) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib
          >>> defined at a_time.cc:34 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/crypto/asn1/a_time.cc:34)
          >>>            a_time.cc.o:(.text.d2i_ASN1_TIME+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib
``` 

that force me to migrate away from rama-tls-boring to rama-tls-rustls
and pin `ring` for rustls.
2026-02-07 17:59:34 -08:00
pakrym-oai
8fe5066bcc Simplify pre-connect (#11040) 2026-02-07 15:52:03 -08:00
viyatb-oai
739908a12c feat(core): add network constraints schema to requirements.toml (#10958)
## Summary

Add `requirements.toml` schema support for admin-defined network
constraints in the requirements layer

example config:

```
[experimental_network]
enabled = true
allowed_domains = ["api.openai.com"]
denied_domains = ["example.com"]
```
2026-02-07 19:48:24 +00:00
jif-oai
83c74125bc Bootstrap shell commands via user shell snapshot (#10909)
Summary
- wrap `shell -lc` executions that use a snapshot with the session shell
so the saved environment is sourced before delegating to the original
shell
- escape single quotes in the generated wrapper and add tests covering
Bash/Zsh/sh session bootstrapping

Testing
- Not run (not requested)
2026-02-07 17:36:44 +01:00
jif-oai
62605fa471 Add resume_agent collab tool (#10903)
Summary
- add the new resume_agent collab tool path through core, protocol, and
the app server API, including the resume events
- update the schema/TypeScript definitions plus docs so resume_agent
appears in generated artifacts and README
- note that resumed agents rehydrate rollout history without overwriting
their base instructions

Testing
- Not run (not requested)
2026-02-07 17:31:45 +01:00
Michael Bolin
4cd0c42a28 fix: normalize line endings when reading file on Windows (#10988)
I did not wait for CI on https://github.com/openai/codex/pull/10980
because it was blocking an alpha release, but apparently it broken the
Windows build.
2026-02-06 23:49:19 -08:00
Michael Bolin
18bb25557c fix: use expected line ending in codex-rs/core/config.schema.json (#10977)
Fixes a line ending that was altered in https://github.com/openai/codex/pull/10861.

This is breaking the release due to:

a118494323/.github/workflows/rust-release.yml (L54-L55)

This PR updates the test to check for this so we should catch it in CI (or when running tests locally):

a118494323/codex-rs/core/src/config/schema.rs (L105-L131)
2026-02-06 22:30:57 -08:00
Michael Bolin
a118494323 feat: add support for allowed_web_search_modes in requirements.toml (#10964)
This PR makes it possible to disable live web search via an enterprise
config even if the user is running in `--yolo` mode (though cached web
search will still be available). To do this, create
`/etc/codex/requirements.toml` as follows:

```toml
# "live" is not allowed; "disabled" is allowed even though not listed explicitly.
allowed_web_search_modes = ["cached"]
```

Or set `requirements_toml_base64` MDM as explained on
https://developers.openai.com/codex/security/#locations.

### Why
- Enforce admin/MDM/`requirements.toml` constraints on web-search
behavior, independent of user config and per-turn sandbox defaults.
- Ensure per-turn config resolution and review-mode overrides never
crash when constraints are present.

### What
- Add `allowed_web_search_modes` to requirements parsing and surface it
in app-server v2 `ConfigRequirements` (`allowedWebSearchModes`), with
fixtures updated.
- Define a requirements allowlist type (`WebSearchModeRequirement`) and
normalize semantics:
  - `disabled` is always implicitly allowed (even if not listed).
  - An empty list is treated as `["disabled"]`.
- Make `Config.web_search_mode` a `Constrained<WebSearchMode>` and apply
requirements via `ConstrainedWithSource<WebSearchMode>`.
- Update per-turn resolution (`resolve_web_search_mode_for_turn`) to:
- Prefer `Live → Cached → Disabled` when
`SandboxPolicy::DangerFullAccess` is active (subject to requirements),
unless the user preference is explicitly `Disabled`.
- Otherwise, honor the user’s preferred mode, falling back to an allowed
mode when necessary.
- Update TUI `/debug-config` and app-server mapping to display
normalized `allowed_web_search_modes` (including implicit `disabled`).
- Fix web-search integration tests to assert cached behavior under
`SandboxPolicy::ReadOnly` (since `DangerFullAccess` legitimately prefers
`live` when allowed).
2026-02-07 05:55:15 +00:00
sayan-oai
5d2702f6b8 fix(tui): conditionally restore status indicator using message phase (#10947)
TLDR: use new message phase field emitted by preamble-supported models
to determine whether an AgentMessage is mid-turn commentary. if so,
restore the status indicator afterwards to indicate the turn has not
completed.

### Problem
`commit_tick` hides the status indicator while streaming assistant text.
For preamble-capable models, that text can be commentary mid-turn, so
hiding was correct during streaming but restore timing mattered:
- restoring too aggressively caused jitter/flashing
- not restoring caused indicator to stay hidden before subsequent work
(tool calls, web search, etc.)

### Fix
- Add optional `phase` to `AgentMessageItem` and propagate it from
`ResponseItem::Message`
- Keep indicator hidden during streamed commit ticks, restore only when:
  - assistant item completes as `phase=commentary`, and
  - stream queues are idle + task is still running.
- Treat `phase=None` as final-answer behavior (no restore) to keep
existing behavior for non-preamble models

### Tests
Add/update tests for:
- no idle-tick restore without commentary completion
- commentary completion restoring status before tool begin
- snapshot coverage for preamble/status behavior

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-02-07 02:39:52 +00:00
daniel-oai
84bce2b8e6 TUI/Core: preserve duplicate skill/app mention selection across submit + resume (#10855)
## What changed

- In `codex-rs/core/src/skills/injection.rs`, we now honor explicit
`UserInput::Skill { name, path }` first, then fall back to text mentions
only when safe.
- In `codex-rs/tui/src/bottom_pane/chat_composer.rs`, mention selection
is now token-bound (selected mention is tied to the specific inserted
`$token`), and we snapshot bindings at submit time so selection is not
lost.
- In `codex-rs/tui/src/chatwidget.rs` and
`codex-rs/tui/src/bottom_pane/mod.rs`, submit/queue paths now consume
the submit-time mention snapshot (instead of rereading cleared composer
state).
- In `codex-rs/tui/src/mention_codec.rs` and
`codex-rs/tui/src/bottom_pane/chat_composer_history.rs`, history now
round-trips mention targets so resume restores the same selected
duplicate.
- In `codex-rs/tui/src/bottom_pane/skill_popup.rs` and
`codex-rs/tui/src/bottom_pane/chat_composer.rs`, duplicate labels are
normalized to `[Repo]` / `[App]`, app rows no longer show `Connected -`,
and description space is a bit wider.

<img width="550" height="163" alt="Screenshot 2026-02-05 at 9 56 56 PM"
src="https://github.com/user-attachments/assets/346a7eb2-a342-4a49-aec8-68dfec0c7d89"
/>
<img width="550" height="163" alt="Screenshot 2026-02-05 at 9 57 09 PM"
src="https://github.com/user-attachments/assets/5e04d9af-cccf-4932-98b3-c37183e445ed"
/>


## Before vs now

- Before: selecting a duplicate could still submit the default/repo
match, and resume could lose which duplicate was originally selected.
- Now: the exact selected target (skill path or app id) is preserved
through submit, queue/restore, and resume.

## Manual test

1. Build and run this branch locally:
   - `cd /Users/daniels/code/codex/codex-rs`
   - `cargo build -p codex-cli --bin codex`
   - `./target/debug/codex`
2. Open mention picker with `$` and pick a duplicate entry (not the
first one).
3. Confirm duplicate UI:
   - repo duplicate rows show `[Repo]`
   - app duplicate rows show `[App]`
   - app description does **not** start with `Connected -`
4. Submit the prompt, then press Up to restore draft and submit again.  
   Expected: it keeps the same selected duplicate target.
5. Use `/resume` to reopen the session and send again.  
Expected: restored mention still resolves to the same duplicate target.
2026-02-06 15:59:00 -08:00
alexsong-oai
daeef06bec add originator to otel (#10826) 2026-02-06 15:13:56 -08:00
Brian Yu
1fbf5ed06f Support alternative websocket API (#10861)
**Test plan**

```
cargo build -p codex-cli && RUST_LOG='codex_api::endpoint::responses_websocket=trace,codex_core::client=debug,codex_core::codex=debug' \
  ./target/debug/codex \
    --enable responses_websockets_v2 \
    --profile byok \
    --full-auto
```
2026-02-06 14:40:50 -08:00
Ahmed Ibrahim
ba8b5d9018 Treat compaction failure as failure state (#10927)
- Return compaction errors from local and remote compaction flows.\n-
Stop turns/tasks when auto-compaction fails instead of continuing
execution.
2026-02-06 13:51:46 -08:00
Charley Cunningham
143daadb31 core: refresh developer instructions after compaction replacement history (#10574)
## Summary

When replaying compacted history (especially `replacement_history` from
remote compaction), we should not keep stale developer messages from
older session state. This PR trims developer-
role messages from compacted replacement history and reinjects fresh
developer instructions derived from current turn/session state.

This aligns compaction replay behavior with the intended "fresh
instructions after summary" model.

## Problem

Compaction replay had two paths:

- `Compacted { replacement_history: None }`: rebuilt with fresh initial
context
- `Compacted { replacement_history: Some(...) }`: previously used raw
replacement history as-is

The second path could carry stale developer instructions
(permissions/personality/collab-mode guidance) across session changes.

## What Changed

### 1) Added helper to refresh compacted developer instructions

- **File:** `codex-rs/core/src/compact.rs`
- **Function:** `refresh_compacted_developer_instructions(...)`

Behavior:
- remove all `ResponseItem::Message { role: "developer", .. }` from
compacted history
- append fresh developer messages from current
`build_initial_context(...)`

### 2) Applied helper in remote compaction flow

- **File:** `codex-rs/core/src/compact_remote.rs`
- After receiving compact endpoint output, refresh developer
instructions before replacing history and persisting
`replacement_history`.

### 3) Applied helper while reconstructing history from rollout

- **File:** `codex-rs/core/src/codex.rs`
- In `reconstruct_history_from_rollout(...)`, when processing
`Compacted` entries with `replacement_history`, refresh developer
instructions instead of directly replacing with raw history.

## Non-Goals / Follow-up

This PR does **not** address the existing first-turn-after-resume
double-injection behavior.
A follow-up PR will handle resume-time dedup/idempotence separately.

If you want, I can also give you a shorter “squash-merge friendly”
version of the description.

## Codex author
`codex fork 019c25e6-706e-75d1-9198-688ec00a8256`
2026-02-06 12:25:08 -08:00
Josh McKinney
e416e578bb core: preconnect Responses websocket for first turn (#10698)
## Problem
The first user turn can pay websocket handshake latency even when a
session has already started. We want to reduce that initial delay while
preserving turn semantics and avoiding any prompt send during startup.

Reviewer feedback also called out duplicated connect/setup paths and
unnecessary preconnect state complexity.

## Mental model
`ModelClient` owns session-scoped transport state. During session
startup, it can opportunistically warm one websocket handshake slot. A
turn-scoped `ModelClientSession` adopts that slot once if available,
restores captured sticky turn-state, and otherwise opens a websocket
through the same shared connect path.

If startup preconnect is still in flight, first turn setup awaits that
task and treats it as the first connection attempt for the turn.

Preconnect is handshake-only. The first `response.create` is still sent
only when a turn starts.

## Non-goals
This change does not make preconnect required for correctness and does
not change prompt/turn payload semantics. It also does not expand
fallback behavior beyond clearing preconnect state when fallback
activates.

## Tradeoffs
The implementation prioritizes simpler ownership and shared connection
code over header-match gating for reuse. The single-slot cache keeps
lifecycle straightforward but only benefits the immediate next turn.

Awaiting in-flight preconnect has the same app-level connect-timeout
semantics as existing websocket connect behavior (no new timeout class
introduced by this PR).

## Architecture
`core/src/client.rs`:
- Added session-level preconnect lifecycle state (`Idle` / `InFlight` /
`Ready`) carrying one warmed websocket plus optional captured
turn-state.
- Added `pre_establish_connection()` startup warmup and `preconnect()`
handshake-only setup.
- Deduped auth/provider resolution into `current_client_setup()` and
websocket handshake wiring into `connect_websocket()` /
`build_websocket_headers()`.
- Updated turn websocket path to adopt preconnect first, await in-flight
preconnect when present, then create a new websocket only when needed.
- Ensured fallback activation clears warmed preconnect state.
- Added documentation for lifecycle, ownership, sticky-routing
invariants, and timeout semantics.

`core/src/codex.rs`:
- Session startup invokes `model_client.pre_establish_connection(...)`.
- Turn metadata resolution uses the shared timeout helper.

`core/src/turn_metadata.rs`:
- Centralized shared timeout helper used by both turn-time metadata
resolution and startup preconnect metadata building.

`core/tests/common/responses.rs` + websocket test suites:
- Added deterministic handshake waiting helper (`wait_for_handshakes`)
with bounded polling.
- Added startup preconnect and in-flight preconnect reuse coverage.
- Fallback expectations now assert exactly two websocket attempts in
covered scenarios (startup preconnect + turn attempt before fallback
sticks).

## Observability
Preconnect remains best-effort and non-fatal. Existing
websocket/fallback telemetry remains in place, and debug logs now make
preconnect-await behavior and preconnect failures easier to reason
about.

## Tests
Validated with:
1. `just fmt`
2. `cargo test -p codex-core websocket_preconnect -- --nocapture`
3. `cargo test -p codex-core websocket_fallback -- --nocapture`
4. `cargo test -p codex-core
websocket_first_turn_waits_for_inflight_preconnect -- --nocapture`
2026-02-06 19:08:24 +00:00
canvrno-oai
36c16e0c58 Add app configs to config.toml (#10822)
Adds app configs to config.toml + tests
2026-02-06 10:29:08 -08:00
Eric Traut
4521a6e852 Removed "exec_policy" feature flag (#10851)
This is no longer needed because it's on by default
2026-02-06 08:59:47 -08:00
jif-oai
aab61934af Handle required MCP startup failures across components (#10902)
Summary
- add a `required` flag for MCP servers everywhere config/CLI data is
touched so mandatory helpers can be round-tripped
- have `codex exec` and `codex app-server` thread start/resume fail fast
when required MCPs fail to initialize
2026-02-06 17:14:37 +01:00
jif-oai
3800173459 feat: backfill async again (#10894) 2026-02-06 15:41:52 +01:00
Eric Traut
dd80e332c4 Removed the "remote_compaction" feature flag (#10840)
This feature is always on now
2026-02-05 23:54:57 -08:00
Eric Traut
e5c1a2d6fb Log an event (info only) when we receive a file watcher event (#10843) 2026-02-05 20:24:16 -08:00
Anton Panasenko
4ee039744e feat: expose detailed metrics to runtime metrics (#10699) 2026-02-05 18:22:30 -08:00
gt-oai
d74fa8edd1 Print warning when config does not meet requirements (#10792)
<img width="1019" height="284" alt="Screenshot 2026-02-05 at 23 34 08"
src="https://github.com/user-attachments/assets/19ec3ce1-3c3b-40f5-b251-a31d964bf3bb"
/>

Currently, if a config value is set that fails the requirements, we exit
Codex.

Now, instead of this, we print a warning and default to a
requirements-permitting value.
2026-02-06 01:12:44 +00:00
Owen Lin
0d8b2b74c4 feat(app-server): turn/steer API (#10821)
This PR adds a dedicated `turn/steer` API for appending user input to an
in-flight turn.

## Motivation
Currently, steering in the app is implemented by just calling
`turn/start` while a turn is running. This has some really weird quirks:
- Client gets back a new `turn.id`, even though streamed
events/approvals remained tied to the original active turn ID.
- All the various turn-level override params on `turn/start` do not
apply to the "steer", and would only apply to the next real turn.
- There can also be a race condition where the client thinks the turn is
active but the server has already completed it, so there might be bugs
if the client has baked in some client-specific behavior thinking it's a
steer when in fact the server kicked off a new turn. This is
particularly possible when running a client against a remote app-server.

Having a dedicated `turn/steer` API eliminates all those quirks.

`turn/steer` behavior:
- Requires an active turn on threadId. Returns a JSON-RPC error if there
is no active turn.
- If expectedTurnId is provided, it must match the active turn (more
useful when connecting to a remote app-server).
- Does not emit `turn/started`.
- Does not accept turn overrides (`cwd`, `model`, `sandbox`, etc.) or
`outputSchema` to accurately reflect that these are not applied when
steering.
2026-02-06 00:35:04 +00:00
pakrym-oai
dbe47ea01a Send beta header with websocket connects (#10727) 2026-02-05 15:05:02 -08:00
sayan-oai
378f1cabe8 go back to auto-enabling web_search for azure (#10820)
###### What
Remove special-casing that prevented auto-enabling `web_search` for
Azure model provider users. Addresses #10071, #10257.

###### Why
Azure fixed their responsesapi implementation; `web_search` is now
supported on models it wasn't before (like `gpt-5.1-codex-max`).

This request now works:
```
curl "$AZURE_API_ENDPOINT" -H "Content-Type: application/json" -H "Authorization: Bearer $AZURE_API_KEY" -d '{
  "model": "gpt-5.1-codex-max",
  "tools": [
    { "type": "web_search" }
  ],
  "tool_choice": "auto",
  "input": "Find the sunrise time in Paris today and cite the source."
}'
```

###### Tests
Tested with above curl, removed Azure-specific tests.
2026-02-05 14:57:07 -08:00
jif-oai
428a9f6035 feat: wait for backfill to be ready (#10790) 2026-02-05 20:45:16 +00:00
sayan-oai
5fdf6f5efa chore: rm web-search-eligible header (#10660)
default-enablement of web_search is now client-side, no need to send
eligibility headers to backend.

Tested locally, headers no longer sent.

will wait for corresponding backend change to deploy before merging
2026-02-05 11:48:34 -08:00
iceweasel-oai
901d5b8fd6 add sandbox policy and sandbox name to codex.tool.call metrics (#10711)
This will give visibility into the comparative success rate of the
Windows sandbox implementations compared to other platforms.
2026-02-05 11:42:12 -08:00