codex

mirror of https://github.com/openai/codex.git synced 2026-05-02 20:32:04 +03:00

Author	SHA1	Message	Date
Eric Traut	152b676597	Fix flaky test relating to metadata remote URL (#16823 ) This test was flaking on Windows. Problem: The Windows CI test for turn metadata compared git remote URLs byte-for-byte even though equivalent remotes can be formatted differently across Git code paths. Solution: Normalize the expected and actual origin URLs in the test by trimming whitespace, removing a trailing slash, and stripping a trailing .git suffix before comparing.	2026-04-05 10:50:29 -07:00
rhan-oai	4fd5c35c4f	[codex-analytics] subagent analytics (#15915 ) - creates custom event that emits subagent thread analytics from core - wires client metadata (`product_client_id, client_name, client_version`), through from app-server - creates `created_at `timestamp in core - subagent analytics are behind `FeatureFlag::GeneralAnalytics` PR stack - [[telemetry] thread events #15690](https://github.com/openai/codex/pull/15690) - --> [[telemetry] subagent events #15915](https://github.com/openai/codex/pull/15915) - [[telemetry] turn events #15591](https://github.com/openai/codex/pull/15591) - [[telemetry] steer events #15697](https://github.com/openai/codex/pull/15697) - [[telemetry] queued prompt data #15804](https://github.com/openai/codex/pull/15804) Notes: - core does not spawn a subagent thread for compact, but represented in mapping for consistency `INFO \| 2026-04-01 13:08:12 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:399 \| Tracked codex_thread_initialized event params={'thread_id': '019d4aa9-233b-70f2-a958-c3dbae1e30fa', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': None}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'model': 'gpt-5.3-codex', 'ephemeral': False, 'initialization_mode': 'new', 'created_at': 1775074091, 'thread_source': 'subagent', 'subagent_source': 'thread_spawn', 'parent_thread_id': '019d4aa8-51ec-77e3-bafb-2c1b8e29e385'} \| ` `INFO \| 2026-04-01 13:08:41 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:399 \| Tracked codex_thread_initialized event params={'thread_id': '019d4aa9-94e3-75f1-8864-ff8ad0e55e1e', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': None}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'model': 'gpt-5.3-codex', 'ephemeral': False, 'initialization_mode': 'new', 'created_at': 1775074120, 'thread_source': 'subagent', 'subagent_source': 'review', 'parent_thread_id': None} \| ` --------- Co-authored-by: jif-oai <jif@openai.com> Co-authored-by: Michael Bolin <mbolin@openai.com>	2026-04-04 11:06:43 -07:00
Thibault Sottiaux	9e19004bc2	[codex] add context-window lineage headers (#16758 ) This change adds client-owned context-window and parent thread id headers to all requests to responses api.	2026-04-04 05:54:31 +00:00
Michael Bolin	3a22e10172	test: avoid PowerShell startup in Windows auth fixture (#16737 ) ## Why `provider_auth_command_supplies_bearer_token` and `provider_auth_command_refreshes_after_401` were still flaky under Windows Bazel because the generated fixture used `powershell.exe`, whose startup can be slow enough to trip the provider-auth timeout in CI. ## What Replace the generated Windows auth fixture script in `codex-rs/core/tests/suite/client.rs` with a small `.cmd` script executed by `cmd.exe /D /Q /C`, and advance `tokens.txt` one line at a time so the refresh-after-401 test still gets the second token on the second invocation. Also align the fixture timeout with the provider-auth default (`5_000` ms) to avoid introducing a test-only timing budget that is stricter than production behavior. ## Testing Left to CI, specifically the Windows Bazel `//codex-rs/core:core-all-test` coverage for the two provider-auth command tests.	2026-04-03 20:05:39 -07:00
Ahmed Ibrahim	8a19dbb177	Add spawn context for MultiAgentV2 children (#16746 )	2026-04-03 19:56:59 -07:00
Ahmed Ibrahim	e4f1b3a65e	Preempt mailbox mail after reasoning/commentary items (#16725 ) Send pending mailbox mail after completed reasoning or commentary items so follow-up requests can pick it up mid-turn. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-03 18:29:05 -07:00
Thibault Sottiaux	91ca49e53c	[codex] allow disabling environment context injection (#16745 ) This adds an `include_environment_context` config/profile flag that defaults on, and guards both initial injection and later environment updates to allow skipping injection of `<environment_context>`.	2026-04-03 18:06:52 -07:00
Thibault Sottiaux	8d19646861	[codex] allow disabling prompt instruction blocks (#16735 ) This PR adds root and profile config switches to omit the generated `<permissions instructions>` and `<apps_instructions>` prompt blocks while keeping both enabled by default, and it gates both the initial developer-context injection and later permissions diff injection so turning the permissions block off stays effective across turn-context overrides. Also added a prompt debug tool that can be used as `codex debug prompt-input "hello"` and dumps the constructed items list.	2026-04-03 23:47:56 +00:00
Michael Bolin	eaf12beacf	Codex/windows bazel rust test coverage no rs (#16528 ) # Why this PR exists This PR is trying to fix a coverage gap in the Windows Bazel Rust test lane. Before this change, the Windows `bazel test //...` job was nominally part of PR CI, but a non-trivial set of `//codex-rs/...` Rust test targets did not actually contribute test signal on Windows. In particular, targets such as `//codex-rs/core:core-unit-tests`, `//codex-rs/core:core-all-test`, and `//codex-rs/login:login-unit-tests` were incompatible during Bazel analysis on the Windows gnullvm platform, so they never reached test execution there. That is why the Cargo-powered Windows CI job could surface Windows-only failures that the Bazel-powered job did not report: Cargo was executing those tests, while Bazel was silently dropping them from the runnable target set. The main goal of this PR is to make the Windows Bazel test lane execute those Rust test targets instead of skipping them during analysis, while still preserving `windows-gnullvm` as the target configuration for the code under test. In other words: use an MSVC host/exec toolchain where Bazel helper binaries and build scripts need it, but continue compiling the actual crate targets with the Windows gnullvm cfgs that our current Bazel matrix is supposed to exercise. # Important scope note This branch intentionally removes the non-resource-loading `.rs` test and production-code changes from the earlier `codex/windows-bazel-rust-test-coverage` branch. The only Rust source changes kept here are runfiles/resource-loading fixes in TUI tests: - `codex-rs/tui/src/chatwidget/tests.rs` - `codex-rs/tui/tests/manager_dependency_regression.rs` That is deliberate. Since the corresponding tests already pass under Cargo, this PR is meant to test whether Bazel infrastructure/toolchain fixes alone are enough to get a healthy Windows Bazel test signal, without changing test behavior for Windows timing, shell output, or SQLite file-locking. # How this PR changes the Windows Bazel setup ## 1. Split Windows host/exec and target concerns in the Bazel test lane The core change is that the Windows Bazel test job now opts into an MSVC host platform for Bazel execution-time tools, but only for `bazel test`, not for the Bazel clippy build. Files: - `.github/workflows/bazel.yml` - `.github/scripts/run-bazel-ci.sh` - `MODULE.bazel` What changed: - `run-bazel-ci.sh` now accepts `--windows-msvc-host-platform`. - When that flag is present on Windows, the wrapper appends `--host_platform=//:local_windows_msvc` unless the caller already provided an explicit `--host_platform`. - `bazel.yml` passes that wrapper flag only for the Windows `bazel test //...` job. - The Bazel clippy job intentionally does not pass that flag, so clippy stays on the default Windows gnullvm host/exec path and continues linting against the target cfgs we care about. - `run-bazel-ci.sh` also now forwards `CODEX_JS_REPL_NODE_PATH` on Windows and normalizes the `node` executable path with `cygpath -w`, so tests that need Node resolve the runner's Node installation correctly under the Windows Bazel test environment. Why this helps: - The original incompatibility chain was mostly on the exec/tool side of the graph, not in the Rust test code itself. Moving host tools to MSVC lets Bazel resolve helper binaries and generators that were not viable on the gnullvm exec platform. - Keeping the target platform on gnullvm preserves cfg coverage for the crates under test, which is important because some Windows behavior differs between `msvc` and `gnullvm`. ## 2. Teach the repo's Bazel Rust macro about Windows link flags and integration-test knobs Files: - `defs.bzl` - `codex-rs/core/BUILD.bazel` - `codex-rs/otel/BUILD.bazel` - `codex-rs/tui/BUILD.bazel` What changed: - Replaced the old gnullvm-only linker flag block with `WINDOWS_RUSTC_LINK_FLAGS`, which now handles both Windows ABIs: - gnullvm gets `-C link-arg=-Wl,--stack,8388608` - MSVC gets `-C link-arg=/STACK:8388608`, `-C link-arg=/NODEFAULTLIB:libucrt.lib`, and `-C link-arg=ucrt.lib` - Threaded those Windows link flags into generated `rust_binary`, unit-test binaries, and integration-test binaries. - Extended `codex_rust_crate(...)` with: - `integration_test_args` - `integration_test_timeout` - Used those new knobs to: - mark `//codex-rs/core:core-all-test` as a long-running integration test - serialize `//codex-rs/otel:otel-all-test` with `--test-threads=1` - Added `src/*/.rs` to `codex-rs/tui` test runfiles, because one regression test scans source files at runtime and Bazel does not expose source-tree directories unless they are declared as data. Why this helps: - Once host-side MSVC tools are available, we still need the generated Rust test binaries to link correctly on Windows. The MSVC-side stack/UCRT flags make those binaries behave more like their Cargo-built equivalents. - The integration-test macro knobs avoid hardcoding one-off test behavior in ad hoc BUILD rules and make the generated test targets more expressive where Bazel and Cargo have different runtime defaults. ## 3. Patch `rules_rs` / `rules_rust` so Windows MSVC exec-side Rust and build scripts are actually usable Files: - `MODULE.bazel` - `patches/rules_rs_windows_exec_linker.patch` - `patches/rules_rust_windows_bootstrap_process_wrapper_linker.patch` - `patches/rules_rust_windows_build_script_runner_paths.patch` - `patches/rules_rust_windows_exec_msvc_build_script_env.patch` - `patches/rules_rust_windows_msvc_direct_link_args.patch` - `patches/rules_rust_windows_process_wrapper_skip_temp_outputs.patch` - `patches/BUILD.bazel` What these patches do: - `rules_rs_windows_exec_linker.patch` - Adds a `rust-lld` filegroup for Windows Rust toolchain repos, symlinked to `lld-link.exe` from `PATH`. - Marks Windows toolchains as using a direct linker driver. - Supplies Windows stdlib link flags for both gnullvm and MSVC. - `rules_rust_windows_bootstrap_process_wrapper_linker.patch` - For Windows MSVC Rust targets, prefers the Rust toolchain linker over an inherited C++ linker path like `clang++`. - This specifically avoids the broken mixed-mode command line where rustc emits MSVC-style `/NOLOGO` / `/LIBPATH:` / `/OUT:` arguments but Bazel still invokes `clang++.exe`. - `rules_rust_windows_build_script_runner_paths.patch` - Normalizes forward-slash execroot-relative paths into Windows path separators before joining them on Windows. - Uses short Windows paths for `RUSTC`, `OUT_DIR`, and the build-script working directory to avoid path-length and quoting issues in third-party build scripts. - Exposes `RULES_RUST_BAZEL_BUILD_SCRIPT_RUNNER=1` to build scripts so crate-local patches can detect "this is running under Bazel's build-script runner". - Fixes the Windows runfiles cleanup filter so generated files with retained suffixes are actually retained. - `rules_rust_windows_exec_msvc_build_script_env.patch` - For exec-side Windows MSVC build scripts, stops force-injecting Bazel's `CC`, `CXX`, `LD`, `CFLAGS`, and `CXXFLAGS` when that would send GNU-flavored tool paths/flags into MSVC-oriented Cargo build scripts. - Rewrites or strips GNU-only `--sysroot`, MinGW include/library paths, stack-protector, and `_FORTIFY_SOURCE` flags on the MSVC exec path. - The practical effect is that build scripts can fall back to the Visual Studio toolchain environment already exported by CI instead of crashing inside Bazel's hermetic `clang.exe` setup. - `rules_rust_windows_msvc_direct_link_args.patch` - When using a direct linker on Windows, stops forwarding GNU driver flags such as `-L...` and `--sysroot=...` that `lld-link.exe` does not understand. - Passes non-`.lib` native artifacts as explicit `-Clink-arg=<path>` entries when needed. - Filters C++ runtime libraries to `.lib` artifacts on the Windows direct-driver path. - `rules_rust_windows_process_wrapper_skip_temp_outputs.patch` - Excludes transient `.tmp` and `.rcgu.o` files from process-wrapper dependency search-path consolidation, so unstable compiler outputs do not get treated as real link search-path inputs. Why this helps: - The host-platform split alone was not enough. Once Bazel started analyzing/running previously incompatible Rust tests on Windows, the next failures were in toolchain plumbing: - MSVC-targeted Rust tests were being linked through `clang++` with MSVC-style arguments. - Cargo build scripts running under Bazel's Windows MSVC exec platform were handed Unix/GNU-flavored path and flag shapes. - Some generated paths were too long or had path-separator forms that third-party Windows build scripts did not tolerate. - These patches make that mixed Bazel/Cargo/Rust/MSVC path workable enough for the test lane to actually build and run the affected crates. ## 4. Patch third-party crate build scripts that were not robust under Bazel's Windows MSVC build-script path Files: - `MODULE.bazel` - `patches/aws-lc-sys_windows_msvc_prebuilt_nasm.patch` - `patches/ring_windows_msvc_include_dirs.patch` - `patches/zstd-sys_windows_msvc_include_dirs.patch` What changed: - `aws-lc-sys` - Detects Bazel's Windows MSVC build-script runner via `RULES_RUST_BAZEL_BUILD_SCRIPT_RUNNER` or a `bazel-out` manifest-dir path. - Uses `clang-cl` for Bazel Windows MSVC builds when no explicit `CC`/`CXX` is set. - Allows prebuilt NASM on the Bazel Windows MSVC path even when `nasm` is not available directly in the runner environment. - Avoids canonicalizing `CARGO_MANIFEST_DIR` in the Bazel Windows MSVC case, because that path may point into Bazel output/runfiles state where preserving the given path is more reliable than forcing a local filesystem canonicalization. - `ring` - Under the Bazel Windows MSVC build-script runner, copies the pregenerated source tree into `OUT_DIR` and uses that as the generated-source root. - Adds include paths needed by MSVC compilation for Fiat/curve25519/P-256 generated headers. - Rewrites a few relative includes in C sources so the added include directories are sufficient. - `zstd-sys` - Adds MSVC-only include directories for `compress`, `decompress`, and feature-gated dictionary/legacy/seekable sources. - Skips `-fvisibility=hidden` on MSVC targets, where that GCC/Clang-style flag is not the right mechanism. Why this helps: - After the `rules_rust` plumbing started running build scripts on the Windows MSVC exec path, some third-party crates still failed for crate-local reasons: wrong compiler choice, missing include directories, build-script assumptions about manifest paths, or Unix-only C compiler flags. - These crate patches address those crate-local assumptions so the larger toolchain change can actually reach first-party Rust test execution. ## 5. Keep the only `.rs` test changes to Bazel/Cargo runfiles parity Files: - `codex-rs/tui/src/chatwidget/tests.rs` - `codex-rs/tui/tests/manager_dependency_regression.rs` What changed: - Instead of asking `find_resource!` for a directory runfile like `src/chatwidget/snapshots` or `src`, these tests now resolve one known file runfile first and then walk to its parent directory. Why this helps: - Bazel runfiles are more reliable for explicitly declared files than for source-tree directories that happen to exist in a Cargo checkout. - This keeps the tests working under both Cargo and Bazel without changing their actual assertions. # What we tried before landing on this shape, and why those attempts did not work ## Attempt 1: Force `--host_platform=//:local_windows_msvc` for all Windows Bazel jobs This did make the previously incompatible test targets show up during analysis, but it also pushed the Bazel clippy job and some unrelated build actions onto the MSVC exec path. Why that was bad: - Windows clippy started running third-party Cargo build scripts with Bazel's MSVC exec settings and crashed in crates such as `tree-sitter` and `libsqlite3-sys`. - That was a regression in a job that was previously giving useful gnullvm-targeted lint signal. What this PR does instead: - The wrapper flag is opt-in, and `bazel.yml` uses it only for the Windows `bazel test` lane. - The clippy lane stays on the default Windows gnullvm host/exec configuration. ## Attempt 2: Broaden the `rules_rust` linker override to all Windows Rust actions This fixed the MSVC test-lane failure where normal `rust_test` targets were linked through `clang++` with MSVC-style arguments, but it broke the default gnullvm path. Why that was bad: - `@@rules_rs++rules_rust+rules_rust//util/process_wrapper:process_wrapper` on the gnullvm exec platform started linking with `lld-link.exe` and then failed to resolve MinGW-style libraries such as `-lkernel32`, `-luser32`, and `-lmingw32`. What this PR does instead: - The linker override is restricted to Windows MSVC targets only. - The gnullvm path keeps its original linker behavior, while MSVC uses the direct Windows linker. ## Attempt 3: Keep everything on pure Windows gnullvm and patch the V8 / Python incompatibility chain instead This would have preserved a single Windows ABI everywhere, but it is a much larger project than this PR. Why that was not the practical first step: - The original incompatibility chain ran through exec-side generators and helper tools, not only through crate code. - `third_party/v8` is already special-cased on Windows gnullvm because `rusty_v8` only publishes Windows prebuilts under MSVC names. - Fixing that path likely means deeper changes in V8/rules_python/rules_rust toolchain resolution and generator execution, not just one local CI flag. What this PR does instead: - Keep gnullvm for the target cfgs we want to exercise. - Move only the Windows test lane's host/exec platform to MSVC, then patch the build-script/linker boundary enough for that split configuration to work. ## Attempt 4: Validate compatibility with `bazel test --nobuild ...` This turned out to be a misleading local validation command. Why: - `bazel test --nobuild ...` can successfully analyze targets and then still exit 1 with "Couldn't start the build. Unable to run tests" because there are no runnable test actions after `--nobuild`. Better local check: ```powershell bazel build --nobuild --keep_going --host_platform=//:local_windows_msvc //codex-rs/login:login-unit-tests //codex-rs/core:core-unit-tests //codex-rs/core:core-all-test ``` # Which patches probably deserve upstream follow-up My rough take is that the `rules_rs` / `rules_rust` patches are the highest-value upstream candidates, because they are fixing generic Windows host/exec + MSVC direct-linker behavior rather than Codex-specific test logic. Strong upstream candidates: - `patches/rules_rs_windows_exec_linker.patch` - `patches/rules_rust_windows_bootstrap_process_wrapper_linker.patch` - `patches/rules_rust_windows_build_script_runner_paths.patch` - `patches/rules_rust_windows_exec_msvc_build_script_env.patch` - `patches/rules_rust_windows_msvc_direct_link_args.patch` - `patches/rules_rust_windows_process_wrapper_skip_temp_outputs.patch` Why these seem upstreamable: - They address general-purpose problems in the Windows MSVC exec path: - missing direct-linker exposure for Rust toolchains - wrong linker selection when rustc emits MSVC-style args - Windows path normalization/short-path issues in the build-script runner - forwarding GNU-flavored CC/link flags into MSVC Cargo build scripts - unstable temp outputs polluting process-wrapper search-path state Potentially upstreamable crate patches, but likely with more care: - `patches/zstd-sys_windows_msvc_include_dirs.patch` - `patches/ring_windows_msvc_include_dirs.patch` - `patches/aws-lc-sys_windows_msvc_prebuilt_nasm.patch` Notes on those: - The `zstd-sys` and `ring` include-path fixes look fairly generic for MSVC/Bazel build-script environments and may be straightforward to propose upstream after we confirm CI stability. - The `aws-lc-sys` patch is useful, but it includes a Bazel-specific environment probe and CI-specific compiler fallback behavior. That probably needs a cleaner upstream-facing shape before sending it out, so upstream maintainers are not forced to adopt Codex's exact CI assumptions. Probably not worth upstreaming as-is: - The repo-local Starlark/test target changes in `defs.bzl`, `codex-rs//BUILD.bazel`, and `.github/scripts/run-bazel-ci.sh` are mostly Codex-specific policy and CI wiring, not generic rules changes. # Validation notes for reviewers On this branch, I ran the following local checks after dropping the non-resource-loading Rust edits: ```powershell cargo test -p codex-tui just --shell 'C:\Program Files\Git\bin\bash.exe' --shell-arg -lc -- fix -p codex-tui python .\tools\argument-comment-lint\run-prebuilt-linter.py -p codex-tui just --shell 'C:\Program Files\Git\bin\bash.exe' --shell-arg -lc fmt ``` One local caveat: - `just argument-comment-lint` still fails on this Windows machine for an unrelated Bazel toolchain-resolution issue in `//codex-rs/exec:exec-all-test`, so I used the direct prebuilt linter for `codex-tui` as the local fallback. # Expected reviewer takeaway If this PR goes green, the important conclusion is that the Windows Bazel test coverage gap was primarily a Bazel host/exec toolchain problem, not a need to make the Rust tests themselves Windows-specific. That would be a strong signal that the deleted non-resource-loading Rust test edits from the earlier branch should stay out, and that future work should focus on upstreaming the generic `rules_rs` / `rules_rust` Windows fixes and reducing the crate-local patch surface.	2026-04-03 15:34:03 -07:00
Eric Traut	4b8bab6ad3	Remove OPENAI_BASE_URL config fallback (#16720 ) The `OPENAI_BASE_URL` environment variable has been a significant support issue, so we decided to deprecate it in favor of an `openai_base_url` config key. We've had the deprecation warning in place for about a month, so users have had time to migrate to the new mechanism. This PR removes support for `OPENAI_BASE_URL` entirely.	2026-04-03 15:03:21 -07:00
Michael Bolin	a70aee1a1e	Fix Windows Bazel app-server trust tests (#16711 ) ## Why Extracted from [#16528](https://github.com/openai/codex/pull/16528) so the Windows Bazel app-server test failures can be reviewed independently from the rest of that PR. This PR targets: - `suite::v2::thread_shell_command::thread_shell_command_runs_as_standalone_turn_and_persists_history` - `suite::v2::thread_start::thread_start_with_elevated_sandbox_trusts_project_and_followup_loads_project_config` - `suite::v2::thread_start::thread_start_with_nested_git_cwd_trusts_repo_root` There were two Windows-specific assumptions baked into those tests and the underlying trust lookup: - project trust keys were persisted and looked up using raw path strings, but Bazel's Windows test environment can surface canonicalized paths with `\\?\` / UNC prefixes or normalized symlink/junction targets, so follow-up `thread/start` requests no longer matched the project entry that had just been written - `item/commandExecution/outputDelta` assertions compared exact trailing line endings even though shell output chunk boundaries and CRLF handling can differ on Windows, and Bazel made that timing-sensitive mismatch visible There was also one behavior bug separate from the assertion cleanup: `thread/start` decided whether to persist trust from the final resolved sandbox policy, but on Windows an explicit `workspace-write` request may be downgraded to `read-only`. That incorrectly skipped writing trust even though the request had asked to elevate the project, so the new logic also keys off the requested sandbox mode. ## What - Canonicalize project trust keys when persisting/loading `[projects]` entries, while still accepting legacy raw keys for existing configs. - Persist project trust when `thread/start` explicitly requests `workspace-write` or `danger-full-access`, even if the resolved policy is later downgraded on Windows. - Make the Windows app-server tests compare persisted trust paths and command output deltas in a path/newline-normalized way. ## Verification - Existing app-server v2 tests cover the three failing Windows Bazel cases above.	2026-04-03 21:41:25 +00:00
Ahmed Ibrahim	567d2603b8	Sanitize forked child history (#16709 ) - Keep only parent system/developer/user messages plus assistant final-answer messages in forked child history. - Strip parent tool/reasoning items and remove the unmatched synthetic spawn output.	2026-04-03 21:13:34 +00:00
Michael Bolin	1d4b5f130c	fix windows-only clippy lint violation (#16722 ) I missed this in https://github.com/openai/codex/pull/16707.	2026-04-03 21:00:24 +00:00
Michael Bolin	faab4d39e1	fix: preserve platform-specific core shell env vars (#16707 ) ## Why We were seeing failures in the following tests as part of trying to get all the tests running under Bazel on Windows in CI (https://github.com/openai/codex/pull/16528): ``` suite::shell_command::unicode_output::with_login suite::shell_command::unicode_output::without_login ``` Certainly `PATHEXT` should have been included in the extra `CORE_VARS` list, so we fix that up here, but also take things a step further for now by forcibly ensuring it is set on Windows in the return value of `create_env()`. Once we get the Windows Bazel build working reliably (i.e., after #16528 is merged), we should come back to this and confirm we can remove the special case in `create_env()`. ## What - Split core env inheritance into `COMMON_CORE_VARS` plus platform-specific allowlists for Windows and Unix in [`exec_env.rs`](`1b55c88fbf/codex-rs/core/src/exec_env.rs (L45-L81)`). - Preserve `PATHEXT`, `USERNAME`, and `USERPROFILE` on Windows, and `HOME` / locale vars on Unix. - Backfill a default `PATHEXT` in `create_env()` on Windows if the parent env does not provide one, so child process launch still works in stripped-down Bazel environments. - Extend the Windows exec-env test to assert mixed-case `PathExt` survives case-insensitive core filtering, and document why the shell-command Unicode test goes through a child process. ## Verification - `cargo test -p codex-core exec_env::tests`	2026-04-03 12:07:07 -07:00
Ahmed Ibrahim	af8a9d2d2b	remove temporary ownership re-exports (#16626 ) Stacked on #16508. This removes the temporary `codex-core` / `codex-login` re-export shims from the ownership split and rewrites callsites to import directly from `codex-model-provider-info`, `codex-models-manager`, `codex-api`, `codex-protocol`, `codex-feedback`, and `codex-response-debug-context`. No behavior change intended; this is the mechanical import cleanup layer split out from the ownership move. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-03 00:33:34 -07:00
Michael Bolin	b15c918836	fix: use cmd.exe in Windows unicode shell test (#16668 ) ## Why This is a follow-up to #16665. The Windows `unicode_output` test should still exercise a child process so it verifies PowerShell's UTF-8 output configuration, but `$env:COMSPEC` depends on that environment variable surviving the curated Bazel test environment. Using `cmd.exe` keeps the child-process coverage while avoiding both bare `cmd` + `PATHEXT` lookup and `$env:COMSPEC` env passthrough assumptions. ## What - Run `cmd.exe /c echo naïve_café` in the Windows branch of `unicode_output`. ## Verification - `cargo test -p codex-core unicode_output`	2026-04-03 00:32:08 -07:00
Michael Bolin	14f95db57b	fix: use COMSPEC in Windows unicode shell test (#16665 ) ## Why Windows Bazel shell tests launch PowerShell with a curated environment, so `PATHEXT` may be absent. The existing `unicode_output` test invokes bare `cmd`, which can fail before the test exercises UTF-8 child-process output. ## What - Use `$env:COMSPEC /c echo naïve_café` in the Windows branch of `unicode_output`. - Preserve the external child-process path instead of switching the test to a PowerShell builtin. ## Verification - `cargo test -p codex-core unicode_output`	2026-04-02 23:54:02 -07:00
Michael Bolin	b4787bf4c0	fix: changes to test that should help them pass on Windows under Bazel (#16662 ) https://github.com/openai/codex/pull/16460 was a large PR created by Codex to try to get the tests to pass under Bazel on Windows. Indeed, it successfully ran all of the tests under `//codex-rs/core:` with its changes to `codex-rs/core/`, though the full set of changes seems to be too broad. This PR tries to port the key changes, which are: - Under Bazel, the `USERNAME` environment variable is not guaranteed to be set on Windows, so for tests that need a non-empty env var as a convenient substitute for an env var containing an API key, just use `PATH`. Note that `PATH` is unlikely to contain characters that are not allowed in an HTTP header value. - Specify `"powershell.exe"` instead of just `"powershell"` in case the `PATHEXT` env var gets lost in the shuffle.	2026-04-02 23:06:36 -07:00
Ahmed Ibrahim	6fff9955f1	extract models manager and related ownership from core (#16508 ) ## Summary - split `models-manager` out of `core` and add `ModelsManagerConfig` plus `Config::to_models_manager_config()` so model metadata paths stop depending on `core::Config` - move login-owned/auth-owned code out of `core` into `codex-login`, move model provider config into `codex-model-provider-info`, move API bridge mapping into `codex-api`, move protocol-owned types/impls into `codex-protocol`, and move response debug helpers into a dedicated `response-debug-context` crate - move feedback tag emission into `codex-feedback`, relocate tests to the crates that now own the code, and keep broad temporary re-exports so this PR avoids a giant import-only rewrite ## Major moves and decisions - created `codex-models-manager` as the owner for model cache/catalog/config/model info logic, including the new `ModelsManagerConfig` struct - created `codex-model-provider-info` as the owner for provider config parsing/defaults and kept temporary `codex-login`/`codex-core` re-exports for old import paths - moved `api_bridge` error mapping + `CoreAuthProvider` into `codex-api`, while `codex-login::api_bridge` temporarily re-exports those symbols and keeps the `auth_provider_from_auth` wrapper - moved `auth_env_telemetry` and `provider_auth` ownership to `codex-login` - moved `CodexErr` ownership to `codex-protocol::error`, plus `StreamOutput`, `bytes_to_string_smart`, and network policy helpers to protocol-owned modules - created `codex-response-debug-context` for `extract_response_debug_context`, `telemetry_transport_error_message`, and related response-debug plumbing instead of leaving that behavior in `core` - moved `FeedbackRequestTags`, `emit_feedback_request_tags`, and `emit_feedback_request_tags_with_auth_env` to `codex-feedback` - deferred removal of temporary re-exports and the mechanical import rewrites to a stacked follow-up PR so this PR stays reviewable ## Test moves - moved auth refresh coverage from `core/tests/suite/auth_refresh.rs` to `login/tests/suite/auth_refresh.rs` - moved text encoding coverage from `core/tests/suite/text_encoding_fix.rs` to `protocol/src/exec_output_tests.rs` - moved model info override coverage from `core/tests/suite/model_info_overrides.rs` to `models-manager/src/model_info_overrides_tests.rs` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-02 23:00:02 -07:00
Michael Bolin	beb3978a3b	test: use cmd.exe for ProviderAuthScript on Windows (#16629 ) ## Why The Windows `ProviderAuthScript` test helpers do not need PowerShell. Running them through `cmd.exe` is enough to emit the next fixture token and rotate `tokens.txt`, and it avoids a PowerShell-specific dependency in these tests. ## What changed - Replaced the Windows `print-token.ps1` fixtures with `print-token.cmd` in `codex-rs/core/src/models_manager/manager_tests.rs` and `codex-rs/login/src/auth/auth_tests.rs`. - Switched the failing external-auth helper in `codex-rs/login/src/auth/auth_tests.rs` from `powershell.exe -Command 'exit 1'` to `cmd.exe /d /s /c 'exit /b 1'`. - Updated Windows timeout comments so they no longer call out PowerShell specifically. ## Verification - `cargo test -p codex-login` - `cargo test -p codex-core` (fails in unrelated `core/src/config/config_tests.rs` assertions in this checkout)	2026-04-02 17:33:07 -07:00
Michael Bolin	7a3eec6fdb	core: cut codex-core compile time 48% with native async SessionTask (#16631 ) ## Why This continues the compile-time cleanup from #16630. `SessionTask` implementations are monomorphized, but `Session` stores the task behind a `dyn` boundary so it can drive and abort heterogenous turn tasks uniformly. That means we can move the `#[async_trait]` expansion off the implementation trait, keep a small boxed adapter only at the storage boundary, and preserve the existing task lifecycle semantics while reducing the amount of generated async-trait glue in `codex-core`. One measurement caveat showed up while exploring this: a warm incremental benchmark based on `touch core/src/tasks/mod.rs && cargo check -p codex-core --lib` was basically flat, but that was the wrong benchmark for this change. Using package-clean `codex-core` rebuilds, like #16630, shows the real win. Relevant pre-change code: - [`SessionTask` with `#[async_trait]`](`3c7f013f97/codex-rs/core/src/tasks/mod.rs (L129-L182)`) - [`RunningTask` storing `Arc<dyn SessionTask>`](`3c7f013f97/codex-rs/core/src/state/turn.rs (L69-L77)`) ## What changed - Switched `SessionTask::{run, abort}` to native RPITIT futures with explicit `Send` bounds. - Added a private `AnySessionTask` adapter that boxes those futures only at the `Arc<dyn ...>` storage boundary. - Updated `RunningTask` to store `Arc<dyn AnySessionTask>` and removed `#[async_trait]` from the concrete task impls plus test-only `SessionTask` impls. ## Timing Benchmarked package-clean `codex-core` rebuilds with dependencies left warm: ```shell cargo check -p codex-core --lib >/dev/null cargo clean -p codex-core >/dev/null /usr/bin/time -p cargo +nightly rustc -p codex-core --lib -- \ -Z time-passes \ -Z time-passes-format=json >/dev/null ``` \| revision \| rustc `total` \| process `real` \| `generate_crate_metadata` \| `MIR_borrow_checking` \| `monomorphization_collector_graph_walk` \| \| --- \| ---: \| ---: \| ---: \| ---: \| ---: \| \| parent `3c7f013f9735` \| 67.21s \| 67.71s \| 24.61s \| 23.43s \| 22.43s \| \| this PR `2cafd783ac22` \| 35.08s \| 35.60s \| 8.01s \| 7.25s \| 7.15s \| \| delta \| -47.8% \| -47.4% \| -67.5% \| -69.1% \| -68.1% \| For completeness, the warm touched-file benchmark stayed flat (`1.96s` parent vs `1.97s` this PR), which is why that benchmark should not be used to evaluate this refactor. ## Verification - Ran `cargo test -p codex-core`; this change compiled and task-related tests passed before hitting the same unrelated 5 `config::tests::guardian` failures already present on the parent stack.	2026-04-02 23:39:56 +00:00
Michael Bolin	3c7f013f97	core: cut codex-core compile time 63% with native async ToolHandler (#16630 ) ## Why `ToolHandler` was still paying a large compile-time tax from `#[async_trait]` on every concrete handler impl, even though the only object-safe boundary the registry actually stores is the internal `AnyToolHandler` adapter. This PR removes that macro-generated async wrapper layer from concrete `ToolHandler` impls while keeping the existing object-safe shim in `AnyToolHandler`. In practice, that gets essentially the same compile-time win as the larger type-erasure refactor in #16627, but with a much smaller diff and without changing the public shape of `ToolHandler<Output = T>`. That tradeoff matters here because this is a broad `codex-core` hotspot and reviewers should be able to judge the compile-time impact from hard numbers, not vibes. ## Headline result On a clean `codex-core` package rebuild (`cargo clean -p codex-core` before each command), rustc `total` dropped from 187.15s to 68.98s versus the shared `0bd31dc382bd` baseline: -63.1%. The biggest hot passes dropped by roughly 71-72%: \| Metric \| Baseline `0bd31dc382bd` \| This PR `41f7ac0adeac` \| Delta \| \|---\|---:\|---:\|---:\| \| `total` \| 187.15s \| 68.98s \| -63.1% \| \| `generate_crate_metadata` \| 84.53s \| 24.49s \| -71.0% \| \| `MIR_borrow_checking` \| 84.13s \| 24.58s \| -70.8% \| \| `monomorphization_collector_graph_walk` \| 79.74s \| 22.19s \| -72.2% \| \| `evaluate_obligation` self-time \| 180.62s \| 46.91s \| -74.0% \| Important caveat: `-Z time-passes` timings are nested, so `generate_crate_metadata` and `monomorphization_collector_graph_walk` are mostly overlapping, not additive. ## Why this PR over #16627 #16627 already proved that the `ToolHandler` stack was the right hotspot, but it got there by making `ToolHandler` object-safe and changing every handler to return `BoxFuture<Result<AnyToolResult, _>>` directly. This PR keeps the lower-churn shape: - `ToolHandler` remains generic over `type Output`. - Concrete handlers use native RPITIT futures with explicit `Send` bounds. - `AnyToolHandler` remains the only object-safe adapter and still does the boxing at the registry boundary, as before. - The implementation diff is only 33 files, +28/-77. The measurements are at least comparable, and in this run this PR is slightly faster than #16627 on the pass-level total: \| Metric \| #16627 \| This PR \| Delta \| \|---\|---:\|---:\|---:\| \| `total` \| 79.90s \| 68.98s \| -13.7% \| \| `generate_crate_metadata` \| 25.88s \| 24.49s \| -5.4% \| \| `monomorphization_collector_graph_walk` \| 23.54s \| 22.19s \| -5.7% \| \| `evaluate_obligation` self-time \| 43.29s \| 46.91s \| +8.4% \| ## Profile data ### Crate-level timings `cargo +nightly build -p codex-core --lib -Z unstable-options --timings=json` after `cargo clean -p codex-core`. Baseline data below is reused from the shared parent `0bd31dc382bd` profile because this PR and #16627 are both one commit on top of that same parent. \| Crate \| Baseline `duration` \| This PR `duration` \| Delta \| Baseline `rmeta_time` \| This PR `rmeta_time` \| Delta \| \|---\|---:\|---:\|---:\|---:\|---:\|---:\| \| `codex_core` \| 187.380776583s \| 69.171113833s \| -63.1% \| 174.474507208s \| 55.873015583s \| -68.0% \| \| `starlark` \| 17.90s \| 16.773824125s \| -6.3% \| n/a \| 8.8999965s \| n/a \| ### Pass-level timings `cargo +nightly rustc -p codex-core --lib -- -Z time-passes -Z time-passes-format=json` after `cargo clean -p codex-core`. \| Pass \| Baseline \| This PR \| Delta \| \|---\|---:\|---:\|---:\| \| `total` \| 187.150662083s \| 68.978770375s \| -63.1% \| \| `generate_crate_metadata` \| 84.531864625s \| 24.487462958s \| -71.0% \| \| `MIR_borrow_checking` \| 84.131389375s \| 24.575553875s \| -70.8% \| \| `monomorphization_collector_graph_walk` \| 79.737515042s \| 22.190207417s \| -72.2% \| \| `codegen_crate` \| 12.362532292s \| 12.695237625s \| +2.7% \| \| `type_check_crate` \| 4.4765405s \| 5.442019542s \| +21.6% \| \| `coherence_checking` \| 3.311121208s \| 4.239935292s \| +28.0% \| \| process `real` / `user` / `sys` \| 187.70s / 201.87s / 4.99s \| 69.52s / 85.90s / 2.92s \| n/a \| ### Self-profile query summary `cargo +nightly rustc -p codex-core --lib -- -Z self-profile=... -Z self-profile-events=default,query-keys,args,llvm,artifact-sizes` after `cargo clean -p codex-core`, summarized with `measureme summarize -p 0.5`. \| Query / phase \| Baseline self time \| This PR self time \| Delta \| Baseline total time \| This PR total time \| Baseline item count \| This PR item count \| Baseline cache hits \| This PR cache hits \| \|---\|---:\|---:\|---:\|---:\|---:\|---:\|---:\|---:\|---:\| \| `evaluate_obligation` \| 180.62s \| 46.91s \| -74.0% \| 182.08s \| 48.37s \| 572,234 \| 388,659 \| 1,130,998 \| 1,058,553 \| \| `mir_borrowck` \| 1.42s \| 1.49s \| +4.9% \| 93.77s \| 29.59s \| n/a \| 6,184 \| n/a \| 15,298 \| \| `typeck` \| 1.84s \| 1.87s \| +1.6% \| 2.38s \| 2.44s \| n/a \| 9,367 \| n/a \| 79,247 \| \| `LLVM_module_codegen_emit_obj` \| n/a \| 17.12s \| n/a \| 17.01s \| 17.12s \| n/a \| 256 \| n/a \| 0 \| \| `LLVM_passes` \| n/a \| 13.07s \| n/a \| 12.95s \| 13.07s \| n/a \| 1 \| n/a \| 0 \| \| `codegen_module` \| n/a \| 12.33s \| n/a \| 12.22s \| 13.64s \| n/a \| 256 \| n/a \| 0 \| \| `items_of_instance` \| n/a \| 676.00ms \| n/a \| n/a \| 24.96s \| n/a \| 99,990 \| n/a \| 0 \| \| `type_op_prove_predicate` \| n/a \| 660.79ms \| n/a \| n/a \| 24.78s \| n/a \| 78,762 \| n/a \| 235,877 \| \| Summary \| Baseline \| This PR \| \|---\|---:\|---:\| \| `evaluate_obligation` % of total CPU \| 70.821% \| 38.880% \| \| self-profile total CPU time \| 255.042999997s \| 120.661175956s \| \| process `real` / `user` / `sys` \| 220.96s / 235.02s / 7.09s \| 86.35s / 103.66s / 3.54s \| ### Artifact sizes From the same `measureme summarize` output: \| Artifact \| Baseline \| This PR \| Delta \| \|---\|---:\|---:\|---:\| \| `crate_metadata` \| 26,534,471 bytes \| 26,545,248 bytes \| +10,777 \| \| `dep_graph` \| 253,181,425 bytes \| 239,240,806 bytes \| -13,940,619 \| \| `linked_artifact` \| 565,366,624 bytes \| 562,673,176 bytes \| -2,693,448 \| \| `object_file` \| 513,127,264 bytes \| 510,464,096 bytes \| -2,663,168 \| \| `query_cache` \| 137,440,945 bytes \| 136,982,566 bytes \| -458,379 \| \| `cgu_instructions` \| 3,586,307 bytes \| 3,575,121 bytes \| -11,186 \| \| `codegen_unit_size_estimate` \| 2,084,846 bytes \| 2,078,773 bytes \| -6,073 \| \| `work_product_index` \| 19,565 bytes \| 19,565 bytes \| 0 \| ### Baseline hotspots before this change These are the top normalized obligation buckets from the shared baseline profile: \| Obligation bucket \| Samples \| Duration \| \|---\|---:\|---:\| \| `outlives:tasks::review::ReviewTask` \| 1,067 \| 6.33s \| \| `outlives:tools::handlers::unified_exec::UnifiedExecHandler` \| 896 \| 5.63s \| \| `trait:T as tools::registry::ToolHandler` \| 876 \| 5.45s \| \| `outlives:tools::handlers::shell::ShellHandler` \| 888 \| 5.37s \| \| `outlives:tools::handlers::shell::ShellCommandHandler` \| 870 \| 5.29s \| \| `outlives:tools::runtimes::shell::unix_escalation::CoreShellActionProvider` \| 637 \| 3.73s \| \| `outlives:tools::handlers::mcp::McpHandler` \| 695 \| 3.61s \| \| `outlives:tasks::regular::RegularTask` \| 726 \| 3.57s \| Top `items_of_instance` entries before this change were mostly concrete async handler/task impls: \| Instance \| Duration \| \|---\|---:\| \| `tasks::regular::{impl#2}::run` \| 3.79s \| \| `tools::handlers::mcp::{impl#0}::handle` \| 3.27s \| \| `tools::runtimes::shell::unix_escalation::{impl#2}::determine_action` \| 3.09s \| \| `tools::handlers::agent_jobs::{impl#11}::handle` \| 3.07s \| \| `tools::handlers::multi_agents::spawn::{impl#1}::handle` \| 2.84s \| \| `tasks::review::{impl#4}::run` \| 2.82s \| \| `tools::handlers::multi_agents_v2::spawn::{impl#2}::handle` \| 2.80s \| \| `tools::handlers::multi_agents::resume_agent::{impl#1}::handle` \| 2.73s \| \| `tools::handlers::unified_exec::{impl#2}::handle` \| 2.54s \| \| `tasks::compact::{impl#4}::run` \| 2.45s \| ## What changed Relevant pre-change registry shape: [`codex-rs/core/src/tools/registry.rs`](`0bd31dc382/codex-rs/core/src/tools/registry.rs (L38-L219)`) Current registry shape in this PR: [`codex-rs/core/src/tools/registry.rs`](`41f7ac0ade/codex-rs/core/src/tools/registry.rs (L38-L203)`) - `ToolHandler::{is_mutating, handle}` now return native `impl Future + Send` futures instead of using `#[async_trait]`. - `AnyToolHandler` remains the object-safe adapter and boxes those futures at the registry boundary with explicit lifetimes. - Concrete handlers and the registry test handler drop `#[async_trait]` but otherwise keep their async method bodies intact. - Representative examples: [`codex-rs/core/src/tools/handlers/shell.rs`](`41f7ac0ade/codex-rs/core/src/tools/handlers/shell.rs (L223-L379)`), [`codex-rs/core/src/tools/handlers/unified_exec.rs`](`41f7ac0ade/codex-rs/core/src/tools/handlers/unified_exec.rs`), [`codex-rs/core/src/tools/registry_tests.rs`](`41f7ac0ade/codex-rs/core/src/tools/registry_tests.rs`) ## Tradeoff This is intentionally less invasive than #16627: it does not move result boxing into every concrete handler and does not change `ToolHandler` into an object-safe trait. Instead, it keeps the existing registry-level type-erasure boundary and only removes the macro-generated async wrapper layer from concrete impls. So the runtime boxing story stays basically the same as before, while the compile-time savings are still large. ## Verification Existing verification for this branch still applies: - Ran `cargo test -p codex-core`; this change compiled and the suite reached the known unrelated `config::tests::guardian` failures, with no local diff under `codex-rs/core/src/config/`. Profiling commands used for the tables above: - `cargo clean -p codex-core` - `cargo +nightly build -p codex-core --lib -Z unstable-options --timings=json` - `cargo +nightly rustc -p codex-core --lib -- -Z time-passes -Z time-passes-format=json` - `cargo +nightly rustc -p codex-core --lib -- -Z self-profile=... -Z self-profile-events=default,query-keys,args,llvm,artifact-sizes` - `measureme summarize -p 0.5`	2026-04-02 16:03:52 -07:00
Michael Bolin	93380a6fac	fix: add shell fallback paths for pwsh/powershell that work on GitHub Actions Windows runners (#16617 ) Recently, I merged a number of PRs to increase startup timeouts for scripts that ran under PowerShell, but in the failure for `suite::codex_tool::test_shell_command_approval_triggers_elicitation`, I found this in the error logs when running on Bazel with BuildBuddy: ``` [mcp stderr] 2026-04-02T19:54:10.758951Z ERROR codex_core::tools::router: error=Exit code: 1 [mcp stderr] Wall time: 0.2 seconds [mcp stderr] Output: [mcp stderr] 'New-Item' is not recognized as an internal or external command, [mcp stderr] operable program or batch file. [mcp stderr] ``` This error implies that the command was run under `cmd.exe` instead of `pwsh.exe`. Under GitHub Actions, I suspect that the `%PATH%` that is passed to our Bazel builder is scrubbed such that our tests cannot find PowerShell where GitHub installs it. Having these explicit fallback paths should help. While we could enable these only for tests, I don't see any harm in keeping them in production, as well.	2026-04-02 13:47:10 -07:00
Michael Bolin	30ee9e769e	fix: increase another startup timeout for PowerShell (#16613 )	2026-04-02 13:16:16 -07:00
Michael Bolin	f894c3f687	fix: add more detail to test assertion (#16606 ) In https://github.com/openai/codex/pull/16528, I am trying to get tests running under Bazel on Windows, but currently I see: ``` thread 'suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var' (10220) panicked at core/tests\suite\user_shell_cmd.rs:358:5: assertion failed: `(left == right)` Diff < left / right > : <1 >0 ``` This PR updates the `assert_eq!()` to provide more information to help diagnose the failure. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16606). * #16608 * __->__ #16606	2026-04-02 12:34:42 -07:00
jif-oai	7fc36249b5	chore: rename assign_task for followup_task (#16571 )	2026-04-02 16:51:17 +02:00
jif-oai	ea27d861b2	nit: state machine desc (#16569 )	2026-04-02 16:18:53 +02:00
jif-oai	ab6cce62b8	chore: rework state machine further (#16567 )	2026-04-02 16:15:28 +02:00
jif-oai	e47ed5e57f	fix: races in end of turn (#16566 )	2026-04-02 15:55:55 +02:00
jif-oai	bd50496411	nit: lint (#16564 )	2026-04-02 15:41:18 +02:00
jif-oai	627299c551	fix: race pending (#16561 )	2026-04-02 15:31:30 +02:00
jif-oai	97df35c74f	chore: memories mini model (#16559 )	2026-04-02 14:48:43 +02:00
Michael Bolin	c1d18ceb6f	[codex] Remove codex-core config type shim (#16529 ) ## Why This finishes the config-type move out of `codex-core` by removing the temporary compatibility shim in `codex_core::config::types`. Callers now depend on `codex-config` directly, which keeps these config model types owned by the config crate instead of re-expanding `codex-core` as a transitive API surface. ## What Changed - Removed the `codex-rs/core/src/config/types.rs` re-export shim and the `core::config::ApprovalsReviewer` re-export. - Updated `codex-core`, `codex-cli`, `codex-tui`, `codex-app-server`, `codex-mcp-server`, and `codex-linux-sandbox` call sites to import `codex_config::types` directly. - Added explicit `codex-config` dependencies to downstream crates that previously relied on the `codex-core` re-export. - Regenerated `codex-rs/core/config.schema.json` after updating the config docs path reference.	2026-04-02 01:19:44 -07:00
Michael Bolin	e846fed2b1	fix: move some test utilities out of codex-rs/core/src/tools/spec.rs (#16524 ) The `#[cfg(test)]` in `codex-rs/core/src/tools/spec.rs` smelled funny to me and it turns out these members were straightforward to move.	2026-04-02 00:49:37 -07:00
Michael Bolin	f32a5e84bf	[codex] Move config types into codex-config (#16523 ) ## Why `codex-rs/core/src/config/types.rs` is a plain config-type module with no dependency on `codex-core`. Moving it into `codex-config` shrinks the core crate and gives config-only consumers a more natural dependency boundary. ## What Changed - Added `codex_config::types` with the moved structs, enums, constants, and unit tests. - Kept `codex_core::config::types` as a compatibility re-export to avoid a broad call-site migration in this PR. - Switched notice-table writes in `core/src/config/edit.rs` to a local `NOTICE_TABLE_KEY` constant. - Added the `wildmatch` runtime dependency and `tempfile` test dependency to `codex-config`.	2026-04-02 00:39:20 -07:00
Michael Bolin	5131e0de45	Move tool registry plan tests into codex-tools (#16521 ) ## Why #16513 moved pure tool-registry planning into `codex-tools`, but much of the corresponding spec/feature-gating coverage still lived in `codex-core`. That leaves the tests for planner behavior in the crate that no longer owns that logic and makes the next extraction steps harder to review. ## What Move the planner-only `spec_tests.rs` coverage into `codex-rs/tools/src/tool_registry_plan_tests.rs` and wire it up from `codex-rs/tools/src/tool_registry_plan.rs` using the crate-local `#[path = "tool_registry_plan_tests.rs"] mod tests;` pattern. The `codex-core` test file now keeps the core-side integration checks: router-visible model tool lists, namespaced handler alias registration, shell adapter behavior, and MCP schema edge cases that still exercise the `core` binding layer. ## Verification - `cargo test -p codex-tools` - `cargo test -p codex-core tools::spec::tests`	2026-04-02 00:26:51 -07:00
Michael Bolin	828b837235	Extract tool registry planning into codex-tools (#16513 ) ## Why This is a larger step in the `codex-core` -> `codex-tools` migration called out in `AGENTS.md`. `codex-rs/core/src/tools/spec.rs` had become mostly pure tool-spec assembly plus handler registration. That made it hard to move more of the tool-definition layer into `codex-tools`, because the runtime binding and the crate-independent planning logic were still interleaved in one function. Splitting those concerns gives `codex-tools` ownership of the declarative registry plan while keeping `codex-core` responsible for instantiating concrete handlers. ## What Changed - Add a `codex-tools` registry-plan layer in `codex-rs/tools/src/tool_registry_plan.rs` and `codex-rs/tools/src/tool_registry_plan_types.rs`. - Move feature-gated tool-spec assembly, MCP/dynamic tool conversion, tool-search aliases, and code-mode nested-plan expansion into `codex-tools`. - Keep `codex-rs/core/src/tools/spec.rs` as the core-side adapter that maps each planned handler kind to concrete runtime handler instances. - Update `spec_tests.rs` to import the moved `codex_tools` symbols directly instead of relying on top-level `spec.rs` re-exports. This is intended to be a straight refactor with no behavior change and no new test surface. ## Verification - `cargo test -p codex-tools` - `cargo test -p codex-core tools::spec::tests` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16513). * #16521 * __->__ #16513	2026-04-02 00:18:18 -07:00
Michael Bolin	aa2403e2eb	core: remove cross-crate re-exports from lib.rs (#16512 ) ## Why `codex-core` was re-exporting APIs owned by sibling `codex-` crates, which made downstream crates depend on `codex-core` as a proxy module instead of the actual owner crate. Removing those forwards makes crate boundaries explicit and lets leaf crates drop unnecessary `codex-core` dependencies. In this PR, this reduces the dependency on `codex-core` to `codex-login` in the following files: ``` codex-rs/backend-client/Cargo.toml codex-rs/mcp-server/tests/common/Cargo.toml ``` ## What - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`, `codex-protocol`, `codex-shell-command`, `codex-sandboxing`, `codex-tools`, and `codex-utils-path`. - Delete the `default_client` forwarding shim in `codex-rs/core`. - Update in-crate and downstream callsites to import directly from the owning `codex-` crate. - Add direct Cargo dependencies where callsites now target the owner crate, and remove `codex-core` from `codex-rs/backend-client`.	2026-04-01 23:06:24 -07:00
Michael Bolin	9f71d57a65	Extract code-mode nested tool collection into codex-tools (#16509 ) ## Why This is another small step in the `codex-core` -> `codex-tools` migration described in `AGENTS.md`. `core/src/tools/spec.rs` and `core/src/tools/code_mode/mod.rs` were both hand-rolling the same pure transformation: convert visible `ToolSpec`s into code-mode nested tool definitions, then sort and deduplicate by tool name. That logic does not depend on core runtime state or handlers, so keeping it in `codex-core` makes `spec.rs` harder to peel out later than it needs to be. ## What Changed - Add `collect_code_mode_tool_definitions()` to `codex-rs/tools/src/code_mode.rs`. - Reuse that helper from `codex-rs/core/src/tools/spec.rs` when assembling the `exec` tool description. - Reuse the same helper from `codex-rs/core/src/tools/code_mode/mod.rs` when exposing nested tool metadata to the code-mode runtime. This is intended to be a straight refactor with no behavior change and no new test surface. ## Verification - `cargo test -p codex-tools` - `cargo test -p codex-core tools::spec::tests` - `cargo test -p codex-core code_mode_only_`	2026-04-01 22:17:55 -07:00
Michael Bolin	cc97982bbb	core: use codex-mcp APIs directly (#16510 ) ## Why `codex-mcp` already owns the shared MCP API surface, including `auth`, `McpConfig`, `CODEX_APPS_MCP_SERVER_NAME`, and tool-name helpers in [`codex-rs/codex-mcp/src/mcp/mod.rs`](`f61e85dbfb/codex-rs/codex-mcp/src/mcp/mod.rs (L1-L35)`). Re-exporting that surface from `codex_core::mcp` gives downstream crates two import paths for the same API and hides the real crate dependency. This PR keeps `codex_core::mcp` focused on the local `McpManager` wrapper in [`codex-rs/core/src/mcp.rs`](`f61e85dbfb/codex-rs/core/src/mcp.rs (L13-L40)`) and makes consumers import shared MCP APIs from `codex_mcp` directly. ## What - Remove the `codex_mcp::mcp` re-export surface from `core/src/mcp.rs`. - Update `codex-core` internals plus `codex-app-server`, `codex-cli`, and `codex-tui` test code to import MCP APIs from `codex_mcp::mcp` directly. - Add explicit `codex-mcp` dependencies where those crates now use that API surface, and refresh `Cargo.lock`. ## Verification - `just bazel-lock-check` - `cargo test -p codex-core -p codex-cli -p codex-tui` - `codex-cli` passed. - `codex-core` still fails five unrelated config tests in `core/src/config/config_tests.rs` (`approvals_reviewer_` and `smart_approvals_alias_`). - A broader `cargo test -p codex-core -p codex-app-server -p codex-cli -p codex-tui` run previously hung in `codex-app-server` test `in_process_start_uses_requested_session_source_for_thread_start`.	2026-04-01 21:55:22 -07:00
Michael Bolin	1b5a16f05e	Extract request_user_input normalization into codex-tools (#16503 ) ## Why This is another incremental step in the `codex-core` -> `codex-tools` migration called out in `AGENTS.md`: keep pure tool-definition and wire-shaping logic out of `codex-core` so the core crate can stay focused on runtime orchestration. `request_user_input` already had its spec and mode-availability helpers in `codex-tools` after #16471. The remaining argument validation and normalization still lived in the core runtime handler, which left that tool split across the two crates. ## What Changed - Export `REQUEST_USER_INPUT_TOOL_NAME` and `normalize_request_user_input_args()` from `codex-rs/tools/src/request_user_input_tool.rs`. - Use that `codex-tools` surface from `codex-rs/core/src/tools/spec.rs` and `codex-rs/core/src/tools/handlers/request_user_input.rs`. - Keep the core handler responsible for payload parsing, session dispatch, cancellation handling, and response serialization. This is intended to be a straight refactor with no behavior change. ## Verification - `cargo test -p codex-tools` - `cargo test -p codex-core request_user_input`	2026-04-01 21:18:45 -07:00
Michael Bolin	7c1c633f3f	core: use codex-tools config types directly (#16504 ) ## Why `codex-rs/tools/src/lib.rs` already defines the [canonical `codex_tools` export surface](`bf081b9e28/codex-rs/tools/src/lib.rs (L83-L88)`) for `ToolsConfig`, `ToolsConfigParams`, and the shell backend config types. Re-exporting those same types from `core/src/tools/spec.rs` gives `codex-core` two import paths for one API and blurs which crate owns those config definitions. This PR removes that duplicate path so `codex-core` callsites depend on `codex_tools` directly. ## What - Remove the five `codex_tools` re-exports from `core/src/tools/spec.rs`. - Update `codex-core` production and test callsites to import `ShellCommandBackendConfig`, `ToolsConfig`, `ToolsConfigParams`, `UnifiedExecShellMode`, and `ZshForkConfig` from `codex_tools`. ## Verification - Ran `cargo test -p codex-core`. - The package run is currently red in five unrelated config tests in `core/src/config/config_tests.rs` (`approvals_reviewer_` and `smart_approvals_alias_`), while the tool/spec and shell tests touched by this import cleanup passed.	2026-04-01 21:16:44 -07:00
Michael Bolin	d1068e057a	Extract tool-suggest wire helpers into codex-tools (#16499 ) ## Why This is another straight-refactor step in the `codex-tools` migration. `core/src/tools/handlers/tool_suggest.rs` still owned request/response payload structs, elicitation metadata shaping, and connector-completion predicates that do not depend on `codex-core` session/runtime internals. Per the `AGENTS.md` guidance to keep shrinking `codex-core`, this moves that pure wire-format logic into `codex-rs/tools` so the core handler keeps only session orchestration, plugin/config refresh, and MCP cache updates. ## What changed - Added `codex-rs/tools/src/tool_suggest.rs` and exported its API from `codex-rs/tools/src/lib.rs`. - Moved `ToolSuggestArgs`, `ToolSuggestResult`, `ToolSuggestMeta`, `build_tool_suggestion_elicitation_request()`, `all_suggested_connectors_picked_up()`, and `verified_connector_suggestion_completed()` into `codex-tools`. - Rewired `core/src/tools/handlers/tool_suggest.rs` to consume those exports directly. - Ported the existing pure helper tests from `core/src/tools/handlers/tool_suggest_tests.rs` to `tools/src/tool_suggest_tests.rs` without adding new behavior coverage. ## Validation ```shell cargo test -p codex-tools cargo test -p codex-core tools::handlers::tool_suggest::tests just argument-comment-lint ```	2026-04-01 20:49:15 -07:00
Michael Bolin	c2699c666c	fix: guard guardian_command_source_tool_name with cfg(unix) (#16498 ) This currently contributing to `rust-ci-full.yml` being red on `main` for windows lint builds due to the cargo/bazel coverage gap that I'm working on. Hopefully this gets us back on track.	2026-04-01 20:16:44 -07:00
Michael Bolin	0b856a4757	Extract tool-search output helpers into codex-tools (#16497 ) ## Why This is the next straight-refactor step in the `codex-tools` migration that follows #16493. `codex-rs/core` still owned a chunk of pure tool-discovery metadata and response shaping even though the corresponding `tool_search` / `tool_suggest` specs already live in `codex-rs/tools`. Per the guidance in `AGENTS.md`, this moves that crate-agnostic logic out of `codex-core` so the handler crate keeps only the BM25 ranking/orchestration and runtime glue. ## What changed - Moved the canonical `tool_search` / `tool_suggest` tool names and the `tool_search` default limit into `codex-rs/tools/src/tool_discovery.rs`. - Added `ToolSearchResultSource` and `collect_tool_search_output_tools()` in `codex-tools` so namespace grouping and deferred Responses API tool serialization happen outside `codex-core`. - Rewired `ToolSearchHandler`, `ToolSuggestHandler`, and `core/src/tools/spec.rs` to consume those exports directly from `codex-tools`. - Ported the existing `tool_search` serializer tests from `core/src/tools/handlers/tool_search_tests.rs` to `tools/src/tool_discovery_tests.rs` without adding new behavior coverage. ## Validation ```shell cargo test -p codex-tools cargo test -p codex-core tools::spec::tests just argument-comment-lint ```	2026-04-01 20:16:21 -07:00
Michael Bolin	5a2f3a8102	Extract built-in tool spec constructors into codex-tools (#16493 ) ## Why `core/src/tools/spec.rs` still had a few built-in tool specs assembled inline even though those definitions are pure metadata and already live conceptually in `codex-tools`. Keeping that construction in `codex-core` makes `spec.rs` do more than registry orchestration and slows the migration toward a right-sized `codex-tools` crate. This continues the extraction stack from #16379, #16471, #16477, #16481, and #16482. ## What Changed - added `create_local_shell_tool()`, `create_web_search_tool(...)`, and `create_image_generation_tool(...)` to `codex-rs/tools/src/tool_spec.rs` - exported those helpers from `codex-rs/tools/src/lib.rs` - switched `codex-rs/core/src/tools/spec.rs` to call those helpers instead of constructing `ToolSpec::LocalShell`, `ToolSpec::WebSearch`, and `ToolSpec::ImageGeneration` inline - removed the remaining core-local web-search content-type constant and made the affected spec test assert the literal expected values directly This is intended to be a straight refactor: tool behavior and wire shape should not change. ## Testing - `cargo test -p codex-tools` - `cargo test -p codex-core tools::spec::tests`	2026-04-01 19:31:24 -07:00
Michael Bolin	d4464125c5	Remove client_common tool re-exports (#16482 ) ## Why `codex-rs/core/src/client_common.rs` still had a `tools` re-export module that forwarded `codex_tools` types back into `codex-core`. After the earlier extraction work in #16379, #16471, #16477, and #16481, that extra layer no longer adds value. Removing it keeps dependencies explicit: the `codex-core` modules that actually use `ToolSpec` and related types now depend on `codex_tools` directly instead of reaching through `client_common`. ## What Changed - removed the `client_common::tools` re-export module from `core/src/client_common.rs` - updated the remaining `codex-core` consumers to import `codex_tools` directly - adjusted the affected test code to reference `codex_tools::ResponsesApiTool` directly as well This is a mechanical cleanup only. It does not change tool behavior or runtime logic. ## Testing - `cargo test -p codex-core client_common::tests` - `cargo test -p codex-core tools::router::tests` - `cargo test -p codex-core tools::context::tests` - `cargo test -p codex-core tools::spec::tests`	2026-04-01 19:15:15 -07:00
Ahmed Ibrahim	59b68f5519	Extract MCP into codex-mcp crate (#15919 ) - Split MCP runtime/server code out of `codex-core` into the new `codex-mcp` crate. New/moved public structs/types include `McpConfig`, `McpConnectionManager`, `ToolInfo`, `ToolPluginProvenance`, `CodexAppsToolsCacheKey`, and the `McpManager` API (`codex_mcp::mcp::McpManager` plus the `codex_core::mcp::McpManager` wrapper/shim). New/moved functions include `with_codex_apps_mcp`, `configured_mcp_servers`, `effective_mcp_servers`, `collect_mcp_snapshot`, `collect_mcp_snapshot_from_manager`, `qualified_mcp_tool_name_prefix`, and the MCP auth/skill-dependency helpers. Why: this creates a focused MCP crate boundary and shrinks `codex-core` without forcing every consumer to migrate in the same PR. - Move MCP server config schema and persistence into `codex-config`. New/moved structs/enums include `AppToolApproval`, `McpServerToolConfig`, `McpServerConfig`, `RawMcpServerConfig`, `McpServerTransportConfig`, `McpServerDisabledReason`, and `codex_config::ConfigEditsBuilder`. New/moved functions include `load_global_mcp_servers` and `ConfigEditsBuilder::replace_mcp_servers`/`apply`. Why: MCP TOML parsing/editing is config ownership, and this keeps config validation/round-tripping (including per-tool approval overrides and inline bearer-token rejection) in the config crate instead of `codex-core`. - Rewire `codex-core`, app-server, and plugin call sites onto the new crates. Updated `Config::to_mcp_config(&self, plugins_manager)`, `codex-rs/core/src/mcp.rs`, `codex-rs/core/src/connectors.rs`, `codex-rs/core/src/codex.rs`, `CodexMessageProcessor::list_mcp_server_status_task`, and `utils/plugins/src/mcp_connector.rs` to build/pass the new MCP config/runtime types. Why: plugin-provided MCP servers still merge with user-configured servers, and runtime auth (`CodexAuth`) is threaded into `with_codex_apps_mcp` / `collect_mcp_snapshot` explicitly so `McpConfig` stays config-only.	2026-04-01 19:03:26 -07:00
Michael Bolin	6cf832fc63	Extract update_plan tool spec into codex-tools (#16481 ) ## Why `codex-rs/core/src/tools/handlers/plan.rs` still owned both the `update_plan` runtime handler and the static tool definition. The tool definition is pure metadata, so keeping it in `codex-core` works against the ongoing effort to move tool-spec code into `codex-tools` and keep `codex-core` focused on orchestration and execution paths. This continues the extraction work from #16379, #16471, and #16477. ## What Changed - added `codex-rs/tools/src/plan_tool.rs` with `create_update_plan_tool()` - re-exported that constructor from `codex-rs/tools/src/lib.rs` - updated `codex-rs/core/src/tools/spec.rs` and `codex-rs/core/src/tools/spec_tests.rs` to use the `codex-tools` export instead of a core-local static - removed the old `PLAN_TOOL` definition from `codex-rs/core/src/tools/handlers/plan.rs`; the `PlanHandler` runtime logic still stays in `codex-core` - tightened two `codex-core` aliases to `#[cfg(test)]` now that production code no longer needs them ## Testing - `cargo test -p codex-tools` - `cargo test -p codex-core tools::spec::tests` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16481). * #16482 * __->__ #16481	2026-04-01 15:51:52 -07:00
Owen Lin	30f6786d62	fix(guardian): make GuardianAssessmentEvent.action strongly typed (#16448 ) ## Description Previously the `action` field on `EventMsg::GuardianAssessment`, which describes what Guardian is reviewing, was typed as an arbitrary JSON blob. This PR cleans it up and defines a sum type representing all the various actions that Guardian can review. This is a breaking change (on purpose), which is fine because: - the Codex app / VSCE does not actually use `action` at the moment - the TUI code that consumes `action` is updated in this PR as well - rollout files that serialized old `EventMsg::GuardianAssessment` will just silently drop these guardian events - the contract is defined as unstable, so other clients have a fair warning :) This will make things much easier for followup Guardian work. ## Why The old guardian review payloads worked, but they pushed too much shape knowledge into downstream consumers. The TUI had custom JSON parsing logic for commands, patches, network requests, and MCP calls, and the app-server protocol was effectively just passing through an opaque blob. Typing this at the protocol boundary makes the contract clearer.	2026-04-01 15:42:18 -07:00

... 2 3 4 5 6 ...

2719 Commits