## Why
`core/src/tools/spec.rs` still had a few built-in tool specs assembled
inline even though those definitions are pure metadata and already live
conceptually in `codex-tools`. Keeping that construction in `codex-core`
makes `spec.rs` do more than registry orchestration and slows the
migration toward a right-sized `codex-tools` crate.
This continues the extraction stack from #16379, #16471, #16477, #16481,
and #16482.
## What Changed
- added `create_local_shell_tool()`, `create_web_search_tool(...)`, and
`create_image_generation_tool(...)` to `codex-rs/tools/src/tool_spec.rs`
- exported those helpers from `codex-rs/tools/src/lib.rs`
- switched `codex-rs/core/src/tools/spec.rs` to call those helpers
instead of constructing `ToolSpec::LocalShell`, `ToolSpec::WebSearch`,
and `ToolSpec::ImageGeneration` inline
- removed the remaining core-local web-search content-type constant and
made the affected spec test assert the literal expected values directly
This is intended to be a straight refactor: tool behavior and wire shape
should not change.
## Testing
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::spec::tests`
## Why
`codex-rs/core/src/client_common.rs` still had a `tools` re-export
module that forwarded `codex_tools` types back into `codex-core`. After
the earlier extraction work in #16379, #16471, #16477, and #16481, that
extra layer no longer adds value.
Removing it keeps dependencies explicit: the `codex-core` modules that
actually use `ToolSpec` and related types now depend on `codex_tools`
directly instead of reaching through `client_common`.
## What Changed
- removed the `client_common::tools` re-export module from
`core/src/client_common.rs`
- updated the remaining `codex-core` consumers to import `codex_tools`
directly
- adjusted the affected test code to reference
`codex_tools::ResponsesApiTool` directly as well
This is a mechanical cleanup only. It does not change tool behavior or
runtime logic.
## Testing
- `cargo test -p codex-core client_common::tests`
- `cargo test -p codex-core tools::router::tests`
- `cargo test -p codex-core tools::context::tests`
- `cargo test -p codex-core tools::spec::tests`
- Split MCP runtime/server code out of `codex-core` into the new
`codex-mcp` crate. New/moved public structs/types include `McpConfig`,
`McpConnectionManager`, `ToolInfo`, `ToolPluginProvenance`,
`CodexAppsToolsCacheKey`, and the `McpManager` API
(`codex_mcp::mcp::McpManager` plus the `codex_core::mcp::McpManager`
wrapper/shim). New/moved functions include `with_codex_apps_mcp`,
`configured_mcp_servers`, `effective_mcp_servers`,
`collect_mcp_snapshot`, `collect_mcp_snapshot_from_manager`,
`qualified_mcp_tool_name_prefix`, and the MCP auth/skill-dependency
helpers. Why: this creates a focused MCP crate boundary and shrinks
`codex-core` without forcing every consumer to migrate in the same PR.
- Move MCP server config schema and persistence into `codex-config`.
New/moved structs/enums include `AppToolApproval`,
`McpServerToolConfig`, `McpServerConfig`, `RawMcpServerConfig`,
`McpServerTransportConfig`, `McpServerDisabledReason`, and
`codex_config::ConfigEditsBuilder`. New/moved functions include
`load_global_mcp_servers` and
`ConfigEditsBuilder::replace_mcp_servers`/`apply`. Why: MCP TOML
parsing/editing is config ownership, and this keeps config
validation/round-tripping (including per-tool approval overrides and
inline bearer-token rejection) in the config crate instead of
`codex-core`.
- Rewire `codex-core`, app-server, and plugin call sites onto the new
crates. Updated `Config::to_mcp_config(&self, plugins_manager)`,
`codex-rs/core/src/mcp.rs`, `codex-rs/core/src/connectors.rs`,
`codex-rs/core/src/codex.rs`,
`CodexMessageProcessor::list_mcp_server_status_task`, and
`utils/plugins/src/mcp_connector.rs` to build/pass the new MCP
config/runtime types. Why: plugin-provided MCP servers still merge with
user-configured servers, and runtime auth (`CodexAuth`) is threaded into
`with_codex_apps_mcp` / `collect_mcp_snapshot` explicitly so `McpConfig`
stays config-only.
## Why
`codex-rs/core/src/tools/handlers/plan.rs` still owned both the
`update_plan` runtime handler and the static tool definition. The tool
definition is pure metadata, so keeping it in `codex-core` works against
the ongoing effort to move tool-spec code into `codex-tools` and keep
`codex-core` focused on orchestration and execution paths.
This continues the extraction work from #16379, #16471, and #16477.
## What Changed
- added `codex-rs/tools/src/plan_tool.rs` with
`create_update_plan_tool()`
- re-exported that constructor from `codex-rs/tools/src/lib.rs`
- updated `codex-rs/core/src/tools/spec.rs` and
`codex-rs/core/src/tools/spec_tests.rs` to use the `codex-tools` export
instead of a core-local static
- removed the old `PLAN_TOOL` definition from
`codex-rs/core/src/tools/handlers/plan.rs`; the `PlanHandler` runtime
logic still stays in `codex-core`
- tightened two `codex-core` aliases to `#[cfg(test)]` now that
production code no longer needs them
## Testing
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::spec::tests`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16481).
* #16482
* __->__ #16481
## Description
Previously the `action` field on `EventMsg::GuardianAssessment`, which
describes what Guardian is reviewing, was typed as an arbitrary JSON
blob. This PR cleans it up and defines a sum type representing all the
various actions that Guardian can review.
This is a breaking change (on purpose), which is fine because:
- the Codex app / VSCE does not actually use `action` at the moment
- the TUI code that consumes `action` is updated in this PR as well
- rollout files that serialized old `EventMsg::GuardianAssessment` will
just silently drop these guardian events
- the contract is defined as unstable, so other clients have a fair
warning :)
This will make things much easier for followup Guardian work.
## Why
The old guardian review payloads worked, but they pushed too much shape
knowledge into downstream consumers. The TUI had custom JSON parsing
logic for commands, patches, network requests, and MCP calls, and the
app-server protocol was effectively just passing through an opaque blob.
Typing this at the protocol boundary makes the contract clearer.
## Why
Follow-up to #16288: the new dynamic provider auth token flow currently
defaults `refresh_interval_ms` to a non-zero value and rejects `0`
entirely.
For command-backed bearer auth, `0` should mean "never auto-refresh".
That lets callers keep using the cached token until the backend actually
returns `401 Unauthorized`, at which point Codex can rerun the auth
command as part of the existing retry path.
## What changed
- changed `ModelProviderAuthInfo.refresh_interval_ms` to accept `0` and
documented that value as disabling proactive refresh
- updated the external bearer token refresher to treat
`refresh_interval_ms = 0` as an indefinitely reusable cached token,
while still rerunning the auth command during unauthorized recovery
- regenerated `core/config.schema.json` so the schema minimum is `0` and
the new behavior is described in the field docs
- added coverage for both config deserialization and the no-auto-refresh
plus `401` recovery behavior
## How tested
- `cargo test -p codex-protocol`
- `cargo test -p codex-login`
- `cargo test -p codex-core test_deserialize_provider_auth_config_`
## Why
Follow-up to #16379 and #16471.
`codex-rs/core/src/tools/spec.rs` still owned the pure discovery-shaping
helpers that turn app metadata and discoverable tool metadata into the
inputs used by `tool_search` and `tool_suggest`. Those helpers do not
need `codex-core` runtime state, so keeping them in `codex-core`
continued to blur the crate boundary this migration is trying to
tighten.
This change keeps pushing spec-only logic behind the `codex-tools` API
so `codex-core` can focus on wiring runtime handlers to the resulting
tool definitions.
## What Changed
- Added `collect_tool_search_app_infos` and
`collect_tool_suggest_entries` to
`codex-rs/tools/src/tool_discovery.rs`.
- Added a small `ToolSearchAppSource` adapter type in `codex-tools` so
`codex-core` can pass app metadata into that shared helper logic without
exposing `ToolInfo` across the crate boundary.
- Re-exported the new discovery helpers from
`codex-rs/tools/src/lib.rs`, which remains exports-only.
- Updated `codex-rs/core/src/tools/spec.rs` to use those `codex-tools`
helpers instead of maintaining local `tool_search_app_infos` and
`tool_suggest_entries` functions.
- Removed the now-redundant helper implementations from `codex-core`.
## Testing
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::spec::tests`
## Why
Now that workspace crate features have been removed and
`.github/scripts/verify_cargo_workspace_manifests.py` hard-bans new
ones, Rust CI should stop building and testing with `--all-features`.
Keeping `--all-features` in CI no longer buys us meaningful coverage for
`codex-rs`, but it still makes the workflow look like we rely on Cargo
feature permutations that we are explicitly trying to eliminate. It also
leaves stale examples in the repo that suggest `--all-features` is a
normal or recommended way to run the workspace.
## What changed
- removed `--all-features` from the Rust CI `cargo chef cook`, `cargo
clippy`, and `cargo nextest` invocations in
`.github/workflows/rust-ci-full.yml`
- updated the `just test` guidance in `justfile` to reflect that
workspace crate features are banned and there should be no need to add
`--all-features`
- updated the multiline command example and snapshot in
`codex-rs/tui/src/history_cell.rs` to stop rendering `cargo test
--all-features --quiet`
- tightened the verifier docstring in
`.github/scripts/verify_cargo_workspace_manifests.py` so it no longer
talks about temporary remaining exceptions
## How tested
- `python3 .github/scripts/verify_cargo_workspace_manifests.py`
- `cargo test -p codex-tui`
## Why
Follow-up to #16379.
`codex-rs/core/src/tools/spec.rs` and the corresponding handlers still
owned several pure tool-definition helpers even though they do not need
`codex-core` runtime state. Keeping that spec-only logic in `codex-core`
keeps the crate boundary blurry and works against the guidance in
`AGENTS.md` to keep shared tooling out of `codex-core` when possible.
This change takes another step toward a dedicated `codex-tools` crate by
moving more metadata and schema-building code behind the `codex-tools`
API while leaving the actual tool execution paths in `codex-core`.
## What Changed
- Added `codex-rs/tools/src/apply_patch_tool.rs` to own
`ApplyPatchToolArgs`, the freeform/json `apply_patch` tool specs, and
the moved `tool_apply_patch.lark` grammar.
- Updated `codex-rs/tools/BUILD.bazel` so Bazel exposes the moved
grammar file to `codex-tools`.
- Moved the `request_user_input` availability and description helpers
into `codex-rs/tools/src/request_user_input_tool.rs`, with the related
unit tests moved alongside that business logic.
- Moved `request_permissions_tool_description()` into
`codex-rs/tools/src/local_tool.rs`.
- Rewired `codex-rs/core/src/tools/spec.rs`,
`codex-rs/core/src/tools/handlers/apply_patch.rs`, and
`codex-rs/core/src/tools/handlers/request_user_input.rs` to consume the
new `codex-tools` exports instead of local helper code.
- Removed the now-redundant helper implementations and tests from
`codex-core`, plus a couple of stale `client_common` re-exports that
became unused after the move.
## Testing
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::spec::tests`
- `cargo test -p codex-core tools::handlers::apply_patch::tests`
## Why
`codex-otel` still carried `disable-default-metrics-exporter`, which was
the last remaining workspace crate feature.
We are removing workspace crate features because they do not fit our
current build model well:
- our Bazel setup does not honor crate features today, which can let
feature-gated issues go unnoticed
- they create extra crate build permutations that we want to avoid
For this case, the feature was only being used to keep the built-in
Statsig metrics exporter off in test and debug-oriented contexts. This
repo already treats `debug_assertions` as the practical proxy for that
class of behavior, so OTEL should follow the same convention instead of
keeping a dedicated crate feature alive.
## What changed
- removed `disable-default-metrics-exporter` from
`codex-rs/otel/Cargo.toml`
- removed the `codex-otel` dev-dependency feature activation from
`codex-rs/core/Cargo.toml`
- changed `codex-rs/otel/src/config.rs` so the built-in
`OtelExporter::Statsig` default resolves to `None` when
`debug_assertions` is enabled, with a focused unit test covering that
behavior
- removed the final feature exceptions from
`.github/scripts/verify_cargo_workspace_manifests.py`, so workspace
crate features are now hard-banned instead of temporarily allowlisted
- expanded the verifier error message to explain the Bazel mismatch and
build-permutation cost behind that policy
## How tested
- `python3 .github/scripts/verify_cargo_workspace_manifests.py`
- `cargo test -p codex-otel`
- `cargo test -p codex-core
metrics_exporter_defaults_to_statsig_when_missing`
- `cargo test -p codex-app-server app_server_default_analytics_`
- `just bazel-lock-check`
## Why
`codex-core` already owns too much of the tool stack, and `AGENTS.md`
explicitly pushes us to move shared code out of `codex-core` instead of
letting it keep growing. This PR takes the next incremental step in
moving `core/src/tools` toward `codex-rs/tools` by extracting
low-coupling tool configuration and image-detail gating logic into
`codex-tools`.
That gives later extraction work a cleaner boundary to build on without
trying to move the entire tools subtree in one shot.
## What changed
- moved `ToolsConfig`, `ToolsConfigParams`, shell backend config, and
unified-exec session selection from `core/src/tools/spec.rs` into
`codex-tools`
- moved original image-detail gating and normalization into
`codex-tools`
- updated `codex-core` to consume the new `codex-tools` exports and pass
a rendered agent-type description instead of raw role config
- kept `codex-rs/tools/src/lib.rs` exports-only, with extracted unit
tests living in sibling `*_tests.rs` modules
## Testing
- `cargo test -p codex-tools`
- `cargo test -p codex-core --lib tools::spec::`
## Why
`voice-input` is the only remaining TUI crate feature, but it is also a
default feature and nothing in the workspace selects it explicitly. In
practice it is just acting as a proxy for platform support, which is
better expressed with target-specific dependencies and cfgs.
## What changed
- remove the `voice-input` feature from `codex-tui`
- make `cpal` a normal non-Linux target dependency
- replace the feature-based voice and audio cfgs with pure
Linux-vs-non-Linux cfgs
- shrink the workspace-manifest verifier allowlist to remove the
remaining `codex-tui` exception
## How tested
- `python3 .github/scripts/verify_cargo_workspace_manifests.py`
- `cargo test -p codex-tui`
- `just bazel-lock-check`
- `just argument-comment-lint -p codex-tui`
## Why
The remaining `vt100-tests` and `debug-logs` features in `codex-tui`
were only gating test-only and debug-only behavior. Those feature
toggles add Cargo and Bazel permutations without buying anything, and
they make it easier for more crate features to linger in the workspace.
## What changed
- delete `vt100-tests` and `debug-logs` from `codex-tui`
- always compile the VT100 integration tests in the TUI test target
instead of hiding them behind a Cargo feature
- remove the unused textarea debug logging branch instead of replacing
it with another gate
- add the required argument-comment annotations in the VT100 tests now
that Bazel sees those callsites during linting
- shrink the manifest verifier allowlist again so only the remaining
real feature exceptions stay permitted
## How tested
- `cargo test -p codex-tui`
- `just argument-comment-lint -p codex-tui`
## Why
`codex-cloud-tasks-client` was mixing two different roles: the real HTTP
client and the mock implementation used by tests and local mock mode.
Keeping both in the same crate forced Cargo feature toggles and Bazel
`crate_features` just to pick an implementation.
This change keeps `codex-cloud-tasks-client` focused on the shared API
surface and real backend client, and moves the mock implementation into
its own crate so we can remove those feature permutations cleanly.
## What changed
- add a new `codex-cloud-tasks-mock-client` crate that owns `MockClient`
- remove the `mock` and `online` features from
`codex-cloud-tasks-client`
- make `codex-cloud-tasks-client` unconditionally depend on
`codex-backend-client` and export `HttpClient` directly
- gate the mock-mode path in `codex-cloud-tasks` behind
`#[cfg(debug_assertions)]`, so release builds always initialize the real
HTTP client
- update `codex-cloud-tasks` and its tests to use
`codex-cloud-tasks-mock-client::MockClient` wherever mock behavior is
needed
- remove the matching Bazel `crate_features` override and shrink the
manifest verifier allowlist accordingly
## How tested
- `cargo test -p codex-cloud-tasks-client`
- `cargo test -p codex-cloud-tasks-mock-client`
- `cargo test -p codex-cloud-tasks`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16456).
* #16457
* __->__ #16456
Makes fuzzy file search use case-insensitive matching instead of
smart-case in `codex-file-search`. I find smart-case to be a poor user
experience -using the wrong case for a letter drops its match so
significantly, it often drops off the results list, effectively making a
search case-sensitive.
## Summary
In #11871 we started consolidating on ExecRequest.sandbox_policy instead
of passing in a separate policy object that theoretically could differ
(but did not). This finishes the some parameter cleanup.
This should be a simple noop, since all 3 callsites of this function
already used a cloned object from the ExecRequest value.
## Testing
- [x] Existing tests pass
## Summary
- switch MultiAgentV2 send_message to accept a single message string
instead of items
- keep the old assign_task item parser in place for the next branch
- update send_message schema/spec and focused handler tests
## Verification
- cargo test -p codex-tools
send_message_tool_requires_message_and_uses_submission_output
- cargo test -p codex-core multi_agent_v2_send_message
- just fix -p codex-tools
- just fix -p codex-core
- just argument-comment-lint
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Follow-up to #16351.
That PR synchronized Bazel clippy lint levels with Cargo, but two
intentional `expect()` calls in `codex-rs/tui/src/status/card.rs` still
tripped `clippy::expect_used` (I believe #16201 raced with #16351, which
is why it was missed).
## Why
Follow-up to #16345, the Bazel clippy rollout in #15955, and the cleanup
pass in #16353.
`cargo clippy` was enforcing the workspace deny-list from
`codex-rs/Cargo.toml` because the member crates opt into `[lints]
workspace = true`, but Bazel clippy was only using `rules_rust` plus
`clippy.toml`. That left the Bazel lane vulnerable to drift:
`clippy.toml` can tune lint behavior, but it cannot set
allow/warn/deny/forbid levels.
This PR now closes both sides of the follow-up. It keeps `.bazelrc` in
sync with `[workspace.lints.clippy]`, and it fixes the real clippy
violations that the newly-synced Windows Bazel lane surfaced once that
deny-list started matching Cargo.
## What Changed
- added `.github/scripts/verify_bazel_clippy_lints.py`, a Python check
that parses `codex-rs/Cargo.toml` with `tomllib`, reads the Bazel
`build:clippy` `clippy_flag` entries from `.bazelrc`, and reports
missing, extra, or mismatched lint levels
- ran that verifier from the lightweight `ci.yml` workflow so the sync
check does not depend on a Rust toolchain being installed first
- expanded the `.bazelrc` comment to explain the Cargo `workspace =
true` linkage and why Bazel needs the deny-list duplicated explicitly
- fixed the Windows-only `codex-windows-sandbox` violations that Bazel
clippy reported after the sync, using the same style as #16353: inline
`format!` args, method references instead of trivial closures, removed
redundant clones, and replaced SID conversion `unwrap` and `expect`
calls with proper errors
- cleaned up the remaining cross-platform violations the Bazel lane
exposed in `codex-backend-client` and `core_test_support`
## Testing
Key new test introduced by this PR:
`python3 .github/scripts/verify_bazel_clippy_lints.py`
Fix stale weekly limit in `/status` (#16194): /status reused the
session’s cached rate-limit snapshot, so the weekly remaining limit
could stay frozen within an active session.
With this change, we now dynamically update the rate limits after status
is displayed.
I needed to delete a few low-value test cases from the chatWidget tests
because the test.rs file is really large, and the new tests in this PR
pushed us over the 512K mandated limit. I'm working on a separate PR to
refactor that test file.
Problem: `chatwidget/tests.rs` had grown into a single oversized test
blob that was hard to maintain and exceeded the repo's blob size limit.
Solution: split the chatwidget tests into topical modules with a thin
root `tests.rs`, shared helper utilities, preserved snapshot naming, and
hermetic test config so the refactor stays stable and passes the
`codex-tui` test suite.
## Why
Bazel clippy now catches lints that `cargo clippy` can still miss when a
crate under `codex-rs` forgets to opt into workspace lints. The concrete
example here was `codex-rs/app-server/tests/common/Cargo.toml`: Bazel
flagged a clippy violation in `models_cache.rs`, but Cargo did not
because that crate inherited workspace package metadata without
declaring `[lints] workspace = true`.
We already mirror the workspace clippy deny list into Bazel after
[#15955](https://github.com/openai/codex/pull/15955), so we also need a
repo-side check that keeps every `codex-rs` manifest opted into the same
workspace settings.
## What changed
- add `.github/scripts/verify_cargo_workspace_manifests.py`, which
parses every `codex-rs/**/Cargo.toml` with `tomllib` and verifies:
- `version.workspace = true`
- `edition.workspace = true`
- `license.workspace = true`
- `[lints] workspace = true`
- top-level crate names follow the `codex-*` / `codex-utils-*`
conventions, with explicit exceptions for `windows-sandbox-rs` and
`utils/path-utils`
- run that script in `.github/workflows/ci.yml`
- update the current outlier manifests so the check is enforceable
immediately
- fix the newly exposed clippy violations in the affected crates
(`app-server/tests/common`, `file-search`, `feedback`,
`shell-escalation`, and `debug-client`)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16353).
* #16351
* __->__ #16353
## Why
https://github.com/openai/codex/pull/16287 introduced a change to
`codex-rs/login/src/auth/auth_tests.rs` that uses a PowerShell helper to
read the next token from `tokens.txt` and rewrite the remainder back to
disk. On Windows, `Get-Content` can return a scalar when the file has
only one remaining line, so `$lines[0]` reads the first character
instead of the full token. That breaks the external bearer refresh test
once the token list is nearly exhausted.
https://github.com/openai/codex/pull/16288 introduced similar changes to
`codex-rs/core/src/models_manager/manager_tests.rs` and
`codex-rs/core/tests/suite/client.rs`.
These went unnoticed because the failures showed up when the test was
run via Cargo on Windows, but not in our Bazel harness. Figuring out
that Cargo-vs-Bazel delta will happen in a follow-up PR.
## Verification
On my Windows machine, I verified `cargo test` passes when run in
`codex-rs/login` and `codex-rs/core`. Once this PR is merged, I will
keep an eye on
https://github.com/openai/codex/actions/workflows/rust-ci-full.yml to
verify it goes green.
## What changed
- Wrap `Get-Content -Path tokens.txt` in `@(...)` so the script always
gets array semantics before counting, indexing, and rewriting the
remaining lines.
## Summary
- Replace the separate external auth enum and refresher trait with a
single `ExternalAuth` trait in login auth flow
- Move bearer token auth behind `BearerTokenRefresher` and update
`AuthManager` and app-server wiring to use the generic external auth API
The TUI’s `/feedback` flow was still uploading directly through the
local feedback crate, which bypassed app-server behavior such as
auth-derived feedback tags like chatgpt_user_id and made TUI feedback
handling diverge from other clients. It also meant that remove TUI
sessions failed to upload the correct feedback logs and session details.
Testing: Manually tested `/feedback` flow and confirmed that it didn't
regress.
I noticed that
https://github.com/openai/codex/actions/workflows/rust-ci-full.yml
started failing on my own PR,
https://github.com/openai/codex/pull/16288, even though CI was green
when I merged it.
Apparently, it introduced a lint violation that was [correctly!] caught
by our Cargo-based clippy runner, but not our Bazel-based one.
My next step is to figure out the reason for the delta between the two
setups, but I wanted to get us green again quickly, first.
Adds this:
```
properties.insert(
"fork_turns".to_string(),
JsonSchema::String {
description: Some(
"Optional MultiAgentV2 fork mode. Use `none`, `all`, or a positive integer string such as `3` to fork only the most recent turns."
.to_string(),
),
},
);
```
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Fixes#15189.
Custom model providers that set `requires_openai_auth = false` could
only use static credentials via `env_key` or
`experimental_bearer_token`. That is not enough for providers that mint
short-lived bearer tokens, because Codex had no way to run a command to
obtain a bearer token, cache it briefly in memory, and retry with a
refreshed token after a `401`.
This PR adds that provider config and wires it through the existing auth
design: request paths still go through `AuthManager.auth()` and
`UnauthorizedRecovery`, with `core` only choosing when to use a
provider-backed bearer-only `AuthManager`.
## Scope
To keep this PR reviewable, `/models` only uses provider auth for the
initial request in this change. It does **not** add a dedicated `401`
retry path for `/models`; that can be follow-up work if we still need it
after landing the main provider-token support.
## Example Usage
```toml
model_provider = "corp-openai"
[model_providers.corp-openai]
name = "Corp OpenAI"
base_url = "https://gateway.example.com/openai"
requires_openai_auth = false
[model_providers.corp-openai.auth]
command = "gcloud"
args = ["auth", "print-access-token"]
timeout_ms = 5000
refresh_interval_ms = 300000
```
The command contract is intentionally small:
- write the bearer token to `stdout`
- exit `0`
- any leading or trailing whitespace is trimmed before the token is used
## What Changed
- add `model_providers.<id>.auth` to the config model and generated
schema
- validate that command-backed provider auth is mutually exclusive with
`env_key`, `experimental_bearer_token`, and `requires_openai_auth`
- build a bearer-only `AuthManager` for `ModelClient` and
`ModelsManager` when a provider configures `auth`
- let normal Responses requests and realtime websocket connects use the
provider-backed bearer source through the same `AuthManager.auth()` path
- allow `/models` online refresh for command-auth providers and attach
the provider token to the initial `/models` request
- keep `auth.cwd` available as an advanced escape hatch and include it
in the generated config schema
## Testing
- `cargo test -p codex-core provider_auth_command`
- `cargo test -p codex-core
refresh_available_models_uses_provider_auth_token`
- `cargo test -p codex-core
test_deserialize_provider_auth_config_defaults`
## Docs
- `developers.openai.com/codex` should document the new
`[model_providers.<id>.auth]` block and the token-command contract
## Summary
`AuthManager` and `UnauthorizedRecovery` already own token resolution
and staged `401` recovery. The missing piece for provider auth was a
bearer-only mode that still fit that design, instead of pushing a second
auth abstraction into `codex-core`.
This PR keeps the design centered on `AuthManager`: it teaches
`codex-login` how to own external bearer auth directly so later provider
work can keep calling `AuthManager.auth()` and `UnauthorizedRecovery`.
## Motivation
This is the middle layer for #15189.
The intended design is still:
- `AuthManager` encapsulates token storage and refresh
- `UnauthorizedRecovery` powers staged `401` recovery
- all request tokens go through `AuthManager.auth()`
This PR makes that possible for provider-backed bearer tokens by adding
a bearer-only auth mode inside `AuthManager` instead of building
parallel request-auth plumbing in `core`.
## What Changed
- move `ModelProviderAuthInfo` into `codex-protocol` so `core` and
`login` share one config shape
- add `login/src/auth/external_bearer.rs`, which runs the configured
command, caches the bearer token in memory, and refreshes it after `401`
- add `AuthManager::external_bearer_only(...)` for provider-scoped
request paths that should use command-backed bearer auth without
mutating the shared OpenAI auth manager
- add `AuthManager::shared_with_external_chatgpt_auth_refresher(...)`
and rename the other `AuthManager` helpers that only apply to external
ChatGPT auth so the ChatGPT-only path is explicit at the call site
- keep external ChatGPT refresh behavior unchanged while ensuring
bearer-only external auth never persists to `auth.json`
## Testing
- `cargo test -p codex-login`
- `cargo test -p codex-protocol`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16287).
* #16288
* __->__ #16287
## Summary
`ExternalAuthRefresher` was still shaped around external ChatGPT auth:
`ExternalAuthTokens` always implied ChatGPT account metadata even when a
caller only needed a bearer token.
This PR generalizes that contract so bearer-only sources are
first-class, while keeping the existing ChatGPT paths strict anywhere we
persist or rebuild ChatGPT auth state.
## Motivation
This is the first step toward #15189.
The follow-on provider-auth work needs one shared external-auth contract
that can do both of these things:
- resolve the current bearer token before a request is sent
- return a refreshed bearer token after a `401`
That should not require a second token result type just because there is
no ChatGPT account metadata attached.
## What Changed
- change `ExternalAuthTokens` to carry `access_token` plus optional
`ExternalAuthChatgptMetadata`
- add helper constructors for bearer-only tokens and ChatGPT-backed
tokens
- add `ExternalAuthRefresher::resolve()` with a default no-op
implementation so refreshers can optionally provide the current token
before a request is sent
- keep ChatGPT-only persistence strict by continuing to require ChatGPT
metadata anywhere the login layer seeds or reloads ChatGPT auth state
- update the app-server bridge to construct the new token shape for
external ChatGPT auth refreshes
## Testing
- `cargo test -p codex-login`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16286).
* #16288
* #16287
* __->__ #16286
## Why
Follow-up to #16106.
`argument-comment-lint` already runs as a native Bazel aspect on Linux
and macOS, but Windows is still the long pole in `rust-ci`. To move
Windows onto the same native Bazel lane, the toolchain split has to let
exec-side helper binaries build in an MSVC environment while still
linting repo crates as `windows-gnullvm`.
Pushing the Windows lane onto the native Bazel path exposed a second
round of Windows-only issues in the mixed exec-toolchain plumbing after
the initial wrapper/target fixes landed.
## What Changed
- keep the Windows lint lanes on the native Bazel/aspect path in
`rust-ci.yml` and `rust-ci-full.yml`
- add a dedicated `local_windows_msvc` platform for exec-side helper
binaries while keeping `local_windows` as the `windows-gnullvm` target
platform
- patch `rules_rust` so `repository_set(...)` preserves explicit
exec-platform constraints for the generated toolchains, keep the
Windows-specific bootstrap/direct-link fixes needed for the nightly lint
driver, and expose exec-side `rustc-dev` `.rlib`s to the MSVC sysroot
- register the custom Windows nightly toolchain set with MSVC exec
constraints while still exposing both `x86_64-pc-windows-msvc` and
`x86_64-pc-windows-gnullvm` targets
- enable `dev_components` on the custom Windows nightly repository set
so the MSVC exec helper toolchain actually downloads the
compiler-internal crates that `clippy_utils` needs
- teach `run-argument-comment-lint-bazel.sh` to enumerate concrete
Windows Rust rules, normalize the resulting labels, and skip explicitly
requested incompatible targets instead of failing before the lint run
starts
- patch `rules_rust` build-script env propagation so exec-side
`windows-msvc` helper crates drop forwarded MinGW include and linker
search paths as whole flag/path pairs instead of emitting malformed
`CFLAGS`, `CXXFLAGS`, and `LDFLAGS`
- export the Windows VS/MSVC SDK environment in `setup-bazel-ci` and
pass the relevant variables through `run-bazel-ci.sh` via `--action_env`
/ `--host_action_env` so Bazel build scripts can see the MSVC and UCRT
headers on native Windows runs
- add inline comments to the Windows `setup-bazel-ci` MSVC environment
export step so it is easier to audit how `vswhere`, `VsDevCmd.bat`, and
the filtered `GITHUB_ENV` export fit together
- patch `aws-lc-sys` to skip its standalone `memcmp` probe under Bazel
`windows-msvc` build-script environments, which avoids a Windows-native
toolchain mismatch that blocked the lint lane before it reached the
aspect execution
- patch `aws-lc-sys` to prefer its bundled `prebuilt-nasm` objects for
Bazel `windows-msvc` build-script runs, which avoids missing
`generated-src/win-x86_64/*.asm` runfiles in the exec-side helper
toolchain
- annotate the Linux test-only callsites in `codex-rs/linux-sandbox` and
`codex-rs/core` that the wider native lint coverage surfaced
## Patches
This PR introduces a large patch stack because the Windows Bazel lint
lane currently depends on behavior that upstream dependencies do not
provide out of the box in the mixed `windows-gnullvm` target /
`windows-msvc` exec-toolchain setup.
- Most of the `rules_rust` patches look like upstream candidates rather
than OpenAI-only policy. Preserving explicit exec-platform constraints,
forwarding the right MSVC/UCRT environment into exec-side build scripts,
exposing exec-side `rustc-dev` artifacts, and keeping the Windows
bootstrap/linker behavior coherent all look like fixes to the Bazel/Rust
integration layer itself.
- The two `aws-lc-sys` patches are more tactical. They special-case
Bazel `windows-msvc` build-script environments to avoid a `memcmp` probe
mismatch and missing NASM runfiles. Those may be harder to upstream
as-is because they rely on Bazel-specific detection instead of a general
Cargo/build-script contract.
- Short term, carrying these patches in-tree is reasonable because they
unblock a real CI lane and are still narrow enough to audit. Long term,
the goal should not be to keep growing a permanent local fork of either
dependency.
- My current expectation is that the `rules_rust` patches are less
controversial and should be broken out into focused upstream proposals,
while the `aws-lc-sys` patches are more likely to be temporary escape
hatches unless that crate wants a more general hook for hermetic build
systems.
Suggested follow-up plan:
1. Split the `rules_rust` deltas into upstream-sized PRs or issues with
minimized repros.
2. Revisit the `aws-lc-sys` patches during the next dependency bump and
see whether they can be replaced by an upstream fix, a crate upgrade, or
a cleaner opt-in mechanism.
3. Treat each dependency update as a chance to delete patches one by one
so the local patch set only contains still-needed deltas.
## Verification
- `./.github/scripts/run-argument-comment-lint-bazel.sh
--config=argument-comment-lint --keep_going`
- `RUNNER_OS=Windows
./.github/scripts/run-argument-comment-lint-bazel.sh --nobuild
--config=argument-comment-lint --platforms=//:local_windows
--keep_going`
- `cargo test -p codex-linux-sandbox`
- `cargo test -p codex-core shell_snapshot_tests`
- `just argument-comment-lint`
## References
- #16106