## Why
Follow-up to #16351.
That PR synchronized Bazel clippy lint levels with Cargo, but two
intentional `expect()` calls in `codex-rs/tui/src/status/card.rs` still
tripped `clippy::expect_used` (I believe #16201 raced with #16351, which
is why it was missed).
## Why
Follow-up to #16345, the Bazel clippy rollout in #15955, and the cleanup
pass in #16353.
`cargo clippy` was enforcing the workspace deny-list from
`codex-rs/Cargo.toml` because the member crates opt into `[lints]
workspace = true`, but Bazel clippy was only using `rules_rust` plus
`clippy.toml`. That left the Bazel lane vulnerable to drift:
`clippy.toml` can tune lint behavior, but it cannot set
allow/warn/deny/forbid levels.
This PR now closes both sides of the follow-up. It keeps `.bazelrc` in
sync with `[workspace.lints.clippy]`, and it fixes the real clippy
violations that the newly-synced Windows Bazel lane surfaced once that
deny-list started matching Cargo.
## What Changed
- added `.github/scripts/verify_bazel_clippy_lints.py`, a Python check
that parses `codex-rs/Cargo.toml` with `tomllib`, reads the Bazel
`build:clippy` `clippy_flag` entries from `.bazelrc`, and reports
missing, extra, or mismatched lint levels
- ran that verifier from the lightweight `ci.yml` workflow so the sync
check does not depend on a Rust toolchain being installed first
- expanded the `.bazelrc` comment to explain the Cargo `workspace =
true` linkage and why Bazel needs the deny-list duplicated explicitly
- fixed the Windows-only `codex-windows-sandbox` violations that Bazel
clippy reported after the sync, using the same style as #16353: inline
`format!` args, method references instead of trivial closures, removed
redundant clones, and replaced SID conversion `unwrap` and `expect`
calls with proper errors
- cleaned up the remaining cross-platform violations the Bazel lane
exposed in `codex-backend-client` and `core_test_support`
## Testing
Key new test introduced by this PR:
`python3 .github/scripts/verify_bazel_clippy_lints.py`
Fix stale weekly limit in `/status` (#16194): /status reused the
session’s cached rate-limit snapshot, so the weekly remaining limit
could stay frozen within an active session.
With this change, we now dynamically update the rate limits after status
is displayed.
I needed to delete a few low-value test cases from the chatWidget tests
because the test.rs file is really large, and the new tests in this PR
pushed us over the 512K mandated limit. I'm working on a separate PR to
refactor that test file.
Problem: `chatwidget/tests.rs` had grown into a single oversized test
blob that was hard to maintain and exceeded the repo's blob size limit.
Solution: split the chatwidget tests into topical modules with a thin
root `tests.rs`, shared helper utilities, preserved snapshot naming, and
hermetic test config so the refactor stays stable and passes the
`codex-tui` test suite.
## Why
Bazel clippy now catches lints that `cargo clippy` can still miss when a
crate under `codex-rs` forgets to opt into workspace lints. The concrete
example here was `codex-rs/app-server/tests/common/Cargo.toml`: Bazel
flagged a clippy violation in `models_cache.rs`, but Cargo did not
because that crate inherited workspace package metadata without
declaring `[lints] workspace = true`.
We already mirror the workspace clippy deny list into Bazel after
[#15955](https://github.com/openai/codex/pull/15955), so we also need a
repo-side check that keeps every `codex-rs` manifest opted into the same
workspace settings.
## What changed
- add `.github/scripts/verify_cargo_workspace_manifests.py`, which
parses every `codex-rs/**/Cargo.toml` with `tomllib` and verifies:
- `version.workspace = true`
- `edition.workspace = true`
- `license.workspace = true`
- `[lints] workspace = true`
- top-level crate names follow the `codex-*` / `codex-utils-*`
conventions, with explicit exceptions for `windows-sandbox-rs` and
`utils/path-utils`
- run that script in `.github/workflows/ci.yml`
- update the current outlier manifests so the check is enforceable
immediately
- fix the newly exposed clippy violations in the affected crates
(`app-server/tests/common`, `file-search`, `feedback`,
`shell-escalation`, and `debug-client`)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16353).
* #16351
* __->__ #16353
## Why
https://github.com/openai/codex/pull/16287 introduced a change to
`codex-rs/login/src/auth/auth_tests.rs` that uses a PowerShell helper to
read the next token from `tokens.txt` and rewrite the remainder back to
disk. On Windows, `Get-Content` can return a scalar when the file has
only one remaining line, so `$lines[0]` reads the first character
instead of the full token. That breaks the external bearer refresh test
once the token list is nearly exhausted.
https://github.com/openai/codex/pull/16288 introduced similar changes to
`codex-rs/core/src/models_manager/manager_tests.rs` and
`codex-rs/core/tests/suite/client.rs`.
These went unnoticed because the failures showed up when the test was
run via Cargo on Windows, but not in our Bazel harness. Figuring out
that Cargo-vs-Bazel delta will happen in a follow-up PR.
## Verification
On my Windows machine, I verified `cargo test` passes when run in
`codex-rs/login` and `codex-rs/core`. Once this PR is merged, I will
keep an eye on
https://github.com/openai/codex/actions/workflows/rust-ci-full.yml to
verify it goes green.
## What changed
- Wrap `Get-Content -Path tokens.txt` in `@(...)` so the script always
gets array semantics before counting, indexing, and rewriting the
remaining lines.
## Summary
- Replace the separate external auth enum and refresher trait with a
single `ExternalAuth` trait in login auth flow
- Move bearer token auth behind `BearerTokenRefresher` and update
`AuthManager` and app-server wiring to use the generic external auth API
The TUI’s `/feedback` flow was still uploading directly through the
local feedback crate, which bypassed app-server behavior such as
auth-derived feedback tags like chatgpt_user_id and made TUI feedback
handling diverge from other clients. It also meant that remove TUI
sessions failed to upload the correct feedback logs and session details.
Testing: Manually tested `/feedback` flow and confirmed that it didn't
regress.
I noticed that
https://github.com/openai/codex/actions/workflows/rust-ci-full.yml
started failing on my own PR,
https://github.com/openai/codex/pull/16288, even though CI was green
when I merged it.
Apparently, it introduced a lint violation that was [correctly!] caught
by our Cargo-based clippy runner, but not our Bazel-based one.
My next step is to figure out the reason for the delta between the two
setups, but I wanted to get us green again quickly, first.
Adds this:
```
properties.insert(
"fork_turns".to_string(),
JsonSchema::String {
description: Some(
"Optional MultiAgentV2 fork mode. Use `none`, `all`, or a positive integer string such as `3` to fork only the most recent turns."
.to_string(),
),
},
);
```
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Fixes#15189.
Custom model providers that set `requires_openai_auth = false` could
only use static credentials via `env_key` or
`experimental_bearer_token`. That is not enough for providers that mint
short-lived bearer tokens, because Codex had no way to run a command to
obtain a bearer token, cache it briefly in memory, and retry with a
refreshed token after a `401`.
This PR adds that provider config and wires it through the existing auth
design: request paths still go through `AuthManager.auth()` and
`UnauthorizedRecovery`, with `core` only choosing when to use a
provider-backed bearer-only `AuthManager`.
## Scope
To keep this PR reviewable, `/models` only uses provider auth for the
initial request in this change. It does **not** add a dedicated `401`
retry path for `/models`; that can be follow-up work if we still need it
after landing the main provider-token support.
## Example Usage
```toml
model_provider = "corp-openai"
[model_providers.corp-openai]
name = "Corp OpenAI"
base_url = "https://gateway.example.com/openai"
requires_openai_auth = false
[model_providers.corp-openai.auth]
command = "gcloud"
args = ["auth", "print-access-token"]
timeout_ms = 5000
refresh_interval_ms = 300000
```
The command contract is intentionally small:
- write the bearer token to `stdout`
- exit `0`
- any leading or trailing whitespace is trimmed before the token is used
## What Changed
- add `model_providers.<id>.auth` to the config model and generated
schema
- validate that command-backed provider auth is mutually exclusive with
`env_key`, `experimental_bearer_token`, and `requires_openai_auth`
- build a bearer-only `AuthManager` for `ModelClient` and
`ModelsManager` when a provider configures `auth`
- let normal Responses requests and realtime websocket connects use the
provider-backed bearer source through the same `AuthManager.auth()` path
- allow `/models` online refresh for command-auth providers and attach
the provider token to the initial `/models` request
- keep `auth.cwd` available as an advanced escape hatch and include it
in the generated config schema
## Testing
- `cargo test -p codex-core provider_auth_command`
- `cargo test -p codex-core
refresh_available_models_uses_provider_auth_token`
- `cargo test -p codex-core
test_deserialize_provider_auth_config_defaults`
## Docs
- `developers.openai.com/codex` should document the new
`[model_providers.<id>.auth]` block and the token-command contract
## Summary
`AuthManager` and `UnauthorizedRecovery` already own token resolution
and staged `401` recovery. The missing piece for provider auth was a
bearer-only mode that still fit that design, instead of pushing a second
auth abstraction into `codex-core`.
This PR keeps the design centered on `AuthManager`: it teaches
`codex-login` how to own external bearer auth directly so later provider
work can keep calling `AuthManager.auth()` and `UnauthorizedRecovery`.
## Motivation
This is the middle layer for #15189.
The intended design is still:
- `AuthManager` encapsulates token storage and refresh
- `UnauthorizedRecovery` powers staged `401` recovery
- all request tokens go through `AuthManager.auth()`
This PR makes that possible for provider-backed bearer tokens by adding
a bearer-only auth mode inside `AuthManager` instead of building
parallel request-auth plumbing in `core`.
## What Changed
- move `ModelProviderAuthInfo` into `codex-protocol` so `core` and
`login` share one config shape
- add `login/src/auth/external_bearer.rs`, which runs the configured
command, caches the bearer token in memory, and refreshes it after `401`
- add `AuthManager::external_bearer_only(...)` for provider-scoped
request paths that should use command-backed bearer auth without
mutating the shared OpenAI auth manager
- add `AuthManager::shared_with_external_chatgpt_auth_refresher(...)`
and rename the other `AuthManager` helpers that only apply to external
ChatGPT auth so the ChatGPT-only path is explicit at the call site
- keep external ChatGPT refresh behavior unchanged while ensuring
bearer-only external auth never persists to `auth.json`
## Testing
- `cargo test -p codex-login`
- `cargo test -p codex-protocol`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16287).
* #16288
* __->__ #16287
## Summary
`ExternalAuthRefresher` was still shaped around external ChatGPT auth:
`ExternalAuthTokens` always implied ChatGPT account metadata even when a
caller only needed a bearer token.
This PR generalizes that contract so bearer-only sources are
first-class, while keeping the existing ChatGPT paths strict anywhere we
persist or rebuild ChatGPT auth state.
## Motivation
This is the first step toward #15189.
The follow-on provider-auth work needs one shared external-auth contract
that can do both of these things:
- resolve the current bearer token before a request is sent
- return a refreshed bearer token after a `401`
That should not require a second token result type just because there is
no ChatGPT account metadata attached.
## What Changed
- change `ExternalAuthTokens` to carry `access_token` plus optional
`ExternalAuthChatgptMetadata`
- add helper constructors for bearer-only tokens and ChatGPT-backed
tokens
- add `ExternalAuthRefresher::resolve()` with a default no-op
implementation so refreshers can optionally provide the current token
before a request is sent
- keep ChatGPT-only persistence strict by continuing to require ChatGPT
metadata anywhere the login layer seeds or reloads ChatGPT auth state
- update the app-server bridge to construct the new token shape for
external ChatGPT auth refreshes
## Testing
- `cargo test -p codex-login`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16286).
* #16288
* #16287
* __->__ #16286
## Why
Follow-up to #16106.
`argument-comment-lint` already runs as a native Bazel aspect on Linux
and macOS, but Windows is still the long pole in `rust-ci`. To move
Windows onto the same native Bazel lane, the toolchain split has to let
exec-side helper binaries build in an MSVC environment while still
linting repo crates as `windows-gnullvm`.
Pushing the Windows lane onto the native Bazel path exposed a second
round of Windows-only issues in the mixed exec-toolchain plumbing after
the initial wrapper/target fixes landed.
## What Changed
- keep the Windows lint lanes on the native Bazel/aspect path in
`rust-ci.yml` and `rust-ci-full.yml`
- add a dedicated `local_windows_msvc` platform for exec-side helper
binaries while keeping `local_windows` as the `windows-gnullvm` target
platform
- patch `rules_rust` so `repository_set(...)` preserves explicit
exec-platform constraints for the generated toolchains, keep the
Windows-specific bootstrap/direct-link fixes needed for the nightly lint
driver, and expose exec-side `rustc-dev` `.rlib`s to the MSVC sysroot
- register the custom Windows nightly toolchain set with MSVC exec
constraints while still exposing both `x86_64-pc-windows-msvc` and
`x86_64-pc-windows-gnullvm` targets
- enable `dev_components` on the custom Windows nightly repository set
so the MSVC exec helper toolchain actually downloads the
compiler-internal crates that `clippy_utils` needs
- teach `run-argument-comment-lint-bazel.sh` to enumerate concrete
Windows Rust rules, normalize the resulting labels, and skip explicitly
requested incompatible targets instead of failing before the lint run
starts
- patch `rules_rust` build-script env propagation so exec-side
`windows-msvc` helper crates drop forwarded MinGW include and linker
search paths as whole flag/path pairs instead of emitting malformed
`CFLAGS`, `CXXFLAGS`, and `LDFLAGS`
- export the Windows VS/MSVC SDK environment in `setup-bazel-ci` and
pass the relevant variables through `run-bazel-ci.sh` via `--action_env`
/ `--host_action_env` so Bazel build scripts can see the MSVC and UCRT
headers on native Windows runs
- add inline comments to the Windows `setup-bazel-ci` MSVC environment
export step so it is easier to audit how `vswhere`, `VsDevCmd.bat`, and
the filtered `GITHUB_ENV` export fit together
- patch `aws-lc-sys` to skip its standalone `memcmp` probe under Bazel
`windows-msvc` build-script environments, which avoids a Windows-native
toolchain mismatch that blocked the lint lane before it reached the
aspect execution
- patch `aws-lc-sys` to prefer its bundled `prebuilt-nasm` objects for
Bazel `windows-msvc` build-script runs, which avoids missing
`generated-src/win-x86_64/*.asm` runfiles in the exec-side helper
toolchain
- annotate the Linux test-only callsites in `codex-rs/linux-sandbox` and
`codex-rs/core` that the wider native lint coverage surfaced
## Patches
This PR introduces a large patch stack because the Windows Bazel lint
lane currently depends on behavior that upstream dependencies do not
provide out of the box in the mixed `windows-gnullvm` target /
`windows-msvc` exec-toolchain setup.
- Most of the `rules_rust` patches look like upstream candidates rather
than OpenAI-only policy. Preserving explicit exec-platform constraints,
forwarding the right MSVC/UCRT environment into exec-side build scripts,
exposing exec-side `rustc-dev` artifacts, and keeping the Windows
bootstrap/linker behavior coherent all look like fixes to the Bazel/Rust
integration layer itself.
- The two `aws-lc-sys` patches are more tactical. They special-case
Bazel `windows-msvc` build-script environments to avoid a `memcmp` probe
mismatch and missing NASM runfiles. Those may be harder to upstream
as-is because they rely on Bazel-specific detection instead of a general
Cargo/build-script contract.
- Short term, carrying these patches in-tree is reasonable because they
unblock a real CI lane and are still narrow enough to audit. Long term,
the goal should not be to keep growing a permanent local fork of either
dependency.
- My current expectation is that the `rules_rust` patches are less
controversial and should be broken out into focused upstream proposals,
while the `aws-lc-sys` patches are more likely to be temporary escape
hatches unless that crate wants a more general hook for hermetic build
systems.
Suggested follow-up plan:
1. Split the `rules_rust` deltas into upstream-sized PRs or issues with
minimized repros.
2. Revisit the `aws-lc-sys` patches during the next dependency bump and
see whether they can be replaced by an upstream fix, a crate upgrade, or
a cleaner opt-in mechanism.
3. Treat each dependency update as a chance to delete patches one by one
so the local patch set only contains still-needed deltas.
## Verification
- `./.github/scripts/run-argument-comment-lint-bazel.sh
--config=argument-comment-lint --keep_going`
- `RUNNER_OS=Windows
./.github/scripts/run-argument-comment-lint-bazel.sh --nobuild
--config=argument-comment-lint --platforms=//:local_windows
--keep_going`
- `cargo test -p codex-linux-sandbox`
- `cargo test -p codex-core shell_snapshot_tests`
- `just argument-comment-lint`
## References
- #16106
## Why
The Bazel-backed `argument-comment-lint` CI path had two gaps:
- Bazel wildcard target expansion skipped inline unit-test crates from
`src/` modules because the generated `*-unit-tests-bin` `rust_test`
targets are tagged `manual`.
- `argument-comment-mismatch` was still only a warning in the Bazel and
packaged-wrapper entrypoints, so a typoed `/*param_name*/` comment could
still pass CI even when the lint detected it.
That left CI blind to real linux-sandbox examples, including the missing
`/*local_port*/` comment in
`codex-rs/linux-sandbox/src/proxy_routing.rs` and typoed argument
comments in `codex-rs/linux-sandbox/src/landlock.rs`.
## What Changed
- Added `tools/argument-comment-lint/list-bazel-targets.sh` so Bazel
lint runs cover `//codex-rs/...` plus the manual `rust_test`
`*-unit-tests-bin` targets.
- Updated `just argument-comment-lint`, `rust-ci.yml`, and
`rust-ci-full.yml` to use that helper.
- Promoted both `argument-comment-mismatch` and
`uncommented-anonymous-literal-argument` to errors in every strict
entrypoint:
- `tools/argument-comment-lint/lint_aspect.bzl`
- `tools/argument-comment-lint/src/bin/argument-comment-lint.rs`
- `tools/argument-comment-lint/wrapper_common.py`
- Added wrapper/bin coverage for the stricter lint flags and documented
the behavior in `tools/argument-comment-lint/README.md`.
- Fixed the now-covered callsites in
`codex-rs/linux-sandbox/src/proxy_routing.rs`,
`codex-rs/linux-sandbox/src/landlock.rs`, and
`codex-rs/core/src/shell_snapshot_tests.rs`.
This keeps the Bazel target expansion narrow while making the Bazel and
prebuilt-linter paths enforce the same strict lint set.
## Verification
- `python3 -m unittest discover -s tools/argument-comment-lint -p
'test_*.py'`
- `cargo +nightly-2025-09-18 test --manifest-path
tools/argument-comment-lint/Cargo.toml`
- `just argument-comment-lint`
## Why
`#16193` moved the pure `tool_search` and `tool_suggest` spec builders
into `codex-tools`, but `codex-core` still owned the shared
discoverable-tool model that those builders and the `tool_suggest`
runtime both depend on. This change continues the migration by moving
that reusable model boundary out of `codex-core` as well, so the
discovery/suggestion stack uses one shared set of types and
`core/src/tools` no longer needs its own `discoverable.rs` module.
## What changed
- Moved `DiscoverableTool`, `DiscoverablePluginInfo`, and
`filter_tool_suggest_discoverable_tools_for_client()` into
`codex-rs/tools/src/tool_discovery.rs` alongside the extracted
discovery/suggestion spec builders.
- Added `codex-app-server-protocol` as a `codex-tools` dependency so the
shared discoverable-tool model can own the connector-side `AppInfo`
variant directly.
- Updated `core/src/tools/handlers/tool_suggest.rs`,
`core/src/tools/spec.rs`, `core/src/tools/router.rs`,
`core/src/connectors.rs`, and `core/src/codex.rs` to consume the shared
`codex-tools` model instead of the old core-local declarations.
- Changed `core/src/plugins/discoverable.rs` to return
`DiscoverablePluginInfo` directly, moved the pure client-filter coverage
into `tool_discovery_tests.rs`, and deleted the old
`core/src/tools/discoverable.rs` module.
- Updated `codex-rs/tools/README.md` so the crate boundary documents
that `codex-tools` now owns the discoverable-tool models in addition to
the discovery/suggestion spec builders.
## Test plan
- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib tools::handlers::tool_suggest::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib plugins::discoverable::`
- `just bazel-lock-check`
- `just argument-comment-lint`
## References
- #16193
- #16154
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
- #16141
## Why
`core/src/tools/spec.rs` still owned the pure `tool_search` and
`tool_suggest` spec builders even though that logic no longer needed
`codex-core` runtime state. This change continues the `codex-tools`
migration by moving the reusable discovery and suggestion spec
construction out of `codex-core` so `spec.rs` is left with the
core-owned policy decisions about when these tools are exposed and what
metadata is available.
## What changed
- Added `codex-rs/tools/src/tool_discovery.rs` with the shared
`tool_search` and `tool_suggest` spec builders, plus focused unit tests
in `tool_discovery_tests.rs`.
- Moved the shared `DiscoverableToolAction` and `DiscoverableToolType`
declarations into `codex-tools` so the `tool_suggest` handler and the
extracted spec builders use the same wire-model enums.
- Updated `core/src/tools/spec.rs` to translate `ToolInfo` and
`DiscoverableTool` values into neutral `codex-tools` inputs and delegate
the actual spec building there.
- Removed the old template-based description rendering helpers from
`core/src/tools/spec.rs` and deleted the now-dead helper methods in
`core/src/tools/discoverable.rs`.
- Updated `codex-rs/tools/README.md` to document that discovery and
suggestion models/spec builders now live in `codex-tools`.
## Test plan
- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-discovery-specs cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discovery-specs cargo test -p
codex-core --lib tools::handlers::tool_suggest::`
- `just argument-comment-lint`
## References
- #16154
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
- #16141
## Summary
A Windows-only snapshot assertion in the app-server MCP startup warning
test compared the raw rendered path, so CI saw `C:\tmp\project` instead
of the normalized `/tmp/project` snapshot fixture.
## Fix
Route that snapshot assertion through the existing
`normalize_snapshot_paths(...)` helper so the test remains
platform-stable.
## Why
The previous `codex-tools` migration steps moved the shared schema
models, local-host specs, collaboration specs, and related adapters out
of `codex-core`, but `core/src/tools/spec.rs` still contained a grab bag
of pure utility tool builders. Those specs do not need session state or
handler logic; they only describe wire shapes for tools that
`codex-core` already knows how to execute.
Moving that remaining low-coupling layer into `codex-tools` keeps the
migration moving in meaningful chunks and trims another large block of
passive tool-spec construction out of `codex-core` without touching the
runtime-coupled handlers.
## What changed
- extended `codex-tools` to own the pure spec builders for:
- code-mode `exec` / `wait`
- `js_repl` / `js_repl_reset`
- MCP resource tools `list_mcp_resources`,
`list_mcp_resource_templates`, and `read_mcp_resource`
- utility tools `list_dir` and `test_sync_tool`
- split those builders across small module files with sibling
`*_tests.rs` coverage, keeping `src/lib.rs` exports-only
- rewired `core/src/tools/spec.rs` to call the extracted builders and
deleted the duplicated core-local implementations
- moved the direct JS REPL grammar seam test out of
`core/src/tools/spec_tests.rs` so it now lives with the extracted
implementation in `codex-tools`
- updated `codex-rs/tools/README.md` so the documented crate boundary
matches the new utility-spec surface
## Test plan
- `CARGO_TARGET_DIR=/tmp/codex-tools-utility-specs cargo test -p
codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-utility-specs cargo test -p
codex-core --lib tools::spec::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`
## References
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
- #16141
Fixes#16092
The app-server-backed TUI could accumulate ghost subagent entries in
`/agent` after resume/backfill flows. Some of those rows were no longer
live according to the backend, but still appeared selectable in the
picker and could open as blank threads.
*Cause*
Unlike the legacy tui behavior, tui_app_server was creating local
picker/replay state for subagents discovered through metadata refresh
and loaded-thread backfill, even when no real local session or
transcript had been attached. That let stale ids survive in the picker
as if they were replayable threads.
*Fix*
Stop creating empty local thread channels during subagent metadata
hydration and loaded-thread backfill.
When opening /agent, prune metadata-only entries that thread/read
reports as terminally unavailable.
When selecting a discovered subagent that is still live but not yet
locally attached, materialize a real local session on demand from
thread/read instead of falling back to an empty replay state.
This addresses #16038
The default `tui_app_server` path stopped surfacing MCP startup failures
during cold start, even though the legacy TUI still showed warnings like
`MCP startup incomplete (...)`. The app-server bridge emitted per-server
startup status notifications, but `tui_app_server` ignored them, so
failed MCP handshakes could look like a clean startup.
This change teaches `tui_app_server` to consume MCP startup status
notifications, preserve the immediate per-server failure warning, and
synthesize the same aggregate startup warning the legacy TUI shows once
startup settles.
## Why
The recent `codex-tools` migration steps have moved shared tool models
and low-coupling spec helpers out of `codex-core`, but
`core/src/tools/spec.rs` still owned a large block of pure
collaboration-tool spec construction. Those builders do not need session
state or runtime behavior; they only need a small amount of core-owned
configuration injected at the seam.
Moving that cohesive slice into `codex-tools` makes the crate boundary
more honest and removes a substantial amount of passive tool-spec logic
from `codex-core` without trying to move the runtime-coupled multi-agent
handlers at the same time.
## What changed
- added `agent_tool.rs`, `request_user_input_tool.rs`, and
`agent_job_tool.rs` to `codex-tools`, with sibling `*_tests.rs` coverage
and an exports-only `lib.rs`
- moved the pure `ToolSpec` builders for:
- collaboration tools such as `spawn_agent`, `send_input`,
`send_message`, `assign_task`, `resume_agent`, `wait_agent`,
`list_agents`, and `close_agent`
- `request_user_input`
- agent-job specs `spawn_agents_on_csv` and `report_agent_job_result`
- rewired `core/src/tools/spec.rs` to call the extracted builders while
still supplying the core-owned inputs, such as spawn-agent role
descriptions and wait timeout bounds
- updated the `core/src/tools/spec.rs` seam tests to build expected
collaboration specs through `codex-tools`
- updated `codex-rs/tools/README.md` so the crate documentation reflects
the broader collaboration-tool boundary
## Test plan
- `CARGO_TARGET_DIR=/tmp/codex-tools-collab-specs cargo test -p
codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-collab-specs cargo test -p
codex-core --lib tools::spec::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`
## References
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
## Why
`core/src/tools/spec.rs` still bundled a set of pure local-host tool
builders with the orchestration that actually decides when those tools
are exposed and which handlers back them. That made `codex-core`
responsible for JSON/tool-shape construction that does not depend on
session state, and it kept the `codex-tools` migration from taking a
meaningfully larger bite out of `spec.rs`.
This PR moves that reusable spec-building layer into `codex-tools` while
leaving feature gating, handler registration, and runtime-coupled
descriptions in `codex-core`.
## What changed
- added `codex-rs/tools/src/local_tool.rs` for the pure builders for
`exec_command`, `write_stdin`, `shell`, `shell_command`, and
`request_permissions`
- added `codex-rs/tools/src/view_image.rs` for the `view_image` tool
spec and output schema so the extracted modules stay right-sized
- rewired `codex-rs/core/src/tools/spec.rs` to call those extracted
builders instead of constructing these specs inline
- kept the `request_permissions` description source in `codex-core`,
with `codex-tools` taking the description as input so the crate boundary
does not grow a dependency on handler/runtime code
- moved the direct constructor coverage for this slice from
`codex-rs/core/src/tools/spec_tests.rs` into
`codex-rs/tools/src/local_tool_tests.rs` and
`codex-rs/tools/src/view_image_tests.rs`
- updated `codex-rs/tools/README.md` to reflect that `codex-tools` now
owns this local-host spec layer
## Test plan
- `CARGO_TARGET_DIR=/tmp/codex-tools-local-host cargo test -p
codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-local-tools cargo test -p codex-core
--lib tools::spec::`
- `just argument-comment-lint`
## References
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
Fixes#16091.
The app-server TUI was truncating the filtered mention candidate list to
`MAX_POPUP_ROWS`, so the `$` skills picker only exposed the first 8
matches. That made it look like many skills were missing and prevented
keyboard navigation beyond the first page, even though direct
`$skill-name` insertion still worked.
Testing: I manually verified the regression and confirmed the fix.
## Why
`thread_start_params_from_config()` is supposed to forward the effective
`approvals_reviewer` into the app-server request, but these tests were
constructing that config through `ConfigBuilder::build()`, which also
loads ambient system and managed config layers. On machines with an
admin or host-level reviewer override, the manual-only case could
inherit `guardian_subagent` and fail even though the exec-side mapping
was correct.
## What changed
- Set `approvals_reviewer` explicitly via `harness_overrides` in the two
`thread_start_params_*review_policy*` tests in
`codex-rs/exec/src/lib.rs`.
- Removed the dependence on default config resolution and temp
`config.toml` writes so the tests exercise only the reviewer-to-request
mapping in `codex-exec`.
## Testing
- `cargo test -p codex-exec`
## Why
The longer-term `codex-tools` migration is to move pure tool-definition
and tool-spec plumbing out of `codex-core` while leaving session- and
runtime-coupled orchestration behind.
The remaining code-mode adapter layer in
`core/src/tools/code_mode_description.rs` was a good next extraction
seam because it only transformed `ToolSpec` values for code mode and
already delegated the low-level description rendering to
`codex-code-mode`.
## What Changed
- added `codex-rs/tools/src/code_mode.rs` with
`augment_tool_spec_for_code_mode()` and
`tool_spec_to_code_mode_tool_definition()`
- added focused unit coverage in `codex-rs/tools/src/code_mode_tests.rs`
- rewired `core/src/tools/spec.rs` and `core/src/tools/code_mode/mod.rs`
to use the extracted adapters from `codex-tools`
- removed the old `core/src/tools/code_mode_description.rs` shim and its
test file from `codex-core`
- added the `codex-code-mode` dependency to `codex-tools`, updated
`Cargo.lock`, and refreshed the `codex-tools` README to reflect the
expanded boundary
## Test Plan
- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-code-mode-adapters cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-code-mode-adapters cargo test -p
codex-core --lib tools::code_mode::`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just argument-comment-lint`
## References
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
## Why
The `plugin/list` force-sync path can race app-server startup's curated
plugin cache refresh.
Startup was capturing the configured curated plugin IDs from the initial
config snapshot. If `plugin/list` with `forceRemoteSync` removed curated
plugin entries from `config.toml` while that background refresh was
still in flight, the startup task could recreate cache directories for
plugins that had just been uninstalled.
That leaves the `plugin/list` response logically correct but the on-disk
cache stale, which matches the flaky Ubuntu arm failure seen in
`codex-app-server::all
suite::v2::plugin_list::plugin_list_force_remote_sync_reconciles_curated_plugin_state`
while validating [#16047](https://github.com/openai/codex/pull/16047).
## What
- change `codex-rs/core/src/plugins/manager.rs` so startup curated-repo
refresh rereads the current user `config.toml` before deciding which
curated plugin cache entries to refresh
- factor the configured-plugin parsing so the same logic can be reused
from either the config layer stack or the persisted user config value
- add a regression test that verifies curated plugin IDs are read from
the latest user config state before cache refresh runs
## Testing
- `cargo test -p codex-core
configured_curated_plugin_ids_from_codex_home_reads_latest_user_config
-- --nocapture`
- `cargo test -p codex-app-server
suite::v2::plugin_list::plugin_list_force_remote_sync_reconciles_curated_plugin_state
-- --nocapture`
- `just argument-comment-lint`
## Why
This continues the `codex-tools` migration by moving another passive
tool-spec layer out of `codex-core`.
After `ToolSpec` moved into `codex-tools`, `codex-core` still owned
`ConfiguredToolSpec` and `create_tools_json_for_responses_api()`. Both
are data-model and serialization helpers rather than runtime
orchestration, so keeping them in `core/src/tools/registry.rs` and
`core/src/tools/spec.rs` left passive tool-definition code coupled to
`codex-core` longer than necessary.
## What changed
- moved `ConfiguredToolSpec` into `codex-rs/tools/src/tool_spec.rs`
- moved `create_tools_json_for_responses_api()` into
`codex-rs/tools/src/tool_spec.rs`
- re-exported the new surface from `codex-rs/tools/src/lib.rs`, which
remains exports-only
- updated `core/src/client.rs`, `core/src/tools/registry.rs`, and
`core/src/tools/router.rs` to consume the extracted types and serializer
from `codex-tools`
- moved the tool-list serialization test into
`codex-rs/tools/src/tool_spec_tests.rs`
- added focused unit coverage for `ConfiguredToolSpec::name()`
- simplified `core/src/tools/spec_tests.rs` to use the extracted
`ConfiguredToolSpec::name()` directly and removed the now-redundant
local `tool_name()` helper
- updated `codex-rs/tools/README.md` so the crate boundary reflects the
newly extracted tool-spec wrapper and serialization helper
## Test plan
- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-configured-spec cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-configured-spec cargo test -p
codex-core --lib client::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`
## References
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
## Why
This continues the `codex-tools` migration by moving another passive
tool-definition layer out of `codex-core`.
After `ResponsesApiTool` and the lower-level schema adapters moved into
`codex-tools`, `core/src/client_common.rs` was still owning `ToolSpec`
and the web-search request wire types even though they are serialized
data models rather than runtime orchestration. Keeping those types in
`codex-core` makes the crate boundary look smaller than it really is and
leaves non-runtime tool-shape code coupled to core.
## What changed
- moved `ToolSpec`, `ResponsesApiWebSearchFilters`, and
`ResponsesApiWebSearchUserLocation` into
`codex-rs/tools/src/tool_spec.rs`
- added focused unit tests in `codex-rs/tools/src/tool_spec_tests.rs`
for:
- `ToolSpec::name()`
- web-search config conversions
- `ToolSpec` serialization for `web_search` and `tool_search`
- kept `codex-rs/tools/src/lib.rs` exports-only by re-exporting the new
module from `lib.rs`
- reduced `core/src/client_common.rs` to a compatibility shim that
re-exports the extracted tool-spec types for current core call sites
- updated `core/src/tools/spec_tests.rs` to consume the extracted
web-search types directly from `codex-tools`
- updated `codex-rs/tools/README.md` so the crate contract reflects that
`codex-tools` now owns the passive tool-spec request models in addition
to the lower-level Responses API structs
## Test plan
- `cargo test -p codex-tools`
- `cargo test -p codex-core --lib tools::spec::`
- `cargo test -p codex-core --lib client_common::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`
## References
- #15923
- #15928
- #15944
- #15953
- #16031
## Summary
- remove protocol and core support for discovering and listing custom
prompts
- simplify the TUI slash-command flow and command popup to built-in
commands only
- delete obsolete custom prompt tests, helpers, and docs references
- clean up downstream event handling for the removed protocol events
## Why
`argument-comment-lint` had become a PR bottleneck because the repo-wide
lane was still effectively running a `cargo dylint`-style flow across
the workspace instead of reusing Bazel's Rust dependency graph. That
kept the lint enforced, but it threw away the main benefit of moving
this job under Bazel in the first place: metadata reuse and cacheable
per-target analysis in the same shape as Clippy.
This change moves the repo-wide lint onto a native Bazel Rust aspect so
Linux and macOS can lint `codex-rs` without rebuilding the world
crate-by-crate through the wrapper path.
## What Changed
- add a nightly Rust toolchain with `rustc-dev` for Bazel and a
dedicated crate-universe repo for `tools/argument-comment-lint`
- add `tools/argument-comment-lint/driver.rs` and
`tools/argument-comment-lint/lint_aspect.bzl` so Bazel can run the lint
as a custom `rustc_driver`
- switch repo-wide `just argument-comment-lint` and the Linux/macOS
`rust-ci` lanes to `bazel build --config=argument-comment-lint
//codex-rs/...`
- keep the Python/DotSlash wrappers as the package-scoped fallback path
and as the current Windows CI path
- gate the Dylint entrypoint behind a `bazel_native` feature so the
Bazel-native library avoids the `dylint_*` packaging stack
- update the aspect runtime environment so the driver can locate
`rustc_driver` correctly under remote execution
- keep the dedicated `tools/argument-comment-lint` package tests and
wrapper unit tests in CI so the source and packaged entrypoints remain
covered
## Verification
- `python3 -m unittest discover -s tools/argument-comment-lint -p
'test_*.py'`
- `cargo test` in `tools/argument-comment-lint`
- `bazel build
//tools/argument-comment-lint:argument-comment-lint-driver
--@rules_rust//rust/toolchain/channel=nightly`
- `bazel build --config=argument-comment-lint
//codex-rs/utils/path-utils:all`
- `bazel build --config=argument-comment-lint
//codex-rs/rollout:rollout`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16106).
* #16120
* __->__ #16106