## Why
`PermissionProfile` needs stable, canonical file-system semantics before
it can become the primary runtime permissions abstraction. Without a
canonical form, callers have to keep re-deriving legacy sandbox maps and
profile comparisons remain lossy or order-dependent.
## What changed
This adds canonicalization helpers for `FileSystemPermissions` and
`PermissionProfile`, expands special paths into explicit sandbox
entries, and updates permission request/conversion paths to consume
those canonical entries. It also tightens the legacy bridge so root-wide
write profiles with narrower carveouts are not silently projected as
full-disk legacy access.
## Verification
- `cargo test -p codex-protocol
root_write_with_read_only_child_is_not_full_disk_write -- --nocapture`
- `cargo test -p codex-sandboxing permission -- --nocapture`
- `cargo test -p codex-tui permissions -- --nocapture`
- Migrates unloaded `thread/name/set` and `thread/memoryModeSet`
app-server writes behind the generic
`ThreadStore::update_thread_metadata` API rather than adding one-off
store methods for setting thread name or memory mode.
- Implements the local ThreadStore metadata patch path for thread name
and memory mode, including rollout append, legacy name index updates,
SessionMeta validation/update, SQLite reconciliation, and re-reading the
stored thread.
- Adds focused local thread-store unit coverage plus app-server
integration coverage for the migrated unloaded write paths.
## Summary
Introduces a single background/control-plane agent task for ChatGPT
backend requests that do not have a thread-scoped task, with
`AuthManager` owning the default ChatGPT backend authorization decision.
Callers now ask `AuthManager` for the default ChatGPT backend
authorization header. `AuthManager` decides whether that is bearer or
background AgentAssertion based on config/internal state, while
low-level bootstrap paths can explicitly request bearer-only auth.
This PR is stacked on PR4 and focuses on the shared background task auth
plumbing plus the first tranche of backend/control-plane consumers. The
remaining callsite wiring is split into PR4.2 to keep review size down.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - register agent
identities when enabled
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR3.1: https://github.com/openai/codex/pull/17978 - persist and
prewarm registered tasks per thread
- PR4: https://github.com/openai/codex/pull/17980 - use task-scoped
`AgentAssertion` for downstream calls
- PR4.1: this PR - introduce AuthManager-owned background/control-plane
`AgentAssertion` auth
- PR4.2: https://github.com/openai/codex/pull/18260 - use background
task auth for additional backend/control-plane calls
## What Changed
- add background task registration and assertion minting inside
`codex-login`
- persist `agent_identity.background_task_id` separately from
per-session task state
- make `BackgroundAgentTaskManager` private to `codex-login`; call sites
do not instantiate or pass it around
- teach `AuthManager` the ChatGPT backend base URL and feature-derived
background auth mode from resolved config
- expose bearer-only helpers for bootstrap/registration/refresh-style
paths that must not use AgentAssertion
- wire `AuthManager` default ChatGPT authorization through app listing,
connector directory listing, remote plugins, MCP status/listing,
analytics, and core-skills remote calls
- preserve bearer fallback when the feature is disabled, the backend
host is unsupported, or background task registration is not available
## Validation
- `just fmt`
- `cargo check -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `cargo test -p codex-login agent_identity`
- `cargo test -p codex-model-provider bearer_auth_provider`
- `cargo test -p codex-core agent_assertion`
- `cargo test -p codex-app-server remote_control`
- `cargo test -p codex-cloud-requirements fetch_cloud_requirements`
- `cargo test -p codex-models-manager manager::tests`
- `cargo test -p codex-chatgpt`
- `cargo test -p codex-cloud-tasks`
- `just fix -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `just fix -p codex-app-server`
- `git diff --check`
## Summary
Add a new app-server `marketplace/remove` RPC on top of the shared
marketplace-remove implementation.
This change:
- adds `MarketplaceRemoveParams` / `MarketplaceRemoveResponse` to the
app-server protocol
- wires the new request through `codex_message_processor`
- reuses the shared core marketplace-remove flow from the stacked
refactor PR
- updates generated schema files and adds focused app-server coverage
## Validation
- `just write-app-server-schema`
- `just fmt`
- heavy compile/test coverage deferred to GitHub CI per request
## Summary
- Add the executor-backed RMCP stdio transport.
- Wire MCP stdio placement through the executor environment config.
- Cover local and executor-backed stdio paths with the existing MCP test
helpers.
## Stack
```text
o #18027 [6/6] Fail exec client operations after disconnect
│
@ #18212 [5/6] Wire executor-backed MCP stdio
│
o #18087 [4/6] Abstract MCP stdio server launching
│
o #18020 [3/6] Add pushed exec process events
│
o #18086 [2/6] Support piped stdin in exec process API
│
o #18085 [1/6] Add MCP server environment config
│
o main
```
---------
Co-authored-by: Codex <noreply@openai.com>
Cap the model-visible skills section to a small share of the context
window, with a fallback character budget, and keep only as many implicit
skills as fit within that budget.
Emit a non-fatal warning when enabled skills are omitted, and add a new
app-server warning notification
Record thread-start skill metrics for total enabled skills, kept skills,
and whether truncation happened
---------
Co-authored-by: Matthew Zeng <mzeng@openai.com>
Co-authored-by: Codex <noreply@openai.com>
## Summary
- trust-gate project `.codex` layers consistently, including repos that
have `.codex/hooks.json` or `.codex/execpolicy/*.rules` but no
`.codex/config.toml`
- keep disabled project layers in the config stack so nested trusted
project layers still resolve correctly, while preventing hooks and exec
policies from loading until the project is trusted
- update app-server/TUI onboarding copy to make the trust boundary
explicit and add regressions for loader, hooks, exec-policy, and
onboarding coverage
## Security
Before this change, an untrusted repo could auto-load project hooks or
exec policies from `.codex/` as long as `config.toml` was absent. This
makes trust the single gate for project-local config, hooks, and exec
policies.
## Stack
- Parent of #15936
## Test
- cargo test -p codex-core without_config_toml
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Update the plugin API for the new remote plugin model.
The mental model is no longer “keep local plugin state in sync with
remote.” Instead, local and remote plugins are becoming separate
sources. Remote catalog entries can be shown directly from the remote
API before installation; after installation they are still downloaded
into the local cache for execution, but remote installed state will come
from the API and be held in memory rather than being read from config.
• ## API changes
- Remove `forceRemoteSync` from `plugin/list`, `plugin/install`, and
`plugin/uninstall`.
- Remove `remoteSyncError` from `plugin/list`.
- Add remote-capable metadata to `plugin/list` / `plugin/read`:
- nullable `marketplaces[].path`
- `source: { type: "remote", downloadUrl }`
- URL asset fields alongside local path fields:
`composerIconUrl`, `logoUrl`, `screenshotUrls`
- Make `plugin/read` and `plugin/install` source-compatible:
- `marketplacePath?: AbsolutePathBuf | null`
- `remoteMarketplaceName?: string | null`
- exactly one source is required at runtime
## Summary
- add first-class marketplace support for git-backed plugin sources
- keep the newer marketplace parsing behavior from `main`, including
alternate manifest locations and string local sources
- materialize remote plugin sources during install, detail reads, and
non-curated cache refresh
- expose git plugin source metadata through the app-server protocol
## Details
This teaches the marketplace parser to accept all of the following:
- local string sources such as `"source": "./plugins/foo"`
- local object sources such as
`{"source":"local","path":"./plugins/foo"}`
- remote repo-root sources such as
`{"source":"url","url":"https://github.com/org/repo.git"}`
- remote subdir sources such as
`{"source":"git-subdir","url":"owner/repo","path":"plugins/foo","ref":"main","sha":"..."}`
It also preserves the newer tolerant behavior from `main`: invalid or
unsupported plugin entries are skipped instead of breaking the whole
marketplace.
## Validation
- `cargo test -p codex-core plugins::marketplace::tests`
- `just fix -p codex-core`
- `just fmt`
## Notes
- A full `cargo test -p codex-core` run still hit unrelated existing
failures in agent and multi-agent tests during this session; the
marketplace-focused suite passed after the rebase resolution.
Follow-up to https://github.com/openai/codex/pull/18178, where we called
out enabling the await-holding lint as a follow-up.
The long-term goal is to enable Clippy coverage for async guards held
across awaits. This PR is intentionally only the first, low-risk cleanup
pass: it narrows obvious lock guard lifetimes and leaves
`codex-rs/Cargo.toml` unchanged so the lint is not enabled until the
remaining cases are fixed or explicitly justified. It intentionally
leaves the active-turn/turn-state locking pattern alone because those
checks and mutations need to stay atomic.
## Common fixes used here
These are the main patterns reviewers should expect in this PR, and they
are also the patterns to reach for when fixing future `await_holding_*`
findings:
- **Scope the guard to the synchronous work.** If the code only needs
data from a locked value, move the lock into a small block, clone or
compute the needed values, and do the later `.await` after the block.
- **Use direct one-line mutations when there is no later await.** Cases
like `map.lock().await.remove(&id)` are acceptable when the guard is
only needed for that single mutation and the statement ends before any
async work.
- **Drain or clone work out of the lock before notifying or awaiting.**
For example, the JS REPL drains pending exec senders into a local vector
and the websocket writer clones buffered envelopes before it serializes
or sends them.
- **Use a `Semaphore` only when serialization is intentional across
async work.** The test serialization guards intentionally span awaited
setup or execution, so using a semaphore communicates "one at a time"
without holding a mutex guard.
- **Remove the mutex when there is only one owner.** The PTY stdin
writer task owns `stdin` directly; the old `Arc<Mutex<_>>` did not
protect shared access because nothing else had access to the writer.
- **Do not split locks that protect an atomic invariant.** This PR
deliberately leaves active-turn/turn-state paths alone because those
checks and mutations need to stay atomic. Those cases should be fixed
separately with a design change or documented with `#[expect]`.
## What changed
- Narrow scoped async mutex guards in app-server, JS REPL, network
approval, remote-control websocket, and the RMCP test server.
- Replace test-only async mutex serialization guards with semaphores
where the guard intentionally lives across async work.
- Let the PTY pipe writer task own stdin directly instead of wrapping it
in an async mutex.
## Verification
- `just fix -p codex-core -p codex-app-server -p codex-rmcp-client -p
codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness`
- `just clippy -p codex-core`
- `cargo test -p codex-core -p codex-app-server -p codex-rmcp-client -p
codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` was
run; the app-server suite passed, and `codex-core` failed in the local
sandbox on six otel approval tests plus
`suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var`,
which appear to depend on local command approval/default rules and
`CODEX_SANDBOX_NETWORK_DISABLED=1` in this environment.
## Summary
First PR in the split from #17956.
- adds the core/app-server `RateLimitReachedType` shape
- maps backend `rate_limit_reached_type` into Codex rate-limit snapshots
- carries the field through app-server notifications/responses and
generated schemas
- updates existing constructors/tests for the new optional field
## Validation
- `cargo test -p codex-backend-client`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server rate_limits`
- `cargo test -p codex-tui workspace_`
- `cargo test -p codex-tui status_`
- `just fmt`
- `just fix -p codex-backend-client`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fix -p codex-tui`
To improve performance of UI loads from the app, add two main
improvements:
1. The `thread/list` api now gets a `sortDirection` request field and a
`backwardsCursor` to the response, which lets you paginate forwards and
backwards from a window. This lets you fetch the first few items to
display immediately while you paginate to fill in history, then can
paginate "backwards" on future loads to catch up with any changes since
the last UI load without a full reload of the entire data set.
2. Added a new `thread/turns/list` api which also has sortDirection and
backwardsCursor for the same behavior as `thread/list`, allowing you the
same small-fetch for immediate display followed by background fill-in
and resync catchup.
Summary
- replace the thread/read persisted-load helper with
ThreadStore::read_thread
- move SQLite/rollout summary, name, fork metadata, and history loading
for persisted reads into LocalThreadStore
- leave getConversationSummary unchanged for a later PR
Context
- Replaces closed stacked PR #18232 after PR #18231 merged and its base
branch was deleted.
## Summary
- adds managed requirements support for deny-read filesystem entries
- constrains config layers so managed deny-read requirements cannot be
widened by user-controlled config
- surfaces managed deny-read requirements through debug/config plumbing
This PR lets managed requirements inject deny-read filesystem
constraints into the effective filesystem sandbox policy.
User-controlled config can still choose the surrounding permission
profile, but it cannot remove or weaken the managed deny-read entries.
## Managed deny-read shape
A managed requirements file can declare exact paths and glob patterns
under `[permissions.filesystem]`:
```toml
# /etc/codex/requirements.toml
[permissions.filesystem]
deny_read = [
"/Users/alice/.gitconfig",
"/Users/alice/.ssh",
"./managed-private/**/*.env",
]
```
Those entries are compiled into the effective filesystem policy as
`access = none` rules, equivalent in shape to filesystem permission
entries like:
```toml
[permissions.workspace.filesystem]
"/Users/alice/.gitconfig" = "none"
"/Users/alice/.ssh" = "none"
"/absolute/path/to/managed-private/**/*.env" = "none"
```
The important difference is that the managed entries come from
requirements, so lower-precedence user config cannot remove them or make
those paths readable again.
Relative managed `deny_read` entries are resolved relative to the
directory containing the managed requirements file. Glob entries keep
their glob suffix after the non-glob prefix is normalized.
## Runtime behavior
- Managed `deny_read` entries are appended to the effective
`FileSystemSandboxPolicy` after the selected permission profile is
resolved.
- Exact paths become `FileSystemPath::Path { access: None }`; glob
patterns become `FileSystemPath::GlobPattern { access: None }`.
- When managed deny-read entries are present, `sandbox_mode` is
constrained to `read-only` or `workspace-write`; `danger-full-access`
and `external-sandbox` cannot silently bypass the managed read-deny
policy.
- On Windows, the managed deny-read policy is enforced for direct file
tools, but shell subprocess reads are not sandboxed yet, so startup
emits a warning for that platform.
- `/debug-config` shows the effective managed requirement as
`permissions.filesystem.deny_read` with its source.
## Stack
1. #15979 - glob deny-read policy/config/direct-tool support
2. #18096 - macOS and Linux sandbox enforcement
3. This PR - managed deny-read requirements
---------
Co-authored-by: Codex <noreply@openai.com>
… import
## Why
`externalAgentConfig/import` used to spawn plugin imports in the
background and return immediately. That meant local marketplace imports
could still be in flight when the caller refreshed plugin state, so
newly imported plugins would not show up right away.
This change makes local marketplace imports complete before the RPC
returns, while keeping remote marketplace imports asynchronous so we do
not block on remote fetches.
## What changed
- split plugin migration details into local and remote marketplace
imports based on the external config source
- import local marketplaces synchronously during
`externalAgentConfig/import`
- return pending remote plugin imports to the app-server so it can
finish them in the background
- clear the plugin and skills caches before responding to plugin
imports, and again after background remote imports complete, so the next
`plugin/list` reloads fresh state
- keep marketplace source parsing encapsulated behind
`is_local_marketplace_source(...)` instead of re-exporting the internal
enum
- add core and app-server coverage for the synchronous local import path
and the pending remote import path
## Verification
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-core` (currently fails an existing unrelated
test:
`config_loader::tests::cli_override_can_update_project_local_mcp_server_when_project_is_trusted`)
- `cargo test` (currently fails existing `codex-app-server` integration
tests in MCP/skills/thread-start areas, plus the unrelated `codex-core`
failure above)
## Summary
This changes Codex logout so managed ChatGPT auth is revoked against
AuthAPI before local auth state is removed. CLI logout, TUI `/logout`,
and the app-server account logout path now use the token-revoking logout
flow instead of only deleting `auth.json` / credential store state.
## Root Cause
Logout previously cleared only local auth storage. That removed Codex's
local credentials but did not ask the backend to invalidate the
refresh/access token state associated with a managed ChatGPT login.
## Behavior
For managed ChatGPT auth, logout sends the stored refresh token to
`https://auth.openai.com/oauth/revoke` with `token_type_hint:
refresh_token` and the Codex OAuth client id, then deletes all local
auth stores after revocation succeeds. If only an access token is
available, it falls back to revoking that access token. API key auth and
externally supplied `chatgptAuthTokens` are still only cleared locally
because Codex does not own a refresh token for those modes.
Revocation failures are fail-closed: if Codex cannot load stored auth or
the backend revoke call fails, logout returns an error and leaves local
auth in place so the user can retry instead of silently clearing local
state while backend tokens remain valid.
## Validation
ran local version of `codex-cli` with staging overrides/harness for auth
ran `codex login` then `codex logout`:
saw auth.json clear and backend revocation endpoints were called
```
POST /oauth/revoke
status: 200
revoking access token
should clear auth session
clearing auth session due to token revocation
successfully revoked session and access token
CANONICAL-API-LINE Response: status='200' method='POST' path='/oauth/revoke
```
---------
Co-authored-by: Codex <noreply@openai.com>
Summary
- refactor thread/read into explicit persisted-load, live-load, and
merge steps
- preserve existing SQLite/filesystem/live-thread behavior exactly
- keep ThreadStore migration out of this PR so the next PR is easier to
review
Validation
- this one's a pure reorganization that relies on existing test coverage
## Problem
When a user resumed or forked a session, the TUI could render the
restored thread history immediately, but it did not receive token usage
until a later model turn emitted a fresh usage event. That left the
context/status UI blank or stale during the exact window where the user
expects resumed state to look complete. Core already reconstructed token
usage from the rollout; the missing behavior was app-server lifecycle
replay to the client that just attached.
## Mental model
Token usage has two representations. The rollout is the durable source
of historical `TokenCount` events, and the core session cache is the
in-memory snapshot reconstructed from that rollout on resume or fork.
App-server v2 clients do not read core state directly; they learn about
usage through `thread/tokenUsage/updated`. The fix keeps those roles
separate: core exposes the restored `TokenUsageInfo`, and app-server
sends one targeted notification after a successful `thread/resume` or
`thread/fork` response when that restored snapshot exists.
This notification is not a new model event. It is a replay of
already-persisted state for the client that just attached. That
distinction matters because using the normal core event path here would
risk duplicating `TokenCount` entries in the rollout and making future
resumes count historical usage twice.
## Non-goals
This change does not add a new protocol method or payload shape. It
reuses the existing v2 `thread/tokenUsage/updated` notification and the
TUI’s existing handler for that notification.
This change does not alter how token usage is computed, accumulated,
compacted, or written during turns. It only exposes the token usage that
resume and fork reconstruction already restored.
This change does not broadcast historical usage replay to every
subscribed client. The replay is intentionally scoped to the connection
that requested resume or fork so already-attached clients are not
surprised by an old usage update while they may be rendering live
activity.
## Tradeoffs
Sending the usage notification after the JSON-RPC response preserves a
clear lifecycle order: the client first receives the thread object, then
receives restored usage for that thread. The tradeoff is that usage is
still a notification rather than part of the `thread/resume` or
`thread/fork` response. That keeps the protocol shape stable and avoids
duplicating usage fields across response types, but clients must
continue listening for notifications after receiving the response.
The helper selects the latest non-in-progress turn id for the replayed
usage notification. This is conservative because restored usage belongs
to completed persisted accounting, not to newly attached in-flight work.
The fallback to the last turn preserves a stable wire payload for
unusual histories, but histories with no meaningful completed turn still
have a weak attribution story.
## Architecture
Core already seeds `Session` token state from the last persisted rollout
`TokenCount` during `InitialHistory::Resumed` and
`InitialHistory::Forked`. The new core accessor exposes the complete
`TokenUsageInfo` through `CodexThread` without giving app-server direct
session mutation authority.
App-server calls that accessor from three lifecycle paths: cold
`thread/resume`, running-thread resume/rejoin, and `thread/fork`. In
each path, the server sends the normal response first, then calls a
shared helper that converts core usage into
`ThreadTokenUsageUpdatedNotification` and sends it only to the
requesting connection.
The tests build fake rollouts with a user turn plus a persisted token
usage event. They then exercise `thread/resume` and `thread/fork`
without starting another model turn, proving that restored usage arrives
before any next-turn token event could be produced.
## Observability
The primary debug path is the app-server JSON-RPC stream. After
`thread/resume` or `thread/fork`, a client should see the response
followed by `thread/tokenUsage/updated` when the source rollout includes
token usage. If the notification is absent, check whether the rollout
contains an `event_msg` payload of type `token_count`, whether core
reconstruction seeded `Session::token_usage_info`, and whether the
connection stayed attached long enough to receive the targeted
notification.
The notification is sent through the existing
`OutgoingMessageSender::send_server_notification_to_connections` path,
so existing app-server tracing around server notifications still
applies. Because this is a replay, not a model turn event, debugging
should start at the resume/fork handlers rather than the turn event
translation in `bespoke_event_handling`.
## Tests
The focused regression coverage is `cargo test -p codex-app-server
emits_restored_token_usage`, which covers both resume and fork. The core
reconstruction guard is `cargo test -p codex-core
record_initial_history_seeds_token_info_from_rollout`.
Formatting and lint/fix passes were run with `just fmt`, `just fix -p
codex-core`, and `just fix -p codex-app-server`. Full crate test runs
surfaced pre-existing unrelated failures in command execution and plugin
marketplace tests; the new token usage tests passed in focused runs and
within the app-server suite before the unrelated command execution
failure.
## Summary
- Add best-effort auto-upgrade for user-configured Git marketplaces
recorded in `config.toml`.
- Track the last activated Git revision with `last_revision` so
unchanged marketplace sources skip clone work.
- Trigger the upgrade from plugin startup and `plugin/list`, while
preserving existing fail-open plugin behavior with warning logs rather
than new user-visible errors.
## Details
- Remote configured marketplaces use `git ls-remote` to compare the
source/ref against the recorded revision.
- Upgrades clone into a staging directory, validate that
`.agents/plugins/marketplace.json` exists and that the manifest name
matches the configured marketplace key, then atomically activate the new
root.
- Local `.agents/plugins/marketplace.json` marketplaces remain live
filesystem state and are not auto-pulled.
- Existing non-curated plugin cache refresh is kicked after successful
marketplace root upgrades.
## Validation
- `just write-config-schema`
- `cargo test -p codex-core marketplace_upgrade`
- `cargo check -p codex-cli -p codex-app-server`
- `just fix -p codex-core`
Did not run the complete `cargo test` suite because the repo
instructions require asking before a full core workspace run.
Fix app-server startup when `remote_control = true` is enabled without
ChatGPT auth.
Remote control now starts in a degraded/retrying state instead of
failing app-server initialization, so Desktop is not stranded before the
initial initialize handshake.
Addresses https://github.com/openai/codex/issues/17143
Problem: TUI interrupts without an active turn stopped cancelling slow
MCP startup after routing through the app-server APIs.
Solution: Route no-active-turn interrupts through app-server as startup
cancels, acknowledge them immediately, and emit cancelled MCP startup
updates.
Testing: I manually confirmed that MCP cancellation didn't work prior to
this PR and works after the fix was in place.
Split plugin loading, marketplace, and related infrastructure out of
core into codex-core-plugins, while keeping the core-facing
configuration and orchestration flow in codex-core.
---------
Co-authored-by: Codex <noreply@openai.com>
- [x] Add resource uri meta to tool call item so that the app-server
client can start prefetching resources immediately without loading mcp
server status.
# Summary
- implement local ThreadStore archive/unarchive operations
- implement local ThreadStore read_thread operation
- break up the various ThreadStore local method implementations into
separate files
- migrate app-server archive/unarchive and core archive fixture to use
ThreadStore (but not all read operations yet!)
- use the ThreadStore's read operation as a proxy check for thread
persistence/existence in the app server code
- move all other filesystem operations related to archive (path
validation etc) into the local thread store.
# Tests
- add dedicated local store archive/unarchive tests
## Summary
- Track outbound remote-control sequence IDs independently for each
client stream.
- Retain unacked outbound messages per stream using FIFO buffers.
- Require stream-scoped acks and update tests for contiguous per-stream
sequencing.
## Why
The remote-control peer uses outbound sequence gaps to detect lost
messages and re-initialize. A single global outbound sequence counter
can create apparent gaps on an individual stream when another stream
receives an interleaved message.
## Validation
- `just fmt`
- `cargo test -p codex-app-server remote_control`
- `just fix -p codex-app-server`
- `git diff --check`
Builds on top of #17659
Move the filesystem + sqlite thread listing-related operations inside of
a local ThreadStore implementation and call ThreadStore from the places
that used to perform these filesystem/sqlite operations.
This is the first of a series of PRs that will implement the rest of the
local ThreadStore.
Testing:
- added unit tests for the thread store implementation
- adjusted some unit tests in the realtime + personality packages whose
callsites changed. Specifically I'm trying to hide ThreadMetadata inside
of the local implementation and make ThreadMetadata a sqlite
implementation detail concern rather than a public interface, preferring
the more generate StoredThread interface instead
- added a corner case test for the personality migration package that
wasn't covered by the existing test suite
- adjust the behavior of searched thread listing to run the existing
local rollout repair/backfill pass _before_ querying SQLite results, so
callers using ThreadStore::list_threads do not miss matches after a
partial metadata warm-up
## Summary
Stack PR 2 of 4 for feature-gated agent identity support.
This PR adds agent identity registration behind
`features.use_agent_identity`. It keeps the app-server protocol
unchanged and starts registration after ChatGPT auth exists rather than
requiring a client restart.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - this PR
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR4: https://github.com/openai/codex/pull/17388 - use `AgentAssertion`
downstream when enabled
## Validation
Covered as part of the local stack validation pass:
- `just fmt`
- `cargo test -p codex-core --lib agent_identity`
- `cargo test -p codex-core --lib agent_assertion`
- `cargo test -p codex-core --lib websocket_agent_task`
- `cargo test -p codex-api api_bridge`
- `cargo build -p codex-cli --bin codex`
## Notes
The full local app-server E2E path is still being debugged after PR
creation. The current branch stack is directionally ready for review
while that follow-up continues.
Problem: PR #17372 moved initialized request handling into
`dispatch_initialized_client_request`, leaving analytics code that uses
`connection_id` without a local binding and breaking `codex-app-server`
builds.
Solution: Restore the `connection_id` binding from
`connection_request_id` before initialized request validation and
analytics tracking.
## Summary
- Refactors `MessageProcessor` and per-connection session state so
initialized service RPC handling can be moved into spawned tasks in a
follow-up PR.
- Shares the processor and initialized session data with
`Arc`/`OnceLock` instead of mutable borrowed connection state.
- Keeps initialized request handling synchronous in this PR; it does
**not** call `tokio::spawn` for service RPCs yet.
## Testing
- `just fmt`
- `cargo test -p codex-app-server` *(fails on existing hardening gaps
covered by #17375, #17376, and #17377; the pipelined config regression
passed before the unrelated failures)*
- `just fix -p codex-app-server`
To allow the ability to have guaranteed-unique cursors, we make two
important updates:
* Add new updated_at_ms and created_at_ms columns that are in
millisecond precision
* Guarantee uniqueness -- if multiple items are inserted at the same
millisecond, bump the new one by one millisecond until it becomes unique
This lets us use single-number cursors for forwards and backwards paging
through resultsets and guarantee that the cursor is a fixed point to do
(timestamp > cursor) and get new items only.
This updated implementation is backwards-compatible since multiple
appservers can be running and won't handle the previous method well.
- Add outputModality to thread/realtime/start and wire text/audio output
selection through app-server, core, API, and TUI.\n- Rename the realtime
transcript delta notification and add a separate transcript done
notification that forwards final text from item done without correlating
it with deltas.