codex

mirror of https://github.com/openai/codex.git synced 2026-05-04 13:21:54 +03:00

Author	SHA1	Message	Date
Ruslan Nigmatullin	0700f979ba	app-server: run initialized rpcs with keyed serialization (#17373 ) ## Why Initialized app-server RPCs no longer need to bottleneck behind one request processor path. Running them concurrently improves responsiveness, but several request families still mutate shared state or depend on ordered side effects. Those stateful families need an auditable serialization contract so concurrency does not reorder thread, config, auth, command, watcher, MCP, or similar state transitions. This PR keeps that boundary explicit: stateful work is serialized by the smallest useful key, while intentionally read-only or externally concurrent work remains unkeyed. In particular, `thread/list` and `thread/turns/list` explicitly have no serialization because they primarily read append-only rollout storage and should continue to be served concurrently. ## What changed - Adds `ClientRequest::serialization_scope()` in `app-server-protocol` and requires every client request definition to declare its serialization behavior. - Introduces keyed request scopes for thread, thread path, command exec process, fuzzy search session, fs watch, MCP OAuth, and global state buckets such as config, account auth, memory, and device keys. - Routes initialized app-server RPCs through per-key FIFO serialization while allowing unkeyed initialized requests to run concurrently. - Cancels in-flight initialized RPC work when the connection disconnects or the app-server exits so spawned request tasks do not outlive their session. - Adds focused coverage for representative keyed and unkeyed serialization scopes, including explicitly concurrent `thread/turns/list` behavior. ## Validation - Added protocol tests for representative keyed serialization scopes and intentionally unkeyed request families. - Added app-server request serialization tests covering per-key FIFO behavior, concurrent unkeyed execution, disconnect shutdown, and config read-after-write ordering. - Local focused protocol validation after the latest rebase is currently blocked by packageproxy failing to resolve locked `rustls-webpki 0.103.13`; CI is expected to provide the full validation signal.	2026-04-28 12:23:34 -07:00
stefanstokic-oai	4c68bd728f	External agent session support (#19895 ) ## Summary This extends external agent detection/import beyond config artifacts so Codex can detect recent sessions files from the external agent home and import them into Codex rollout history. ## What changed - Added a focused `external_agent_sessions` module for: - session discovery - source-record parsing - rollout construction - import ledger tracking - Wired session detection/import into the app-server external agent config API. - Added compaction handling so large imported sessions can be resumed safely before the first follow-up turn. ## Testing Added coverage for: - recent-session detection - custom-title handling - recency filtering - dedupe and re-detect-after-source-change behavior - visible imported turn construction - backward-compatible import payload deserialization - end-to-end RPC import flow - rejection of undetected session paths - repeat-import behavior - large-session compaction before first follow-up Ran: - `cargo test -p codex-app-server external_agent_config_import_ --test all`	2026-04-28 17:42:36 +00:00
jif-oai	a9e5c34083	feat: trigger memories from user turns with cooldown (#19970 ) ## Why Memory startup was tied to thread lifecycle events such as create, load, and fork. That can run memory work before a thread receives real user input, and it makes startup cost scale with thread management instead of actual turns. Moving the trigger to `thread/sendInput` keeps memory startup aligned with the first real user turn and lets it use the current thread config at turn time. The idea is to prevent ghost cost due to pre-warm triggered by the app Turn-based startup can also make global phase-2 consolidation easier to request repeatedly, so this adds a success cooldown and tightens the default startup scan window. ## What Changed - Start `codex_memories_write::start_memories_startup_task` after a non-empty `thread/sendInput` turn is submitted, instead of from thread create/load/fork paths: `d4a6885b78/codex-rs/app-server/src/codex_message_processor.rs (L6477-L6487)` - Expose `CodexThread::config()` so app-server can pass the live config into memory startup at turn time. - Add a six-hour successful-run cooldown for global phase-2 consolidation via `SkippedCooldown`: `d4a6885b78/codex-rs/state/src/runtime/memories.rs (L963-L966)` - Reduce memory startup defaults to at most 2 rollouts over 10 days: `d4a6885b78/codex-rs/config/src/types.rs (L31-L34)` ## Verification Updated the memory runtime coverage around phase-2 reclaim behavior, including `phase2_global_lock_respects_success_cooldown`. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 16:23:13 +02:00
jif-oai	431ebeaef7	feat: split memories part 2 (#19860 ) Keep extracting memories out of core and moving the write trigger in the app-server This is temporary and it should move at the client level as a follow-up This makes core fully independant from `codex-memories-write` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 13:03:28 +02:00
xli-oai	803705f795	Add remote plugin uninstall API (#19456 ) ## Summary - Adds the remote `plugin/uninstall` request form using required `pluginId` plus optional `remoteMarketplaceName`, while preserving local `pluginId` uninstall. - Adds `codex_core_plugins::remote::uninstall_remote_plugin` for the deployed ChatGPT plugin backend uninstall path and validates the backend returns the same id with `enabled: false`. - Routes app-server remote uninstall through feature checks, remote plugin id validation, backend mutation, local downloaded cache deletion, cache clearing, docs, and regenerated protocol schemas. ## Tests - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol plugin_uninstall_params_serialization_omits_force_remote_sync` - `cargo test -p codex-app-server plugin_uninstall --test all` - `cargo test -p codex-app-server plugin_uninstall` - `cargo build -p codex-cli` - `CODEX_BIN=/Users/xli/code/codex/codex-rs/target/debug/codex python3 /Users/xli/.codex/skills/xli-test-marketplace-api/scripts/run_marketplace_api_matrix.py` (44 pass / 0 fail) - `just fix -p codex-app-server-protocol -p codex-app-server -p codex-tui` - `just fix -p codex-app-server`	2026-04-28 03:27:53 -07:00
xl-openai	7d72fc8f53	feat: Cache remote plugin bundles on install (#19914 ) Remote installs now fetch, validate, download, and cache the plugin bundle locally	2026-04-28 00:53:27 -07:00
Michael Bolin	fc2a69107c	permissions: derive snapshot sandbox projections (#19775 ) ## Why `ThreadConfigSnapshot` is used by app-server and thread metadata code as a stable view of active runtime settings. Keeping both `sandbox_policy` and `permission_profile` in the snapshot duplicates permission state and makes it possible for the legacy projection to drift from the canonical profile. The legacy `sandbox` value is still needed at app-server compatibility boundaries, so this PR derives it on demand from the snapshot profile and cwd instead of storing it. ## What Changed - Removes `ThreadConfigSnapshot.sandbox_policy`. - Adds `ThreadConfigSnapshot::sandbox_policy()` as a compatibility projection from `permission_profile` plus `cwd`. - Updates app-server response/metadata code and tests to call the projection only where legacy fields still exist. - Keeps snapshot construction profile-only so split filesystem rules, disabled enforcement, and external enforcement remain represented by the canonical profile. ## Verification - `cargo test -p codex-app-server thread_response_permission_profile_preserves_enforcement --lib` - `cargo test -p codex-core dispatch_reclaims_stale_global_lock_and_starts_consolidation --lib` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19775). * #19900 * #19899 * #19776 * __->__ #19775	2026-04-27 22:30:47 -07:00
Michael Bolin	bf38def44e	permissions: make SessionConfigured profile-only (#19774 ) ## Why `SessionConfiguredEvent` is the internal event that tells clients what permissions are active for a session. Emitting both `sandbox_policy` and `permission_profile` leaves two possible authorities and forces every consumer to decide which one to honor. At this point in the migration, the profile is expressive enough to represent managed, disabled, and external sandbox enforcement, so the internal event can be profile-only. The wire compatibility concern is older serialized events or rollout data that only contain `sandbox_policy`; those still need to deserialize. ## What Changed - Removes `sandbox_policy` from `SessionConfiguredEvent` and makes `permission_profile` required. - Adds custom deserialization so old payloads with only `sandbox_policy` are upgraded to a cwd-anchored `PermissionProfile`. - Updates core event emission and TUI session handling to sync permissions from the profile directly. - Updates app-server response construction to derive the legacy `sandbox` response field from the active thread snapshot instead of from `SessionConfiguredEvent`. - Updates yolo-mode display logic to treat both `PermissionProfile::Disabled` and managed unrestricted filesystem plus enabled network as full-access, while still preserving the distinction between no sandbox and external sandboxing. ## Verification - `cargo test -p codex-protocol session_configured_event --lib` - `cargo test -p codex-protocol serialize_event --lib` - `cargo test -p codex-exec session_configured --lib` - `cargo test -p codex-app-server thread_response_permission_profile_preserves_enforcement --lib` - `cargo test -p codex-core session_configured_reports_permission_profile_for_external_sandbox --lib` - `cargo test -p codex-tui session_configured --lib` - `cargo test -p codex-tui yolo_mode_includes_managed_full_access_profiles --lib` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19774). * #19900 * #19899 * #19776 * #19775 * __->__ #19774	2026-04-27 22:06:47 -07:00
pakrym-oai	c5a495c2cd	Streamline review and feedback handlers (#19498 ) ## Why The remaining review, interrupt, fuzzy search, feedback, and git-diff handlers still had local send-error branches that obscured otherwise simple request handling. This final slice flattens those handlers without changing the public protocol behavior. ## What Changed - Streamlined review start, turn interrupt, fuzzy search session, feedback upload, and git diff handlers in `codex-rs/app-server/src/codex_message_processor.rs`. - Converted validation and upload failures into returned JSON-RPC errors where that avoids nested `send_error`/`return` blocks. - Left unrelated sandbox setup and notification code untouched. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server --test all v2::review -- --test-threads=1`	2026-04-27 16:36:04 -07:00
pakrym-oai	e903d000b0	Streamline turn and realtime handlers (#19497 ) ## Why Turn and realtime handlers had nested validation and send-error branches that made the request path longer than the behavior warranted. This slice keeps the same request semantics while letting the handlers return errors from the failing step. ## What Changed - Streamlined turn start, injected item, and turn steer request handling in `codex-rs/app-server/src/codex_message_processor.rs`. - Applied the same result-returning shape to realtime session response handlers. - Preserved existing request validation and thread-manager interactions. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server --test all v2::turn_start -- --test-threads=1` - `cargo test -p codex-app-server --test all v2::turn_steer -- --test-threads=1` - `cargo test -p codex-app-server --test all v2::thread_inject_items -- --test-threads=1`	2026-04-27 15:21:59 -07:00
pakrym-oai	739ab6bc51	Streamline thread resume and fork handlers (#19495 ) ## Why Thread resume and fork had some of the deepest error-handling indentation in this area because helpers emitted request errors directly. Returning those failures gives the handlers a single request boundary while preserving the async pending-resume behavior. ## What Changed - Converted thread resume helpers in `codex-rs/app-server/src/codex_message_processor.rs` to return `Result` values for validation and view loading failures. - Applied the same pattern to thread fork request handling. - Simplified pending resume error construction by using the shared JSON-RPC error helpers. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server --test all v2::thread_resume -- --test-threads=1` - `cargo test -p codex-app-server --test all v2::thread_fork -- --test-threads=1`	2026-04-27 15:04:53 -07:00
pakrym-oai	2be9fd5a93	Streamline thread read handlers (#19494 ) ## Why The thread read/list handlers mostly assemble views, but their error handling was interleaved with response emission. Returning view-building errors from the helper path keeps those handlers focused on data assembly. ## What Changed - Added a small mapper for `ThreadReadViewError` to JSON-RPC errors in `codex-rs/app-server/src/codex_message_processor.rs`. - Streamlined thread list, loaded-thread, read, turn-list, and summary handlers to produce result values for the request boundary. - Kept the existing invalid-request vs internal-error distinctions for missing or unreadable thread data. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server --test all conversation_summary -- --test-threads=1`	2026-04-27 14:30:24 -07:00
pakrym-oai	5c30d79afb	Streamline thread mutation handlers (#19493 ) ## Why Thread mutation handlers had many short error branches whose only job was to emit a JSON-RPC error and stop. This slice keeps those errors visible, but lets each handler build a result and return early from validation helpers instead of nesting the main path. ## What Changed - Streamlined thread archive/unarchive, rename, memory, metadata, rollback, compact, background terminal, shell, and guardian handlers in `codex-rs/app-server/src/codex_message_processor.rs`. - Reused shared JSON-RPC error constructors in `codex-rs/app-server/src/bespoke_event_handling.rs` for rollback-related request failures. - Preserved direct `send_error` calls where they remain the simplest boundary for pending async event responses. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server --test all v2::thread_rollback -- --test-threads=1`	2026-04-27 14:18:55 -07:00
pakrym-oai	c5e2921e1d	Streamline thread start handler (#19492 ) ## Why The thread start handler mixed request validation, thread construction, dynamic-tool validation, and JSON-RPC error emission in one nested flow. Returning request errors from the helper path makes the successful setup path easier to follow. ## What Changed - Reworked `thread/start` handling in `codex-rs/app-server/src/codex_message_processor.rs` so helper methods return `Result` and the handler emits one result. - Moved dynamic-tool validation failures into returned JSON-RPC errors instead of local `send_error` branches. - Preserved the existing thread creation and task-spawning behavior. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server --test all v2::dynamic_tools -- --test-threads=1` - `cargo test -p codex-app-server --test all v2::turn_start -- --test-threads=1`	2026-04-27 13:56:20 -07:00
Michael Bolin	4b55979755	permissions: remove cwd special path (#19841 ) ## Why The experimental `PermissionProfile` API had both `:cwd` and `:project_roots` special filesystem paths, which made the permission root ambiguous. This PR removes the unstable `current_working_directory` special path before the permissions API is stabilized, so callers use `:project_roots` for symbolic project-root access. ## What changed - Removes `FileSystemSpecialPath::CurrentWorkingDirectory` from protocol and app-server protocol models, plus regenerated app-server JSON/TypeScript schemas. - Replaces internal `:cwd` permission entries with `:project_roots` entries. - Keeps the existing cwd-update behavior for legacy-shaped workspace-write profiles, while removing the deleted `CurrentWorkingDirectory` case from that compatibility path. - Keeps `PermissionProfile::workspace_write()` as the reusable symbolic workspace-write helper, with docs noting that `:project_roots` entries resolve at enforcement time. - Updates app-server docs/examples and approval UI labeling to stop advertising `:cwd` as a permission token. ## Compatibility Persisted rollout items may contain the old `{"kind":"current_working_directory"}` tag from earlier experimental `permissionProfile` snapshots. This PR keeps that tag as a deserialize-only alias for `ProjectRoots { subpath: None }`, while continuing to serialize only the new `project_roots` tag. ## Follow-up This PR intentionally does not introduce an explicit project-root set on `SessionConfiguration` or runtime sandbox resolution. Today, the resolver still uses the active cwd as the single implicit project root. A follow-up should model project roots separately from tool cwd so `:project_roots` entries can resolve against the configured project roots, and resolve to no entries when there are no project roots. ## Verification - `cargo test -p codex-protocol permissions:: --lib` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-sandboxing -p codex-exec-server --lib` - `cargo test -p codex-core session_configuration_apply_ --lib` - `cargo test -p codex-app-server command_exec_permission_profile_project_roots_use_command_cwd --test all` - `cargo test -p codex-tui thread_read_session_state_does_not_reuse_primary_permission_profile --lib` - `cargo test -p codex-tui preset_matching_accepts_workspace_write_with_extra_roots --lib` - `cargo test -p codex-config --lib`	2026-04-27 13:41:27 -07:00
rhan-oai	215d5a8f7c	[codex-analytics] remove ga flag (#19863 )	2026-04-27 19:29:19 +00:00
pakrym-oai	e5709db6dc	Streamline account and command handlers (#19491 ) ## Why Account login/logout and command exec handlers were doing local error sends in the middle of each handler. That made these request flows branch heavily even though most of the logic is validate, perform the operation, and return the response. ## What Changed - Converted ChatGPT/API-key login, login cancel, logout, rate-limit, and add-credit handlers in `codex-rs/app-server/src/codex_message_processor.rs` to compute `Result` values and send them once at the request boundary. - Applied the same shape to command exec start/write/resize/terminate handlers. - Kept side-effect notifications in the same places after successful request handling. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server --test all v2::account -- --test-threads=1` - `cargo test -p codex-app-server --test all v2::command_exec -- --test-threads=1`	2026-04-27 12:03:49 -07:00
efrazer-oai	2009f6e894	refactor: make auth loading async (#19762 ) ## Summary Auth loading used to expose synchronous construction helpers in several places even though some auth sources now need async work. This PR makes the auth-loading surface async and updates the callers to await it. This is intentionally only plumbing. It does not change how AgentIdentity tokens are decoded, how task runtime ids are allocated, or how JWT signatures are verified. ## Stack 1. This PR: [refactor: make auth loading async](https://github.com/openai/codex/pull/19762) 2. [refactor: load AgentIdentity runtime eagerly](https://github.com/openai/codex/pull/19763) 3. [feat: verify AgentIdentity JWTs with JWKS](https://github.com/openai/codex/pull/19764) ## Important call sites \| Area \| Change \| \| --- \| --- \| \| `codex-login` auth loading \| `CodexAuth` and `AuthManager` construction paths now await auth loading. \| \| app-server startup \| Auth manager construction is awaited during initialization. \| \| CLI/TUI/exec/MCP/chatgpt callers \| Existing auth-loading calls now await the same behavior. \| \| cloud requirements storage loader \| The loader becomes async so it can share the same auth construction path. \| \| auth tests \| Tests that load auth now run in async contexts. \| ## Testing Tests: targeted Rust auth test compilation, formatter, scoped Clippy fix, and Bazel lock check.	2026-04-27 11:00:27 -07:00
pakrym-oai	4ed22fc7d2	Streamline plugin, apps, and skills handlers (#19490 ) ## Why The plugin, app, and skills handlers had a lot of repeated `send_error`/`return` branches that made the success path hard to scan. This slice keeps behavior the same while moving fallible steps into local response-producing helpers, so the request boundary can send one result. ## What Changed - Converted plugin list/install/uninstall handlers in `codex-rs/app-server/src/codex_message_processor/plugins.rs` to return `Result<*Response, JSONRPCErrorError>` from helper methods and call `send_result` once. - Added local error-mapping helpers for plugin install/uninstall and marketplace failures. - Applied the same mechanical shape to app list, skills list/config, and marketplace add/remove/upgrade handlers in `codex-rs/app-server/src/codex_message_processor.rs`. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server --test all v2::app_list -- --test-threads=1` - `cargo test -p codex-app-server --test all v2::plugin_ -- --test-threads=1` - `cargo test -p codex-app-server --test all v2::skills_list -- --test-threads=1`	2026-04-27 10:18:25 -07:00
Michael Bolin	0d8cdc0510	permissions: centralize legacy sandbox projection (#19734 ) ## Why The remaining migration work still needs `SandboxPolicy` at a few compatibility boundaries, but those projections should come from one canonical path. Keeping ad hoc legacy projections scattered through app-server, CLI, and config code makes it easy for behavior to drift as `PermissionProfile` gains fidelity that the legacy enum cannot represent. ## What Changed - Adds `Permissions::legacy_sandbox_policy(cwd)` and `Config::legacy_sandbox_policy()` as the compatibility projection from the canonical `PermissionProfile`. - Adds `Permissions::can_set_legacy_sandbox_policy()` so legacy inputs are checked after they are converted into profile semantics. - Updates app-server command handling, Windows sandbox setup, session configuration, and sandbox summaries to use the centralized projection helper. - Leaves `SandboxPolicy` in place only for boundary inputs/outputs that still speak the legacy abstraction. ## Verification - `cargo check -p codex-config -p codex-core -p codex-sandboxing -p codex-app-server -p codex-cli -p codex-tui` - `cargo test -p codex-tui permissions_selection_history_snapshot_full_access_to_default -- --nocapture` - `cargo test -p codex-tui permissions_selection_sends_approvals_reviewer_in_override_turn_context -- --nocapture` - `bazel test //codex-rs/tui:tui-unit-tests-bin --test_arg=permissions_selection_history_snapshot_full_access_to_default --test_output=errors` - `bazel test //codex-rs/tui:tui-unit-tests-bin --test_arg=permissions_selection_sends_approvals_reviewer_in_override_turn_context --test_output=errors` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19734). * #19737 * #19736 * #19735 * __->__ #19734	2026-04-26 20:31:23 -07:00
Abhinav	c3e60849e5	inline hostname resolution for remote sandbox config (#19739 ) # Why Requirements support host-specific `remote_sandbox_config.hostname_patterns`, but config loading previously resolved and passed the system hostname through every config-loading path even when no requirements layer used `remote_sandbox_config`. On machines where hostname lookup is slow, startup and app-server config reads paid for a feature that was not active. We only need the hostname when a requirements layer actually declares `remote_sandbox_config`, so this moves hostname resolution to the single requirements merge point and keeps all other config callers unaware of hostname matching. # What - Removed the eager `host_name` plumbing from `load_config_layers_state`, `load_requirements_toml`, `ConfigBuilder`, app-server `ConfigManager`, network proxy loading, and related call sites. - Resolve the hostname inside `merge_requirements_with_remote_sandbox_config` only when the incoming requirements contain `remote_sandbox_config`.	2026-04-27 03:18:57 +00:00
Michael Bolin	ad57a3fee2	permissions: finish profile-backed app surfaces (#19395 )	2026-04-26 19:42:39 -07:00
Andrey Mishchenko	35bc6e3d01	Delete unused ResponseItem::Message.end_turn (#19605 ) This field is unused. Delete it.	2026-04-26 17:18:09 -07:00
Michael Bolin	dda8199b73	permissions: migrate approval and sandbox consumers to profiles (#19393 ) ## Why Runtime decisions should not infer permissions from the lossy legacy sandbox projection once `PermissionProfile` is available. In particular, `Disabled` and `External` need to remain distinct, and managed profiles with split filesystem or deny-read rules should not be collapsed before approval, network, safety, or analytics code makes decisions. ## What Changed - Changes managed network proxy setup and network approval logic to use `PermissionProfile` when deciding whether a managed sandbox is active. - Migrates patch safety, Guardian/user-shell approval paths, Landlock helper setup, analytics sandbox classification, and selected turn/session code to profile-backed permissions. - Validates command-level profile overrides against the constrained `PermissionProfile` rather than a strict `SandboxPolicy` round trip. - Preserves configured deny-read restrictions when command profiles are narrowed. - Adds coverage for profile-backed trust, network proxy/approval behavior, patch safety, analytics classification, and command-profile narrowing. ## Verification - `cargo test -p codex-core direct_write_roots` - `cargo test -p codex-core runtime_roots_to_legacy_projection` - `cargo test -p codex-app-server requested_permissions_trust_project_uses_permission_profile_intent` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19393). * #19395 * #19394 * __->__ #19393	2026-04-26 15:30:40 -07:00
pakrym-oai	9c3abcd46c	[codex] Move config loading into codex-config (#19487 ) ## Why Config loading had become split across crates: `codex-config` owned the config types and merge logic, while `codex-core` still owned the loader that assembled the layer stack. This change consolidates that responsibility in `codex-config`, so the crate that defines config behavior also owns how configs are discovered and loaded. To make that move possible without reintroducing the old dependency cycle, the shell-environment policy types and helpers that `codex-exec-server` needs now live in `codex-protocol` instead of flowing through `codex-config`. This also makes the migrated loader tests more deterministic on machines that already have managed or system Codex config installed by letting tests override the system config and requirements paths instead of reading the host's `/etc/codex`. ## What Changed - moved the config loader implementation from `codex-core` into `codex-config::loader` and deleted the old `core::config_loader` module instead of leaving a compatibility shim - moved shell-environment policy types and helpers into `codex-protocol`, then updated `codex-exec-server` and other downstream crates to import them from their new home - updated downstream callers to use loader/config APIs from `codex-config` - added test-only loader overrides for system config and requirements paths so loader-focused tests do not depend on host-managed config state - cleaned up now-unused dependency entries and platform-specific cfgs that were surfaced by post-push CI ## Testing - `cargo test -p codex-config` - `cargo test -p codex-core config_loader_tests::` - `cargo test -p codex-protocol -p codex-exec-server -p codex-cloud-requirements -p codex-rmcp-client --lib` - `cargo test --lib -p codex-app-server-client -p codex-exec` - `cargo test --no-run --lib -p codex-app-server` - `cargo test -p codex-linux-sandbox --lib` - `cargo shear` - `just bazel-lock-check` ## Notes - I did not chase unrelated full-suite failures outside the migrated loader surface. - `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive failures on this machine, and Windows CI still shows unrelated long-running/timeouting test noise outside the loader migration itself.	2026-04-26 15:10:53 -07:00
pakrym-oai	2a020f1a0a	Lift app-server JSON-RPC error handling to request boundary (#19484 ) ## Why App-server request handling had a lot of repeated JSON-RPC error construction and one-off `send_error`/`return` branches. This made small handlers noisy and pushed error response details into leaf code that otherwise only needed to validate input or call the underlying API. ## What Changed - Added shared JSON-RPC error constructors in `codex-rs/app-server/src/error_code.rs`. - Lifted straightforward request result emission into `codex-rs/app-server/src/message_processor.rs` so response/error dispatch happens at the request boundary. - Reused the result helpers across command exec, config, filesystem, device-key, external-agent config, fs-watch, and outgoing-message paths. - Removed leaf wrapper handlers where the method body was only forwarding to a response helper. - Returned request validation errors upward in the simple cases instead of sending an error locally and immediately returning. ## Verification - `cargo test -p codex-app-server --lib command_exec::tests` - `cargo test -p codex-app-server --lib outgoing_message::tests` - `cargo test -p codex-app-server --lib in_process::tests` - `cargo test -p codex-app-server --test all v2::fs` - `cargo test -p codex-app-server --test all v2::config_rpc` - `cargo test -p codex-app-server --test all v2::external_agent_config` - `cargo test -p codex-app-server --test all v2::initialize` - `just fix -p codex-app-server` - `git diff --check` Note: full `cargo test -p codex-app-server` was attempted and stopped in `message_processor::tracing_tests::turn_start_jsonrpc_span_parents_core_turn_spans` with a stack overflow after unrelated tests had already passed.	2026-04-26 15:10:35 -07:00
Michael Bolin	deaa307fb2	permissions: derive compatibility policies from profiles (#19392 ) ## Why After #19391, `PermissionProfile` and the split filesystem/network policies could still be stored in parallel. That creates drift risk: a profile can preserve deny globs, external enforcement, or split filesystem entries while a cached projection silently loses those details. This PR makes the profile the runtime source and derives compatibility views from it. ## What Changed - Removes stored filesystem/network sandbox projections from `Permissions` and `SessionConfiguration`; their accessors now derive from the canonical `PermissionProfile`. - Derives legacy `SandboxPolicy` snapshots from profiles only where an older API still needs that field. - Updates MCP connection and elicitation state to track `PermissionProfile` instead of `SandboxPolicy` for auto-approval decisions. - Adds semantic filesystem-policy comparison so cwd changes can preserve richer profiles while still recognizing equivalent legacy projections independent of entry ordering. - Updates config/session tests to assert profile-derived projections instead of parallel stored fields. ## Verification - `cargo test -p codex-core direct_write_roots` - `cargo test -p codex-core runtime_roots_to_legacy_projection` - `cargo test -p codex-app-server requested_permissions_trust_project_uses_permission_profile_intent` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19392). * #19395 * #19394 * #19393 * __->__ #19392	2026-04-26 15:06:42 -07:00
Michael Bolin	4d7ce3447d	permissions: make runtime config profile-backed (#19606 ) ## Why This supersedes #19391. During stack repair, GitHub marked #19391 as merged into a temporary stack branch rather than into `main`, so the runtime-config change needed a fresh PR. `PermissionProfile` is now the canonical permissions shape after #19231 because it can distinguish `Managed`, `Disabled`, and `External` enforcement while also carrying filesystem rules that legacy `SandboxPolicy` cannot represent cleanly. Core config and session state still needed to accept profile-backed permissions without forcing every profile through the strict legacy bridge, which rejected valid runtime profiles such as direct write roots. The unrelated CI/test hardening that previously rode along with this PR has been split into #19683 so this PR stays focused on the permissions model migration. ## What Changed - Adds `Permissions.permission_profile` and `SessionConfiguration.permission_profile` as constrained runtime state, while keeping `sandbox_policy` as a legacy compatibility projection. - Introduces profile setters that keep `PermissionProfile`, split filesystem/network policies, and legacy `SandboxPolicy` projections synchronized. - Uses a compatibility projection for requirement checks and legacy consumers instead of rejecting profiles that cannot round-trip through `SandboxPolicy` exactly. - Updates config loading, config overrides, session updates, turn context plumbing, prompt permission text, sandbox tags, and exec request construction to carry profile-backed runtime permissions. - Preserves configured deny-read entries and `glob_scan_max_depth` when command/session profiles are narrowed. - Adds `PermissionProfile::read_only()` and `PermissionProfile::workspace_write()` presets that match legacy defaults. ## Verification - `cargo test -p codex-core direct_write_roots` - `cargo test -p codex-core runtime_roots_to_legacy_projection` - `cargo test -p codex-app-server requested_permissions_trust_project_uses_permission_profile_intent` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19606). * #19395 * #19394 * #19393 * #19392 * __->__ #19606	2026-04-26 13:29:54 -07:00
Michael Bolin	ac2bffa443	test: harden app-server integration tests (#19683 ) ## Why Windows Bazel runs in the permissions stack exposed that app-server integration tests were launching normal plugin startup warmups in every subprocess. Those warmups can call `https://chatgpt.com/backend-api/plugins/featured` when a test is not specifically exercising plugin startup, which adds slow background work, noisy stderr, and dependence on external network state. The relevant startup/featured-plugin behavior was introduced across #15042 and #15264. A few app-server tests also had long optional waits or unbounded cleanup paths, making failures expensive to diagnose and contributing to slow Windows shards. One external-agent config test from #18246 used a GitHub-style marketplace source, which was enough to exercise the pending remote-import path but also meant the background completion task could attempt a real clone. ## What Changed - Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks` plumbing and a hidden debug-only `--disable-plugin-startup-tasks-for-tests` app-server flag, so integration tests can suppress startup plugin warmups without adding a production env-var gate. - Has the app-server test harness pass that hidden flag by default, while opting plugin-startup coverage back in for tests that intentionally exercise startup sync and featured-plugin warmup behavior. - Lowers normal app-server subprocess logging from `info`/`debug` to `warn` to avoid multi-megabyte stderr output in Bazel logs. - Prevents the external-agent config test from attempting a real marketplace clone by using an invalid non-local source while still exercising the pending-import completion path. - Bounds optional filesystem/realtime waits and fake WebSocket test-server shutdown so failures produce targeted timeouts instead of hanging a shard. - Fixes the Unix script-resolution test in `rmcp-client` to exercise PATH resolution directly and include the actual spawn error in failures. ## Verification - `cargo check -p codex-app-server` - `cargo clippy -p codex-app-server --tests -- -D warnings` - `cargo test -p codex-rmcp-client program_resolver::tests::test_unix_executes_script_without_extension` - `cargo test -p codex-app-server --test all external_agent_config_import_sends_completion_notification_after_pending_plugins_finish -- --nocapture` - `cargo test -p codex-app-server --test all plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request -- --nocapture` - Windows Local Bazel passed with this test-hardening bundle before it was extracted from #19606. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683). * #19395 * #19394 * #19393 * #19392 * #19606 * __->__ #19683	2026-04-26 12:43:16 -07:00
Michael Bolin	9881dc7306	fix: restore 30-minute timeout for Bazel builds (#19609 ) I think raising it to 45 minutes in https://github.com/openai/codex/pull/19578 was a mistake for the reasons explained in the comments in the code. Instead, we attempt to defend against timeouts by increasing the number of shards in `app-server-all-test` so that a "true failure" that gets run 3x should not take as much wall clock time.	2026-04-25 16:34:06 -07:00
Michael Bolin	d54493ba1c	test: stabilize app-server path assertions on Windows (#19604 ) ## Why Windows can represent the same canonical local path with either a normal drive path or a verbatim device path prefix. The failure pattern that motivated this PR was an assertion diff like `C:\...` versus `\\?\C:\...`: different spellings, same file. That became visible while validating the permissions stack above this PR. The stack increasingly routes paths through `AbsolutePathBuf`, which normalizes supported Windows device prefixes, while several existing tests still built expected values directly with `std::fs::canonicalize()` or compared `AbsolutePathBuf::as_path()` to a raw `PathBuf`. On Windows, that can make tests fail because the two sides choose different textual forms for an otherwise equivalent canonical path. This PR is intentionally split out as the bottom PR below #19606. The runtime permissions migration should not carry unrelated Windows test stabilization, and reviewers should be able to verify this as a test-only change before looking at the larger permissions changes. ## Failure Modes Covered - `conversation_summary` expected rollout paths were built from raw canonicalized `PathBuf`s, while app-server responses could carry `AbsolutePathBuf`-normalized paths. - `thread_resume` compared returned thread paths directly to previously stored or fixture paths, so a verbatim-prefix spelling could fail an otherwise correct resume. - `marketplace_add` compared plugin install roots through `as_path()` against raw canonicalized paths, reproducing the same `C:\...` versus `\\?\C:\...` mismatch in both app-server and core-plugin coverage. ## What Changed - In `app-server/tests/suite/conversation_summary.rs`, normalize both expected rollout paths and received `ConversationSummary.path` values through `AbsolutePathBuf` before comparing the full summary object. - In `app-server/tests/suite/v2/thread_resume.rs`, normalize both sides of thread path comparisons before asserting equality. This keeps the tests focused on whether resume returned the same existing path, not whether Windows used the same string spelling. - In `app-server/tests/suite/v2/marketplace_add.rs` and `core-plugins/src/marketplace_add.rs`, compare install roots as `AbsolutePathBuf` values instead of comparing an absolute-path wrapper to a raw canonicalized `PathBuf`. ## Behavior This PR does not change production app-server or marketplace behavior. It only changes tests to assert semantic path identity across Windows path spelling variants. It also leaves API response values untouched; the normalization happens inside assertions only. ## Verification Targeted local checks run while extracting this fix: - `cargo test -p codex-app-server get_conversation_summary_by_thread_id_reads_rollout` - `cargo test -p codex-app-server get_conversation_summary_by_relative_rollout_path_resolves_from_codex_home` - `cargo test -p codex-app-server thread_resume_prefers_path_over_thread_id` Windows-specific confidence comes from the Bazel Windows CI job for this PR, since the failure is platform-specific. ## Docs No docs update is needed because this is test-only infrastructure stabilization. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19604). * #19395 * #19394 * #19393 * #19392 * #19606 * __->__ #19604	2026-04-25 16:25:28 -07:00
Michael Bolin	f41306b4f3	test: isolate remote thread store regression from plugin warmups (#19593 ) Follow-up to #19266. ## Why `thread_start_with_non_local_thread_store_does_not_create_local_persistence` is meant to catch accidental local thread persistence when a non-local thread store is configured. The Windows flake reported in [this BuildBuddy invocation](https://app.buildbuddy.io/invocation/0b75dde4-6828-4e7b-a35b-e45b73fb005d) showed that the assertion was tripping on an unexpected top-level `.tmp` entry: ```diff { + ".tmp", "config.toml", "installation_id", "memories", "skills", } ``` That `.tmp` does not appear to come from `tempfile::TempDir`; it comes from unrelated plugin startup work that can legitimately materialize `codex_home/.tmp`, including the startup remote plugin sync marker in [`core/src/plugins/startup_sync.rs`](`bce74c70ce/codex-rs/core/src/plugins/startup_sync.rs (L13-L15)`) and the curated plugin snapshot under [`.tmp/plugins`](`bce74c70ce/codex-rs/core-plugins/src/startup_sync.rs (L25-L26)`). That makes the regression race unrelated background startup tasks instead of validating the thread-store invariant it was added to cover. Rather than weakening the assertion to allow arbitrary `.tmp` entries, this change isolates the test from plugin warmups so it can stay strict about unexpected local thread persistence artifacts. ## What changed - disable plugins in the generated config used by `app-server/tests/suite/v2/remote_thread_store.rs` - keep the existing `codex_home` assertions unchanged so the test still fails if local session or sqlite persistence is introduced ## Verification - `cargo test -p codex-app-server suite::v2::remote_thread_store::thread_start_with_non_local_thread_store_does_not_create_local_persistence -- --exact`	2026-04-25 20:45:31 +00:00
Eric Traut	bce74c70ce	Restore persisted model provider on thread resume (#19287 ) Fixes #15219. ## Why `thread/resume` should continue a persisted thread with the same model provider that created the thread. The app server already restores the persisted model and reasoning effort before resuming, but it was leaving `model_provider` unset. If a user created a thread with one provider and later switched their active profile to another provider, resumed encrypted history could be sent to the wrong endpoint and fail with `invalid_encrypted_content`. The thread metadata already records the original provider, so resume should apply it when the caller has not explicitly requested a different model/provider/reasoning configuration. ## What changed This updates `merge_persisted_resume_metadata` in `app-server/src/codex_message_processor.rs` to copy `ThreadMetadata::model_provider` into `ConfigOverrides::model_provider` alongside the persisted model. The existing resume metadata tests now also assert that: - the persisted provider is restored for normal resume - explicit model, provider, or reasoning-effort overrides still prevent persisted resume metadata from being applied - a thread with no persisted model or reasoning effort still resumes with its persisted provider ## Verification - `cargo test -p codex-app-server` passed the app-server unit tests, including the updated resume metadata coverage. The broader integration portion of that command failed in an unrelated environment-sensitive skills-budget warning assertion, where this run saw 8 omitted skills instead of the expected 7. - `just fix -p codex-app-server` completed successfully.	2026-04-25 12:40:00 -07:00
Eric Traut	4167628622	Add goal core runtime (4 / 5) (#18076 ) Adds the core runtime behavior for active goals on top of the model tools from PR 3. ## Why A long-running goal should be a core runtime concern, not something every client has to implement. Core owns the turn lifecycle, tool completion boundaries, interruptions, resume behavior, and token usage, so it is the right place to account progress, enforce budgets, and decide when to continue work. ## What changed - Centralized goal lifecycle side effects behind `Session::goal_runtime_apply(GoalRuntimeEvent::...)`. - Starts goal continuation turns only when the session is idle; pending user input and mailbox work take priority. - Accounts token and wall-clock usage at turn, tool, mutation, interrupt, and resume boundaries; `get_thread_goal` remains read-only. - Preserves sub-second wall-clock remainder across accounting boundaries so long-running goals do not drift downward over time. - Treats token budget exhaustion as a soft stop by marking the goal `budget_limited` and injecting wrap-up steering instead of aborting the active turn. - Suppresses budget steering when `update_goal` marks a goal complete. - Pauses active goals on interrupt and auto-reactivates paused goals when a thread resumes outside plan mode. - Suppresses repeated automatic continuation when a continuation turn makes no tool calls. - Added continuation and budget-limit prompt templates. ## Verification - Added focused core coverage for continuation scheduling, accounting boundaries, budget-limit steering, completion accounting, interrupt pause behavior, resume auto-activation, and wall-clock remainder accounting.	2026-04-24 21:16:00 -07:00
Eric Traut	6c874f9b34	Add goal app-server API (2 / 5) (#18074 ) Adds the app-server v2 goal API on top of the persisted goal state from PR 1. ## Why Clients need a stable app-server surface for reading and controlling materialized thread goals before the model tools and TUI can use them. Goal changes also need to be observable by app-server clients, including clients that resume an existing thread. ## What changed - Added v2 `thread/goal/get`, `thread/goal/set`, and `thread/goal/clear` RPCs for materialized threads. - Added `thread/goal/updated` and `thread/goal/cleared` notifications so clients can keep local goal state in sync. - Added resume/snapshot wiring so reconnecting clients see the current goal state for a thread. - Added app-server handlers that reconcile persisted rollout state before direct goal mutations. - Updated the app-server README plus generated JSON and TypeScript schema fixtures for the new API surface. ## Verification - Added app-server v2 coverage for goal get/set/clear behavior, notification emission, resume snapshots, and non-local thread-store interactions.	2026-04-24 20:53:41 -07:00
Curtis 'Fjord' Hawthorne	8a559e7938	Remove js_repl feature (#19410 )	2026-04-24 17:49:29 -07:00
Michael Bolin	789f387982	permissions: remove legacy read-only access modes (#19449 ) ## Why `ReadOnlyAccess` was a transitional legacy shape on `SandboxPolicy`: `FullAccess` meant the historical read-only/workspace-write modes could read the full filesystem, while `Restricted` tried to carry partial readable roots. The partial-read model now belongs in `FileSystemSandboxPolicy` and `PermissionProfile`, so keeping it on `SandboxPolicy` makes every legacy projection reintroduce lossy read-root bookkeeping and creates unnecessary noise in the rest of the permissions migration. This PR makes the legacy policy model narrower and explicit: `SandboxPolicy::ReadOnly` and `SandboxPolicy::WorkspaceWrite` represent the old full-read sandbox modes only. Split readable roots, deny-read globs, and platform-default/minimal read behavior stay in the runtime permissions model. ## What changed - Removes `ReadOnlyAccess` from `codex_protocol::protocol::SandboxPolicy`, including the generated `access` and `readOnlyAccess` API fields. - Updates legacy policy/profile conversions so restricted filesystem reads are represented only by `FileSystemSandboxPolicy` / `PermissionProfile` entries. - Keeps app-server v2 compatible with legacy `fullAccess` read-access payloads by accepting and ignoring that no-op shape, while rejecting legacy `restricted` read-access payloads instead of silently widening them to full-read legacy policies. - Carries Windows sandbox platform-default read behavior with an explicit override flag instead of depending on `ReadOnlyAccess::Restricted`. - Refreshes generated app-server schema/types and updates tests/docs for the simplified legacy policy shape. ## Verification - `cargo check -p codex-app-server-protocol --tests` - `cargo check -p codex-windows-sandbox --tests` - `cargo test -p codex-app-server-protocol sandbox_policy_` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19449). * #19395 * #19394 * #19393 * #19392 * #19391 * __->__ #19449	2026-04-24 17:16:58 -07:00
Tom	588f7a9fc4	[codex] add non-local thread store regression harness (#19266 ) - Add an integration test that guarantees nothing gets written to codex home dir or sqlite when running a rollout with a non-local ThreadStore - Add an in-memory "spy" ThreadStore for tests like this Note I could not find a good way to also ensure there were no filesystem _reads_ that didn't go through threadstore. I explored a more elaborate sandboxed-subprocess approach but it isn't platform portable and felt like it wasn't (yet) worth it.	2026-04-24 15:45:44 -07:00
Tom	0a9b559c0b	Migrate fork and resume reads to thread store (#18900 ) - Route cold thread/resume and thread/fork source loading through ThreadStore reads instead of direct rollout path operations - Keep lookups that explicitly specify a rollout-path using the local thread store methods but return an invalid-request error for remote ThreadStore configurations - Add some additional unit tests for code path coverage	2026-04-24 13:51:37 -07:00
Michael Bolin	13e0ec1614	permissions: make legacy profile conversion cwd-free (#19414 ) ## Why The profile conversion path still required a `cwd` even when it was only translating a legacy `SandboxPolicy` into a `PermissionProfile`. That made profile producers invent an ambient `cwd`, which is exactly the anchoring we are trying to remove from permission-profile data. A legacy workspace-write policy can be represented symbolically instead: `:cwd = write` plus read-only `:project_roots` metadata subpaths. This PR creates that cwd-free base so the rest of the stack can stop threading cwd through profile construction. Callers that actually need a concrete runtime filesystem policy for a specific cwd still have an explicitly named cwd-bound conversion. ## What Changed - `PermissionProfile::from_legacy_sandbox_policy` now takes only `&SandboxPolicy`. - `FileSystemSandboxPolicy::from_legacy_sandbox_policy` is now the symbolic, cwd-free projection for profiles. - The old concrete projection is retained as `FileSystemSandboxPolicy::from_legacy_sandbox_policy_for_cwd` for runtime/boundary code that must materialize legacy cwd behavior. - Workspace-write profiles preserve `CurrentWorkingDirectory` and `ProjectRoots` special entries instead of materializing cwd into absolute paths. ## Verification - `cargo check -p codex-protocol -p codex-core -p codex-app-server-protocol -p codex-app-server -p codex-exec -p codex-exec-server -p codex-tui -p codex-sandboxing -p codex-linux-sandbox -p codex-analytics --tests` - `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol -p codex-app-server -p codex-exec -p codex-exec-server -p codex-tui -p codex-sandboxing -p codex-linux-sandbox -p codex-analytics` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19414). * #19395 * #19394 * #19393 * #19392 * #19391 * __->__ #19414	2026-04-24 13:42:05 -07:00
willwang-openai	687c5d9081	Update unix socket transport to use WebSocket upgrade (#19244 ) ## Summary - Switch Unix socket app-server connections to perform the standard WebSocket HTTP Upgrade handshake - Update the Unix socket test to exercise a real upgrade over the Unix stream - Refresh the app-server README to describe the new Unix socket behavior ## Testing - `cargo test -p codex-app-server transport::unix_socket_tests` - `just fmt` - `git diff --check`	2026-04-24 13:06:51 -07:00
Ruslan Nigmatullin	a3cccbd8ed	[codex] Omit fork turns from thread started notifications (#19093 ) ## Why `thread/fork` responses intentionally include copied history so the caller can render the fork immediately, but `thread/started` is a lifecycle notification. The v2 `Thread` contract says notifications should return `turns: []`, and the fork path was reusing the response thread directly, causing copied turns to be emitted through `thread/started` as well. ## What Changed - Route app-server `thread/started` notification construction through a helper that clears `thread.turns` before sending. - Keep `thread/fork` responses unchanged so callers still receive copied history. - Add persistent and ephemeral fork coverage that asserts `thread/started` emits an empty `turns` array while the response retains fork history. ## Testing - `just fmt` - `cargo test -p codex-app-server`	2026-04-24 12:31:13 -07:00
Alex Zamoshchin	bcc1caa920	respect workspace option for disabling plugins (#18907 ) Respects the workspace setting for plugins in Codex Plugins menu disappears Plugins do not load Plugins do not load in composer no plugins loaded <img width="809" height="226" alt="Screenshot 2026-04-23 at 3 20 45 PM" src="https://github.com/user-attachments/assets/3a4dba8e-69c3-4046-a77e-f13ab77f84b4" /> no plugins in menu <img width="293" height="204" alt="Screenshot 2026-04-23 at 3 20 35 PM" src="https://github.com/user-attachments/assets/5cb9bf52-ad72-488f-b90c-5eb457da09a3" />	2026-04-24 17:38:45 +00:00
danwang-oai	11806faf71	Fix hang on turn/interrupt (#18392 ) Fix a bug where the `turn/interrupt` RPC hangs when interrupting a turn that has already completed. Before this change, `turn/interrupt` requests were queued in app-server and only answered when a later TurnAborted event arrived. If the target turn was already complete, core treated Op::Interrupt as a no-op, so no abort event was emitted and the RPC could hang indefinitely. This change fixes that in two places: * Reject turn/interrupt immediately with `INVALID_REQUEST` when the requested turn is no longer the active turn. * Resolve any already-accepted pending interrupt requests when the turn reaches TurnComplete, covering the case where a turn finishes naturally after the interrupt request is accepted but before it aborts. I tested this by adding a failing test in `707487c063`. You may view the results here: https://github.com/openai/codex/actions/runs/24585182419/ <img width="1512" height="310" alt="CleanShot 2026-04-17 at 16 33 30@2x" src="https://github.com/user-attachments/assets/f4a88228-b2a4-41f4-9aaa-ec82814096af" />	2026-04-24 10:47:50 -04:00
sayan-oai	c10f95ddac	Update models.json and related fixtures (#19323 ) Supersedes #18735. The scheduled rust-release-prepare workflow force-pushed `bot/update-models-json` back to the generated models.json-only diff, which dropped the test and snapshot updates needed for CI. This PR keeps the latest generated `models.json` from #18735 and adds the corresponding fixture updates: - preserve model availability NUX in the app-server model cache fixture - update core/TUI expectations for the new `gpt-5.4` `xhigh` default reasoning - refresh affected TUI chatwidget snapshots for the `gpt-5.5` default/model copy changes Validation run locally while preparing the fix: - `just fmt` - `cargo test -p codex-app-server model_list` - `cargo test -p codex-core includes_no_effort_in_request` - `cargo test -p codex-core includes_default_reasoning_effort_in_request_when_defined_by_model_info` - `cargo test -p codex-tui --lib chatwidget::tests` - `cargo insta pending-snapshots` --------- Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>	2026-04-24 11:14:13 +02:00
Michael Bolin	4816b89204	permissions: make profiles represent enforcement (#19231 ) ## Why `PermissionProfile` is becoming the canonical permissions abstraction, but the old shape only carried optional filesystem and network fields. It could describe allowed access, but not who is responsible for enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy when profiles were exported, cached, or round-tripped through app-server APIs. The important model change is that active permissions are now a disjoint union over the enforcement mode. Conceptually: ```rust pub enum PermissionProfile { Managed { file_system: FileSystemSandboxPolicy, network: NetworkSandboxPolicy, }, Disabled, External { network: NetworkSandboxPolicy, }, } ``` This distinction matters because `Disabled` means Codex should apply no outer sandbox at all, while `External` means filesystem isolation is owned by an outside caller. Those are not equivalent to a broad managed sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an inner sandbox may require the outer Codex layer to use no sandbox rather than a permissive one. ## How Existing Modeling Maps Legacy `SandboxPolicy` remains a boundary projection, but it now maps into the higher-fidelity profile model: - `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed` with restricted filesystem entries plus the corresponding network policy. - `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving the “no outer sandbox” intent instead of treating it as a lax managed sandbox. - `ExternalSandbox { network_access }` maps to `PermissionProfile::External { network }`, preserving external filesystem enforcement while still carrying the active network policy. - Split runtime policies that legacy `SandboxPolicy` cannot faithfully express, such as managed unrestricted filesystem plus restricted network, stay `Managed` instead of being collapsed into `ExternalSandbox`. - Per-command/session/turn grants remain partial overlays via `AdditionalPermissionProfile`; full `PermissionProfile` is reserved for complete active runtime permissions. ## What Changed - Change active `PermissionProfile` into a tagged union: `managed`, `disabled`, and `external`. - Keep partial permission grants separate with `AdditionalPermissionProfile` for command/session/turn overlays. - Represent managed filesystem permissions as either `restricted` entries or `unrestricted`; `glob_scan_max_depth` is non-zero when present. - Preserve old rollout compatibility by accepting the pre-tagged `{ network, file_system }` profile shape during deserialization. - Preserve fidelity for important edge cases: `DangerFullAccess` round-trips as `disabled`, `ExternalSandbox` round-trips as `external`, and managed unrestricted filesystem + restricted network stays managed instead of being mistaken for external enforcement. - Preserve configured deny-read entries and bounded glob scan depth when full profiles are projected back into runtime policies, including unrestricted replacements that now become `:root = write` plus deny entries. - Regenerate the experimental app-server v2 JSON/TypeScript schema and update the `command/exec` README example for the tagged `permissionProfile` shape. ## Compatibility Legacy `SandboxPolicy` remains available at config/API boundaries as the compatibility projection. Existing rollout lines with the old `PermissionProfile` shape continue to load. The app-server `permissionProfile` field is experimental, so its v2 wire shape is intentionally updated to match the higher-fidelity model. ## Verification - `just write-app-server-schema` - `cargo check --tests` - `cargo test -p codex-protocol permission_profile` - `cargo test -p codex-protocol preserving_deny_entries_keeps_unrestricted_policy_enforceable` - `cargo test -p codex-app-server-protocol permission_profile_file_system_permissions` - `cargo test -p codex-app-server-protocol serialize_client_response` - `cargo test -p codex-core session_configured_reports_permission_profile_for_external_sandbox` - `just fix` - `just fix -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-core` - `just fix -p codex-app-server`	2026-04-23 23:02:18 -07:00
xli-oai	33cc135cc3	[codex] Support remote plugin install writes (#18917 ) ## Summary - Add a remote plugin install write call that POSTs the selected remote plugin to the ChatGPT cloud plugin API. - Align remote install with the latest remote read contract: `pluginName` carries the backend remote plugin id directly, for example `plugins~Plugin_linear`, and install no longer synthesizes `<name>@<marketplace>` ids. - Validate remote install ids with the same character rules as remote read, return the same install response shape as local installs, and include mocked app-server coverage for the write path. ## Validation - `just fmt` - `cargo test -p codex-app-server --test all plugin_install` - `cargo test -p codex-core-plugins` - `just fix -p codex-app-server` - `just fix -p codex-core-plugins`	2026-04-23 22:10:15 -07:00
Ruslan Nigmatullin	19badb0be2	app-server: persist device key bindings in sqlite (#19206 ) ## Why Device-key providers should only own platform key material. The account/client binding used to authorize a signing payload is app-server state, and keeping that state in provider-specific metadata makes the same check harder to audit and harder to share across platform implementations. Persisting the binding in the shared state database gives the device-key crate a platform-neutral source of truth before it asks a provider to sign. It also lets app-server move potentially blocking key operations off the main message processor path, which matters once providers may wait for OS authentication prompts. ## What changed - Add a `device_key_bindings` state migration plus `StateRuntime` helpers keyed by `key_id`. - Add an async `DeviceKeyBindingStore` abstraction to `codex-device-key` and use it from `DeviceKeyStore::create` and `DeviceKeyStore::sign`. - Keep provider calls behind async store methods and run the synchronous provider work through `spawn_blocking`. - Wire app-server device-key RPC handling to the SQLite-backed binding store and spawn response/error delivery tasks for device-key requests. - Run the turn-start tracing test on the existing larger current-thread test harness after the larger async surface made the default test stack too small locally. ## Validation - `cargo test -p codex-device-key` - `cargo test -p codex-state device_key` - `cargo test -p codex-state` - `cargo test -p codex-app-server device_key` - `cargo test -p codex-app-server message_processor::tracing_tests::turn_start_jsonrpc_span_parents_core_turn_spans` - `cargo test -p codex-app-server` - `just fix -p codex-device-key` - `just fix -p codex-state` - `just fix -p codex-app-server` - `just bazel-lock-update` - `just bazel-lock-check` - `git diff --check`	2026-04-23 21:55:56 -07:00
Celia Chen	e8d8080818	feat: let model providers own model discovery (#18950 ) ## Why `codex-models-manager` had grown to own provider-specific concerns: constructing OpenAI-compatible `/models` requests, resolving provider auth, emitting request telemetry, and deciding how provider catalogs should be sourced. That made the manager harder to reuse for providers whose model catalog is not fetched from the OpenAI `/models` endpoint, such as Amazon Bedrock. This change moves provider-specific model discovery behind provider-owned implementations, so the models manager can focus on refresh policy, cache behavior, picker ordering, and model metadata merging. ## What Changed - Introduced a `ModelsManager` trait with separate `OpenAiModelsManager` and `StaticModelsManager` implementations. - Added `ModelsEndpointClient` so OpenAI-compatible HTTP fetching lives outside `codex-models-manager`. - Moved `/models` request construction, provider auth resolution, timeout handling, and request telemetry into `codex-model-provider` via `OpenAiModelsEndpoint`. - Added provider-owned `models_manager(...)` construction so configured OpenAI-compatible providers use `OpenAiModelsManager`, while static/catalog-backed providers can return `StaticModelsManager`. - Added an Amazon Bedrock static model catalog for the GPT OSS Bedrock model IDs. - Updated core/session/thread manager code and tests to depend on `Arc<dyn ModelsManager>`. - Moved offline model test helpers into `codex_models_manager::test_support`. ## Metadata References The Bedrock catalog metadata is based on the official Amazon Bedrock OpenAI model documentation: - [Amazon Bedrock OpenAI models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-openai.html) lists the Bedrock model IDs, text input/output modalities, and `128,000` token context window for `gpt-oss-20b` and `gpt-oss-120b`. - [Amazon Bedrock `gpt-oss-120b` model card](https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-openai-gpt-oss-120b.html) lists the `bedrock-runtime` model ID `openai.gpt-oss-120b-1:0`, the `bedrock-mantle` model ID `openai.gpt-oss-120b`, text-only modalities, and `128K` context window. - [OpenAI `gpt-oss-120b` model docs](https://developers.openai.com/api/docs/models/gpt-oss-120b) document configurable reasoning effort with `low`, `medium`, and `high`, plus text input/output modality. The display names, default reasoning effort, and priority ordering are Codex-local catalog choices. ## Test Plan - Manually verified app-server model listing with an AWS profile: ```shell CODEX_HOME="$(mktemp -d)" cargo run -p codex-app-server-test-client -- \ --codex-bin ./target/debug/codex \ -c 'model_provider="amazon-bedrock"' \ -c 'model_providers.amazon-bedrock.aws.profile="codex-bedrock"' \ -c 'model_providers.amazon-bedrock.aws.region="us-west-2"' \ model-list ``` The response returned the Bedrock catalog with `openai.gpt-oss-120b-1:0` as the default model and `openai.gpt-oss-20b-1:0` as the second listed model, both text-only and supporting low/medium/high reasoning effort.	2026-04-24 04:28:25 +00:00
starr-openai	49fb25997f	Add sticky environment API and thread state (#18897 ) ## Summary - add sticky environment selections to app-server v2 thread/start and turn/start request flow - carry thread-level selections through core session/thread state - add app-server coverage for sticky selections and turn overrides ## Stack 1. This PR: API and thread persistence 2. #18898: config.toml named environment loading 3. #18899: downstream tool/runtime consumers ## Validation - Not run locally; split only. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-23 18:57:13 -07:00

1 2 3 4 5 ...

929 Commits