codex

mirror of https://github.com/openai/codex.git synced 2026-05-03 04:42:20 +03:00

Author	SHA1	Message	Date
alexsong-oai	1afbbc11c3	Ensure the env values of imported shell_environment_policy.set is string (#13402 )	2026-03-03 16:12:23 -08:00
Curtis 'Fjord' Hawthorne	b92146d48b	Add under-development original-resolution view_image support (#13050 ) ## Summary Add original-resolution support for `view_image` behind the under-development `view_image_original_resolution` feature flag. When the flag is enabled and the target model is `gpt-5.3-codex` or newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends `detail: "original"` to the Responses API instead of using the legacy resize/compress path. ## What changed - Added `view_image_original_resolution` as an under-development feature flag. - Added `ImageDetail` to the protocol models and support for serializing `detail: "original"` on tool-returned images. - Added `PromptImageMode::Original` to `codex-utils-image`. - Preserves original PNG/JPEG/WebP bytes. - Keeps legacy behavior for the resize path. - Updated `view_image` to: - use the shared `local_image_content_items_with_label_number(...)` helper in both code paths - select original-resolution mode only when: - the feature flag is enabled, and - the model slug parses as `gpt-5.3-codex` or newer - Kept local user image attachments on the existing resize path; this change is specific to `view_image`. - Updated history/image accounting so only `detail: "original"` images use the docs-based GPT-5 image cost calculation; legacy images still use the old fixed estimate. - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG at 85% quality unless lossless is required, while still allowing other formats when explicitly requested. - Updated tests and helper code that construct `FunctionCallOutputContentItem::InputImage` to carry the new `detail` field. ## Behavior ### Feature off - `view_image` keeps the existing resize/re-encode behavior. - History estimation keeps the existing fixed-cost heuristic. ### Feature on + `gpt-5.3-codex+` - `view_image` sends original-resolution images with `detail: "original"`. - PNG/JPEG/WebP source bytes are preserved when possible. - History estimation uses the GPT-5 docs-based image-cost calculation for those `detail: "original"` images. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/13050 - ⏳ `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049	2026-03-03 15:56:54 -08:00
joeytrasatti-openai	935754baa3	Add thread metadata update endpoint to app server (#13280 ) ## Summary - add the v2 `thread/metadata/update` API, including protocol/schema/TypeScript exports and app-server docs - patch stored thread `gitInfo` in sqlite without resuming the thread, with validation plus support for explicit `null` clears - repair missing sqlite thread rows from rollout data before patching, and make those repairs safe by inserting only when absent and updating only git columns so newer metadata is not clobbered - keep sqlite authoritative for mutable thread git metadata by preserving existing sqlite git fields during reconcile/backfill and only using rollout `SessionMeta` git fields to fill gaps - add regression coverage for the endpoint, repair paths, concurrent sqlite writes, clearing git fields, and rollout/backfill reconciliation - fix the login server shutdown race so cancelling before the waiter starts still terminates `block_until_done()` correctly ## Testing - `cargo test -p codex-state apply_rollout_items_preserves_existing_git_branch_and_fills_missing_git_fields` - `cargo test -p codex-state update_thread_git_info_preserves_newer_non_git_metadata` - `cargo test -p codex-core backfill_sessions_preserves_existing_git_branch_and_fills_missing_git_fields` - `cargo test -p codex-app-server thread_metadata_update` - `cargo test` - currently fails in existing `codex-core` grep-files tests with `unsupported call: grep_files`: - `suite::grep_files::grep_files_tool_collects_matches` - `suite::grep_files::grep_files_tool_reports_empty_results`	2026-03-03 15:56:11 -08:00
Charley Cunningham	299b8ac445	tui: align pending steers with core acceptance (#12868 ) ## Summary - submit `Enter` steers immediately while a turn is already running instead of routing them through `queued_user_messages` - keep those submitted steers visible in the footer as `pending_steers` until core records them as a user message or aborts the turn - reconcile pending steers on `ItemCompleted(UserMessage)`, not `RawResponseItem` - emit user-message item lifecycle for leftover pending input at task finish, then remove the TUI `TurnComplete` fallback - keep `queued_user_messages` for actual queued drafts, rendered below pending steers ## Problem While the assistant was generating, pressing `Enter` could send the input into `queued_user_messages`. That queue only drains after the turn ends, so ordinary steers behaved like queued drafts instead of landing at the next core sampling boundary. The first version of this fix also used `RawResponseItem` to decide when a steer had landed. Review feedback was that this is the wrong abstraction for client behavior. There was also a late edge case in core: if pending steer input was accepted after the final sampling decision but before `TurnComplete`, core would record that user message into history at task finish without emitting `ItemStarted(UserMessage)` / `ItemCompleted(UserMessage)`. TUI had a fallback to paper over that gap locally. ## Approach - `Enter` during an active turn now submits a normal `Op::UserTurn` immediately - TUI keeps a local pending-steer preview instead of rendering that user message into history immediately - when core records the steer as `ItemCompleted(UserMessage)`, TUI matches and removes the corresponding pending preview, then renders the committed user message - core now emits the same user-message lifecycle when `on_task_finished(...)` drains leftover pending user input, before `TurnComplete` - with that lifecycle gap closed in core, TUI no longer needs to flush pending steers into history on `TurnComplete` - if the turn is interrupted, pending steers and queued drafts are both restored into the composer, with pending steers first ## Notes - `Tab` still uses the real queued-message path - `queued_user_messages` and `pending_steers` are separate state with separate semantics - the pending-steer matching key is built directly from `UserInput` - this removes the new TUI dependency on `RawResponseItem` ## Validation - `just fmt` - `cargo test -p codex-core task_finish_emits_turn_item_lifecycle_for_leftover_pending_user_input -- --nocapture` - `cargo test -p codex-tui`	2026-03-03 15:31:52 -08:00
xl-openai	9b004e2db1	Refactor plugin config and cache path (#13333 ) Update config.toml plugin entries to use <plugin_name>@<marketplace_name> as the key. Plugin now stays in [plugins/cache/marketplace-name/plugin-name/$version/] Clean up the plugin code structure. Add plugin install functionality (not used yet).	2026-03-03 15:00:18 -08:00
Ahmed Ibrahim	041c896509	Revert "Revert "realtime prompt changes"" (#13398 ) Reverts openai/codex#13385	2026-03-03 14:41:26 -08:00
Ahmed Ibrahim	6bee02a346	Build delegated realtime handoff text from all messages (#13395 ) ## Summary - Route delegated realtime handoff turns from all handoff message texts, preserving order - Fallback to input_transcript only when no messages are present - Add regression coverage for multi-message handoff requests	2026-03-03 14:07:51 -08:00
pakrym-oai	69df12efb3	Remove Responses V1 websocket implementation (#13364 ) V2 is the way to go!	2026-03-03 11:32:53 -07:00
jif-oai	8159f05dfd	feat: wire spreadsheet artifact (#13362 )	2026-03-03 15:27:37 +00:00
jif-oai	24ba01b9da	feat: artifact presentation part 7 (#13360 )	2026-03-03 15:03:25 +00:00
jif-oai	1df040e62b	feat: add multi-actions to presentation tool (#13357 )	2026-03-03 14:37:26 +00:00
jif-oai	ad393fa753	feat: pres artifact part 5 (#13355 ) Mostly written by Codex	2026-03-03 14:08:01 +00:00
jif-oai	a7d90b867d	feat: presentation part 4 (#13348 )	2026-03-03 12:51:31 +00:00
jif-oai	564a883c2a	feat: pres artifact 3 (#13346 )	2026-03-03 12:18:25 +00:00
jif-oai	72dc444b2c	feat: pres artifact 2 (#13344 )	2026-03-03 12:00:34 +00:00
jif-oai	4874b9291a	feat: presentation artifact p1 (#13341 ) Part 1 of presentation tool artifact	2026-03-03 11:38:03 +00:00
pash-openai	07e532dcb9	app-server service tier plumbing (plus some cleanup) (#13334 ) followup to https://github.com/openai/codex/pull/13212 to expose fast tier controls to app server (majority of this PR is generated schema jsons - actual code is +69 / -35 and +24 tests ) - add service tier fields to the app-server protocol surfaces used by thread lifecycle, turn start, config, and session configured events - thread service tier through the app-server message processor and core thread config snapshots - allow runtime config overrides to carry service tier for app-server callers cleanup: - Removing useless "legacy" code supporting "standard" - we moved to None \| "fast", so "standard" is not needed.	2026-03-03 02:35:09 -08:00
jif-oai	938c6dd388	fix: db windows path (#13336 )	2026-03-03 09:50:52 +00:00
jif-oai	cacefb5228	fix: agent when profile (#13235 ) Co-authored-by: Josh McKinney <joshka@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-03 09:20:25 +00:00
jif-oai	3166a5ba82	fix: agent race (#13248 ) https://github.com/openai/codex/issues/13244	2026-03-03 09:19:37 +00:00
pash-openai	2f5b01abd6	add fast mode toggle (#13212 ) - add a local Fast mode setting in codex-core (similar to how model id is currently stored on disk locally) - send `service_tier=priority` on requests when Fast is enabled - add `/fast` in the TUI and persist it locally - feature flag	2026-03-02 20:29:33 -08:00
Celia Chen	0bb152b01d	chore: remove SkillMetadata.permissions and derive skill sandboxing from permission_profile (#13061 ) ## Summary This change removes the compiled permissions field from skill metadata and keeps permission_profile as the single source of truth. Skill loading no longer compiles skill permissions eagerly. Instead, the zsh-fork skill escalation path compiles `skill.permission_profile` when it needs to determine the sandbox to apply for a skill script. ## Behavior change For skills that declare: ``` permissions: {} ``` we now treat that the same as having no skill permissions override, instead of creating and using a default readonly sandbox. This change makes the behavior more intuitive: - only non-empty skill permission profiles affect sandboxing - omitting permissions and writing permissions: {} now mean the same thing - skill metadata keeps a single permissions representation instead of storing derived state too Overall, this makes skill sandbox behavior easier to understand and more predictable.	2026-03-03 01:29:53 +00:00
Brian Fioca	50084339a6	Adjusting plan prompt for clarity and verbosity (#13284 ) `plan.md` prompt changes to tighten plan clarity and verbosity.	2026-03-03 01:14:39 +00:00
Ahmed Ibrahim	b20b6aa46f	Update realtime websocket API (#13265 ) - migrate the realtime websocket transport to the new session and handoff flow - make the realtime model configurable in config.toml and use API-key auth for the websocket --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-02 16:05:40 -08:00
Ruslan Nigmatullin	14fcb6645c	app-server: Update `thread/name/set` to support not-loaded threads (#13282 ) Currently `thread/name/set` does only work for loaded threads. Expand the scope to also support persisted but not-yet-loaded ones for a more predictable API surface. This will make it possible to rename threads discovered via `thread/list` and similar operations.	2026-03-02 15:13:18 -08:00
Dylan Hurd	e10df4ba10	fix(core) shell_snapshot multiline exports (#12642 ) ## Summary Codex discovered this one - shell_snapshot tests were breaking on my machine because I had a multiline env var. We should handle these! ## Testing - [x] existing tests pass - [x] Updated unit tests	2026-03-02 12:08:17 -07:00
Eric Traut	7709bf32a3	Fix project trust config parsing so CLI overrides work (#13090 ) Fixes #13076 This PR fixes a bug that causes command-line config overrides for MCP subtables to not be merged correctly. Summary - make project trust loading go through the dedicated struct so CLI overrides can update trusted project-local MCP transports --------- Co-authored-by: jif-oai <jif@openai.com>	2026-03-02 11:10:38 -07:00
daveaitel-openai	c2e126f92a	core: reuse parent shell snapshot for thread-spawn subagents (#13052 ) ## Summary - reuse the parent shell snapshot when spawning/forking/resuming `SessionSource::SubAgent(SubAgentSource::ThreadSpawn { .. })` sessions - plumb inherited snapshot through `AgentControl -> ThreadManager -> Codex::spawn -> SessionConfiguration` - skip shell snapshot refresh on cwd updates for thread-spawn subagents so inherited snapshots are not replaced ## Why - avoids per-subagent shell snapshot creation and cleanup work - keeps thread-spawn subagents on the parent snapshot path, matching the intended parent/child snapshot model ## Validation - `just fmt` (in `codex-rs`) - `cargo test -p codex-core --no-run` - `cargo test -p codex-core spawn_agent -- --nocapture` - `cargo test -p codex-core --test all suite::agent_jobs::spawn_agents_on_csv_runs_and_exports` ## Notes - full `cargo test -p codex-core --test all` was left running separately for broader verification Co-authored-by: Codex <noreply@openai.com>	2026-03-02 15:53:15 +00:00
jif-oai	1905597017	feat: update memories config names (#13237 )	2026-03-02 15:25:39 +00:00
jif-oai	b649953845	feat: polluted memories (#13008 ) Add a feature flag to disable memory creation for "polluted"	2026-03-02 11:57:32 +00:00
Ahmed Ibrahim	0aeb55bf08	Record realtime close marker on replacement (#13058 ) ## Summary - record a realtime close developer message when a new realtime session replaces an active one - assert the replacement marker through the mocked responses request path --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-01 13:54:12 -08:00
Leo Shimonaka	4ae60cf03c	fix: MacOSAutomationPermission::BundleIDs should allow communicating … (#12989 ) …with launchservicesd Add mach lookup for `launchservicesd` when extending the sandbox for `MacOSAutomationPermission::BundleIDs`. This is necessary so that the target application can be launched for automation. This omission was due to a spec error in a document, which has been fixed.	2026-03-01 11:00:54 -08:00
xl-openai	752402c4fe	feat: load from plugins (#12864 ) Support loading plugins. Plugins can now be enabled via [plugins.<name>] in config.toml. They are loaded as first-class entities through PluginsManager, and their default skills/ and .mcp.json contributions are integrated into the existing skills and MCP flows.	2026-03-01 10:50:56 -08:00
Michael Bolin	6a673e7339	core: resolve host_executable() rules during preflight (#13065 ) ## Why [#12964](https://github.com/openai/codex/pull/12964) added `host_executable()` support to `codex-execpolicy`, and [#13046](https://github.com/openai/codex/pull/13046) adopted it in the zsh-fork interception path. The remaining gap was the preflight execpolicy check in `core/src/exec_policy.rs`. That path derives approval requirements before execution for `shell`, `shell_command`, and `unified_exec`, but it was still using the default exact-token matcher. As a result, a command that already included an absolute executable path, such as `/usr/bin/git status`, could still miss a basename rule like `prefix_rule(pattern = ["git"], ...)` during preflight even when the policy also defined a matching `host_executable(name = "git", ...)` entry. This PR brings the same opt-in `host_executable()` resolution to the preflight approval path when an absolute program path is already present in the parsed command. ## What Changed - updated `ExecPolicyManager::create_exec_approval_requirement_for_command()` in `core/src/exec_policy.rs` to use `check_multiple_with_options(...)` with `MatchOptions { resolve_host_executables: true }` - kept the existing shell parsing flow for approval derivation, but now allow basename rules to match absolute executable paths during preflight when `host_executable()` permits it - updated requested-prefix amendment evaluation to use the same host-executable-aware matching mode, so suggested `prefix_rule()` amendments are checked consistently for absolute-path commands - added preflight coverage for: - absolute-path commands that should match basename rules through `host_executable()` - absolute-path commands whose paths are not in the allowed `host_executable()` mapping - requested prefix-rule amendments for absolute-path commands ## Verification - `just fix -p codex-core` - `cargo test -p codex-core --lib exec_policy::tests::`	2026-02-28 17:25:30 +00:00
jif-oai	74e5150b1e	fix: package `models.json` for Bazel tests (#13129 )	2026-02-28 17:21:02 +01:00
jif-oai	84b662e74f	nit: disable on windows (#13127 )	2026-02-28 14:55:16 +01:00
daveaitel-openai	eec3b1e235	Speed up subagent startup (#12935 ) ## Summary - skip online model refresh for subagent sessions - avoid rollout flushes during subagent startup - keep /models refresh for non-subagent sessions ## Testing - cargo test -p codex-core --test all suite::models_etag_responses::refresh_models_on_models_etag_mismatch_and_avoid_duplicate_models_fetch - cargo test -p codex-core --test all suite::remote_models::remote_models_long_model_slug_is_sent_with_high_reasoning - cargo test -p codex-core --test all suite::model_switching::model_switch_to_smaller_model_updates_token_context_window - cargo test -p codex-core --test all suite::compact::pre_sampling_compact_runs_on_switch_to_smaller_context_model - cargo test -p codex-core --test all suite::compact::pre_sampling_compact_runs_after_resume_and_switch_to_smaller_model - cargo test -p codex-core --test all suite::personality::remote_model_friendly_personality_instructions_with_feature --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-28 14:54:08 +01:00
Andi Liu	5f7c38baa9	Tune memory read-path for stale facts (#13088 ) ## Why - tighten Codex memory-read behavior around stale facts and conflicting memory - encode the risk-of-drift vs verification-effort decision rule directly in the read-path prompt - make partial stale-detail updates explicit so correcting only the answer is not treated as sufficient ## What changed - update `codex-rs/core/templates/memories/read_path.md` - add guidance for when to verify cheap local facts vs when to answer from older memory with visible provenance - strengthen same-turn `MEMORY.md` updates when stored concrete details are stale ## Notes - this is based on some staleness eval work	2026-02-28 14:48:47 +01:00
jif-oai	bee93ca2f3	chore: change mem default (#13125 )	2026-02-28 14:45:27 +01:00
jif-oai	d33f4b54ac	feat: skill disable respect config layer (#13027 )	2026-02-28 14:17:05 +01:00
alexsong-oai	e2fef7a3d2	Make cloud_requirements fail close (#13063 ) Make it fail-close only for CLI for now Will extend this for app-server later	2026-02-27 18:22:05 -08:00
Ruslan Nigmatullin	8c1e3f3e64	app-server: Add `ephemeral` field to `Thread` object (#13084 ) Currently there is no alternative way to know that thread is ephemeral, only client which did create it has the knowledge.	2026-02-27 17:42:25 -08:00
Michael Bolin	1a8d930267	core: adopt host_executable() rules in zsh-fork (#13046 ) ## Why [#12964](https://github.com/openai/codex/pull/12964) added `host_executable()` support to `codex-execpolicy`, but the zsh-fork interception path in `unix_escalation.rs` was still evaluating commands with the default exact-token matcher. That meant an intercepted absolute executable such as `/usr/bin/git status` could still miss basename rules like `prefix_rule(pattern = ["git", "status"])`, even when the policy also defined a matching `host_executable(name = "git", ...)` entry. This PR adopts the new matching behavior in the zsh-fork runtime only. That keeps the rollout intentionally narrow: zsh-fork already requires explicit user opt-in, so it is a safer first caller to exercise the new `host_executable()` scheme before expanding it to other execpolicy call sites. It also brings zsh-fork back in line with the current `prefix_rule()` execution model. Until prefix rules can carry their own permission profiles, a matched `prefix_rule()` is expected to rerun the intercepted command unsandboxed on `allow`, or after the user accepts `prompt`, instead of merely continuing inside the inherited shell sandbox. ## What Changed - added `evaluate_intercepted_exec_policy()` in `core/src/tools/runtimes/shell/unix_escalation.rs` to centralize execpolicy evaluation for intercepted commands - switched intercepted direct execs in the zsh-fork path to `check_multiple_with_options(...)` with `MatchOptions { resolve_host_executables: true }` - added `commands_for_intercepted_exec_policy()` so zsh-fork policy evaluation works from intercepted `(program, argv)` data instead of reconstructing a synthetic command before matching - left shell-wrapper parsing intentionally disabled by default behind `ENABLE_INTERCEPTED_EXEC_POLICY_SHELL_WRAPPER_PARSING`, so path-sensitive matching relies on later direct exec interception rather than shell-script parsing - made matched `prefix_rule()` decisions rerun intercepted commands with `EscalationExecution::Unsandboxed`, while unmatched-command fallback keeps the existing sandbox-preserving behavior - extracted the zsh-fork test harness into `core/tests/common/zsh_fork.rs` so both the skill-focused and approval-focused integration suites can exercise the same runtime setup - limited this change to the intercepted zsh-fork path rather than changing every execpolicy caller at once - added runtime coverage in `core/src/tools/runtimes/shell/unix_escalation_tests.rs` for allowed and disallowed `host_executable()` mappings and the wrapper-parsing modes - added integration coverage in `core/tests/suite/approvals.rs` to verify a saved `prefix_rule(pattern=["touch"], decision="allow")` reruns under zsh-fork outside a restrictive `WorkspaceWrite` sandbox --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13046). * #13065 * __->__ #13046	2026-02-28 01:41:23 +00:00
Ahmed Ibrahim	ec6f6aacbf	Add model availability NUX tooltips (#13021 ) - override startup tooltips with model availability NUX and persist per-model show counts in config - stop showing each model after four exposures and fall back to normal tooltips	2026-02-27 17:14:06 -08:00
Eric Traut	ff5cbfd7d4	Handle missing plan info for ChatGPT accounts (#13072 ) Addresses https://github.com/openai/codex/issues/13007 and https://github.com/openai/codex/issues/12170 There are situations where the ChatGPT auth backend might return a JWT that contains no plan information. Most code paths already handle this case well, but the internal implementation of the "account/read" app server call was failing in this case (returning an error rather than properly returning None for the plan). This resulted in a situation where users needed to log in every time the extension or app started even if they successfully logged in the last time. Summary - allow ChatGPT-authenticated accounts to fall back to `AccountPlanType::Unknown` when the token omits the plan claim - add regression coverage in `app-server/tests/suite/v2/account.rs` to confirm `account/read` returns `plan_type: Unknown` when the claim is absent - ensure the Rust auth helpers and fixtures treat missing plan claims as Optional and default to `Unknown`	2026-02-27 17:51:21 -07:00
Charley Cunningham	695957a348	Unify rollout reconstruction with resume/fork TurnContext hydration (#12612 ) ## Summary This PR unifies rollout history reconstruction and resume/fork metadata hydration under a single `Session::reconstruct_history_from_rollout` implementation. The key change from main is that replay metadata now comes from the same reconstruction pass that rebuilds model-visible history, instead of doing a second bespoke rollout scan to recover `previous_model` / `reference_context_item`. ## What Changed ### Unified reconstruction output `reconstruct_history_from_rollout` now returns a single `RolloutReconstruction` bundle containing: - rebuilt `history` - `previous_model` - `reference_context_item` Resume and fork both consume that shared output directly. ### Reverse replay core The reconstruction logic moved into `codex-rs/core/src/codex/rollout_reconstruction.rs` and now scans rollout items newest-to-oldest. That reverse pass: - derives `previous_model` - derives whether `reference_context_item` is preserved or cleared - stops early once it has both resume metadata and a surviving `replacement_history` checkpoint History materialization is still bridged eagerly for now by replaying only the surviving suffix forward, which keeps the history result stable while moving the control flow toward the future lazy reverse loader design. ### Removed bespoke context lookup This deletes `last_rollout_regular_turn_context_lookup` and its separate compaction-aware scan. The previous model / baseline metadata is now computed from the same replay state that rebuilds history, so resume/fork cannot drift from the reconstructed transcript view. ### `TurnContextItem` persistence contract `TurnContextItem` is now treated as the replay source of truth for durable model-visible baselines. This PR keeps the following contract explicit: - persist `TurnContextItem` for the first real user turn so resume can recover `previous_model` - persist it for later turns that emit model-visible context updates - if mid-turn compaction reinjects full initial context into replacement history, persist a fresh `TurnContextItem` after `Compacted` so resume/fork can re-establish the baseline from the rewritten history - do not treat manual compaction or pre-sampling compaction as creating a new durable baseline on their own ## Behavior Preserved - rollback replay stays aligned with `drop_last_n_user_turns` - rollback skips only user turns - incomplete active user turns are dropped before older finalized turns when rollback applies - unmatched aborts do not consume the current active turn - missing abort IDs still conservatively clear stale compaction state - compaction clears `reference_context_item` until a later `TurnContextItem` re-establishes it - `previous_model` still comes from the newest surviving user turn that established one ## Tests Targeted validation run for the current branch shape: - `cd codex-rs && cargo test -p codex-core --lib codex::rollout_reconstruction_tests -- --nocapture` - `cd codex-rs && just fmt` The branch also extracts the rollout reconstruction tests into `codex-rs/core/src/codex/rollout_reconstruction_tests.rs` so this logic has a dedicated home instead of living inline in `codex.rs`.	2026-02-27 13:50:45 -08:00
Michael Bolin	b148d98e0e	execpolicy: add host_executable() path mappings (#12964 ) ## Why `execpolicy` currently keys `prefix_rule()` matching off the literal first token. That works for rules like `["/usr/bin/git"]`, but it means shared basename rules such as `["git"]` do not help when a caller passes an absolute executable path like `/usr/bin/git`. This PR lays the groundwork for basename-aware matching without changing existing callers yet. It adds typed host-executable metadata and an opt-in resolution path in `codex-execpolicy`, so a follow-up PR can adopt the new behavior in `unix_escalation.rs` and other call sites without having to redesign the policy layer first. ## What Changed - added `host_executable(name = ..., paths = [...])` to the execpolicy parser and validated it with `AbsolutePathBuf` - stored host executable mappings separately from prefix rules inside `Policy` - added `MatchOptions` and opt-in `*_with_options()` APIs that preserve existing behavior by default - implemented exact-first matching with optional basename fallback, gated by `host_executable()` allowlists when present - normalized executable names for cross-platform matching so Windows paths like `git.exe` can satisfy `host_executable(name = "git", ...)` - updated `match` / `not_match` example validation to exercise the host-executable resolution path instead of only raw prefix-rule matching - preserved source locations for deferred example-validation errors so policy load failures still point at the right file and line - surfaced `resolvedProgram` on `RuleMatch` so callers can tell when a basename rule matched an absolute executable path - preserved host executable metadata when requirements policies overlay file-based policies in `core/src/exec_policy.rs` - documented the new rule shape and CLI behavior in `execpolicy/README.md` ## Verification - `cargo test -p codex-execpolicy` - added coverage in `execpolicy/tests/basic.rs` for parsing, precedence, empty allowlists, basename fallback, exact-match precedence, and host-executable-backed `match` / `not_match` examples - added a regression test in `core/src/exec_policy.rs` to verify requirements overlays preserve `host_executable()` metadata - verified `cargo test -p codex-core --lib`, including source-rendering coverage for deferred validation errors	2026-02-27 12:59:24 -08:00
Felipe Coury	3b5996f988	fix(tui): promote windows terminal diff ansi16 to truecolor (#13016 ) ## Summary - Promote ANSI-16 to truecolor for diff rendering when running inside Windows Terminal - Respect explicit `FORCE_COLOR` override, skipping promotion when set - Extract a pure `diff_color_level_for_terminal` function for testability - Strip background tints from ANSI-16 diff output, rendering add/delete lines with foreground color only - Introduce `RichDiffColorLevel` to type-safely restrict background fills to truecolor and ansi256 ## Problem Windows Terminal fully supports 24-bit (truecolor) rendering but often does not provide the usual TERM metadata (`TERM`, `TERM_PROGRAM`, `COLORTERM`) in `cmd.exe`/PowerShell sessions. In those environments, `supports-color` can report only ANSI-16 support. The diff renderer therefore falls back to a 16-color palette, producing washed-out, hard-to-read diffs. The screenshots below demonstrate that both PowerShell and cmd.exe don't set any `TERM` environment variables. \| PowerShell \| cmd.exe \| \|---\|---\| \| <img width="2032" height="1162" alt="SCR-20260226-nfvy" src="https://github.com/user-attachments/assets/59e968cc-4add-4c7b-a415-07163297e86a" /> \| <img width="2032" height="1162" alt="SCR-20260226-nfyc" src="https://github.com/user-attachments/assets/d06b3e39-bf91-4ce3-9705-82bf9563a01b" /> \| ## Mental model `StdoutColorLevel` (from `supports-color`) is the _detected_ capability. `DiffColorLevel` is the _intended_ capability for diff rendering. A new intermediary — `diff_color_level_for_terminal` — maps one to the other and is the single place where terminal-specific overrides live. Windows Terminal is detected two independent ways: the `TerminalName` parsed by `terminal_info()` and the raw presence of `WT_SESSION`. When `WT_SESSION` is present and `FORCE_COLOR` is not set, we promote unconditionally to truecolor. When `WT_SESSION` is absent but `TerminalName::WindowsTerminal` is detected, we promote only the ANSI-16 level (not `Unknown`). A single override helper — `has_force_color_override()` — checks whether `FORCE_COLOR` is set. When it is, both the `WT_SESSION` fast-path and the `TerminalName`-based promotion are suppressed, preserving explicit user intent. \| PowerShell \| cmd.exe \| WSL \| Bash for Windows \| \|---\|---\|---\|---\| \| ![SCR-20260226-msrh](https://github.com/user-attachments/assets/0f6297a6-4241-4dbf-b7ff-cf02da8941b0) \| ![SCR-20260226-nbao](https://github.com/user-attachments/assets/bb5ff8a9-903c-4677-a2de-1f6e1f34b18e) \| ![SCR-20260226-nbej](https://github.com/user-attachments/assets/26ecec2c-a7e9-410a-8702-f73995b490a6) \| ![SCR-20260226-nbkz](https://github.com/user-attachments/assets/80c4bf9a-3b41-40e1-bc87-f5c565f96075) \| ## Non-goals - This does not change color detection for anything outside the diff renderer (e.g. the chat widget, markdown rendering). - This does not add a user-facing config knob; `FORCE_COLOR` already serves that role. ## Tradeoffs - The `has_wt_session` signal is intentionally kept separate from `TerminalName::WindowsTerminal`. `terminal_info()` is derived with `TERM_PROGRAM` precedence, so it can differ from raw `WT_SESSION`. - Real-world validation in this issue: in both `cmd.exe` and PowerShell, `TERM`/`TERM_PROGRAM`/`COLORTERM` were absent, so TERM-based capability hints were unavailable in those sessions. - Checking `FORCE_COLOR` for presence rather than parsing its value is a simplification. In practice `supports-color` has already parsed it, so our check is a coarse "did the user set _anything_?" gate. The effective color level still comes from `supports-color`. - When `WT_SESSION` is present without `FORCE_COLOR`, we promote to truecolor regardless of `stdout_level` (including `Unknown`). This is aggressive but correct: `WT_SESSION` is a strong signal that we're in Windows Terminal. - ANSI-16 add/delete backgrounds (bright green/red) overpower syntax-highlighted token colors, making diffs harder to read. Foreground-only cues (colored text, gutter signs) preserve readability on low-color terminals. ## Architecture ``` stdout_color_level() ──┐ terminal_info().name ──┤ WT_SESSION presence ──┼──▶ diff_color_level_for_terminal() ──▶ DiffColorLevel FORCE_COLOR presence ──┘ │ ▼ RichDiffColorLevel::from_diff_color_level() │ ┌──────────┴──────────┐ │ Some(TrueColor\|256) │ → bg tints │ None (Ansi16) │ → fg only └─────────────────────┘ ``` `diff_color_level()` is the environment-reading entry point; it gathers the four runtime signals and delegates to the pure, testable `diff_color_level_for_terminal()`. ## Observability No new logs or metrics. Incorrect color selection is immediately visible as broken diff rendering; the test suite covers the decision matrix exhaustively. ## Tests Six new unit tests exercise every branch of `diff_color_level_for_terminal`: \| Test \| Inputs \| Expected \| \|------\|--------\|----------\| \| `windows_terminal_promotes_ansi16_to_truecolor_for_diffs` \| Ansi16 + WindowsTerminal name \| TrueColor \| \| `wt_session_promotes_ansi16_to_truecolor_for_diffs` \| Ansi16 + WT_SESSION only \| TrueColor \| \| `non_windows_terminal_keeps_ansi16_diff_palette` \| Ansi16 + WezTerm \| Ansi16 \| \| `wt_session_promotes_unknown_color_level_to_truecolor` \| Unknown + WT_SESSION \| TrueColor \| \| `explicit_force_override_keeps_ansi16_on_windows_terminal` \| Ansi16 + WindowsTerminal + FORCE_COLOR \| Ansi16 \| \| `explicit_force_override_keeps_ansi256_on_windows_terminal` \| Ansi256 + WT_SESSION + FORCE_COLOR \| Ansi256 \| \| `ansi16_add_style_uses_foreground_only` \| Dark + Ansi16 \| fg=Green, bg=None \| \| (and any other new snapshot/assertion tests from commits `d757fee` and `d7c78b3`) \| \| \| ## Test plan - [x] Verify all new unit tests pass (`cargo test -p codex-tui --lib`) - [x] On Windows Terminal: confirm diffs render with truecolor backgrounds - [x] On Windows Terminal with `FORCE_COLOR` set: confirm promotion is disabled and output follows the forced `supports-color` level - [x] On macOS/Linux terminals: confirm no behavior change Fixes https://github.com/openai/codex/issues/12904 Fixes https://github.com/openai/codex/issues/12890 Fixes https://github.com/openai/codex/issues/12912 Fixes https://github.com/openai/codex/issues/12840	2026-02-27 10:45:59 -07:00
Michael Bolin	d09a7535ed	fix: use AbsolutePathBuf for permission profile file roots (#12970 ) ## Why `PermissionProfile` should describe filesystem roots as absolute paths at the type level. Using `PathBuf` in `FileSystemPermissions` made the shared type too permissive and blurred together three different deserialization cases: - skill metadata in `agents/openai.yaml`, where relative paths should resolve against the skill directory - app-server API payloads, where callers should have to send absolute paths - local tool-call payloads for commands like `shell_command` and `exec_command`, where `additional_permissions.file_system` may legitimately be relative to the command `workdir` This change tightens the shared model without regressing the existing local command flow. ## What Changed - changed `protocol::models::FileSystemPermissions` and the app-server `AdditionalFileSystemPermissions` mirror to use `AbsolutePathBuf` - wrapped skill metadata deserialization in `AbsolutePathBufGuard`, so relative permission roots in `agents/openai.yaml` resolve against the containing skill directory - kept app-server/API deserialization strict, so relative `additionalPermissions.fileSystem.*` paths are rejected at the boundary - restored cwd/workdir-relative deserialization for local tool-call payloads by parsing `shell`, `shell_command`, and `exec_command` arguments under an `AbsolutePathBufGuard` rooted at the resolved command working directory - simplified runtime additional-permission normalization so it only canonicalizes and deduplicates absolute roots instead of trying to recover relative ones later - updated the app-server schema fixtures, `app-server/README.md`, and the affected transport/TUI tests to match the final behavior	2026-02-27 17:42:52 +00:00
jif-oai	8cf5b00aef	fix: more stable notify script (#13011 )	2026-02-27 16:05:44 +01:00

1 2 3 4 5 ...

2012 Commits