codex

rad/codex

mirror of https://github.com/openai/codex.git synced 2026-03-05 21:45:28 +03:00

Author	SHA1	Message	Date
Xin Lin	8ea5996f33	support plugin/list.	2026-03-04 23:56:02 -05:00
xl-openai	1e877ccdd2	plugin: support local-based marketplace.json + install endpoint. (#13422 ) Support marketplace.json that points to a local file, with ``` "source": { "source": "local", "path": "./plugin-1" }, ``` Add a new plugin/install endpoint which add the plugin to the cache folder and enable it in config.toml.	2026-03-04 19:08:18 -05:00
Ahmed Ibrahim	294079b0b1	Prefix handoff messages with role (#13505 ) Format handoff context by prefixing each message with its role (for example "user:" and "assistant:") before forwarding to the agent.	2026-03-04 15:37:31 -08:00
Michael Bolin	4907096d13	[release] temporarily use thin LTO for releases (#13506 )	2026-03-04 14:10:54 -08:00
Eric Traut	f80e5d979d	Notify TUI about plan mode prompts and user input requests (#13495 ) Addresses #13478 Summary - Add two new scopes for `tui.notifications` config: `plan-mode-prompt` and `user-input-requested`. - Add Plan Mode prompt and user-input-requested notifications to the TUI so these events surface consistently outside of plan mode - Add helpers and tests to ensure the new notification types publish the right titles, summaries, and type tags for filtering - Add prioritization mechanism to fix an existing bug where one notification event could arbitrarily overwrite others Testing - Manually tested plan mode to ensure that notification appeared	2026-03-04 15:08:57 -07:00
alexsong-oai	ce139bb1af	add metrics for external config import (#13501 )	2026-03-04 13:59:50 -08:00
Owen Lin	8dfd654196	feat(app-server-test-client): OTEL setup for tracing (#13493 ) ### Overview This PR: - Updates `app-server-test-client` to load OTEL settings from `$CODEX_HOME/config.toml` and initializes its own OTEL provider. - Add real client root spans to app-server test client traces. This updates `codex-app-server-test-client` so its Datadog traces reflect the full client-driven flow instead of a set of server spans stitched together under a synthetic parent. Before this change, the test client generated a fake `traceparent` once and reused it for every JSON-RPC request. That kept the requests in one trace, but there was no real client span at the top, so Datadog ended up showing the sequence in a slightly misleading way, where all RPCs were anchored under `initialize`. Now the test client: - loads OTEL settings from the normal Codex config path, including `$CODEX_HOME/config.toml` and existing --config overrides - initializes tracing the same way other Codex binaries do when trace export is enabled - creates a real client root span for each scripted command - creates per-request client spans for JSON-RPC methods like `initialize`, `thread/start`, and `turn/start` - injects W3C trace context from the current client span into request.trace instead of reusing a fabricated carrier This gives us a cleaner trace shape in Datadog: - one trace URL for the whole scripted flow - a visible client root span - proper client/server parent-child relationships for each app-server request	2026-03-04 13:30:09 -08:00
jif-oai	2322e49549	feat: external artifacts builder (#13485 ) This PR reverts the built-in artifact render while a decision is being reached. No impact expected on any features	2026-03-04 20:22:34 +00:00
Felipe Coury	98923e53cc	fix(tui): decode ANSI alpha-channel encoding in syntax themes (#13382 ) ## Problem The `ansi`, `base16`, and `base16-256` syntax themes are designed to emit ANSI palette colors so that highlighted code respects the user's terminal color scheme. Syntect encodes this intent in the alpha channel of its `Color` struct — a convention shared with `bat` — but `convert_style` was ignoring it entirely, treating every foreground color as raw RGB. This caused ANSI-family themes to produce hard-coded RGB values (e.g. `Rgb(0x02, 0, 0)` instead of `Green`), defeating their purpose and rendering them as near-invisible dark colors on most terminals. Reported in #12890. ## Mental model Syntect themes use a compact encoding in their `Color` struct: \| `alpha` \| Meaning of `r` \| Mapped to \| \|---------\|----------------\|-----------\| \| `0x00` \| ANSI palette index (0–255) \| `RtColor::Black`…`Gray` for 0–7, `Indexed(n)` for 8–255 \| \| `0x01` \| Unused (sentinel) \| `None` — inherit terminal default fg/bg \| \| `0xFF` \| True RGB red channel \| `RtColor::Rgb(r, g, b)` \| \| other \| Unexpected \| `RtColor::Rgb(r, g, b)` (silent fallback) \| This encoding is a bat convention that three bundled themes rely on. The new `convert_syntect_color` function decodes it; `ansi_palette_color` maps indices 0–7 to ratatui's named ANSI variants. \| macOS - Dark \| macOS - Light \| Windows - ansi \| Windows - base16 \| \|---\|---\|---\|---\| \| <img width="1064" height="1205" alt="macos-dark" src="https://github.com/user-attachments/assets/f03d92fb-b44b-4939-b2b9-503fde133811" /> \| <img width="1073" height="1227" alt="macos-light" src="https://github.com/user-attachments/assets/2ecb2089-73b5-4676-bed8-e4e6794250b4" /> \| ![windows-ansi](https://github.com/user-attachments/assets/d41029e6-ffd3-454e-ab72-6751607e5d5c) \| ![windows-base16](https://github.com/user-attachments/assets/b48aafcc-0196-4977-8ee1-8f8eaddd1698) \| ## Non-goals - Background color decoding — we intentionally skip backgrounds to preserve the terminal's own background. The decoder supports it, but `convert_style` does not apply it. - Italic/underline changes — those remain suppressed as before. - Custom `.tmTheme` support for ANSI encoding — only the bundled themes use this convention. ## Tradeoffs - The alpha-channel encoding is an undocumented bat/syntect convention, not a formal spec. We match bat's behavior exactly, trading formality for ecosystem compatibility. - Indices 0–7 are mapped to ratatui's named variants (`Black`, `Red`, …, `Gray`) rather than `Indexed(0)`…`Indexed(7)`. This lets terminals apply bold/bright semantics to named colors, which is the expected behavior for ANSI themes, but means the two representations are not perfectly round-trippable. ## Architecture All changes are in `codex-rs/tui/src/render/highlight.rs`, within the style-conversion layer between syntect and ratatui: ``` syntect::highlighting::Color └─ convert_syntect_color(color) [NEW — alpha-dispatch] ├─ a=0x00 → ansi_palette_color() [NEW — index→named/indexed] ├─ a=0x01 → None (terminal default) ├─ a=0xFF → Rgb(r,g,b) (standard opaque path) └─ other → Rgb(r,g,b) (silent fallback) ``` `convert_style` delegates foreground mapping to `convert_syntect_color` instead of inlining the `Rgb(r,g,b)` conversion. The core highlighter is refactored into `highlight_to_line_spans_with_theme` (accepts an explicit theme reference) so tests can highlight against specific themes without mutating process-global state. ### ANSI-family theme contract The ANSI-family themes (`ansi`, `base16`, `base16-256`) rely on upstream alpha-channel encoding from two_face/syntect. We intentionally do not validate this contract at runtime — if the upstream format changes, the `ansi_themes_use_only_ansi_palette_colors` test catches it at build time, long before it reaches users. A runtime warning would be unactionable noise. ### Warning copy cleanup User-facing warning messages were rewritten for clarity: - Removed internal jargon ("alpha-encoded ANSI color markers", "RGB fallback semantics", "persisted override config") - Dropped "syntax" prefix from "syntax theme" — users just think "theme" - Downgraded developer-only diagnostics (duplicate override, resolve fallback) from `warn` to `debug` ## Observability - The `ansi_themes_use_only_ansi_palette_colors` test enforces the ANSI-family contract at build time. - The snapshot test provides a regression tripwire for palette color output. - User-facing warnings are limited to actionable issues: unknown theme names and invalid custom `.tmTheme` files. ## Tests - Unit tests for each alpha branch: `alpha=0x00` with low index (named color), `alpha=0x00` with high index (`Indexed`), `alpha=0x01` (terminal default), unexpected alpha (falls back to RGB), ANSI white → Gray mapping. - Integration test: `ansi_family_themes_use_terminal_palette_colors_not_rgb` — highlights a Rust snippet with each ANSI-family theme and asserts zero `Rgb` foreground colors appear. - Snapshot test: `ansi_family_foreground_palette_snapshot` — records the exact set of unique foreground colors each ANSI-family theme produces, guarding against regressions. - Warning validation tests: verify user-facing warnings for missing custom themes, invalid `.tmTheme` files, and bundled theme resolution. ## Test plan - [ ] `cargo test -p codex-tui` passes all new and existing tests - [ ] Select `ansi`, `base16`, or `base16-256` theme and verify code blocks render with terminal palette colors (not near-black RGB) - [ ] Select a standard RGB theme (e.g. `dracula`) and verify no regression in color output	2026-03-04 12:03:34 -08:00
pash-openai	b200a5f45b	[tui] Update Fast slash command description (#13458 ) ## Summary - update the /fast slash command description to mention fastest inference - mention the 3X plan usage tradeoff in the help copy ## Testing - cargo test -p codex-tui slash_command (currently blocked by an unrelated latest-main codex-tui compile error in chatwidget.rs: refresh_queued_user_messages missing) --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-04 19:30:51 +00:00
Val Kharitonov	26f4b8e2f1	remove serviceTier from app-server examples (#13489 ) Documentation-only	2026-03-04 19:12:40 +00:00
Owen Lin	27724f6ead	feat(core, tracing): add a span representing a turn (#13424 ) This is PR 3 of the app-server tracing rollout. PRs https://github.com/openai/codex/pull/13285 and https://github.com/openai/codex/pull/13368 gave us inbound request spans in app-server and propagated trace context through Submission. This change finishes the next piece in core: when a request actually starts a turn, we now create a core-owned long-lived span that stays open for the real lifetime of the turn. What changed: - `Session::spawn_task` can now optionally create a long-lived turn span and run the spawned task inside it - `turn/start` uses that path, so normal turn execution stays under a single core-owned span after the async handoff - `review/start` uses the same pattern - added a unit test that verifies the spawned turn task inherits the submission dispatch trace ancestry Why The app-server request span is intentionally short-lived. Once work crosses into core, we still want one span that covers the actual execution window until completion or interruption. This keeps that ownership where it belongs: in the layer that owns the runtime lifecycle.	2026-03-04 11:09:17 -08:00
iceweasel-oai	54a1c81d73	allow apps to specify cwd for sandbox setup. (#13484 ) The electron app doesn't start up the app-server in a particular workspace directory. So sandbox setup happens in the app-installed directory instead of the project workspace. This allows the app do specify the workspace cwd so that the sandbox setup actually sets up the ACLs instead of exiting fast and then having the first shell command be slow.	2026-03-04 10:54:30 -08:00
Alex Daley	8a59386273	add new scopes to login (#12383 ) Validated login + refresh flows. Removing scopes from the refresh request until we have upgrade flow in place. Confirmed that tokens refresh with existing scopes.	2026-03-04 16:41:54 +00:00
jif-oai	f72ab43fd1	feat: memories in workspace write (#13467 ) artifact-runtime-v2.3.4	2026-03-04 13:00:26 +00:00
jif-oai	df619474f5	nit: citation prompt (#13468 )	2026-03-04 13:00:11 +00:00
jif-oai	e07eaff0d3	feat: add metric for per-turn tool count and add tmp_mem flag (#13456 )	2026-03-04 11:25:58 +00:00
jif-oai	bda3c49dc4	feat: disable request input on sub agent (#13460 ) https://github.com/openai/codex/issues/13289	2026-03-04 11:25:49 +00:00
jif-oai	e6b2e3a9f7	fix: bad merge (#13461 )	2026-03-04 11:00:48 +00:00
jif-oai	e4a202ea52	fix: pending messages in `/agent` (#13240 )	2026-03-04 10:17:29 +00:00
jif-oai	49634b7f9c	add metric for per-turn token usage (#13454 )	2026-03-04 10:17:25 +00:00
jif-oai	a4ad101125	feat: ordinal nick name (#13412 )	2026-03-04 09:41:29 +00:00
jif-oai	932ff28183	feat: better multi-agent prompt (#13404 )	2026-03-04 09:41:20 +00:00
Won Park	fa2306b303	image-gen-core (#13290 ) Core tool-calling for image-gen, handles requesting and receiving logic for images using response API	2026-03-03 23:11:28 -08:00
Val Kharitonov	4f6c4bb143	support 'flex' tier in app-server in addition to 'fast' (#13391 )	2026-03-03 22:46:05 -08:00
Michael Bolin	7134220f3c	core: box wrapper futures to reduce stack pressure (#13429 ) Follow-up to [#13388](https://github.com/openai/codex/pull/13388). This uses the same general fix pattern as [#12421](https://github.com/openai/codex/pull/12421), but in the `codex-core` compact/resume/fork path. ## Why `compact_resume_after_second_compaction_preserves_history` started overflowing the stack on Windows CI after `#13388`. The important part is that this was not a compaction-recursion bug. The test exercises a path with several thin `async fn` wrappers around much larger thread-spawn, resume, and fork futures. When one `async fn` awaits another inline, the outer future stores the callee future as part of its own state machine. In a long wrapper chain, that means a caller can accidentally inline a lot more state than the source code suggests. That is exactly what was happening here: - `ThreadManager` convenience methods such as `start_thread`, `resume_thread_from_rollout`, and `fork_thread` were inlining the larger spawn/resume futures beneath them. - `core_test_support::test_codex` added another wrapper layer on top of those same paths. - `compact_resume_fork` adds a few more helpers, and this particular test drives the resume/fork path multiple times. On Windows, that was enough to push both the libtest thread and Tokio worker threads over the edge. The previous 8 MiB test-thread workaround proved the failure was stack-related, but it did not address the underlying future size. ## How This Was Debugged The useful debugging pattern here was to turn the CI-only failure into a local low-stack repro. 1. First, remove the explicit large-stack harness so the test runs on the normal `#[tokio::test]` path. 2. Build the test binary normally. 3. Re-run the already-built `tests/all` binary directly with progressively smaller `RUST_MIN_STACK` values. Running the built binary directly matters: it keeps the reduced stack size focused on the test process instead of also applying it to `cargo` and `rustc`. That made it possible to answer two questions quickly: - Does the failure still reproduce without the workaround? Yes. - Does boxing the wrapper futures actually buy back stack headroom? Also yes. After this change, the built test binary passes with `RUST_MIN_STACK=917504` and still overflows at `786432`, which is enough evidence to justify removing the explicit 8 MiB override while keeping a deterministic low-stack repro for future debugging. If we hit a similar issue again, the first places to inspect are thin `async fn` wrappers that mostly forward into a much larger async implementation. ## `Box::pin()` Primer `async fn` compiles into a state machine. If a wrapper does this: ```rust async fn wrapper() { inner().await; } ``` then `wrapper()` stores the full `inner()` future inline as part of its own state. If the wrapper instead does this: ```rust async fn wrapper() { Box::pin(inner()).await; } ``` then the child future lives on the heap, and the outer future only stores a pinned pointer to it. That usually trades one allocation for a substantially smaller outer future, which is exactly the tradeoff we want when the problem is stack pressure rather than raw CPU time. Useful references: - [`Box::pin`](https://doc.rust-lang.org/std/boxed/struct.Box.html#method.pin) - [Async book: Pinning](https://rust-lang.github.io/async-book/04_pinning/01_chapter.html) ## What Changed - Boxed the wrapper futures in `core/src/thread_manager.rs` around `start_thread`, `resume_thread_from_rollout`, `fork_thread`, and the corresponding `ThreadManagerState` spawn helpers so callers no longer inline the full spawn/resume state machine through multiple layers. - Boxed the matching test-only wrapper futures in `core/tests/common/test_codex.rs` and `core/tests/suite/compact_resume_fork.rs`, which sit directly on top of the same path. - Restored `compact_resume_after_second_compaction_preserves_history` in `core/tests/suite/compact_resume_fork.rs` to a normal `#[tokio::test]` and removed the explicit `TEST_STACK_SIZE_BYTES` thread/runtime sizing. - Simplified a tiny helper in `compact_resume_fork` by making `fetch_conversation_path()` synchronous, which removes one more unnecessary future layer from the test path. ## Verification - `cargo test -p codex-core --test all suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history -- --exact --nocapture` - `cargo test -p codex-core --test all suite::compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary directly with reduced stack sizes: - `RUST_MIN_STACK=917504` passes - `RUST_MIN_STACK=786432` still overflows - `cargo test -p codex-core` - Still fails locally in unrelated existing integration areas that expect the `codex` / `test_stdio_server` binaries or hit the existing `search_tool` wiremock mismatches.	2026-03-04 05:44:52 +00:00
Celia Chen	d622bff384	chore: Nest skill and protocol network permissions under `network.enabled` (#13427 ) ## Summary Changes the permission profile shape from a bare network boolean to a nested object. Before: ```yaml permissions: network: true ``` After: ```yaml permissions: network: enabled: true ``` This also updates the shared Rust and app-server protocol types so `PermissionProfile.network` is no longer `Option<bool>`, but `Option<NetworkPermissions>` with `enabled: Option<bool>`. ## What Changed - Updated `PermissionProfile` in `codex-rs/protocol/src/models.rs`: - `pub network: Option<bool>` -> `pub network: Option<NetworkPermissions>` - Added `NetworkPermissions` with: - `pub enabled: Option<bool>` - Changed emptiness semantics so `network` is only considered empty when `enabled` is `None` - Updated skill metadata parsing to accept `permissions.network.enabled` - Updated core permission consumers to read `network.enabled.unwrap_or(false)` where a concrete boolean is needed - Updated app-server v2 protocol types and regenerated schema/TypeScript outputs - Updated docs to mention `additionalPermissions.network.enabled`	2026-03-03 20:57:29 -08:00
gabec-openai	2e154a35bc	Add role-specific subagent nickname overrides (#13218 ) ## Summary - add `nickname_candidates` to agent role config - use role-specific nickname pools for spawned and resumed subagents - validate and schema-generate the new config surface ## Testing - `just fmt` - `just write-config-schema` - `just fix -p codex-core` - `cargo test -p codex-core` - `cargo test` --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-04 04:43:52 +00:00
Michael Bolin	bfff0c729f	config: enforce enterprise feature requirements (#13388 ) ## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.	2026-03-04 04:40:22 +00:00
Celia Chen	e6773f856c	Feat: Preserve network access on read-only sandbox policies (#13409 ) ## Summary `PermissionProfile.network` could not be preserved when additional or compiled permissions resolved to `SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access field. This change makes read-only + network enabled representable directly and threads that through the protocol, app-server v2 mirror, and permission- merging logic. ## What changed - Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core protocol and app-server v2 protocol. - Kept backward compatibility by defaulting the new field to false, so legacy read-only payloads still deserialize unchanged. - Updated `has_full_network_access()` and sandbox summaries to respect read-only network access. - Preserved PermissionProfile.network when: - compiling skill permission profiles into sandbox policies - normalizing additional permissions - merging additional permissions into existing sandbox policies - Updated the approval overlay to show network in the rendered permission rule when requested. - Regenerated app-server schema fixtures for the new v2 wire shape.	2026-03-04 02:41:57 +00:00
zbarsky-openai	2d8c1575b8	[bazel] Bump rules_rs and llvm (#13366 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-03-04 01:59:32 +00:00
iceweasel-oai	639a5f6c48	copy command-runner to CODEX_HOME so sandbox users can always execute it (#13413 ) • Keep Windows sandbox runner launches working from packaged installs by running the helper from a user-owned runtime location. On some Windows installs, the packaged helper location is difficult to use reliably for sandboxed runner launches even though the binaries are present. This change works around that by copying codex- command-runner.exe into CODEX_HOME/.sandbox-bin/, reusing that copy across launches, and falling back to the existing packaged-path lookup if anything goes wrong. The runtime copy lives in a dedicated directory with tighter ACLs than .sandbox: sandbox users can read and execute the runner there, but they cannot modify it. This keeps the workaround focused on the command runner, leaves the setup helper on its trusted packaged path, and adds logging so it is clear which runner path was selected at launch.	2026-03-04 01:31:37 +00:00
Owen Lin	52521a5e40	feat(app-server): propagate app-server trace context into core (#13368 ) ### Summary Propagate trace context originating at app-server RPC method handlers -> codex core submission loop (so this includes spans such as `run_turn`!). This implements PR 2 of the app-server tracing rollout. This also removes the old lower-level env-based reparenting in core so explicit request/submission ancestry wins instead of being overridden by ambient `TRACEPARENT` state. ### What changed - Added `trace: Option<W3cTraceContext>` to codex_protocol::Submission - Taught `Codex::submit()` / `submit_with_id()` to automatically capture the current span context when constructing or forwarding a submission - Wrapped the core submission loop in a submission_dispatch span parented from Submission.trace - Warn on invalid submission trace carriers and ignore them cleanly - Removed the old env-based downstream reparenting path in core task execution - Stopped OTEL provider init from implicitly attaching env trace context process-wide - Updated mcp-server Submission call sites for the new field Added focused unit tests for: - capturing trace context into Submission - preferring `Submission.trace` when building the core dispatch span ### Why PR 1 gave us consistent inbound request spans in app-server, but that only covered the transport boundary. For long-running work like turns and reviews, the important missing piece was preserving ancestry after the request handler returns and core continues work on a different async path. This change makes that handoff explicit and keeps the parentage rules simple: - app-server request span sets the current context - `Submission.trace` snapshots that context - core restores it once, at the submission boundary - deeper core spans inherit naturally That also lets us stop relying on env-based reparenting for this path, which was too ambient and could override explicit ancestry.	2026-03-04 01:03:45 +00:00
Owen Lin	0fbd84081b	feat(app-server): add a skills/changed v2 notification (#13414 ) This adds a first-class app-server v2 `skills/changed` notification for the existing skills live-reload signal. Before this change, clients only had the legacy raw `codex/event/skills_update_available` event. With this PR, v2 clients can listen for a typed JSON-RPC notification instead of depending on the legacy `codex/event/*` stream, which we want to remove soon.	2026-03-03 17:01:00 -08:00
rhan-oai	e951ef4374	[feedback] diagnostics (#13292 ) - added header logic to display diagnostics on cli - added logic for collecting env vars <img width="606" height="327" alt="Screenshot 2026-03-03 at 3 49 31 PM" src="https://github.com/user-attachments/assets/05e78c56-8cb3-47fa-abaf-3e57f1fdd8e2" /> <img width="690" height="353" alt="Screenshot 2026-03-02 at 6 47 54 PM" src="https://github.com/user-attachments/assets/e470b559-13f4-44d9-897f-bc398943c6d1" />	2026-03-03 16:34:11 -08:00
sayan-oai	082682a628	feat: load plugin apps (#13401 ) load plugin-apps from `.app.json`. make apps runtime-mentionable iff `codex_apps` MCP actually exposes tools for that `connector_id`. if the app isn't available, it's filtered out of runtime connector set, so no tools are added and no app-mentions resolve. right now we don't have a clean cli-side error for an app not being installed. can look at this after. ### Tests Added tests, tested locally that using a plugin that bundles an app picks up the app.	2026-03-03 16:29:15 -08:00
Curtis 'Fjord' Hawthorne	c4cb594e73	Make js_repl image output controllable (#13331 ) ## Summary Instead of always adding inner function call outputs to the model context, let js code decide which ones to return. - Stop auto-hoisting nested tool outputs from `codex.tool(...)` into the outer `js_repl` function output. - Keep `codex.tool(...)` return values unchanged as structured JS objects. - Add `codex.emitImage(...)` as the explicit path for attaching an image to the outer `js_repl` function output. - Support emitting from a direct image URL, a single `input_image` item, an explicit `{ bytes, mimeType }` object, or a raw tool response object containing exactly one image. - Preserve existing `view_image` original-resolution behavior when JS emits the raw `view_image` tool result. - Suppress the special `ViewImageToolCall` event for `js_repl`-sourced `view_image` calls so nested inspection stays side-effect free until JS explicitly emits. - Update the `js_repl` docs and generated project instructions with both recommended patterns: - `await codex.emitImage(codex.tool("view_image", { path }))` - `await codex.emitImage({ bytes: await page.screenshot({ type: "jpeg", quality: 85 }), mimeType: "image/jpeg" })` #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/13050 - 👉 `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049	2026-03-03 16:25:59 -08:00
alexsong-oai	1afbbc11c3	Ensure the env values of imported shell_environment_policy.set is string (#13402 )	2026-03-03 16:12:23 -08:00
Curtis 'Fjord' Hawthorne	b92146d48b	Add under-development original-resolution view_image support (#13050 ) ## Summary Add original-resolution support for `view_image` behind the under-development `view_image_original_resolution` feature flag. When the flag is enabled and the target model is `gpt-5.3-codex` or newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends `detail: "original"` to the Responses API instead of using the legacy resize/compress path. ## What changed - Added `view_image_original_resolution` as an under-development feature flag. - Added `ImageDetail` to the protocol models and support for serializing `detail: "original"` on tool-returned images. - Added `PromptImageMode::Original` to `codex-utils-image`. - Preserves original PNG/JPEG/WebP bytes. - Keeps legacy behavior for the resize path. - Updated `view_image` to: - use the shared `local_image_content_items_with_label_number(...)` helper in both code paths - select original-resolution mode only when: - the feature flag is enabled, and - the model slug parses as `gpt-5.3-codex` or newer - Kept local user image attachments on the existing resize path; this change is specific to `view_image`. - Updated history/image accounting so only `detail: "original"` images use the docs-based GPT-5 image cost calculation; legacy images still use the old fixed estimate. - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG at 85% quality unless lossless is required, while still allowing other formats when explicitly requested. - Updated tests and helper code that construct `FunctionCallOutputContentItem::InputImage` to carry the new `detail` field. ## Behavior ### Feature off - `view_image` keeps the existing resize/re-encode behavior. - History estimation keeps the existing fixed-cost heuristic. ### Feature on + `gpt-5.3-codex+` - `view_image` sends original-resolution images with `detail: "original"`. - PNG/JPEG/WebP source bytes are preserved when possible. - History estimation uses the GPT-5 docs-based image-cost calculation for those `detail: "original"` images. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/13050 - ⏳ `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049	2026-03-03 15:56:54 -08:00
joeytrasatti-openai	935754baa3	Add thread metadata update endpoint to app server (#13280 ) ## Summary - add the v2 `thread/metadata/update` API, including protocol/schema/TypeScript exports and app-server docs - patch stored thread `gitInfo` in sqlite without resuming the thread, with validation plus support for explicit `null` clears - repair missing sqlite thread rows from rollout data before patching, and make those repairs safe by inserting only when absent and updating only git columns so newer metadata is not clobbered - keep sqlite authoritative for mutable thread git metadata by preserving existing sqlite git fields during reconcile/backfill and only using rollout `SessionMeta` git fields to fill gaps - add regression coverage for the endpoint, repair paths, concurrent sqlite writes, clearing git fields, and rollout/backfill reconciliation - fix the login server shutdown race so cancelling before the waiter starts still terminates `block_until_done()` correctly ## Testing - `cargo test -p codex-state apply_rollout_items_preserves_existing_git_branch_and_fills_missing_git_fields` - `cargo test -p codex-state update_thread_git_info_preserves_newer_non_git_metadata` - `cargo test -p codex-core backfill_sessions_preserves_existing_git_branch_and_fills_missing_git_fields` - `cargo test -p codex-app-server thread_metadata_update` - `cargo test` - currently fails in existing `codex-core` grep-files tests with `unsupported call: grep_files`: - `suite::grep_files::grep_files_tool_collects_matches` - `suite::grep_files::grep_files_tool_reports_empty_results`	2026-03-03 15:56:11 -08:00
Charley Cunningham	299b8ac445	tui: align pending steers with core acceptance (#12868 ) ## Summary - submit `Enter` steers immediately while a turn is already running instead of routing them through `queued_user_messages` - keep those submitted steers visible in the footer as `pending_steers` until core records them as a user message or aborts the turn - reconcile pending steers on `ItemCompleted(UserMessage)`, not `RawResponseItem` - emit user-message item lifecycle for leftover pending input at task finish, then remove the TUI `TurnComplete` fallback - keep `queued_user_messages` for actual queued drafts, rendered below pending steers ## Problem While the assistant was generating, pressing `Enter` could send the input into `queued_user_messages`. That queue only drains after the turn ends, so ordinary steers behaved like queued drafts instead of landing at the next core sampling boundary. The first version of this fix also used `RawResponseItem` to decide when a steer had landed. Review feedback was that this is the wrong abstraction for client behavior. There was also a late edge case in core: if pending steer input was accepted after the final sampling decision but before `TurnComplete`, core would record that user message into history at task finish without emitting `ItemStarted(UserMessage)` / `ItemCompleted(UserMessage)`. TUI had a fallback to paper over that gap locally. ## Approach - `Enter` during an active turn now submits a normal `Op::UserTurn` immediately - TUI keeps a local pending-steer preview instead of rendering that user message into history immediately - when core records the steer as `ItemCompleted(UserMessage)`, TUI matches and removes the corresponding pending preview, then renders the committed user message - core now emits the same user-message lifecycle when `on_task_finished(...)` drains leftover pending user input, before `TurnComplete` - with that lifecycle gap closed in core, TUI no longer needs to flush pending steers into history on `TurnComplete` - if the turn is interrupted, pending steers and queued drafts are both restored into the composer, with pending steers first ## Notes - `Tab` still uses the real queued-message path - `queued_user_messages` and `pending_steers` are separate state with separate semantics - the pending-steer matching key is built directly from `UserInput` - this removes the new TUI dependency on `RawResponseItem` ## Validation - `just fmt` - `cargo test -p codex-core task_finish_emits_turn_item_lifecycle_for_leftover_pending_user_input -- --nocapture` - `cargo test -p codex-tui`	2026-03-03 15:31:52 -08:00
viyatb-oai	24a2d0c696	fix(network-proxy): reject mismatched host headers (#13275 ) ## Summary - reject plain HTTP absolute-form requests whose Host header does not match the request target authority - add host/port-aware Host header validation for non-default ports - add regression coverage for mismatched Host forwarding and validator edge cases	2026-03-03 15:12:06 -08:00
xl-openai	9b004e2db1	Refactor plugin config and cache path (#13333 ) Update config.toml plugin entries to use <plugin_name>@<marketplace_name> as the key. Plugin now stays in [plugins/cache/marketplace-name/plugin-name/$version/] Clean up the plugin code structure. Add plugin install functionality (not used yet).	2026-03-03 15:00:18 -08:00
Ahmed Ibrahim	041c896509	Revert "Revert "realtime prompt changes"" (#13398 ) Reverts openai/codex#13385	2026-03-03 14:41:26 -08:00
Eric Traut	bab32afa93	Require deduplicator success before commenting (#13399 ) Fixed recent regression in issue dedup action	2026-03-03 15:32:47 -07:00
Ahmed Ibrahim	6bee02a346	Build delegated realtime handoff text from all messages (#13395 ) ## Summary - Route delegated realtime handoff turns from all handoff message texts, preserving order - Fallback to input_transcript only when no messages are present - Add regression coverage for multi-message handoff requests	2026-03-03 14:07:51 -08:00
Owen Lin	d7eb195b62	chore(app-server): restore EventMsg TS types (#13397 ) Realized EventMsg generated types were unintentionally removed as part of this PR: https://github.com/openai/codex/pull/13375 Turns out our TypeScript export pipeline relied on transitively reaching `EventMsg`. We should still export `EventMsg` explicitly since we're still emitting `codex/event/*` events (for now, but getting dropped soon as well).	2026-03-03 13:37:40 -08:00
Owen Lin	167158f93c	chore(app-server): delete v1 RPC methods and notifications (#13375 ) ## Summary This removes the old app-server v1 methods and notifications we no longer need, while keeping the small set the main codex app client still depends on for now. The remaining legacy surface is: - `initialize` - `getConversationSummary` - `getAuthStatus` - `gitDiffToRemote` - `fuzzyFileSearch` - `fuzzyFileSearch/sessionStart` - `fuzzyFileSearch/sessionUpdate` - `fuzzyFileSearch/sessionStop` And the raw `codex/event/*` notifications emitted from core. These notifications will be removed in a followup PR. ## What changed - removed deprecated v1 request variants from the protocol and app-server dispatcher - removed deprecated typed notifications: `authStatusChange`, `loginChatGptComplete`, and `sessionConfigured` - updated the app-server test client to use v2 flows instead of deleted v1 flows - deleted legacy-only app-server test suites and added focused coverage for `getConversationSummary` - regenerated app-server schema fixtures and updated the MCP interface docs to match the remaining compatibility surface ## Testing - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server`	2026-03-03 13:18:25 -08:00
Ahmed Ibrahim	72d368e03a	fix (#13389 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-03-03 12:48:16 -08:00
Ahmed Ibrahim	8afe2127dc	Revert "realtime prompt changes" (#13385 ) Reverts openai/codex#13376	2026-03-03 12:30:37 -08:00

1 2 3 4 5 ...

4233 Commits