codex

mirror of https://github.com/openai/codex.git synced 2026-05-03 21:01:55 +03:00

Author	SHA1	Message	Date
Charles Cunningham	8a4e7dab76	Rebase resume model switch on hydrated history	2026-02-17 22:31:16 -08:00
Charles Cunningham	08bfb1e7f2	Preserve personality updates with resumed-model hydration	2026-02-17 22:31:16 -08:00
Charles Cunningham	418321e388	Refactor settings updates to diff TurnContextItem	2026-02-17 22:31:16 -08:00
Charley Cunningham	c16f9daaaf	Add model-visible context layout snapshot tests (#12073 ) ## Summary - add a dedicated `core/tests/suite/model_visible_layout.rs` snapshot suite to materialize model-visible request layout in high-value scenarios - add three reviewer-focused snapshot scenarios: - turn-level context updates (cwd / permissions / personality) - first post-resume turn with model hydration + personality change - first post-resume turn where pre-turn model override matches rollout model - wire the new suite into `core/tests/suite/mod.rs` - commit generated `insta` snapshots under `core/tests/suite/snapshots/` ## Why This creates a stable, reviewable baseline of model-visible context layout against `main` before follow-on context-management refactors. It lets subsequent PRs show focused snapshot diffs for behavior changes instead of introducing the test surface and behavior changes at once. ## Testing - `just fmt` - `INSTA_UPDATE=always cargo test -p codex-core model_visible_layout`	2026-02-17 22:30:29 -08:00
won-openai	189f592014	got rid of experimental_mode for configtoml (#12077 )	2026-02-17 21:10:30 -08:00
Owen Lin	edacbf7b6e	feat(core): zsh exec bridge (#12052 ) zsh fork PR stack: - https://github.com/openai/codex/pull/12051 - https://github.com/openai/codex/pull/12052 👈 ### Summary This PR introduces a feature-gated native shell runtime path that routes shell execution through a patched zsh exec bridge, removing MCP-specific behavior from the shell hot path while preserving existing CommandExecution lifecycle semantics. When shell_zsh_fork is enabled, shell commands run via patched zsh with per-`execve` interception through EXEC_WRAPPER. Core receives wrapper IPC requests over a Unix socket, applies existing approval policy, and returns allow/deny before the subcommand executes. ### What’s included 1) New zsh exec bridge runtime in core - Wrapper-mode entrypoint (maybe_run_zsh_exec_wrapper_mode) for EXEC_WRAPPER invocations. - Per-execution Unix-socket IPC handling for wrapper requests/responses. - Approval callback integration using existing core approval orchestration. - Streaming stdout/stderr deltas to existing command output event pipeline. - Error handling for malformed IPC, denial/abort, and execution failures. 2) Session lifecycle integration SessionServices now owns a `ZshExecBridge`. Session startup initializes bridge state; shutdown tears it down cleanly. 3) Shell runtime routing (feature-gated) When `shell_zsh_fork` is enabled: - Build execution env/spec as usual. - Add wrapper socket env wiring. - Execute via `zsh_exec_bridge.execute_shell_request(...)` instead of the regular shell path. - Non-zsh-fork behavior remains unchanged. 4) Config + feature wiring - Added `Feature::ShellZshFork` (under development). - Added config support for `zsh_path` (optional absolute path to patched zsh): - `Config`, `ConfigToml`, `ConfigProfile`, overrides, and schema. - Session startup validates that `zsh_path` exists/usable when zsh-fork is enabled. - Added startup test for missing `zsh_path` failure mode. 5) Seatbelt/sandbox updates for wrapper IPC - Extended seatbelt policy generation to optionally allow outbound connection to explicitly permitted Unix sockets. - Wired sandboxing path to pass wrapper socket path through to seatbelt policy generation. - Added/updated seatbelt tests for explicit socket allow rule and argument emission. 6) Runtime entrypoint hooks - This allows the same binary to act as the zsh wrapper subprocess when invoked via `EXEC_WRAPPER`. 7) Tool selection behavior - ToolsConfig now prefers ShellCommand type when shell_zsh_fork is enabled. - Added test coverage for precedence with unified-exec enabled.	2026-02-17 20:19:53 -08:00
pakrym-oai	fc810ba045	Use V2 websockets if feature enabled (#12071 )	2026-02-17 18:32:16 -08:00
Charley Cunningham	eb68767f2f	Unify remote compaction snapshot mocks around default endpoint behavior (#12050 ) ## Summary - standardize remote compaction test mocking around one default behavior in shared helpers - make default remote compact mocks mirror production shape: keep `message/user` + `message/developer`, drop assistant/tool artifacts, then append a summary user message - switch non-special `compact_remote` tests to the shared default mock instead of ad-hoc JSON payloads ## Special-case tests that still use explicit mocks - remote compaction error payload / HTTP failure behavior - summary-only compact output behavior - manual `/compact` with no prior user messages - stale developer-instruction injection coverage ## Why This removes inconsistent manual remote compaction fixtures and gives us one source of truth for normal remote compact behavior, while preserving explicit mocks only where tests intentionally cover non-default behavior.	2026-02-17 18:18:47 -08:00
Owen Lin	db4d2599b5	feat(core): plumb distinct approval ids for command approvals (#12051 ) zsh fork PR stack: - https://github.com/openai/codex/pull/12051 👈 - https://github.com/openai/codex/pull/12052 With upcoming support for a fork of zsh that allows us to intercept `execve` and run execpolicy checks for each subcommand as part of a `CommandExecution`, it will be possible for there to be multiple approval requests for a shell command like `/path/to/zsh -lc 'git status && rg \"TODO\" src && make test'`. To support that, this PR introduces a new `approval_id` field across core, protocol, and app-server so that we can associate approvals properly for subcommands.	2026-02-18 01:55:57 +00:00
Shijie Rao	b3a8571219	Chore: remove response model check and rely on header model for downgrade (#12061 ) ### Summary Ensure that we use the model value from the response header only so that we are guaranteed with the correct slug name. We are no longer checking against the model value from response so that we are less likely to have false positive. There are two different treatments - for SSE we use the header from the response and for websocket we check top-level events.	2026-02-18 01:50:06 +00:00
gabec-openai	5341ad08f8	Use prompt-based co-author attribution with config override (#11617 )	2026-02-17 20:15:54 +00:00
dependabot[bot]	15cd796749	chore(deps): bump arc-swap from 1.8.0 to 1.8.2 in /codex-rs (#11890 ) Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.8.0 to 1.8.2. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md">arc-swap's changelog</a>.</em></p> <blockquote> <h1>1.8.2</h1> <ul> <li>Proper gate of <code>Pin</code> (since 1.39 - we are not using only <code>Pin</code>, but also <code>Pin::into_inner</code>, <a href="https://redirect.github.com/vorner/arc-swap/issues/197">#197</a>).</li> </ul> <h1>1.8.1</h1> <ul> <li>Some more careful orderings (<a href="https://redirect.github.com/vorner/arc-swap/issues/195">#195</a>).</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`19f0d661a2`"><code>19f0d66</code></a> Version 1.8.2</li> <li><a href="`c222a22864`"><code>c222a22</code></a> Release 1.8.1</li> <li><a href="`cccf3548a8`"><code>cccf354</code></a> Upgrade the other ordering too, for transitivity</li> <li><a href="`e94df5511a`"><code>e94df55</code></a> Merge pull request <a href="https://redirect.github.com/vorner/arc-swap/issues/195">#195</a> from 0xfMel/master</li> <li><a href="`bd5d3276e4`"><code>bd5d327</code></a> Fix Debt::pay failure ordering</li> <li><a href="`22431daf64`"><code>22431da</code></a> Merge pull request <a href="https://redirect.github.com/vorner/arc-swap/issues/189">#189</a> from atouchet/rdm</li> <li><a href="`b142bd81da`"><code>b142bd8</code></a> Update Readme</li> <li>See full diff in <a href="https://github.com/vorner/arc-swap/compare/v1.8.0...v1.8.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=arc-swap&package-manager=cargo&previous-version=1.8.0&new-version=1.8.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-17 11:45:45 -08:00
Matthew Zeng	16fa195fce	[apps] Expose more fields from apps listing endpoints. (#11706 ) - [x] Expose app_metadata, branding, and labels in AppInfo.	2026-02-17 11:45:04 -08:00
sayan-oai	41800fc876	chore: rm remote models fflag (#11699 ) rm `remote_models` feature flag. We see issues like #11527 when a user has `remote_models` disabled, as we always use the default fallback `ModelInfo`. This causes issues with model performance. Builds on #11690, which helps by warning the user when they are using the default fallback. This PR will make that happen much less frequently as an accidental consequence of disabling `remote_models`.	2026-02-17 11:43:16 -08:00
xl-openai	314029ffa3	Add remote skill scope/product_surface/enabled params and cleanup (#11801 ) skills/remote/list: params=hazelnutScope, productSurface, enabled; returns=data: { id, name, description }[] skills/remote/export: params=hazelnutId; returns={ id, path }	2026-02-17 11:05:22 -08:00
Shijie Rao	48018e9eac	Feat: add model reroute notification (#12001 ) ### Summary Builiding off `5c75aa7b89 (diff-058ae8f109a8b84b4b79bbfa45f522c2233b9d9e139696044ae374d50b6196e0)`, we have created a `model/rerouted` notification that captures the event so that consumers can render as expected. Keep the `EventMsg::Warning` path in core so that this does not affect TUI rendering. `model/rerouted` is meant to be generic to account for future usage including capacity planning etc.	2026-02-17 11:02:23 -08:00
sayan-oai	a1b8e34938	chore: clarify web_search deprecation notices and consolidate tests (#11224 ) follow up to #10406, clarify default-enablement of web_search. also consolidate pseudo-redundant tests Tests pass	2026-02-17 18:20:24 +00:00
jif-oai	76283e6b4e	feat: move agents config to main config (#11982 )	2026-02-17 18:17:19 +00:00
Charley Cunningham	cab607befb	Centralize context update diffing logic (#11807 ) ## Summary This PR centralizes model-visible state diffing for turn context updates into one module, while keeping existing behavior and call sites stable. ### What changed - Added `core/src/context_updates.rs` with the consolidated diffing logic for: - environment context updates - permissions/policy updates - collaboration mode updates - model-instruction switch updates - personality updates - Added `BuildSettingsUpdateItemsParams` so required dependencies are passed explicitly. - Updated `Session::build_settings_update_items` in `core/src/codex.rs` to delegate to the centralized module. - Reused the same centralized `personality_message_for` helper from initial-context assembly to avoid duplicated logic. - Registered the new module in `core/src/lib.rs`. ## Why This is a minimal, shippable step toward the model-visible-state design: all state diff decisions for turn-context update items now live in one place, improving reviewability and reducing drift risk without expanding scope. ## Behavior - Intended to be behavior-preserving. - No protocol/schema changes. - No call-site behavior changes beyond routing through the new centralized logic. ## Testing Ran targeted tests in this worktree: - `cargo test -p codex-core build_settings_update_items_emits_environment_item_for_network_changes` - `cargo test -p codex-core collaboration_instructions --test all` Both passed. ## Codex author `codex resume 019c540f-3951-7352-a3fa-6f07b834d4ce`	2026-02-17 09:21:44 -08:00
Eric Traut	281b0eae8b	Don't allow model_supports_reasoning_summaries to disable reasoning (#11833 ) The `model_supports_reasoning_summaries` config option was originally added so users could enable reasoning for custom models (models that codex doesn't know about). This is how it was documented in the source, but its implementation didn't match. It was implemented such that it can also be used to disable reasoning for models that otherwise support reasoning. This leads to bad behavior for some reasoning models like `gpt-5.3-codex`. Diagnosing this is difficult, and it has led to many support issues. This PR changes the handling of `model_supports_reasoning_summaries` so it matches its original documented behavior. If it is set to false, it is a no-op. That is, it never disables reasoning for models that are known to support reasoning. It can still be used for its intended purpose -- to enable reasoning for unknown models.	2026-02-17 07:19:28 -08:00
jif-oai	56cd85cd4b	nit: wording multi-agent (#11986 )	2026-02-17 11:45:59 +00:00
jif-oai	77f74a5c17	fix: race in js repl (#11922 ) js_repl_reset previously raced with in-flight/new js_repl executions because reset() could clear exec_tool_calls without synchronizing with execute(). In that window, a running exec could lose its per-exec tool-call context, and subsequent kernel RunTool messages would fail with js_repl exec context not found. The fix serializes reset and execute on the same exec_lock, so reset cannot run concurrently with exec setup/teardown. We also keep the timeout path safe by performing reset steps inline while execute() already holds the lock, avoiding re-entrant lock acquisition. A regression test now verifies that reset waits for the exec lock and does not clear tool-call state early.	2026-02-17 11:06:14 +00:00
jif-oai	846464e869	fix: js_repl reset hang by clearing exec tool calls without waiting (#11932 ) Remove the waiting loop in `reset` so it no longer blocks on potentially hanging exec tool calls + add `clear_all_exec_tool_calls_map` to drain the map and notify waiters so `reset` completes immediately	2026-02-17 08:40:54 +00:00
Dylan Hurd	0fbe10a807	fix(core) exec_policy parsing fixes (#11951 ) ## Summary Fixes a few things in our exec_policy handling of prefix_rules: 1. Correctly match redirects specifically for exec_policy parsing. i.e. if you have `prefix_rule(["echo"], decision="allow")` then `echo hello > output.txt` should match - this should fix #10321 2. If there already exists any rule that would match our prefix rule (not just a prompt), then drop it, since it won't do anything. ## Testing - [x] Updated unit tests, added approvals ScenarioSpecs	2026-02-16 23:11:59 -08:00
Fouad Matin	02e9006547	add(core): safety check downgrade warning (#11964 ) Add per-turn notice when a request is downgraded to a fallback model due to cyber safety checks. Changes - codex-api: Emit a ServerModel event based on the openai-model response header and/or response payload (SSE + WebSocket), including when the model changes mid-stream. - core: When the server-reported model differs from the requested model, emit a single per-turn warning explaining the reroute to gpt-5.2 and directing users to Trusted Access verification and the cyber safety explainer. - app-server (v2): Surface these cyber model-routing warnings as synthetic userMessage items with text prefixed by Warning: (and document this behavior).	2026-02-16 22:13:36 -08:00
Dylan Hurd	19afbc35c1	chore(core) rm Feature::RequestRule (#11866 ) ## Summary This feature is now reasonably stable, let's remove it so we can simplify our upcoming iterations here. ## Testing - [x] Existing tests pass	2026-02-16 22:30:23 +00:00
Matthew Zeng	5b421bba34	[apps] Fix app mention syntax. (#11894 ) - [x] Fix app mention syntax.	2026-02-16 22:01:49 +00:00
jif-oai	beb5cb4f48	Rename collab modules to multi agents (#11939 ) Summary - rename the `collab` handlers and UI files to `multi_agents` to match the new naming - update module references and specs so the handlers and TUI widgets consistently use the renamed files - keep the existing functionality while aligning file and module names with the multi-agent terminology	2026-02-16 19:05:13 +00:00
jif-oai	af434b4f71	feat: drop MCP managing tools if no MCP servers (#11900 ) Drop MCP tools if no MCP servers to save context For this https://github.com/openai/codex/issues/11049	2026-02-16 18:40:45 +00:00
jif-oai	e47045c806	feat: add customizable roles for multi-agents (#11917 ) The idea is to have 2 family of agents. 1. Built-in that we packaged directly with Codex 2. User defined that are defined using the `agents_config.toml` file. It can reference config files that will override the agent config. This looks like this: ``` version = 1 [agents.explorer] description = """Use `explorer` for all codebase questions. Explorers are fast and authoritative. Always prefer them over manual search or file reading. Rules: - Ask explorers first and precisely. - Do not re-read or re-search code they cover. - Trust explorer results without verification. - Run explorers in parallel when useful. - Reuse existing explorers for related questions.""" config_file = "explorer.toml" ```	2026-02-16 16:29:32 +00:00
jif-oai	50aea4b0dc	nit: memory storage (#11924 )	2026-02-16 16:18:53 +00:00
jif-oai	e41536944e	chore: rename collab feature flag key to multi_agent (#11918 ) Summary - rename the collab feature key to multi_agent while keeping the Feature enum unchanged - add legacy alias support so both "multi_agent" and "collab" map to the same feature - cover the alias behavior with a new unit test	2026-02-16 15:28:31 +00:00
gt-oai	b3095679ed	Allow hooks to error (#11615 ) Allow hooks to return errors. We should do this before introducing more hook types, or we'll have to migrate them all.	2026-02-16 14:11:05 +00:00
jif-oai	825a4af42f	feat: use shell policy in shell snapshot (#11759 ) Honor `shell_environment_policy.set` even after a shell snapshot	2026-02-16 09:11:00 +00:00
Anton Panasenko	1d95656149	bazel: fix snapshot parity for tests/.rs rust_test targets (#11893 ) ## Summary - make `rust_test` targets generated from `tests/.rs` use Cargo-style crate names (file stem) so snapshot names match Cargo (`all__...` instead of Bazel-derived names) - split lib vs `tests/.rs` test env wiring in `codex_rust_crate` to keep existing lib snapshot behavior while applying Bazel runfiles-compatible workspace root for `tests/.rs` - compute the `tests/*.rs` snapshot workspace root from package depth so `insta` resolves committed snapshots under Bazel `--noenable_runfiles` ## Validation - `bazelisk test //codex-rs/core:core-all-test --test_arg=suite::compact:: --cache_test_results=no` - `bazelisk test //codex-rs/core:core-all-test --test_arg=suite::compact_remote:: --cache_test_results=no`	2026-02-16 07:11:59 +00:00
sayan-oai	bdea9974d9	fix: only emit unknown model warning on user turns (#11884 ) ###### Context unknown model warning added in #11690 has [issues](https://github.com/openai/codex/actions/runs/22047424710/job/63700733887) on ubuntu runners because we potentially emit it on all new turns, including ones with intentionally fake models (i.e., `mock-model` in a test). ###### Fix change the warning to only emit on user turns/review turns. ###### Tests CI now passes on ubuntu, still passes locally	2026-02-15 21:18:35 -08:00
Anton Panasenko	02abd9a8ea	feat: persist and restore codex app's tools after search (#11780 ) ### What changed 1. Removed per-turn MCP selection reset in `core/src/tasks/mod.rs`. 2. Added `SessionState::set_mcp_tool_selection(Vec<String>)` in `core/src/state/session.rs` for authoritative restore behavior (deduped, order-preserving, empty clears). 3. Added rollout parsing in `core/src/codex.rs` to recover `active_selected_tools` from prior `search_tool_bm25` outputs: - tracks matching `call_id`s - parses function output text JSON - extracts `active_selected_tools` - latest valid payload wins - malformed/non-matching payloads are ignored 4. Applied restore logic to resumed and forked startup paths in `core/src/codex.rs`. 5. Updated instruction text to session/thread scope in `core/templates/search_tool/tool_description.md`. 6. Expanded tests in `core/tests/suite/search_tool.rs`, plus unit coverage in: - `core/src/codex.rs` - `core/src/state/session.rs` ### Behavior after change 1. Search activates matched tools. 2. Additional searches union into active selection. 3. Selection survives new turns in the same thread. 4. Resume/fork restores selection from rollout history. 5. Separate threads do not inherit selection unless forked.	2026-02-15 19:18:41 -08:00
sayan-oai	060a320e7d	fix: show user warning when using default fallback metadata (#11690 ) ### What It's currently unclear when the harness falls back to the default, generic `ModelInfo`. This happens when the `remote_models` feature is disabled or the model is truly unknown, and can lead to bad performance and issues in the harness. Add a user-facing warning when this happens so they are aware when their setup is broken. ### Tests Added tests, tested locally.	2026-02-15 18:46:05 -08:00
Charley Cunningham	85034b189e	core: snapshot tests for compaction requests, post-compaction layout, some additional compaction tests (#11487 ) This PR keeps compaction context-layout test coverage separate from runtime compaction behavior changes, so runtime logic review can stay focused. ## Included - Adds reusable context snapshot helpers in `core/tests/common/context_snapshot.rs` for rendering model-visible request/history shapes. - Standardizes helper naming for readability: - `format_request_input_snapshot` - `format_response_items_snapshot` - `format_labeled_requests_snapshot` - `format_labeled_items_snapshot` - Expands snapshot coverage for both local and remote compaction flows: - pre-turn auto-compaction - pre-turn failure/context-window-exceeded paths - mid-turn continuation compaction - manual `/compact` with and without prior user turns - Captures both sides where relevant: - compaction request shape - post-compaction history layout shape - Adds/uses shared request-inspection helpers so assertions target structured request content instead of ad-hoc JSON string parsing. - Aligns snapshots/assertions to current behavior and leaves explicit `TODO(ccunningham)` notes where behavior is known and intentionally deferred. ## Not Included - No runtime compaction logic changes. - No model-visible context/state behavior changes.	2026-02-14 19:57:10 -08:00
viyatb-oai	db6aa80195	fix(core): add linux bubblewrap sandbox tag (#11767 ) ## Summary - add a distinct `linux_bubblewrap` sandbox tag when the Linux bubblewrap pipeline feature is enabled - thread the bubblewrap feature flag into sandbox tag generation for: - turn metadata header emission - tool telemetry metric tags and after-tool-use hooks - add focused unit tests for `sandbox_tag` precedence and Linux bubblewrap behavior ## Validation - `just fmt` - `cargo clippy -p codex-core --all-targets` - `cargo test -p codex-core sandbox_tags::tests` - started `cargo test -p codex-core` and stopped it per request Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>	2026-02-14 19:00:01 +00:00
viyatb-oai	b527ee2890	feat(core): add structured network approval plumbing and policy decision model (#11672 ) ### Description #### Summary Introduces the core plumbing required for structured network approvals #### What changed - Added structured network policy decision modeling in core. - Added approval payload/context types needed for network approval semantics. - Wired shell/unified-exec runtime plumbing to consume structured decisions. - Updated related core error/event surfaces for structured handling. - Updated protocol plumbing used by core approval flow. - Included small CLI debug sandbox compatibility updates needed by this layer. #### Why establishes the minimal backend foundation for network approvals without yet changing high-level orchestration or TUI behavior. #### Notes - Behavior remains constrained by existing requirements/config gating. - Follow-up PRs in the stack handle orchestration, UX, and app-server integration. --------- Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>	2026-02-14 04:18:12 +00:00
Charley Cunningham	67e577da53	Handle model-switch base instructions after compaction (#11659 ) Strip trailing <model_switch> during model-switch compaction request, and append <model_switch> after model switch compaction	2026-02-13 19:02:53 -08:00
alexsong-oai	8156c57234	add perf metrics for connectors load (#11803 )	2026-02-13 18:15:07 -08:00
Celia Chen	5b6911cb1b	feat(skills): add permission profiles from openai.yaml metadata (#11658 ) ## Summary This PR adds support for skill-level permissions in .codex/openai.yaml and wires that through the skill loading pipeline. ## What’s included 1. Added a new permissions section for skills (network, filesystem, and macOS-related access). 2. Implemented permission parsing/normalization and translation into runtime permission profiles. 3. Threaded the new permission profile through SkillMetadata and loader flow. ## Follow-up A follow-up PR will connect these permission profiles to actual sandbox enforcement and add user approval prompts for executing binaries/scripts from skill directories. ## Example `openai.yaml` snippet: ``` permissions: network: true fs_read: - "./data" - "./data" fs_write: - "./output" macos_preferences: "readwrite" macos_automation: - "com.apple.Notes" macos_accessibility: true macos_calendar: true ``` compiled skill permission profile metadata (macOS): ``` SkillPermissionProfile { sandbox_policy: SandboxPolicy::WorkspaceWrite { writable_roots: vec![ AbsolutePathBuf::try_from("/ABS/PATH/TO/SKILL/output").unwrap(), ], read_only_access: ReadOnlyAccess::Restricted { include_platform_defaults: true, readable_roots: vec![ AbsolutePathBuf::try_from("/ABS/PATH/TO/SKILL/data").unwrap(), ], }, network_access: true, exclude_tmpdir_env_var: false, exclude_slash_tmp: false, }, // Truncated for readability; actual generated profile is longer. macos_seatbelt_permission_file: r#" (allow user-preference-write) (allow appleevent-send (appleevent-destination "com.apple.Notes")) (allow mach-lookup (global-name "com.apple.axserver")) (allow mach-lookup (global-name "com.apple.CalendarAgent")) ... "#.to_string(), ```	2026-02-14 01:43:44 +00:00
Curtis 'Fjord' Hawthorne	0d76d029b7	Fix js_repl in-flight tool-call waiter race (#11800 ) ## Summary This PR fixes a race in `js_repl` tool-call draining that could leave an exec waiting indefinitely for in-flight tool calls to finish. The fix is in: - `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs` ## Problem `js_repl` tracks in-flight tool calls per exec and waits for them to drain on completion/timeout/cancel paths. The previous wait logic used a check-then-wait pattern with `Notify` that could miss a wakeup: 1. Observe `in_flight > 0` 2. Drop lock 3. Register wait (`notified().await`) If `notify_waiters()` happened between (2) and (3), the waiter could sleep until another notification that never comes. ## What changed - Updated all exec-tool-call wait loops to create an owned notification future while holding the lock: - use `Arc<Notify>::notified_owned()` instead of cloning notify and awaiting later. - Applied this consistently to: - `wait_for_exec_tool_calls` - `wait_for_all_exec_tool_calls` - `wait_for_exec_tool_calls_map` This preserves existing behavior while eliminating the lost-wakeup window. ## Test coverage Added a regression test: - `wait_for_exec_tool_calls_map_drains_inflight_calls_without_hanging` The test repeatedly races waiter/finisher tasks and asserts bounded completion to catch hangs. ## Impact - No API changes. - No user-facing behavior changes intended. - Improves reliability of exec lifecycle boundaries when tool calls are still in flight. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/11796 - 👉 `2` https://github.com/openai/codex/pull/11800 - ⏳ `3` https://github.com/openai/codex/pull/10673 - ⏳ `4` https://github.com/openai/codex/pull/10670	2026-02-14 01:24:52 +00:00
Curtis 'Fjord' Hawthorne	6cbb489e6e	Fix js_repl view_image test runtime panic (#11796 ) ## Summary Fixes a flaky/panicking `js_repl` image-path test by running it on a multi-thread Tokio runtime and tightening assertions to focus on real behavior. ## Problem `js_repl_can_attach_image_via_view_image_tool` in `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs` can panic under single-thread test runtime with: `can call blocking only when running on the multi-threaded runtime` It also asserted a brittle user-facing text string. ## Changes 1. Updated the test runtime to: `#[tokio::test(flavor = "multi_thread", worker_threads = 2)]` 2. Removed the brittle `"attached local image path"` string assertion. 3. Kept the concrete side-effect assertions: - tool call succeeds - image is actually injected into pending input (`InputImage` with `data:image/png;base64,...`) ## Why this is safe This is test-only behavior. No production runtime code paths are changed. ## Validation - Ran: `cargo test -p codex-core tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool -- --nocapture` - Result: pass #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/11796 - ⏳ `2` https://github.com/openai/codex/pull/11800 - ⏳ `3` https://github.com/openai/codex/pull/10673 - ⏳ `4` https://github.com/openai/codex/pull/10670	2026-02-14 01:11:13 +00:00
pash-openai	a5e8e69d18	turn metadata followups (#11782 ) some trivial simplifications from #11677	2026-02-13 14:59:16 -08:00
pash-openai	6c0a924203	turn metadata: per-turn non-blocking (#11677 )	2026-02-13 12:48:29 -08:00
alexsong-oai	e71760fc64	support app usage analytics (#11687 ) Emit app mentioned and app used events. Dedup by (turn_id, connector_id) Example event params: { "event_type": "codex_app_used", "connector_id": "asdk_app_xxx", "thread_id": "019c5527-36d4-xxx", "turn_id": "019c552c-cd17-xxx", "app_name": "Slack (OpenAI Internal)", "product_client_id": "codex_cli_rs", "invoke_type": "explicit", "model_slug": "gpt-5.3-codex" }	2026-02-13 12:00:16 -08:00
Curtis 'Fjord' Hawthorne	a02342c9e1	Add js_repl kernel crash diagnostics (#11666 ) ## Summary This PR improves `js_repl` crash diagnostics so kernel failures are debuggable without weakening timeout/reset guarantees. ## What Changed - Added bounded kernel stderr capture and truncation logic (line + byte caps). - Added structured kernel snapshots (`pid`, exit status, stderr tail) for failure paths. - Enriched model-visible kernel-failure errors with a structured diagnostics payload: - `js_repl diagnostics: {...}` - Included only for likely kernel-failure write/EOF cases. - Improved logging around kernel write failures, unexpected exits, and kill/wait paths. - Added/updated unit tests for: - UTF-8-safe truncation - stderr tail bounds - structured diagnostics shape/truncation - conditional diagnostics emission - timeout kill behavior - forced kernel-failure diagnostics ## Why Before this, failures like broken pipe / unexpected kernel exit often surfaced as generic errors with little context. This change preserves existing behavior but adds actionable diagnostics while keeping output bounded. ## Scope - Code changes are limited to: - `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs` ## Validation - `cargo clippy -p codex-core --all-targets -- -D warnings` - Targeted `codex-core` js_repl unit tests (including new diagnostics/timeout coverage) - Tried starting a long running js_repl command (sleep for 10 minutes), verified error output was as expected after killing the node process. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/11666 - ⏳ `2` https://github.com/openai/codex/pull/10673 - ⏳ `3` https://github.com/openai/codex/pull/10670	2026-02-13 11:57:11 -08:00

1 2 3 4 5 ...

1773 Commits