codex

mirror of https://github.com/openai/codex.git synced 2026-04-30 11:21:34 +03:00

Author	SHA1	Message	Date
Yaroslav Volovich	70de95b7dc	codex: address PR review findings	2026-02-26 11:24:06 +00:00
Yaroslav Volovich	1eac244927	Add live skill refresh notifications	2026-02-26 11:24:05 +00:00
Curtis 'Fjord' Hawthorne	7326c097e3	Reduce js_repl Node version requirement to 22.22.0 (#12857 ) ## Summary Lower the `js_repl` minimum Node version from `24.13.1` to `22.22.0`. This updates the enforced minimum in `codex-rs/node-version.txt` and the corresponding user-facing `/experimental` description for the JavaScript REPL feature. ## Rationale The previous `24.13.1` floor was stricter than necessary for `js_repl`. I validated the REPL kernel behavior under Node `22.22.0` still works. ## Why `22.22.0` `22.22.0` is a current, widely packaged Node 22 release across common developer environments and distros, including Homebrew `node@22`, Fedora `nodejs22`, Arch `nodejs-lts-jod`, and Debian testing. That makes it a better exact floor than guessing at an older `22.x` patch we have not validated. `22.x` is also a maintenance branch that will be supported through April 2027, where the previous maintenance branch of `20.x` is only supported through April of this year. ## Changes - Update `codex-rs/node-version.txt` from `24.13.1` to `22.22.0` - Update the `/experimental` JavaScript REPL description to say `Requires Node >= v22.22.0 installed.`	2026-02-26 04:09:30 +00:00
xl-openai	8cdee988f9	Skip system skills for extra roots (#12744 ) When extra roots is set do not load system skills.	2026-02-25 19:55:28 -08:00
Curtis 'Fjord' Hawthorne	40ab71a985	Disable js_repl when Node is incompatible at startup (#12824 ) ## Summary - validate `js_repl` Node compatibility during session startup when the experiment is enabled - if Node is missing or too old, disable `js_repl` and `js_repl_tools_only` for the session before tools and instructions are built - surface that startup disablement to users through the existing startup warning flow instead of only logging it - reuse the same compatibility check in js_repl kernel startup so startup gating and runtime behavior stay aligned - add a regression test that verifies the warning is emitted and that the first advertised tool list omits `js_repl` and `js_repl_reset` when Node is incompatible ## Why Today `js_repl` can be advertised based only on the feature flag, then fail later when the kernel starts. That makes the available tool list inaccurate at the start of a conversation, and users do not get a clear explanation for why the tool is unavailable. This change makes tool availability reflect real startup checks, keeps the advertised tool set stable for the lifetime of the session, and gives users a visible warning when `js_repl` is disabled. ## Testing - `just fmt` - `cargo test -p codex-core --test all js_repl_is_not_advertised_when_startup_node_is_incompatible`	2026-02-26 01:14:51 +00:00
Michael Bolin	14116ade8d	feat: include available decisions in command approval requests (#12758 ) Command-approval clients currently infer which choices to show from side-channel fields like `networkApprovalContext`, `proposedExecpolicyAmendment`, and `additionalPermissions`. That makes the request shape harder to evolve, and it forces each client to replicate the server's heuristics instead of receiving the exact decision list for the prompt. This PR introduces a mapping between `CommandExecutionApprovalDecision` and `codex_protocol::protocol::ReviewDecision`: ```rust impl From<CoreReviewDecision> for CommandExecutionApprovalDecision { fn from(value: CoreReviewDecision) -> Self { match value { CoreReviewDecision::Approved => Self::Accept, CoreReviewDecision::ApprovedExecpolicyAmendment { proposed_execpolicy_amendment, } => Self::AcceptWithExecpolicyAmendment { execpolicy_amendment: proposed_execpolicy_amendment.into(), }, CoreReviewDecision::ApprovedForSession => Self::AcceptForSession, CoreReviewDecision::NetworkPolicyAmendment { network_policy_amendment, } => Self::ApplyNetworkPolicyAmendment { network_policy_amendment: network_policy_amendment.into(), }, CoreReviewDecision::Abort => Self::Cancel, CoreReviewDecision::Denied => Self::Decline, } } } ``` And updates `CommandExecutionRequestApprovalParams` to have a new field: ```rust available_decisions: Option<Vec<CommandExecutionApprovalDecision>> ``` when, if specified, should make it easier for clients to display an appropriate list of options in the UI. This makes it possible for `CoreShellActionProvider::prompt()` in `unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly, adding support for `ApprovedForSession` when approving a skill script, which was previously missing in the TUI. Note this results in a significant change to `exec_options()` in `approval_overlay.rs`, as the displayed options are now derived from `available_decisions: &[ReviewDecision]`. ## What Changed - Add `available_decisions` to [`ExecApprovalRequestEvent`](`de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)`), including helpers to derive the legacy default choices when older senders omit the field. - Map `codex_protocol::protocol::ReviewDecision` to app-server `CommandExecutionApprovalDecision` and expose the ordered list as experimental `availableDecisions` in [`CommandExecutionRequestApprovalParams`](`de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)`). - Thread optional `available_decisions` through the core approval path so Unix shell escalation can explicitly request `ApprovedForSession` for session-scoped approvals instead of relying on client heuristics. [`unix_escalation.rs`](`de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214)`) - Update the TUI approval overlay to build its buttons from the ordered decision list, while preserving the legacy fallback when `available_decisions` is missing. - Update the app-server README, test client output, and generated schema artifacts to document and surface the new field. ## Testing - Add `approval_overlay.rs` coverage for explicit decision lists, including the generic `ApprovedForSession` path and network approval options. - Update `chatwidget/tests.rs` and app-server protocol tests to populate the new optional field and keep older event shapes working. ## Developers Docs - If we document `item/commandExecution/requestApproval` on [developers.openai.com/codex](https://developers.openai.com/codex), add experimental `availableDecisions` as the preferred source of approval choices and note that older servers may omit it.	2026-02-26 01:10:46 +00:00
Celia Chen	4f45668106	Revert "Add skill approval event/response (#12633 )" (#12811 ) This reverts commit https://github.com/openai/codex/pull/12633. We no longer need this PR, because we favor sending normal exec command approval server request with `additional_permissions` of skill permissions instead	2026-02-26 01:02:42 +00:00
pakrym-oai	4fedef88e0	Use websocket v2 as model-preferred websocket protocol (#12838 )	2026-02-25 16:35:53 -08:00
Ahmed Ibrahim	e76b1a2853	Remove steer feature flag (#12026 ) All code should go in the direction that steer is enabled --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-25 15:41:42 -08:00
Michael Bolin	a6a5976c5a	feat: scope execve session approvals by approved skill metadata (#12814 ) Previous to this change, `determine_action()` would 1. check if `program` is associated with a skill 2. if so, check if `program` is in `execve_session_approvals` to see whether the user needs to be prompted This PR flips the order of these checks to try to set us up so that "session approvals" are always consulted first (which should soon extend to include session approvals derived from `prefix_rule()`s, as well). Though to make the new ordering work, we need to record any relevant metadata to associate with the approval, which in the case of a skill-based approval is the `SkillMetadata` so that we can derive the `PermissionProfile` to include with the escalation. (Though as noted by the `TODO`, this `PermissionProfile` is not honored yet.) The new `ExecveSessionApproval` struct is used to retain the necessary metadata. ## What Changed - Replace the `execve_session_approvals` `HashSet` with a map that stores an `ExecveSessionApproval` alongside each approved `program`. - When a user chooses `ApprovedForSession` for a skill script, capture the matched `SkillMetadata` in the session approval entry. - Consult that cache before re-running `find_skill()`, and reuse the originally approved skill metadata and permission profile when allowing later execve callbacks in the same session.	2026-02-25 15:30:24 -08:00
Charley Cunningham	2f4d6ded1d	Enable request_user_input in Default mode (#12735 ) ## Summary - allow `request_user_input` in Default collaboration mode as well as Plan - update the Default-mode instructions to prefer assumptions first and use `request_user_input` only when a question is unavoidable - update request_user_input and app-server tests to match the new Default-mode behavior - refactor collaboration-mode availability plumbing into `CollaborationModesConfig` for future mode-related flags ## Codex author `codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`	2026-02-25 15:20:46 -08:00
Ahmed Ibrahim	2bd87d1a75	only use preambles for realtime (#12831 ) Reverts openai/codex#12830	2026-02-25 14:54:54 -08:00
Celia Chen	b6d20748e0	Revert "Ensure shell command skills trigger approval (#12697 )" (#12721 ) This reverts commit `daf0f03ac8`. # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-02-25 22:49:53 +00:00
Ahmed Ibrahim	f86087eaa8	Revert "only use preambles for realtime" (#12830 ) Reverts openai/codex#12806	2026-02-25 14:30:48 -08:00
Ahmed Ibrahim	c1851be1ed	only use preambles for realtime (#12806 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-25 13:41:54 -08:00
Ahmed Ibrahim	3f30746237	Add simple realtime text logs (#12807 ) Update realtime debug logs to include the actual text payloads in both input and output paths. - In `core/src/realtime_conversation.rs`: - `handle_start`: add extracted assistant text output to the `[realtime-text]` debug log. - `handle_text`: add incoming text input (`params.text`) to the `[realtime-text]` debug log. No tests were run (per request).	2026-02-25 12:01:48 -08:00
Owen Lin	a0fd94bde6	feat(app-server): add ThreadItem::DynamicToolCall (#12732 ) Previously, clients would call `thread/start` with dynamic_tools set, and when a model invokes a dynamic tool, it would just make the server->client `item/tool/call` request and wait for the client's response to complete the tool call. This works, but it doesn't have an `item/started` or `item/completed` event. Now we are doing this: - [new] emit `item/started` with `DynamicToolCall` populated with the call arguments - send an `item/tool/call` server request - [new] once the client responds, emit `item/completed` with `DynamicToolCall` populated with the response. Also, with `persistExtendedHistory: true`, dynamic tool calls are now reconstructable in `thread/read` and `thread/resume` as `ThreadItem::DynamicToolCall`.	2026-02-25 12:00:10 -08:00
Rasmus Rygaard	73eaebbd1c	Propagate session ID when compacting (#12802 ) We propagate the session ID when sending requests for inference but we don't do the same for compaction requests. This makes it hard to link compaction requests to their session for debugging purposes	2026-02-25 19:17:38 +00:00
Michael Bolin	648a420cbf	fix: enforce sandbox envelope for zsh fork execution (#12800 ) ## Why Zsh fork execution was still able to bypass the `WorkspaceWrite` model in edge cases because the fork path reconstructed command execution without preserving sandbox wrappers, and command extraction only accepted shell invocations in a narrow positional shape. This can allow commands to run with broader filesystem access than expected, which breaks the sandbox safety model. ## What changed - Preserved the sandboxed `ExecRequest` produced by `attempt.env_for(...)` when entering the zsh fork path in [`unix_escalation.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs). - Updated `CoreShellCommandExecutor` to execute the sandboxed command and working directory captured from `attempt.env_for(...)`, instead of re-running a freshly reconstructed shell command. - Made zsh-fork script extraction robust to wrapped invocations by scanning command arguments for `-c`/`-lc` rather than only matching the first positional form. - Added unit tests in `unix_escalation.rs` to lock in wrapper-tolerant parsing behavior and keep unsupported shell forms rejected. - Tightened the regression in [`skill_approval.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/tests/suite/skill_approval.rs): - `shell_zsh_fork_still_enforces_workspace_write_sandbox` now uses an explicit `WorkspaceWrite` policy with `exclude_tmpdir_env_var: true` and `exclude_slash_tmp: true`. - The test attempts to write to `/tmp/...`, which is only reliably outside writable roots with those explicit exclusions set. ## Verification - Added and passed the new unit tests around `extract_shell_script` parsing behavior with wrapped command shapes. - `extract_shell_script_supports_wrapped_command_prefixes` - `extract_shell_script_rejects_unsupported_shell_invocation` - Verified the regression with the focused integration test: `shell_zsh_fork_still_enforces_workspace_write_sandbox`. ## Manual Testing Prior to this change, if I ran Codex via: ``` just codex --config zsh_path=/Users/mbolin/code/codex2/codex-rs/app-server/tests/suite/zsh --enable shell_zsh_fork ``` and asked: ``` what is the output of /bin/ps ``` it would run it, even though the default sandbox should prevent the agent from running `/bin/ps` because it is setuid on MacOS. But with this change, I now see the expected failure because it is blocked by the sandbox: ``` /bin/ps exited with status 1 and produced no output in this environment. ```	2026-02-25 11:05:27 -08:00
jif-oai	7b39e76a66	Revert "fix(bazel): replace askama templates with include_str! in memories" (#12795 ) Reverts openai/codex#11778	2026-02-25 18:06:17 +00:00
Curtis 'Fjord' Hawthorne	0543d0a022	Promote js_repl to experimental with Node requirement (#12712 ) ## Summary - Promote `js_repl` to an experimental feature that users can enable from `/experimental`. - Add `js_repl` experimental metadata, including the Node prerequisite and activation guidance. - Add regression coverage for the feature metadata and the `/experimental` popup. ## What Changed - Changed `Feature::JsRepl` from `Stage::UnderDevelopment` to `Stage::Experimental`. - Added experimental metadata for `js_repl` in `core/src/features.rs`: - name: `JavaScript REPL` - description: calls out interactive website debugging, inline JavaScript execution, and the required Node version (`>= v24.13.1`) - announcement: tells users to enable it, then start a new chat or restart Codex - Added a core unit test that verifies: - `js_repl` is experimental - `js_repl` is disabled by default - the hardcoded Node version in the description matches `node-version.txt` - Added a TUI test that opens the `/experimental` popup and verifies the rendered `js_repl` entry includes the Node requirement text. ## Testing - `just fmt` - `cargo test -p codex-tui` - `cargo test -p codex-core` (unit-test phase passed; stopped during the long `tests/all.rs` integration suite)	2026-02-25 09:44:52 -08:00
mcgrew-oai	9a393c9b6f	feat(network-proxy): add embedded OTEL policy audit logging (#12046 ) PR Summary This PR adds embedded-only OTEL policy audit logging for `codex-network-proxy` and threads audit metadata from `codex-core` into managed proxy startup. ### What changed - Added structured audit event emission in `network_policy.rs` with target `codex_otel.network_proxy`. - Emitted: - `codex.network_proxy.domain_policy_decision` once per domain-policy evaluation. - `codex.network_proxy.block_decision` for non-domain denies. - Added required policy/network fields, RFC3339 UTC millisecond `event.timestamp`, and fallback defaults (`http.request.method="none"`, `client.address="unknown"`). - Added non-domain deny audit emission in HTTP/SOCKS handlers for mode-guard and proxy-state denies, including unix-socket deny paths. - Added `REASON_UNIX_SOCKET_UNSUPPORTED` and used it for unsupported unix-socket auditing. - Added `NetworkProxyAuditMetadata` to runtime/state, re-exported from `lib.rs` and `state.rs`. - Added `start_proxy_with_audit_metadata(...)` in core config, with `start_proxy()` delegating to default metadata. - Wired metadata construction in `codex.rs` from session/auth context, including originator sanitization for OTEL-safe tagging. - Updated `network-proxy/README.md` with embedded-mode audit schema and behavior notes. - Refactored HTTP block-audit emission to a small local helper to reduce duplication. - Preserved existing unix-socket proxy-disabled host/path behavior for responses and blocked history while using an audit-only endpoint override (`server.address="unix-socket"`, `server.port=0`). ### Explicit exclusions - No standalone proxy OTEL startup work. - No `main.rs` binary wiring. - No `standalone_otel.rs`. - No standalone docs/tests. ### Tests - Extended `network_policy.rs` tests for event mapping, metadata propagation, fallbacks, timestamp format, and target prefix. - Extended HTTP tests to assert unix-socket deny block audit events. - Extended SOCKS tests to cover deny emission from handler deny branches. - Added/updated core tests to verify audit metadata threading into managed proxy state. ### Validation run - `just fmt` - `cargo test -p codex-network-proxy` ✅ - `cargo test -p codex-core` ran with one unrelated flaky timeout (`shell_snapshot::tests::snapshot_shell_does_not_inherit_stdin`), and the test passed when rerun directly ✅ --------- Co-authored-by: viyatb-oai <viyatb@openai.com>	2026-02-25 11:46:37 -05:00
jif-oai	8362b79cb4	feat: fix sqlite home (#12787 )	2026-02-25 15:52:55 +00:00
jif-oai	01f25a7b96	chore: unify max depth parameter (#12770 ) Users were confused	2026-02-25 15:20:24 +00:00
jif-oai	e4bfa763f6	feat: record memory usage (#12761 )	2026-02-25 13:48:40 +00:00
jif-oai	5441130e0a	feat: adding stream parser (#12666 ) Add a stream parser to extract citations (and others) from a stream. This support cases where markers are split in differen tokens. Codex never manage to make this code work so everything was done manually. Please review correctly and do not touch this part of the code without a very clear understanding of it	2026-02-25 13:27:58 +00:00
jif-oai	bcd6e68054	Display pending child-thread approvals in TUI (#12767 ) Summary - propagate approval policy from parent to spawned agents and drop the Never override so sub-agents respect the caller’s request - refresh the pending-approval list whenever events arrive or the active thread changes and surface the list above the composer for inactive threads - add widgets, helpers, and tests covering the new pending-thread approval UI state ![Uploading Screenshot 2026-02-25 at 11.02.18.png…]()	2026-02-25 11:40:11 +00:00
Michael Bolin	93efcfd50d	feat: record whether a skill script is approved for the session (#12756 ) ## Why `unix_escalation.rs` checks a session-scoped approval cache before prompting again for an execve-intercepted skill script. Without also recording `ReviewDecision::ApprovedForSession`, that cache never gets populated, so the same skill script can still trigger repeated approval prompts within one session. ## What Changed - Add `execve_session_approvals` to `SessionServices` so the session can track approved skill script paths. - Record the script path when a skill-script prompt returns `ReviewDecision::ApprovedForSession`, but only for the skill-script path rather than broader prefix-rule approvals. - Reuse the cached approval on later execve callbacks by treating an already-approved skill script as `Decision::Allow`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12756). * #12758 * __->__ #12756	2026-02-25 10:17:22 +00:00
alexsong-oai	6d6570d89d	Support external agent config detect and import (#12660 ) Migration Behavior * Config * Migrates settings.json into config.toml * Only adds fields when config.toml is missing, or when those fields are missing from the existing file * Supported mappings: env -> shell_environment_policy sandbox.enabled = true -> sandbox_mode = "workspace-write" * Skills * Copies home and repo .claude/skills into .agents/skills * Existing skill directories are not overwritten * SKILL.md content is rewritten from Claude-related terms to Codex * AgentsMd * Repo only * Migrates CLAUDE.md into AGENTS.md * Detect/import only proceed when AGENTS.md is missing or present but empty * Content is rewritten from Claude-related terms to Codex	2026-02-25 02:11:51 -08:00
jif-oai	f46b767b7e	feat: add search term to thread list (#12578 ) Add `searchTerm` to `thread/list` that will search for a match in the titles (the condition being `searchTerm` $$\in$$ `title`)	2026-02-25 09:59:41 +00:00
jif-oai	10c04e11b8	feat: add service name to app-server (#12319 ) Add service name to the app-server so that the app can use it's own service name This is on thread level because later we might plan the app-server to become a singleton on the computer	2026-02-25 09:51:42 +00:00
Celia Chen	6a3233da64	Surface skill permission profiles in zsh-fork exec approvals (#12753 ) ## Summary - Preserve each skill’s raw permissions block as a permission_profile on SkillMetadata during skill loading. - Keep compiling that same metadata into the existing runtime Permissions object, so current enforcement behavior stays intact. - When zsh-fork intercepts execution of a script that belongs to a skill, include the skill’s permission_profile in the exec approval request. - This lets approval UIs show the extra filesystem access the skill declared when prompting for approval.	2026-02-25 01:23:10 -08:00
Michael Bolin	c4ec6be4ab	fix: keep shell escalation exec paths absolute (#12750 ) ## Why In the `shell_zsh_fork` flow, `codex-shell-escalation` receives the executable path exactly as the shell passed it to `execve()`. That path is not guaranteed to be absolute. For commands such as `./scripts/hello-mbolin.sh`, if the shell was launched with a different `workdir`, resolving the intercepted `file` against the server process working directory makes policy checks and skill matching inspect the wrong executable. This change pushes that fix a step further by keeping the normalized path typed as `AbsolutePathBuf` throughout the rest of the escalation pipeline. That makes the absolute-path invariant explicit, so later code cannot accidentally treat the resolved executable path as an arbitrary `PathBuf`. ## What Changed - record the wrapper process working directory as an `AbsolutePathBuf` - update the escalation protocol so `workdir` is explicitly absolute while `file` remains the raw intercepted exec path - resolve a relative intercepted `file` against the request `workdir` as soon as the server receives the request - thread `AbsolutePathBuf` through `EscalationPolicy`, `CoreShellActionProvider`, and command normalization helpers so the resolved executable path stays type-checked as absolute - replace the `path-absolutize` dependency in `codex-shell-escalation` with `codex-utils-absolute-path` - add a regression test that covers a relative `file` with a distinct `workdir` ## Verification - `cargo test -p codex-shell-escalation`	2026-02-24 23:52:36 -08:00
Michael Bolin	59398125f6	feat: zsh-fork forces scripts/*/ for skills to trigger a prompt (#12730 ) Direct skill-script matches force `Decision::Prompt`, so skill-backed scripts require explicit approval before they run. (Note "allow for session" is not supported in this PR, but will be done in a follow-up.) In the process of implementing this, I fixed an important bug: `ShellZshFork` is supposed to keep ordinary allowed execs on the client-side `Run` path so later `execve()` calls are still intercepted and reviewed. After the shell-escalation port, `Decision::Allow` still mapped to `Escalate`, which moved `zsh` to server-side execution too early. That broke the intended flow for skill-backed scripts and made the approval prompt depend on the wrong execution path. ## What changed - In `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs`, `Decision::Allow` now returns `Run` unless escalation is actually required. - Removed the zsh-specific `argv[0]` fallback. With the `Allow -> Run` fix in place, zsh's later `execve()` of the script is intercepted normally, so the skill match happens on the script path itself. - Kept the skill-path handling in `determine_action()` focused on the direct `program` match path. ## Verification - Updated `shell_zsh_fork_prompts_for_skill_script_execution` in `codex-rs/core/tests/suite/skill_approval.rs` (gated behind `cfg(unix)`) to: - run under `SandboxPolicy::new_workspace_write_policy()` instead of `DangerFullAccess` - assert the approval command contains only the script path - assert the approved run returns both stdout and stderr markers in the shell output - Ran `cargo test -p codex-core shell_zsh_fork_prompts_for_skill_script_execution -- --nocapture` ## Manual Testing Run the dev build: ``` just codex --config zsh_path=/Users/mbolin/code/codex2/codex-rs/app-server/tests/suite/zsh --enable shell_zsh_fork ``` I have created `/Users/mbolin/.agents/skills/mbolin-test-skill` with: ``` ├── scripts │ └── hello-mbolin.sh └── SKILL.md ``` The skill: ``` --- name: mbolin-test-skill description: Used to exercise various features of skills. --- When this skill is invoked, run the `hello-mbolin.sh` script and report the output. ``` The script: ``` set -e # Note this script will fail if run with network disabled. curl --location openai.com ``` Use `$mbolin-test-skill` to invoke the skill manually and verify that I get prompted to run `hello-mbolin.sh`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12730). * #12750 * __->__ #12730	2026-02-24 23:51:26 -08:00
Curtis 'Fjord' Hawthorne	9501669a24	tests(js_repl): remove node-related skip paths from js_repl tests (#12185 ) ## Summary Remove js_repl/node test-skip paths and make Node setup explicit in CI so js_repl tests always run instead of silently skipping. ## Why We had multiple “expediency” skip paths that let js_repl tests pass without actually exercising Node-backed behavior. This reduced CI signal and hid runtime/environment regressions. ## What changed ### CI - Added Node setup using `codex-rs/node-version.txt` in: - `.github/workflows/rust-ci.yml` - `.github/workflows/bazel.yml` - Added a Unix PATH copy step in Bazel workflow to expose the setup-node binary in common paths. ### js_repl test harness - Added explicit js_repl sandbox test configuration helpers in: - `codex-rs/core/src/tools/js_repl/mod.rs` - `codex-rs/core/src/tools/handlers/js_repl.rs` - Added Linux arg0 dispatch glue for js_repl tests so sandbox subprocess entrypoint behavior is correct under Linux test execution. ### Removed skip behavior - Deleted runtime guard function and early-return skips in js_repl tests (`can_run_js_repl_runtime_tests` and related per-test short-circuits). - Removed view_image integration test skip behavior: - dropped `skip_if_no_network!(Ok(()))` - removed “skip on Node missing/too old” branch after js_repl output inspection. ## Impact - js_repl/node tests now consistently execute and fail loudly when the environment is not correctly provisioned. - CI has stronger signal for js_repl regressions instead of false green from conditional skips. ## Testing - `cargo test -p codex-core` (locally) to validate js_repl unit/integration behavior with skips removed. - CI expected to surface any remaining environment/runtime gaps directly (rather than masking them). #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12300 - ✅ `2` https://github.com/openai/codex/pull/12275 - ✅ `3` https://github.com/openai/codex/pull/12205 - ✅ `4` https://github.com/openai/codex/pull/12407 - ✅ `5` https://github.com/openai/codex/pull/12372 - 👉 `6` https://github.com/openai/codex/pull/12185 - ⏳ `7` https://github.com/openai/codex/pull/10673	2026-02-24 22:52:14 -08:00
Curtis 'Fjord' Hawthorne	8f3f2c3c02	tests(js_repl): stabilize CI runtime test execution (#12407 ) ## Summary Stabilize `js_repl` runtime test setup in CI and move tool-facing `js_repl` behavior coverage into integration tests. This is a test/CI change only. No production `js_repl` behavior change is intended. ## Why - Bazel test sandboxes (especially on macOS) could resolve a different `node` than the one installed by `actions/setup-node`, which caused `js_repl` runtime/version failures. - `js_repl` runtime tests depend on platform-specific sandbox/test-harness behavior, so they need explicit gating in a base-stability commit. - Several tests in the `js_repl` unit test module were actually black-box/tool-level behavior tests and fit better in the integration suite. ## Changes - Add `actions/setup-node` to the Bazel and Rust `Tests` workflows, using the exact version pinned in the repo’s Node version file. - In Bazel (non-Windows), pass `CODEX_JS_REPL_NODE_PATH=$(which node)` into test env so `js_repl` uses the `actions/setup-node` runtime inside Bazel tests. - Add a new integration test suite for `js_repl` tool behavior and register it in the core integration test suite module. - Move black-box `js_repl` behavior tests into the integration suite (persistence/TLA, builtin tool invocation, recursive self-call rejection, `process` isolation, blocked builtin imports). - Keep white-box manager/kernel tests in the `js_repl` unit test module. - Gate `js_repl` runtime tests to run only on macOS and only when a usable Node runtime is available (skip on other platforms / missing Node in this commit). ## Impact - Reduces `js_repl` CI failures caused by Node resolution drift in Bazel. - Improves test organization by separating tool-facing behavior tests from white-box manager/kernel tests. - Keeps the base commit stable while expanding `js_repl` runtime coverage. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12372 - 👉 `2` https://github.com/openai/codex/pull/12407 - ⏳ `3` https://github.com/openai/codex/pull/12185 - ⏳ `4` https://github.com/openai/codex/pull/10673	2026-02-24 21:04:34 -08:00
Celia Chen	16ca527c80	chore: migrate additional permissions to PermissionProfile (#12731 ) This PR replaces the old `additional_permissions.fs_read/fs_write` shape with a shared `PermissionProfile` model and wires it through the command approval, sandboxing, protocol, and TUI layers. The schema is adopted from the `SkillManifestPermissions`, which is also refactored to use this unified struct. This helps us easily expose permission profiles in app server/core as a follow-up.	2026-02-25 03:35:28 +00:00
sayan-oai	e6bb5d8553	chore: change catalog mode to enum (#12656 ) make presence of custom catalog more clear by changing to enum instead of bool.	2026-02-24 19:33:32 -08:00
Curtis 'Fjord' Hawthorne	125fbec317	Fix js_repl view_image attachments in nested tool calls (#12725 ) ## Summary - Fix `js_repl` so `await codex.tool("view_image", { path })` actually attaches the image to the active turn when called from inside the JS REPL. - Restore the behavior expected by the existing `js_repl` image-attachment test. - This is a follow-up to [#12553](https://github.com/openai/codex/pull/12553), which changed `view_image` to return structured image content. ## Root Cause - [#12553](https://github.com/openai/codex/pull/12553) changed `view_image` from directly injecting a pending user image message to returning structured `function_call_output` content items. - The nested tool-call bridge inside `js_repl` serialized that tool response back to the JS runtime, but it did not mirror returned image content into the active turn. - As a result, `view_image` appeared to succeed inside `js_repl`, but no `input_image` was actually attached for the outer turn. ## What Changed - Updated the nested tool-call path in `js_repl` to inspect function tool responses for structured content items. - When a nested tool response includes `input_image` content, `js_repl` now injects a corresponding user `Message` into the active turn before returning the raw tool result back to the JS runtime. - Kept the normal JSON result flow intact, so `codex.tool(...)` still returns the original tool output object to JavaScript. ## Why - `js_repl` documentation and tests already assume that `view_image` can be used from inside the REPL to attach generated images to the model. - Without this fix, the nested call path silently dropped that attachment behavior.	2026-02-24 18:23:53 -08:00
Michael Bolin	e88f74d140	feat: pass helper executable paths via Arg0DispatchPaths (#12719 ) ## Why `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` previously located `codex-execve-wrapper` by scanning `PATH` and sibling directories. That lookup is brittle and can select the wrong binary when the runtime environment differs from startup assumptions. We already pass `codex-linux-sandbox` from `codex-arg0`; `codex-execve-wrapper` should use the same startup-driven path plumbing. ## What changed - Introduced `Arg0DispatchPaths` in `codex-arg0` to carry both helper executable paths: - `codex_linux_sandbox_exe` - `main_execve_wrapper_exe` - Updated `arg0_dispatch_or_else()` to pass `Arg0DispatchPaths` to top-level binaries and preserve helper paths created in `prepend_path_entry_for_codex_aliases()`. - Threaded `Arg0DispatchPaths` through entrypoints in `cli`, `exec`, `tui`, `app-server`, and `mcp-server`. - Added `main_execve_wrapper_exe` to core configuration plumbing (`Config`, `ConfigOverrides`, and `SessionServices`). - Updated zsh-fork shell escalation to consume the configured `main_execve_wrapper_exe` and removed path-sniffing fallback logic. - Updated app-server config reload paths so reloaded configs keep the same startup-provided helper executable paths. ## References - [`Arg0DispatchPaths` definition](`e355b43d5c/codex-rs/arg0/src/lib.rs (L20-L24)`) - [`arg0_dispatch_or_else()` forwarding both paths](`e355b43d5c/codex-rs/arg0/src/lib.rs (L145-L176)`) - [zsh-fork escalation using configured wrapper path](`e355b43d5c/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L109-L150)`) ## Testing - `cargo check -p codex-arg0 -p codex-core -p codex-exec -p codex-tui -p codex-mcp-server -p codex-app-server` - `cargo test -p codex-arg0` - `cargo test -p codex-core tools::runtimes::shell::unix_escalation:: -- --nocapture`	2026-02-24 17:44:38 -08:00
Michael Bolin	448fb6ac22	fix: clarify the value of SkillMetadata.path (#12729 ) Rename `SkillMetadata.path` to `SkillMetadata.path_to_skills_md` for clarity. Would ideally change the type to `AbsolutePathBuf`, but that can be done later.	2026-02-24 17:15:54 -08:00
Curtis 'Fjord' Hawthorne	63c2ac96cd	fix(js_repl): surface uncaught kernel errors and reset cleanly (#12636 ) ## Summary Improve `js_repl` behavior when the Node kernel hits a process-level failure (for example, an uncaught exception or unhandled Promise rejection). Instead of only surfacing a generic `js_repl kernel exited unexpectedly` after stdout EOF, `js_repl` now returns a clearer exec error for the active request, then resets the kernel cleanly. ## Why Some sandbox-denied operations can trigger Node errors that become process-level failures (for example, an unhandled EventEmitter `'error'` event). In that case: - the kernel process exits, - the host sees stdout EOF, - the user gets a generic kernel-exit error, - and the next request can briefly race with stale kernel state. This change improves that failure mode without monkeypatching Node APIs. ## Changes ### Kernel-side (`js_repl` Node process) - Add process-level handlers for: - `uncaughtException` - `unhandledRejection` - When one of these fires: - best-effort emit a normal `exec_result` error for the active exec - include actionable guidance to catch/handle async errors (including Promise rejections and EventEmitter `'error'` events) - exit intentionally so the host can reset/restart the kernel ### Host-side (`JsReplManager`) - Clear dead kernel state as soon as the stdout reader observes unexpected kernel exit/EOF. - This lets the next `js_repl` exec start a fresh kernel instead of hitting a stale broken-pipe path. ### Tests - Add regression coverage for: - uncaught async exception -> exec error + kernel recovery on next exec - Update forced-kernel-exit test to validate recovery behavior (next exec restarts cleanly) ## Impact - Better user-facing error for kernel crashes caused by uncaught/unhandled async failures. - Cleaner recovery behavior after kernel exit. ## Validation - `cargo test -p codex-core --lib tools::js_repl::tests::js_repl_uncaught_exception_returns_exec_error_and_recovers -- --exact` - `cargo test -p codex-core --lib tools::js_repl::tests::js_repl_forced_kernel_exit_recovers_on_next_exec -- --exact` - `just fmt`	2026-02-24 17:12:02 -08:00
Michael Bolin	3d356723c4	fix: make EscalateServer public and remove shell escalation wrappers (#12724 ) ## Why `codex-shell-escalation` exposed a `codex-core`-specific adapter layer (`ShellActionProvider`, `ShellPolicyFactory`, and `run_escalate_server`) that existed only to bridge `codex-core` to `EscalateServer`. That indirection increased API surface and obscured crate ownership without adding behavior. This change moves orchestration into `codex-core` so boundaries are clearer: `codex-shell-escalation` provides reusable escalation primitives, and `codex-core` provides shell-tool policy decisions. Admittedly, @pakrym rightfully requested this sort of cleanup as part of https://github.com/openai/codex/pull/12649, though this avoids moving all of `codex-shell-escalation` into `codex-core`. ## What changed - Made `EscalateServer` public and exported it from `shell-escalation`. - Removed the adapter layer from `shell-escalation`: - deleted `shell-escalation/src/unix/core_shell_escalation.rs` - removed exports for `ShellActionProvider`, `ShellPolicyFactory`, `EscalationPolicyFactory`, and `run_escalate_server` - Updated `core/src/tools/runtimes/shell/unix_escalation.rs` to: - create `Stopwatch`/cancellation in `codex-core` - instantiate `EscalateServer` directly - implement `EscalationPolicy` directly on `CoreShellActionProvider` Net effect: same escalation flow with fewer wrappers and a smaller public API. ## Verification - Manually reviewed the old vs. new escalation call flow to confirm timeout/cancellation behavior and approval policy decisions are preserved while removing wrapper types.	2026-02-24 16:20:08 -08:00
Eric Traut	8da40c9251	Raise image byte estimate for compaction token accounting (#12717 ) Increase `IMAGE_BYTES_ESTIMATE` from 340 bytes to 7,373 bytes so the existing 4-bytes/token heuristic yields an image estimate of ~1,844 tokens instead of ~85. This makes auto-compaction more conservative for image-heavy transcripts and avoids underestimating context usage, which can otherwise cause compaction to fail when there is not enough free context remaining. The new value was chosen because that's the image resolution cap used for our latest models. Follow-up to [#12419](https://github.com/openai/codex/pull/12419). Refs [#11845](https://github.com/openai/codex/issues/11845).	2026-02-24 16:11:38 -08:00
daveaitel-openai	dcab40123f	Agent jobs (spawn_agents_on_csv) + progress UI (#10935 ) ## Summary - Add agent job support: spawn a batch of sub-agents from CSV, auto-run, auto-export, and store results in SQLite. - Simplify workflow: remove run/resume/get-status/export tools; spawn is deterministic and completes in one call. - Improve exec UX: stable, single-line progress bar with ETA; suppress sub-agent chatter in exec. ## Why Enables map-reduce style workflows over arbitrarily large repos using the existing Codex orchestrator. This addresses review feedback about overly complex job controls and non-deterministic monitoring. ## Demo (progress bar) ``` ./codex-rs/target/debug/codex exec \ --enable collab \ --enable sqlite \ --full-auto \ --progress-cursor \ -c agents.max_threads=16 \ -C /Users/daveaitel/code/codex \ - <<'PROMPT' Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows: path = item-01..item-30, area = test. Then call spawn_agents_on_csv with: - csv_path: /tmp/agent_job_progress_demo.csv - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1." - output_csv_path: /tmp/agent_job_progress_demo_out.csv PROMPT ``` ## Review feedback addressed - Auto-start jobs on spawn; removed run/resume/status/export tools. - Auto-export on success. - More descriptive tool spec + clearer prompts. - Avoid deadlocks on spawn failure; pending/running handled safely. - Progress bar no longer scrolls; stable single-line redraw. ## Tests - `cd codex-rs && cargo test -p codex-exec` - `cd codex-rs && cargo build -p codex-cli`	2026-02-24 21:00:19 +00:00
Eric Traut	bd192b54cd	Honor `project_root_markers` when discovering `AGENTS.md` (#12639 ) Fixes #12128 The docs indicates that `project_root_markers` are used to discover the project root for local config as well as `AGENTS.md`. It looks like it was never wired up to support the latter. Summary - resolve project docs by walking to the configured `project_root_markers` (or defaults) instead of assuming the Git root, while honoring CLI overrides and handling malformed configs - fall back to the project’s canonical path chain and add a test that makes sure custom markers upstream of `.git` are respected	2026-02-24 12:55:48 -08:00
Ahmed Ibrahim	b6ab2214e3	Add TUI realtime conversation mode (#12687 ) - Add a hidden `realtime_conversation` feature flag and `/realtime` slash command for start/stop live voice sessions. - Reuse transcription composer/footer UI for live metering, stream mic audio, play assistant audio, render realtime user text events, and force-close on feature disable. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-24 12:54:30 -08:00
Michael Bolin	3b5fc7547e	refactor: remove unused seatbelt unix socket arg (#12707 ) https://github.com/openai/codex/pull/12052 introduced an `allowed_unix_socket_paths` parameter to `create_seatbelt_command_args()`, but https://github.com/openai/codex/pull/12649 removed the abstraction that #12052 introduced, so this parameter is no longer necessary as it is always an empty slice.	2026-02-24 12:30:26 -08:00
pakrym-oai	daf0f03ac8	Ensure shell command skills trigger approval (#12697 ) Summary - detect skill-invoking shell commands based on the original command string, request approvals when needed, and cache positive decisions per session - keep implicit skill invocation emitted after approval and keep skill approval decline messaging centralized to the shell handler - expand and adjust skill approval tests to cover shell-based skill scripts while matching the new detection expectations Testing - Not run (not requested)	2026-02-24 12:13:20 -08:00
Yaroslav Volovich	67d9261e2c	feat(sleep-inhibitor): add Linux and Windows idle-sleep prevention (#11766 ) ## Background - follow-up to previous macOS-only PR: https://github.com/openai/codex/pull/11711 - follow-up macOS refactor PR (current structural approach used here): https://github.com/openai/codex/pull/12340 ## Summary - extend `codex-utils-sleep-inhibitor` with Linux and Windows backends while preserving existing macOS behavior - Linux backend: - use `systemd-inhibit` (`--what=idle --mode=block`) when available - fall back to `gnome-session-inhibit` (`--inhibit idle`) when available - keep no-op behavior if neither backend exists on host - Windows backend: - use Win32 power request handles (`PowerCreateRequest` + `PowerSetRequest` / `PowerClearRequest`) with `PowerRequestSystemRequired` - make `prevent_idle_sleep` Experimental on macOS/Linux/Windows; keep under development on other targets ## Testing - `just fmt` - `cargo test -p codex-utils-sleep-inhibitor` - `cargo test -p codex-core features::tests::` - `cargo test -p codex-tui chatwidget::tests::` - `just fix -p codex-utils-sleep-inhibitor` - `just fix -p codex-core` ## Semantics and API references - Goal remains: prevent idle system sleep while a turn is running. - Linux: - `systemd-inhibit` / login1 inhibitor model: - https://www.freedesktop.org/software/systemd/man/latest/systemd-inhibit.html - https://www.freedesktop.org/software/systemd/man/org.freedesktop.login1.html - https://systemd.io/INHIBITOR_LOCKS/ - xdg-desktop-portal Inhibit (relevant for sandboxed apps): - https://flatpak.github.io/xdg-desktop-portal/docs/doc-org.freedesktop.portal.Inhibit.html - Windows: - `PowerCreateRequest`: - https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-powercreaterequest - `PowerSetRequest`: - https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-powersetrequest - `PowerClearRequest`: - https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-powerclearrequest - `SetThreadExecutionState` (alternative baseline API): - https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setthreadexecutionstate ## Chromium vs this PR - Chromium Linux backend: - https://github.com/chromium/chromium/blob/main/services/device/wake_lock/power_save_blocker/power_save_blocker_linux.cc - Chromium Windows backend: - https://github.com/chromium/chromium/blob/main/services/device/wake_lock/power_save_blocker/power_save_blocker_win.cc - Electron powerSaveBlocker entry point: - https://github.com/electron/electron/blob/main/shell/browser/api/electron_api_power_save_blocker.cc ## Why we differ from Chromium - Linux implementation mechanism: - Chromium uses in-process D-Bus APIs plus UI-integrated screen-saver suspension. - This PR uses command-based inhibitor backends (`systemd-inhibit`, `gnome-session-inhibit`) instead of linking a Linux D-Bus client in this crate. - Reason: keep `codex-utils-sleep-inhibitor` dependency-light and avoid Linux CI/toolchain fragility from new native D-Bus linkage, while preserving the same runtime intent (hold an inhibitor while a turn runs). - Linux UI integration scope: - Chromium also uses `display::Screen::SuspendScreenSaver()` in its UI stack. - Codex `codex-rs` does not have that display abstraction in this crate, so this PR scopes Linux behavior to process-level sleep inhibition only. - Windows wake-lock type breadth: - Chromium supports both display/system wake-lock types and extra display-specific handling for some pre-Win11 scenarios. - Codex’s feature is scoped to turn execution continuity (not forcing display on), so this PR uses `PowerRequestSystemRequired` only.	2026-02-24 11:51:44 -08:00

1 2 3 4 5 ...

1709 Commits