Commit Graph

4219 Commits

Author SHA1 Message Date
pakrym-oai
ba41e84a50 Use model catalog default for reasoning summary fallback (#12873)
## Summary
- make `Config.model_reasoning_summary` optional so unset means use
model default
- resolve the optional config value to a concrete summary when building
`TurnContext`
- add protocol support for `default_reasoning_summary` in model metadata

## Validation
- `cargo test -p codex-core --lib client::tests -- --nocapture`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-26 09:31:13 -08:00
jif-oai
f0a85ded18 fix: ctrl c sub agent (#12911) 2026-02-26 17:06:20 +00:00
jif-oai
739d4b52de fix: do not apply turn cwd to metadata (#12887)
Details here:
https://openai.slack.com/archives/C09NZ54M4KY/p1772056758227339
2026-02-26 17:05:58 +00:00
jif-oai
c528f32acb feat: use memory usage for selection (#12909) 2026-02-26 16:44:02 +00:00
pakrym-oai
1503a8dad7 split-debuginfo (#12871)
Attempt to reduce disk usage in mac ci.

>off - This is the default for platforms with ELF binaries and
windows-gnu (not Windows MSVC and not macOS). This typically means that
DWARF debug information can be found in the final artifact in sections
of the executable. This option is not supported on Windows MSVC. On
macOS this options prevents the final execution of dsymutil to generate
debuginfo.
2026-02-26 16:39:24 +00:00
daveaitel-openai
79cbca324a Skip history metadata scan for subagents (#12918)
Summary
- Skip `history_metadata` scanning when spawning subagents to avoid
expensive per-spawn history scans.
- Keeps behavior unchanged for normal sessions.

Testing
  - `cd codex-rs && cargo test -p codex-core` 
- Failing in this environment (pre-existing and I don't think something
I did?):
- `suite::cli_stream::responses_mode_stream_cli` (SIGKILL + OTEL export
error to http://localhost:14318/v1/logs)
- `suite::grep_files::grep_files_tool_collects_matches` (unsupported
call: grep_files)
- `suite::grep_files::grep_files_tool_reports_empty_results`
(unsupported call: grep_files)

Co-authored-by: Codex <noreply@openai.com>
2026-02-26 16:21:26 +00:00
jif-oai
79d6f80e41 chore: clean DB runtime (#12905) 2026-02-26 14:11:10 +00:00
jif-oai
382fa338b3 feat: memories forgetting (#12900)
Add diff based memory forgetting
2026-02-26 13:19:57 +00:00
jif-oai
81ce645733 chore: better awaiter description (#12901) 2026-02-26 12:07:13 +00:00
Wendy Jiao
52aa49db1b Add rollout path to memory files and search for them during read (#12684)
Co-authored-by: jif-oai <jif@openai.com>
2026-02-26 10:57:01 +00:00
pash-openai
6acede5a28 tui: restore visible line numbers for hidden file links (#12870)
we recently changed file linking so the model uses markdown links when
it wants something to be clickable.

This works well across the GUI surfaces because they can render markdown
cleanly and use the full absolute path in the anchor target.

A previous pass hid the absolute path in the TUI (and only showed the
label), but that also meant we could lose useful location info when the
model put the line number or range in the anchor target instead of the
label.

This follow-up keeps the TUI behavior simple while making local file
links feel closer to the old TUI file reference style.

key changes:
- Local markdown file links in the TUI keep the old file-ref feel: code
styling, no underline, no visible absolute path.
- If the hidden local anchor target includes a location suffix and the
label does not already include one, we append that suffix to the visible
label.
- This works for single lines, line/column references, and ranges.
- If the label already includes the location, we leave it alone.
- normal web links keep the old TUI markdown-link behavior

some examples:
- `[foo.rs](/abs/path/foo.rs)` renders as `foo.rs`
- `[foo.rs](/abs/path/foo.rs:45)` renders as `foo.rs:45`
- `[foo.rs](/abs/path/foo.rs:45:3-48:9)` renders as `foo.rs:45:3-48:9`
- `[foo.rs:45](/abs/path/foo.rs:45)` stays `foo.rs:45`
- `[docs](https://example.com/docs)` still renders like a normal web
link

how it looks:
<img width="732" height="813" alt="Screenshot 2026-02-26 at 9 27 55 AM"
src="https://github.com/user-attachments/assets/d51bf236-653a-4e83-96e4-9427f0804471"
/>
2026-02-26 10:29:54 +00:00
jif-oai
14a08d6c14 nit: captial (#12885) 2026-02-26 09:36:13 +00:00
jif-oai
51cf3977d4 chore: new agents name (#12884) 2026-02-26 09:36:09 +00:00
Charley Cunningham
07aefffb1f core: bundle settings diff updates into one dev/user envelope (#12417)
## Summary
- bundle contextual prompt injection into at most one developer message
plus one contextual user message in both:
  - per-turn settings updates
  - initial context insertion
- preserve `<model_switch>` across compaction by rebuilding it through
canonical initial-context injection, instead of relying on
strip/reattach hacks
- centralize contextual user fragment detection in one shared definition
table and reuse it for parsing/compaction logic
- keep `AGENTS.md` in its natural serialized format:
  - `# AGENTS.md instructions for {dirname}`
  - `<INSTRUCTIONS>...</INSTRUCTIONS>`
- simplify related tests/helpers and accept the expected snapshot/layout
updates from bundled multi-part messages

## Why
The goal is to converge toward a simpler, more intentional prompt shape
where contextual updates are consistently represented as one developer
envelope plus one contextual user envelope, while keeping parsing and
compaction behavior aligned with that representation.

## Notable details
- the temporary `SettingsUpdateEnvelope` wrapper was removed; these
paths now return `Vec<ResponseItem>` directly
- local/remote compaction no longer rely on model-switch strip/restore
helpers
- contextual user detection is now driven by shared fragment definitions
instead of ad hoc matcher assembly
- AGENTS/user instructions are still the same logical context; only the
synthetic `<user_instructions>` wrapper was replaced by the natural
AGENTS text format

## Testing
- `just fmt`
- `cargo test -p codex-app-server
codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages
-- --exact`
- `cargo test -p codex-core
compact::tests::collect_user_messages_filters_session_prefix_entries
--lib -- --exact`
- `cargo test -p codex-core --test all
'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch'
-- --exact`
- `cargo test -p codex-core --test all
'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch'
-- --exact`
- `cargo test -p codex-core --test all
'suite::client::includes_apps_guidance_as_developer_message_when_enabled'
-- --exact`
- `cargo test -p codex-core --test all
'suite::client::includes_developer_instructions_message_in_request' --
--exact`
- `cargo test -p codex-core --test all
'suite::client::includes_user_instructions_message_in_request' --
--exact`
- `cargo test -p codex-core --test all
'suite::client::resume_includes_initial_messages_and_sends_prior_items'
-- --exact`
- `cargo test -p codex-core --test all
'suite::review::review_input_isolated_from_parent_history' -- --exact`
- `cargo test -p codex-exec --test all
'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' --
--exact`
- `cargo test -p core_test_support
context_snapshot::tests::full_text_mode_preserves_unredacted_text --
--exact`

## Notes
- I also ran several targeted `compact`, `compact_remote`,
`prompt_caching`, `model_visible_layout`, and `event_mapping` tests
while iterating on prompt-shape changes.
- I have not claimed a clean full-workspace `cargo test` from this
environment because local sandbox/resource conditions have previously
produced unrelated failures in large workspace runs.
2026-02-26 00:12:08 -08:00
Eric Traut
28bfbb8f2b Enforce user input length cap (#12823)
Currently there is no bound on the length of a user message submitted in
the TUI or through the app server interface. That means users can paste
many megabytes of text, which can lead to bad performance, hangs, and
crashes. In extreme cases, it can lead to a [kernel
panic](https://github.com/openai/codex/issues/12323).

This PR limits the length of a user input to 2**20 (about 1M)
characters. This value was chosen because it fills the entire context
window on the latest models, so accepting longer inputs wouldn't make
sense anyway.

Summary
- add a shared `MAX_USER_INPUT_TEXT_CHARS` constant in codex-protocol
and surface it in TUI and app server code
- block oversized submissions in the TUI submit flow and emit error
history cells when validation fails
- reject heavy app-server requests with JSON-RPC `-32602` and structured
`input_too_large` data, plus document the behavior

Testing
- ran the IDE extension with this change and verified that when I
attempt to paste a user message that's several MB long, it correctly
reports an error instead of crashing or making my computer hot.
2026-02-25 22:23:51 -08:00
pash-openai
9a96b6f509 Hide local file link destinations in TUI markdown (#12705)
## Summary
- hide appended destinations for local path-style markdown links in the
TUI renderer
- keep web links rendering with their visible destination and style link
labels consistently
- add markdown renderer tests and a snapshot for the new file-link
output

## Testing
- just fmt
- cargo test -p codex-tui
<img width="1120" height="968" alt="image"
src="https://github.com/user-attachments/assets/490e8eda-ae47-4231-89fa-b254a1f83eed"
/>
2026-02-26 05:28:37 +00:00
pakrym-oai
cbbf302f5f Fix release build take (#12865) 2026-02-25 20:59:07 -08:00
Curtis 'Fjord' Hawthorne
7326c097e3 Reduce js_repl Node version requirement to 22.22.0 (#12857)
## Summary

Lower the `js_repl` minimum Node version from `24.13.1` to `22.22.0`.

This updates the enforced minimum in `codex-rs/node-version.txt` and the
corresponding user-facing `/experimental` description for the JavaScript
REPL feature.

## Rationale

The previous `24.13.1` floor was stricter than necessary for `js_repl`.
I validated the REPL kernel behavior under Node `22.22.0` still works.

## Why `22.22.0`

`22.22.0` is a current, widely packaged Node 22 release across common
developer environments and distros, including Homebrew `node@22`, Fedora
`nodejs22`, Arch `nodejs-lts-jod`, and Debian testing. That makes it a
better exact floor than guessing at an older `22.x` patch we have not
validated.

`22.x` is also a maintenance branch that will be supported through April
2027, where the previous maintenance branch of `20.x` is only supported
through April of this year.

## Changes

- Update `codex-rs/node-version.txt` from `24.13.1` to `22.22.0`
- Update the `/experimental` JavaScript REPL description to say
`Requires Node >= v22.22.0 installed.`
2026-02-26 04:09:30 +00:00
xl-openai
8cdee988f9 Skip system skills for extra roots (#12744)
When extra roots is set do not load system skills.
2026-02-25 19:55:28 -08:00
pakrym-oai
b65205fb3d Attempt 2 to fix release (#12856) 2026-02-25 19:12:19 -08:00
pakrym-oai
ea621ae152 Try fixing windows pipeline (#12848)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-02-25 18:18:00 -08:00
xl-openai
2c1f225427 Clarify device auth login hint (#12813)
Mention device auth for remote login
2026-02-25 18:15:27 -08:00
Curtis 'Fjord' Hawthorne
40ab71a985 Disable js_repl when Node is incompatible at startup (#12824)
## Summary
- validate `js_repl` Node compatibility during session startup when the
experiment is enabled
- if Node is missing or too old, disable `js_repl` and
`js_repl_tools_only` for the session before tools and instructions are
built
- surface that startup disablement to users through the existing startup
warning flow instead of only logging it
- reuse the same compatibility check in js_repl kernel startup so
startup gating and runtime behavior stay aligned
- add a regression test that verifies the warning is emitted and that
the first advertised tool list omits `js_repl` and `js_repl_reset` when
Node is incompatible

## Why
Today `js_repl` can be advertised based only on the feature flag, then
fail later when the kernel starts. That makes the available tool list
inaccurate at the start of a conversation, and users do not get a clear
explanation for why the tool is unavailable.

This change makes tool availability reflect real startup checks, keeps
the advertised tool set stable for the lifetime of the session, and
gives users a visible warning when `js_repl` is disabled.

## Testing
- `just fmt`
- `cargo test -p codex-core --test all
js_repl_is_not_advertised_when_startup_node_is_incompatible`
2026-02-26 01:14:51 +00:00
Michael Bolin
14116ade8d feat: include available decisions in command approval requests (#12758)
Command-approval clients currently infer which choices to show from
side-channel fields like `networkApprovalContext`,
`proposedExecpolicyAmendment`, and `additionalPermissions`. That makes
the request shape harder to evolve, and it forces each client to
replicate the server's heuristics instead of receiving the exact
decision list for the prompt.

This PR introduces a mapping between `CommandExecutionApprovalDecision`
and `codex_protocol::protocol::ReviewDecision`:

```rust
impl From<CoreReviewDecision> for CommandExecutionApprovalDecision {
    fn from(value: CoreReviewDecision) -> Self {
        match value {
            CoreReviewDecision::Approved => Self::Accept,
            CoreReviewDecision::ApprovedExecpolicyAmendment {
                proposed_execpolicy_amendment,
            } => Self::AcceptWithExecpolicyAmendment {
                execpolicy_amendment: proposed_execpolicy_amendment.into(),
            },
            CoreReviewDecision::ApprovedForSession => Self::AcceptForSession,
            CoreReviewDecision::NetworkPolicyAmendment {
                network_policy_amendment,
            } => Self::ApplyNetworkPolicyAmendment {
                network_policy_amendment: network_policy_amendment.into(),
            },
            CoreReviewDecision::Abort => Self::Cancel,
            CoreReviewDecision::Denied => Self::Decline,
        }
    }
}
```

And updates `CommandExecutionRequestApprovalParams` to have a new field:

```rust
available_decisions: Option<Vec<CommandExecutionApprovalDecision>>
```

when, if specified, should make it easier for clients to display an
appropriate list of options in the UI.

This makes it possible for `CoreShellActionProvider::prompt()` in
`unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly,
adding support for `ApprovedForSession` when approving a skill script,
which was previously missing in the TUI.

Note this results in a significant change to `exec_options()` in
`approval_overlay.rs`, as the displayed options are now derived from
`available_decisions: &[ReviewDecision]`.

## What Changed

- Add `available_decisions` to
[`ExecApprovalRequestEvent`](de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)),
including helpers to derive the legacy default choices when older
senders omit the field.
- Map `codex_protocol::protocol::ReviewDecision` to app-server
`CommandExecutionApprovalDecision` and expose the ordered list as
experimental `availableDecisions` in
[`CommandExecutionRequestApprovalParams`](de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)).
- Thread optional `available_decisions` through the core approval path
so Unix shell escalation can explicitly request `ApprovedForSession` for
session-scoped approvals instead of relying on client heuristics.
[`unix_escalation.rs`](de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214))
- Update the TUI approval overlay to build its buttons from the ordered
decision list, while preserving the legacy fallback when
`available_decisions` is missing.
- Update the app-server README, test client output, and generated schema
artifacts to document and surface the new field.

## Testing

- Add `approval_overlay.rs` coverage for explicit decision lists,
including the generic `ApprovedForSession` path and network approval
options.
- Update `chatwidget/tests.rs` and app-server protocol tests to populate
the new optional field and keep older event shapes working.

## Developers Docs

- If we document `item/commandExecution/requestApproval` on
[developers.openai.com/codex](https://developers.openai.com/codex), add
experimental `availableDecisions` as the preferred source of approval
choices and note that older servers may omit it.
2026-02-26 01:10:46 +00:00
Celia Chen
4f45668106 Revert "Add skill approval event/response (#12633)" (#12811)
This reverts commit https://github.com/openai/codex/pull/12633. We no
longer need this PR, because we favor sending normal exec command
approval server request with `additional_permissions` of skill
permissions instead
2026-02-26 01:02:42 +00:00
pakrym-oai
4fedef88e0 Use websocket v2 as model-preferred websocket protocol (#12838) 2026-02-25 16:35:53 -08:00
EFRAZER-oai
a1cd78c818 Add macOS and Linux direct install script (#12740)
## Summary
- add a direct install script for macOS and Linux at
`scripts/install/install.sh`
- stage `install.sh` into `dist/` during release so it is published as a
GitHub release asset
- reuse the existing platform npm payload so the installer includes both
`codex` and `rg`

## Testing
- `bash -n scripts/install/install.sh`
- local macOS `curl | sh` smoke test against a locally served copy of
the script
2026-02-26 00:33:50 +00:00
Ahmed Ibrahim
e76b1a2853 Remove steer feature flag (#12026)
All code should go in the direction that steer is enabled

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-25 15:41:42 -08:00
Michael Bolin
a6a5976c5a feat: scope execve session approvals by approved skill metadata (#12814)
Previous to this change, `determine_action()` would

1. check if `program` is associated with a skill
2. if so, check if `program` is in `execve_session_approvals` to see
whether the user needs to be prompted

This PR flips the order of these checks to try to set us up so that
"session approvals" are always consulted first (which should soon extend
to include session approvals derived from `prefix_rule()`s, as well).

Though to make the new ordering work, we need to record any relevant
metadata to associate with the approval, which in the case of a
skill-based approval is the `SkillMetadata` so that we can derive the
`PermissionProfile` to include with the escalation. (Though as noted by
the `TODO`, this `PermissionProfile` is not honored yet.)

The new `ExecveSessionApproval` struct is used to retain the necessary
metadata.

## What Changed

- Replace the `execve_session_approvals` `HashSet` with a map that
stores an `ExecveSessionApproval` alongside each approved `program`.
- When a user chooses `ApprovedForSession` for a skill script, capture
the matched `SkillMetadata` in the session approval entry.
- Consult that cache before re-running `find_skill()`, and reuse the
originally approved skill metadata and permission profile when allowing
later execve callbacks in the same session.
2026-02-25 15:30:24 -08:00
Charley Cunningham
2f4d6ded1d Enable request_user_input in Default mode (#12735)
## Summary
- allow `request_user_input` in Default collaboration mode as well as
Plan
- update the Default-mode instructions to prefer assumptions first and
use `request_user_input` only when a question is unavoidable
- update request_user_input and app-server tests to match the new
Default-mode behavior
- refactor collaboration-mode availability plumbing into
`CollaborationModesConfig` for future mode-related flags

## Codex author
`codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`
2026-02-25 15:20:46 -08:00
Ahmed Ibrahim
2bd87d1a75 only use preambles for realtime (#12831)
Reverts openai/codex#12830
2026-02-25 14:54:54 -08:00
Celia Chen
b6d20748e0 Revert "Ensure shell command skills trigger approval (#12697)" (#12721)
This reverts commit daf0f03ac8.

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-02-25 22:49:53 +00:00
Ahmed Ibrahim
f86087eaa8 Revert "only use preambles for realtime" (#12830)
Reverts openai/codex#12806
2026-02-25 14:30:48 -08:00
Ahmed Ibrahim
c1851be1ed only use preambles for realtime (#12806)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-25 13:41:54 -08:00
Owen Lin
21f7032dbb feat(app-server): thread/unsubscribe API (#10954)
Adds a new v2 app-server API for a client to be able to unsubscribe to a
thread:
- New RPC method: `thread/unsubscribe`
- New server notification: `thread/closed`

Today clients can start/resume/archive threads, but there wasn’t a way
to explicitly unload a live thread from memory without archiving it.
With `thread/unsubscribe`, a client can indicate it is no longer
actively working with a live Thread. If this is the only client
subscribed to that given thread, the thread will be automatically closed
by app-server, at which point the server will send `thread/closed` and
`thread/status/changed` with `status: notLoaded` notifications.

This gives clients a way to prevent long-running app-server processes
from accumulating too many thread (and related) objects in memory.

Closed threads will also be removed from `thread/loaded/list`.
2026-02-25 13:14:30 -08:00
sayan-oai
d45ffd5830 make 5.3-codex visible in cli for api users (#12808)
5.3-codex released in api, mark it visible for API users via bundled
`models.json`.
2026-02-25 13:01:40 -08:00
Michael Bolin
be5bca6f8d fix: harden zsh fork tests and keep subcommand approvals deterministic (#12809)
## Why
The prior
`turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
assertion was brittle under Bazel: command approval payloads in the test
could include environment-dependent wrapper/command formatting
differences, which makes exact command-string matching flaky even when
behavior is correct.

(This regression was knowingly introduced in
https://github.com/openai/codex/pull/12800, but it was urgent to land
that PR.)

## What changed
- Hardened
`turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
in
[`turn_start_zsh_fork.rs`](https://github.com/openai/codex/blob/main/codex-rs/app-server/tests/suite/v2/turn_start_zsh_fork.rs):
- Replaced strict `approval_command.starts_with("/bin/rm")` checks with
intent-based subcommand matching.
- Subcommand approvals are now recognized by file-target semantics
(`first.txt` or `second.txt`) plus `rm` intent.
- Parent approval recognition is now more tolerant of command-format
differences while still requiring a definitive parent command context.
- Uses a defensive loop that waits for all target subcommand decisions
and the parent approval request.
- Preserved the existing regression and unit test fixes from earlier
commits in `unix_escalation.rs` and `skill_approval.rs`.

## Verification
- Ran the zsh fork subcommand decline regression under this change:
-
`turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
- Confirmed the test is now robust against approval-command-string
variation instead of hardcoding one expected command shape.
2026-02-25 12:23:30 -08:00
Eric Traut
f6fdfbeb98 Update Codex docs success link (#12805)
Fix a stale documentation link in the sign-in flow
2026-02-25 12:02:41 -08:00
Ahmed Ibrahim
3f30746237 Add simple realtime text logs (#12807)
Update realtime debug logs to include the actual text payloads in both
input and output paths.

- In `core/src/realtime_conversation.rs`:
- `handle_start`: add extracted assistant text output to the
`[realtime-text]` debug log.
- `handle_text`: add incoming text input (`params.text`) to the
`[realtime-text]` debug log.

No tests were run (per request).
2026-02-25 12:01:48 -08:00
Owen Lin
a0fd94bde6 feat(app-server): add ThreadItem::DynamicToolCall (#12732)
Previously, clients would call `thread/start` with dynamic_tools set,
and when a model invokes a dynamic tool, it would just make the
server->client `item/tool/call` request and wait for the client's
response to complete the tool call. This works, but it doesn't have an
`item/started` or `item/completed` event.

Now we are doing this:
- [new] emit `item/started` with `DynamicToolCall` populated with the
call arguments
- send an `item/tool/call` server request
- [new] once the client responds, emit `item/completed` with
`DynamicToolCall` populated with the response.

Also, with `persistExtendedHistory: true`, dynamic tool calls are now
reconstructable in `thread/read` and `thread/resume` as
`ThreadItem::DynamicToolCall`.
2026-02-25 12:00:10 -08:00
Rasmus Rygaard
73eaebbd1c Propagate session ID when compacting (#12802)
We propagate the session ID when sending requests for inference but we
don't do the same for compaction requests. This makes it hard to link
compaction requests to their session for debugging purposes
2026-02-25 19:17:38 +00:00
Michael Bolin
648a420cbf fix: enforce sandbox envelope for zsh fork execution (#12800)
## Why
Zsh fork execution was still able to bypass the `WorkspaceWrite` model
in edge cases because the fork path reconstructed command execution
without preserving sandbox wrappers, and command extraction only
accepted shell invocations in a narrow positional shape. This can allow
commands to run with broader filesystem access than expected, which
breaks the sandbox safety model.

## What changed
- Preserved the sandboxed `ExecRequest` produced by
`attempt.env_for(...)` when entering the zsh fork path in
[`unix_escalation.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs).
- Updated `CoreShellCommandExecutor` to execute the sandboxed command
and working directory captured from `attempt.env_for(...)`, instead of
re-running a freshly reconstructed shell command.
- Made zsh-fork script extraction robust to wrapped invocations by
scanning command arguments for `-c`/`-lc` rather than only matching the
first positional form.
- Added unit tests in `unix_escalation.rs` to lock in wrapper-tolerant
parsing behavior and keep unsupported shell forms rejected.
- Tightened the regression in
[`skill_approval.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/tests/suite/skill_approval.rs):
- `shell_zsh_fork_still_enforces_workspace_write_sandbox` now uses an
explicit `WorkspaceWrite` policy with `exclude_tmpdir_env_var: true` and
`exclude_slash_tmp: true`.
- The test attempts to write to `/tmp/...`, which is only reliably
outside writable roots with those explicit exclusions set.

## Verification
- Added and passed the new unit tests around `extract_shell_script`
parsing behavior with wrapped command shapes.
  - `extract_shell_script_supports_wrapped_command_prefixes`
  - `extract_shell_script_rejects_unsupported_shell_invocation`
- Verified the regression with the focused integration test:
`shell_zsh_fork_still_enforces_workspace_write_sandbox`.

## Manual Testing

Prior to this change, if I ran Codex via:

```
just codex --config zsh_path=/Users/mbolin/code/codex2/codex-rs/app-server/tests/suite/zsh --enable shell_zsh_fork
```

and asked:

```
what is the output of /bin/ps
```

it would run it, even though the default sandbox should prevent the
agent from running `/bin/ps` because it is setuid on MacOS.

But with this change, I now see the expected failure because it is
blocked by the sandbox:

```
/bin/ps exited with status 1 and produced no output in this environment.
```
2026-02-25 11:05:27 -08:00
pakrym-oai
9d7013eab0 Handle websocket timeout (#12791)
Sometimes websockets will timeout with 400 error, ensure we retry it.
2026-02-25 10:31:37 -08:00
jif-oai
7b39e76a66 Revert "fix(bazel): replace askama templates with include_str! in memories" (#12795)
Reverts openai/codex#11778
2026-02-25 18:06:17 +00:00
Ahmed Ibrahim
947092283a Add app-server v2 thread realtime API (#12715)
Add experimental `thread/realtime/*` v2 requests and notifications, then
route app-server realtime events through that thread-scoped surface with
integration coverage.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-25 09:59:10 -08:00
Curtis 'Fjord' Hawthorne
0543d0a022 Promote js_repl to experimental with Node requirement (#12712)
## Summary

- Promote `js_repl` to an experimental feature that users can enable
from `/experimental`.
- Add `js_repl` experimental metadata, including the Node prerequisite
and activation guidance.
- Add regression coverage for the feature metadata and the
`/experimental` popup.

## What Changed

- Changed `Feature::JsRepl` from `Stage::UnderDevelopment` to
`Stage::Experimental`.
- Added experimental metadata for `js_repl` in `core/src/features.rs`:
  - name: `JavaScript REPL`
- description: calls out interactive website debugging, inline
JavaScript execution, and the required Node version (`>= v24.13.1`)
- announcement: tells users to enable it, then start a new chat or
restart Codex
- Added a core unit test that verifies:
  - `js_repl` is experimental
  - `js_repl` is disabled by default
- the hardcoded Node version in the description matches
`node-version.txt`
- Added a TUI test that opens the `/experimental` popup and verifies the
rendered `js_repl` entry includes the Node requirement text.

## Testing

- `just fmt`
- `cargo test -p codex-tui`
- `cargo test -p codex-core` (unit-test phase passed; stopped during the
long `tests/all.rs` integration suite)
2026-02-25 09:44:52 -08:00
mcgrew-oai
9a393c9b6f feat(network-proxy): add embedded OTEL policy audit logging (#12046)
**PR Summary**

This PR adds embedded-only OTEL policy audit logging for
`codex-network-proxy` and threads audit metadata from `codex-core` into
managed proxy startup.

### What changed
- Added structured audit event emission in `network_policy.rs` with
target `codex_otel.network_proxy`.
- Emitted:
- `codex.network_proxy.domain_policy_decision` once per domain-policy
evaluation.
  - `codex.network_proxy.block_decision` for non-domain denies.
- Added required policy/network fields, RFC3339 UTC millisecond
`event.timestamp`, and fallback defaults (`http.request.method="none"`,
`client.address="unknown"`).
- Added non-domain deny audit emission in HTTP/SOCKS handlers for
mode-guard and proxy-state denies, including unix-socket deny paths.
- Added `REASON_UNIX_SOCKET_UNSUPPORTED` and used it for unsupported
unix-socket auditing.
- Added `NetworkProxyAuditMetadata` to runtime/state, re-exported from
`lib.rs` and `state.rs`.
- Added `start_proxy_with_audit_metadata(...)` in core config, with
`start_proxy()` delegating to default metadata.
- Wired metadata construction in `codex.rs` from session/auth context,
including originator sanitization for OTEL-safe tagging.
- Updated `network-proxy/README.md` with embedded-mode audit schema and
behavior notes.
- Refactored HTTP block-audit emission to a small local helper to reduce
duplication.
- Preserved existing unix-socket proxy-disabled host/path behavior for
responses and blocked history while using an audit-only endpoint
override (`server.address="unix-socket"`, `server.port=0`).

### Explicit exclusions
- No standalone proxy OTEL startup work.
- No `main.rs` binary wiring.
- No `standalone_otel.rs`.
- No standalone docs/tests.

### Tests
- Extended `network_policy.rs` tests for event mapping, metadata
propagation, fallbacks, timestamp format, and target prefix.
- Extended HTTP tests to assert unix-socket deny block audit events.
- Extended SOCKS tests to cover deny emission from handler deny
branches.
- Added/updated core tests to verify audit metadata threading into
managed proxy state.

### Validation run
- `just fmt`
- `cargo test -p codex-network-proxy` 
- `cargo test -p codex-core` ran with one unrelated flaky timeout
(`shell_snapshot::tests::snapshot_shell_does_not_inherit_stdin`), and
the test passed when rerun directly 

---------

Co-authored-by: viyatb-oai <viyatb@openai.com>
2026-02-25 11:46:37 -05:00
jif-oai
8362b79cb4 feat: fix sqlite home (#12787) 2026-02-25 15:52:55 +00:00
jif-oai
01f25a7b96 chore: unify max depth parameter (#12770)
Users were confused
2026-02-25 15:20:24 +00:00
mcgrew-oai
bccce0d75f otel: add host.name resource attribute to logs/traces via gethostname (#12352)
**PR Summary**

This PR adds the OpenTelemetry `host.name` resource attribute to Codex
OTEL exports so every OTEL log (and trace, via the shared resource)
carries the machine hostname.

**What changed**

- Added `host.name` to the shared OTEL `Resource` in
`/Users/michael.mcgrew/code/codex/codex-rs/otel/src/otel_provider.rs`
  - This applies to both:
    - OTEL logs (`SdkLoggerProvider`)
    - OTEL traces (`SdkTracerProvider`)
- Hostname is now resolved via `gethostname::gethostname()`
(best-effort)
  - Value is trimmed
  - Empty values are omitted (non-fatal)
- Added focused unit tests for:
  - including `host.name` when present
  - omitting `host.name` when missing/empty

**Why**

- `host.name` is host/process metadata and belongs on the OTEL
`resource`, not per-event attributes.
- Attaching it in the shared resource is the smallest change that
guarantees coverage across all exported OTEL logs/traces.

**Scope / Non-goals**

- No public API changes
- No changes to metrics behavior (this PR only updates log/trace
resource metadata)

**Dependency updates**

- Added `gethostname` as a workspace dependency and `codex-otel`
dependency
- `Cargo.lock` updated accordingly
- `MODULE.bazel.lock` unchanged after refresh/check

**Validation**

- `just fmt`
- `cargo test -p codex-otel`
- `just bazel-lock-update`
- `just bazel-lock-check`
2026-02-25 09:54:45 -05:00