Commit Graph

884 Commits

Author SHA1 Message Date
jif-oai
42ae738f67 feat: model warning in case of apply patch (#7494) 2025-12-03 09:07:31 +00:00
Robby He
f3989f6092 fix(unified_exec): use platform default shell when unified_exec shell… (#7486)
# Unified Exec Shell Selection on Windows

## Problem

reference issue #7466

The `unified_exec` handler currently deserializes model-provided tool
calls into the `ExecCommandArgs` struct:

```rust
#[derive(Debug, Deserialize)]
struct ExecCommandArgs {
    cmd: String,
    #[serde(default)]
    workdir: Option<String>,
    #[serde(default = "default_shell")]
    shell: String,
    #[serde(default = "default_login")]
    login: bool,
    #[serde(default = "default_exec_yield_time_ms")]
    yield_time_ms: u64,
    #[serde(default)]
    max_output_tokens: Option<usize>,
    #[serde(default)]
    with_escalated_permissions: Option<bool>,
    #[serde(default)]
    justification: Option<String>,
}
```

The `shell` field uses a hard-coded default:

```rust
fn default_shell() -> String {
    "/bin/bash".to_string()
}
```

When the model returns a tool call JSON that only contains `cmd` (which
is the common case), Serde fills in `shell` with this default value.
Later, `get_command` uses that value as if it were a model-provided
shell path:

```rust
fn get_command(args: &ExecCommandArgs) -> Vec<String> {
    let shell = get_shell_by_model_provided_path(&PathBuf::from(args.shell.clone()));
    shell.derive_exec_args(&args.cmd, args.login)
}
```

On Unix, this usually resolves to `/bin/bash` and works as expected.
However, on Windows this behavior is problematic:

- The hard-coded `"/bin/bash"` is not a valid Windows path.
- `get_shell_by_model_provided_path` treats this as a model-specified
shell, and tries to resolve it (e.g. via `which::which("bash")`), which
may or may not exist and may not behave as intended.
- In practice, this leads to commands being executed under a non-default
or non-existent shell on Windows (for example, WSL bash), instead of the
expected Windows PowerShell or `cmd.exe`.

The core of the issue is that **"model did not specify `shell`" is
currently interpreted as "the model explicitly requested `/bin/bash`"**,
which is both Unix-specific and wrong on Windows.

## Proposed Solution

Instead of hard-coding `"/bin/bash"` into `ExecCommandArgs`, we should
distinguish between:

1. **The model explicitly specifying a shell**, e.g.:

   ```json
   {
     "cmd": "echo hello",
     "shell": "pwsh"
   }
   ```

In this case, we *do* want to respect the model’s choice and use
`get_shell_by_model_provided_path`.

2. **The model omitting the `shell` field entirely**, e.g.:

   ```json
   {
     "cmd": "echo hello"
   }
   ```

In this case, we should *not* assume `/bin/bash`. Instead, we should use
`default_user_shell()` and let the platform decide.

To express this distinction, we can:

1. Change `shell` to be optional in `ExecCommandArgs`:

   ```rust
   #[derive(Debug, Deserialize)]
   struct ExecCommandArgs {
       cmd: String,
       #[serde(default)]
       workdir: Option<String>,
       #[serde(default)]
       shell: Option<String>,
       #[serde(default = "default_login")]
       login: bool,
       #[serde(default = "default_exec_yield_time_ms")]
       yield_time_ms: u64,
       #[serde(default)]
       max_output_tokens: Option<usize>,
       #[serde(default)]
       with_escalated_permissions: Option<bool>,
       #[serde(default)]
       justification: Option<String>,
   }
   ```

Here, the absence of `shell` in the JSON is represented as `shell:
None`, rather than a hard-coded string value.
2025-12-02 21:49:25 -08:00
Michael Bolin
06e7667d0e fix: inline function marked as dead code (#7508)
I was debugging something else and noticed we could eliminate an
instance of `#[allow(dead_code)]` pretty easily.
2025-12-03 00:50:34 +00:00
Ahmed Ibrahim
1ef1fe67ec improve resume performance (#7303)
Reading the tail can be costly if we have a very big rollout item. we
can just read the file metadata
2025-12-02 16:39:40 -08:00
Michael Bolin
ec93b6daf3 chore: make create_approval_requirement_for_command an async fn (#7501)
I think this might help with https://github.com/openai/codex/pull/7033
because `create_approval_requirement_for_command()` will soon need
access to `Session.state`, which is a `tokio::sync::Mutex` that needs to
be accessed via `async`.
2025-12-02 15:01:15 -08:00
liam
4d4778ec1c Trim history.jsonl when history.max_bytes is set (#6242)
This PR honors the `history.max_bytes` configuration parameter by
trimming `history.jsonl` whenever it grows past the configured limit.
While appending new entries we retain the newest record, drop the oldest
lines to stay within the byte budget, and serialize the compacted file
back to disk under the same lock to keep writers safe.
2025-12-02 14:01:05 -08:00
zhao-oai
5ebdc9af1b persisting credits if new snapshot does not contain credit info (#7490)
in response to incoming changes to responses headers where the header
may sometimes not contain credits info (no longer forcing a credit
check)
2025-12-02 16:23:24 -05:00
Michael Bolin
f6a7da4ac3 fix: drop lock once it is no longer needed (#7500)
I noticed this while doing a post-commit review of https://github.com/openai/codex/pull/7467.
2025-12-02 20:46:26 +00:00
Ahmed Ibrahim
127e307f89 Show token used when context window is unknown (#7497)
- Show context window usage in tokens instead of percentage when the
window length is unknown.
2025-12-02 11:45:50 -08:00
Ahmed Ibrahim
21ad1c1c90 Use non-blocking mutex (#7467) 2025-12-02 10:50:46 -08:00
jif-oai
72b95db12f feat: intercept apply_patch for unified_exec (#7446) 2025-12-02 17:54:02 +00:00
jif-oai
9ee855ec57 feat: add warning message for the model (#7445)
Add a warning message as a user turn to the model if the model does not
behave as expected (here, for example, if the model opens too many
`unified_exec` sessions)
2025-12-02 11:56:00 +00:00
jif-oai
4b78e2ab09 chore: review everywhere (#7444) 2025-12-02 11:26:27 +00:00
Thibault Sottiaux
a8d5ad37b8 feat: experimental support for skills.md (#7412)
This change prototypes support for Skills with the CLI. This is an
**experimental** feature for internal testing.

---------

Co-authored-by: Gav Verma <gverma@openai.com>
2025-12-01 20:22:35 -08:00
Steve Mostovoy
f443555728 fix(core): enable history lookup on windows (#7457)
- Add portable history log id helper to support inode-like tracking on
Unix and creation time on Windows
- Refactor history metadata and lookup to share code paths and allow
nonzero log ids across platforms
- Add coverage for lookup stability after appends
2025-12-01 16:29:01 -08:00
Dylan Hurd
5b25915d7e fix(apply_patch) tests for shell_command (#7307)
## Summary
Adds test coverage for invocations of apply_patch via shell_command with
heredoc, to validate behavior.

## Testing
- [x] These are tests
2025-12-01 15:09:22 -08:00
Ali Towaiji
0cc3b50228 Fix recent_commits(limit=0) returning 1 commit instead of 0 (#7334)
Fixes #7333

This is a small bug fix.

This PR fixes an inconsistency in `recent_commits` where `limit == 0`
still returns 1 commit due to the use of `limit.max(1)` when
constructing the `git log -n` argument.

Expected behavior: requesting 0 commits should return an empty list.

This PR:
- returns an empty `Vec` when `limit == 0`
- adds a test for `recent_commits(limit == 0)` that fails before the
change and passes afterwards
- maintains existing behavior for `limit > 0`

This aligns behavior with API expectations and avoids downstream
consumers misinterpreting the repository as having commit history when
`limit == 0` is used to explicitly request none.

Happy to adjust if the current behavior is intentional.
2025-12-01 10:14:36 -08:00
jif-oai
a421eba31f fix: disable review rollout filtering (#7371) 2025-12-01 09:04:13 +00:00
jif-oai
457c9fdb87 chore: better session recycling (#7368) 2025-11-30 12:42:26 -08:00
jif-oai
aaec8abf58 feat: detached review (#7292) 2025-11-28 11:34:57 +00:00
Job Chong
cbd7d0d543 chore: improve rollout session init errors (#7336)
Title: Improve rollout session initialization error messages

Issue: https://github.com/openai/codex/issues/7283

What: add targeted mapping for rollout/session initialization errors so
users get actionable messages when Codex cannot access session files.

Why: session creation previously returned a generic internal error,
hiding permissions/FS issues and making support harder.

How:
- Added rollout::error::map_session_init_error to translate the more
common io::Error kinds into user-facing hints (permission, missing dir,
file blocking, corruption). Others are passed through directly with
`CodexErr::Fatal`.
- Reused the mapper in Codex session creation to preserve root causes
instead of returning InternalAgentDied.
2025-11-27 00:20:33 -08:00
Eric Traut
e953092949 Fixed regression in experimental "sandbox command assessment" feature (#7308)
Recent model updates caused the experimental "sandbox tool assessment"
to time out most of the time leaving the user without any risk
assessment or tool summary. This change explicitly sets the reasoning
effort to medium and bumps the timeout.

This change has no effect if the user hasn't enabled the
`experimental_sandbox_command_assessment` feature flag.
2025-11-25 16:15:13 -08:00
jif-oai
28ff364c3a feat: update process ID for event handling (#7261) 2025-11-25 14:21:05 -08:00
jif-oai
4502b1b263 chore: proper client extraction (#6996) 2025-11-25 18:06:12 +00:00
jif-oai
2845e2c006 fix: drop conversation when /new (#7297) 2025-11-25 17:20:25 +00:00
jif-oai
9ba27cfa0a feat: add compaction event (#7289) 2025-11-25 16:12:14 +00:00
jif-oai
37d83e075e feat: add custom env for unified exec process (#7286) 2025-11-25 10:35:35 +00:00
jif-oai
523b40a129 feat[app-serve]: config management (#7241) 2025-11-25 09:29:38 +00:00
Clifford Ressel
3308dc5e48 fix: Correct the stream error message (#7266)
Fixes a copy paste bug with the error handling in  `try_run_turn`

I have read the CLA Document and I hereby sign the CLA
2025-11-24 20:16:29 -08:00
jif-oai
fc2ff624ac fix: don't store early exit sessions (#7263) 2025-11-24 21:14:24 +00:00
Josh McKinney
ec49b56874 chore: add cargo-deny configuration (#7119)
- add GitHub workflow running cargo-deny on push/PR
- document cargo-deny allowlist with workspace-dep notes and advisory
ignores
- align workspace crates to inherit version/edition/license for
consistent checks
2025-11-24 12:22:18 -08:00
Gabriel Peal
3741f387e9 Allow enterprises to skip upgrade checks and messages (#7213)
This is a feature primarily for enterprises who centrally manage Codex
updates.
2025-11-24 15:04:49 -05:00
Dylan Hurd
1e832b1438 fix(windows) support apply_patch parsing in powershell (#7221)
## Summary
Support powershell parsing of apply_patch

## Testing
- [x] Enable apply_patch unit tests

---------

Co-authored-by: jif-oai <jif@openai.com>
2025-11-24 19:32:47 +00:00
Matthew Zeng
c31663d745 [feedback] Add source info into feedback metadata. (#7140)
Verified the source info is correctly attached based on whether it's cli
or vscode.
2025-11-24 19:05:37 +00:00
jif-oai
35d89e820f fix: flaky test (#7257) 2025-11-24 18:45:41 +00:00
jif-oai
b2cddec3d7 feat: unified exec basic pruning strategy (#7239)
LRU + exited sessions first
2025-11-24 17:22:32 +00:00
jif-oai
920239f272 fix: codex delegate cancellation (#7092) 2025-11-24 16:59:09 +00:00
jif-oai
99bcb90353 chore: use proxy for encrypted summary (#7252) 2025-11-24 16:51:47 +00:00
Ahmed Ibrahim
b519267d05 Account for encrypted reasoning for auto compaction (#7113)
- The total token used returned from the api doesn't account for the
reasoning items before the assistant message
- Account for those for auto compaction
- Add the encrypted reasoning effort in the common tests utils
- Add a test to make sure it works as expected
2025-11-22 03:06:45 +00:00
Michael Bolin
c6f68c9df8 feat: declare server capability in shell-tool-mcp (#7112)
This introduces a new feature to Codex when it operates as an MCP
_client_ where if an MCP _server_ replies that it has an entry named
`"codex/sandbox-state"` in its _server capabilities_, then Codex will
send it an MCP notification with the following structure:

```json
{
  "method": "codex/sandbox-state/update",
  "params": {
    "sandboxPolicy": {
      "type": "workspace-write",
      "network-access": false,
      "exclude-tmpdir-env-var": false
      "exclude-slash-tmp": false
    },
    "codexLinuxSandboxExe": null,
    "sandboxCwd": "/Users/mbolin/code/codex2"
  }
}
```

or with whatever values are appropriate for the initial `sandboxPolicy`.

**NOTE:** Codex _should_ continue to send the MCP server notifications
of the same format if these things change over the lifetime of the
thread, but that isn't wired up yet.

The result is that `shell-tool-mcp` can consume these values so that
when it calls `codex_core::exec::process_exec_tool_call()` in
`codex-rs/exec-server/src/posix/escalate_server.rs`, it is now sure to
call it with the correct values (whereas previously we relied on
hardcoded values).

While I would argue this is a supported use case within the MCP
protocol, the `rmcp` crate that we are using today does not support
custom notifications. As such, I had to patch it and I submitted it for
review, so hopefully it will be accepted in some form:

https://github.com/modelcontextprotocol/rust-sdk/pull/556

To test out this change from end-to-end:

- I ran `cargo build` in `~/code/codex2/codex-rs/exec-server`
- I built the fork of Bash in `~/code/bash/bash`
- I added the following to my `~/.codex/config.toml`:

```toml
# Use with `codex --disable shell_tool`.
[mcp_servers.execshell]
args = ["--bash", "/Users/mbolin/code/bash/bash"]
command = "/Users/mbolin/code/codex2/codex-rs/target/debug/codex-exec-mcp-server"
```

- From `~/code/codex2/codex-rs`, I ran `just codex --disable shell_tool`
- When the TUI started up, I verified that the sandbox mode is
`workspace-write`
- I ran `/mcp` to verify that the shell tool from the MCP is there:

<img width="1387" height="1400" alt="image"
src="https://github.com/user-attachments/assets/1a8addcc-5005-4e16-b59f-95cfd06fd4ab"
/>

- Then I asked it:

> what is the output of `gh issue list`

because this should be auto-approved with our existing dummy policy:


af63e6eccc/codex-rs/exec-server/src/posix.rs (L157-L164)

And it worked:

<img width="1387" height="1400" alt="image"
src="https://github.com/user-attachments/assets/7568d2f7-80da-4d68-86d0-c265a6f5e6c1"
/>
2025-11-21 16:11:01 -08:00
zhao-oai
87b211709e bypass sandbox for policy approved commands (#7110)
allowing cmds greenlit by execpolicy to bypass sandbox + minor refactor
for a world where we have execpolicy rules with specific sandbox
requirements
2025-11-21 18:03:23 -05:00
Michael Bolin
67975ed33a refactor: inline sandbox type lookup in process_exec_tool_call (#7122)
`process_exec_tool_call()` was taking `SandboxType` as a param, but in
practice, the only place it was constructed was in
`codex_message_processor.rs` where it was derived from the other
`sandbox_policy` param, so this PR inlines the logic that decides the
`SandboxType` into `process_exec_tool_call()`.



---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/7122).
* #7112
* __->__ #7122
2025-11-21 22:53:05 +00:00
Jeremy Rose
7561a6aaf0 support MCP elicitations (#6947)
No support for request schema yet, but we'll at least show the message
and allow accept/decline.

<img width="823" height="551" alt="Screenshot 2025-11-21 at 2 44 05 PM"
src="https://github.com/user-attachments/assets/6fbb892d-ca12-4765-921e-9ac4b217534d"
/>
2025-11-21 14:44:53 -08:00
pakrym-oai
e52cc38dfd Use use_model (#7121) 2025-11-21 22:10:52 +00:00
iceweasel-oai
3bdcbc7292 Windows: flag some invocations that launch browsers/URLs as dangerous (#7111)
Prevent certain Powershell/cmd invocations from reaching the sandbox
when they are trying to launch a browser, or run a command with a URL,
etc.
2025-11-21 13:36:17 -08:00
Ahmed Ibrahim
d5f661c91d enable unified exec for experiments (#7118) 2025-11-21 13:10:01 -08:00
Ahmed Ibrahim
8ecaad948b feat: Add exp model to experiment with the tools (#7115) 2025-11-21 12:44:47 -08:00
jif-oai
af65666561 chore: drop model_max_output_tokens (#7100) 2025-11-21 17:42:54 +00:00
jif-oai
bce030ddb5 Revert "fix: read max_output_tokens param from config" (#7088)
Reverts openai/codex#4139
2025-11-21 11:40:02 +01:00
Yorling
c9e149fd5c fix: read max_output_tokens param from config (#4139)
Request param `max_output_tokens` is documented in
`https://github.com/openai/codex/blob/main/docs/config.md`,
but nowhere uses the item in config, this commit read it from config for
GPT responses API.

see https://github.com/openai/codex/issues/4138 for issue report.

Signed-off-by: Yorling <shallowcloud@yeah.net>
2025-11-20 22:46:34 -08:00