Commit Graph

1721 Commits

Author SHA1 Message Date
zbarsky-openai
8497163363 [bazel] Improve runfiles handling (#10098)
we can't use runfiles directory on Windows due to path lengths, so swap
to manifest strategy. Parsing the manifest is a bit complex and the
format is changing in Bazel upstream, so pull in the official Rust
library (via a small hack to make it importable...) and cleanup all the
associated logic to work cleanly in both bazel and cargo without extra
confusion
2026-01-29 00:15:44 +00:00
sayan-oai
ff9fa56368 default enable compression, update test helpers (#10102)
set `enable_request_compression` flag to default-enabled.

update integration test helpers to decompress `zstd` if flag set.
2026-01-28 12:25:40 -08:00
Eric Traut
147e7118e0 Added tui.notifications_method config option (#10043)
This PR adds a new `tui.notifications_method` config option that accepts
values of "auto", "osc9" and "bel". It defaults to "auto", which
attempts to auto-detect whether the terminal supports OSC 9 escape
sequences and falls back to BEL if not.

The PR also removes the inconsistent handling of notifications on
Windows when WSL was used.
2026-01-28 12:00:32 -08:00
iceweasel-oai
66de985e4e allow elevated sandbox to be enabled without base experimental flag (#10028)
elevated flag = elevated sandbox
experimental flag = non-elevated sandbox
both = elevated
2026-01-28 11:38:29 -08:00
Ahmed Ibrahim
b7edeee8ca compaction (#10034)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-28 11:36:11 -08:00
sayan-oai
851617ff5a chore: deprecate old web search feature flags (#10097)
deprecate all old web search flags and aliases, including:
- `[features].web_search_request` and `[features].web_search_cached`
- `[tools].web_search`
- `[features].web_search`

slightly rework `legacy_usages` to enable pointing to non-features from
deprecated features; we need to point to `web_search` (not under
`[features]`) from things like `[features].web_search_cached` and
`[features].web_search_request`.

Added integration tests to confirm deprecation notice is shown on
explicit enablement and disablement of deprecated flags.
2026-01-28 10:55:57 -08:00
Jeremy Rose
b8156706e6 file-search: improve file query perf (#9939)
switch nucleo-matcher for nucleo and use a "file search session" w/ live
updating query instead of a single hermetic run per query.
2026-01-28 10:54:43 -08:00
jif-oai
231406bd04 feat: sort metadata by date (#10083) 2026-01-28 16:19:08 +01:00
jif-oai
3878c3dc7c feat: sqlite 1 (#10004)
Add a `.sqlite` database to be used to store rollout metatdata (and
later logs)
This PR is phase 1:
* Add the database and the required infrastructure
* Add a backfill of the database
* Persist the newly created rollout both in files and in the DB
* When we need to get metadata or a rollout, consider the `JSONL` as the
source of truth but compare the results with the DB and show any errors
2026-01-28 15:29:14 +01:00
gt-oai
71b8d937ed Add exec policy TOML representation (#10026)
We'd like to represent these in `requirements.toml`. This just adds the
representation and the tests, doesn't wire it up anywhere yet.
2026-01-28 12:00:10 +00:00
Dylan Hurd
996e09ca24 feat(core) RequestRule (#9489)
## Summary
Instead of trying to derive the prefix_rule for a command mechanically,
let's let the model decide for us.

## Testing
- [x] tested locally
2026-01-28 08:43:17 +00:00
iceweasel-oai
9f79365691 error code/msg details for failed elevated setup (#9941) 2026-01-27 23:06:10 -08:00
Dylan Hurd
fef3e36f67 fix(core) info cleanup (#9986)
## Summary
Simplify this logic a bit.
2026-01-27 21:15:15 -07:00
Matthew Zeng
3bb8e69dd3 [skills] Auto install MCP dependencies when running skils with dependency specs. (#9982)
Auto install MCP dependencies when running skils with dependency specs.
2026-01-27 19:02:45 -08:00
sayan-oai
1609f6aa81 fix: allow unknown fields on Notice in schema (#10041)
the `notice` field didn't allow unknown fields in the schema, leading to
issues where they shouldn't be.

Now we allow unknown fields.

<img width="2260" height="720" alt="image"
src="https://github.com/user-attachments/assets/1de43b60-0d50-4a96-9c9c-34419270d722"
/>
2026-01-27 18:24:24 -08:00
sayan-oai
a90ab789c2 fix: enable per-turn updates to web search mode (#10040)
web_search can now be updated per-turn, for things like changes to
sandbox policy.

`SandboxPolicy::DangerFullAccess` now sets web_search to `live`, and the
default is still `cached`.

Added integration tests.
2026-01-27 18:09:29 -08:00
sayan-oai
28051d18c6 enable live web search for DangerFullAccess sandbox policy (#10008)
Auto-enable live `web_search` tool when sandbox policy is
`DangerFullAccess`.

Explicitly setting `web_search` (canonical setting), or enabling
`web_search_cached` or `web_search_request` still takes precedence over
this sandbox-policy-driven enablement.
2026-01-27 20:09:05 +00:00
alexsong-oai
2f8a44baea Remove load from SKILL.toml fallback (#10007) 2026-01-27 12:06:40 -08:00
iceweasel-oai
c40ad65bd8 remove sandbox globals. (#9797)
Threads sandbox updates through OverrideTurnContext for active turn
Passes computed sandbox type into safety/exec
2026-01-27 11:04:23 -08:00
Owen Lin
fc0fd85349 fix(app-server, core): defer initial context write to rollout file until first turn (#9950)
### Overview
Currently calling `thread/resume` will always bump the thread's
`updated_at` timestamp. This PR makes it the `updated_at` timestamp
changes only if a turn is triggered.

### Additonal context
What we typically do on resuming a thread is **always** writing “initial
context” to the rollout file immediately. This initial context includes:
- Developer instructions derived from sandbox/approval policy + cwd
- Optional developer instructions (if provided)
- Optional collaboration-mode instructions
- Optional user instructions (if provided)
- Environment context (cwd, shell, etc.)

This PR defers writing the “initial context” to the rollout file until
the first `turn/start`, so we don't inadvertently bump the thread's
`updated_at` timestamp until a turn is actually triggered.

This works even though both `thread/resume` and `turn/start` accept
overrides (such as `model`, `cwd`, etc.) because the initial context is
seeded from the effective `TurnContext` in memory, computed at
`turn/start` time, after both sets of overrides have been applied.

**NOTE**: This is a very short-lived solution until we introduce sqlite.
Then we can remove this.
2026-01-27 10:41:54 -08:00
jif-oai
067922a734 description in role type (#9993) 2026-01-27 17:20:07 +00:00
jif-oai
3b726d9550 chore: clean orchestrator prompt (#9994) 2026-01-27 16:32:05 +00:00
jif-oai
74ffbbe7c1 nit: better unused prompt (#9991) 2026-01-27 13:03:12 +00:00
jif-oai
742f086ee6 nit: better tool description (#9988) 2026-01-27 12:46:51 +00:00
K Bediako
ab99df0694 Fix: cap aggregated exec output consistently (#9759)
## WHAT?
- Bias aggregated output toward stderr under contention (2/3 stderr, 1/3
stdout) while keeping the 1 MiB cap.
- Rebalance unused stderr share back to stdout when stderr is tiny to
avoid underfilling.
- Add tests for contention, small-stderr rebalance, and under-cap
ordering (stdout then stderr).

## WHY?
- Review feedback requested stderr priority under contention.
- Avoid underfilled aggregated output when stderr is small while
preserving a consistent cap across exec paths.

## HOW?
- Update `aggregate_output` to compute stdout/stderr shares, then
reassign unused capacity to the other stream.
- Use the helper in both Windows and async exec paths.
- Add regression tests for contention/rebalance and under-cap ordering.

## BEFORE
```rust
// Best-effort aggregate: stdout then stderr (capped).
let mut aggregated = Vec::with_capacity(
    stdout
        .text
        .len()
        .saturating_add(stderr.text.len())
        .min(EXEC_OUTPUT_MAX_BYTES),
);
append_capped(&mut aggregated, &stdout.text, EXEC_OUTPUT_MAX_BYTES);
append_capped(&mut aggregated, &stderr.text, EXEC_OUTPUT_MAX_BYTES);
let aggregated_output = StreamOutput {
    text: aggregated,
    truncated_after_lines: None,
};
```

## AFTER
```rust
fn aggregate_output(
    stdout: &StreamOutput<Vec<u8>>,
    stderr: &StreamOutput<Vec<u8>>,
) -> StreamOutput<Vec<u8>> {
    let total_len = stdout.text.len().saturating_add(stderr.text.len());
    let max_bytes = EXEC_OUTPUT_MAX_BYTES;
    let mut aggregated = Vec::with_capacity(total_len.min(max_bytes));

    if total_len <= max_bytes {
        aggregated.extend_from_slice(&stdout.text);
        aggregated.extend_from_slice(&stderr.text);
        return StreamOutput {
            text: aggregated,
            truncated_after_lines: None,
        };
    }

    // Under contention, reserve 1/3 for stdout and 2/3 for stderr; rebalance unused stderr to stdout.
    let want_stdout = stdout.text.len().min(max_bytes / 3);
    let want_stderr = stderr.text.len();
    let stderr_take = want_stderr.min(max_bytes.saturating_sub(want_stdout));
    let remaining = max_bytes.saturating_sub(want_stdout + stderr_take);
    let stdout_take = want_stdout + remaining.min(stdout.text.len().saturating_sub(want_stdout));

    aggregated.extend_from_slice(&stdout.text[..stdout_take]);
    aggregated.extend_from_slice(&stderr.text[..stderr_take]);

    StreamOutput {
        text: aggregated,
        truncated_after_lines: None,
    }
}
```

## TESTS
- [x] `just fmt`
- [x] `just fix -p codex-core`
- [x] `cargo test -p codex-core aggregate_output_`
- [x] `cargo test -p codex-core`
- [x] `cargo test --all-features`

## FIXES
Fixes #9758
2026-01-27 09:29:12 +00:00
Ahmed Ibrahim
509ff1c643 Fixing main and make plan mode reasoning effort medium (#9980)
It's overthinking so much on high and going over the context window.
2026-01-26 22:30:24 -08:00
Ahmed Ibrahim
cabb2085cc make plan prompt less detailed (#9977)
This was too much to ask for
2026-01-26 21:42:01 -08:00
Ahmed Ibrahim
4db6da32a3 tui: wrapping user input questions (#9971) 2026-01-26 21:30:09 -08:00
sayan-oai
0adcd8aa86 make cached web_search client-side default (#9974)
[Experiment](https://console.statsig.com/50aWbk2p4R76rNX9lN5VUw/experiments/codex_web_search_rollout/summary)
for default cached `web_search` completed; cached chosen as default.

Update client to reflect that.
2026-01-26 21:25:40 -08:00
Ahmed Ibrahim
28bd7db14a plan prompt (#9975)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-26 21:14:05 -08:00
Ahmed Ibrahim
0c72d8fd6e prompt (#9970)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-26 20:27:57 -08:00
Ahmed Ibrahim
f45a8733bf prompt final (#9969)
hopefully final this time (at least tonight) >_<
2026-01-26 20:12:43 -08:00
Ahmed Ibrahim
b655a092ba Improve plan mode prompt (#9968)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-26 19:56:16 -08:00
Ahmed Ibrahim
b7bba3614e plan prompt v7 (#9966)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-26 19:34:18 -08:00
sayan-oai
86adf53235 fix: handle all web_search actions and in progress invocations (#9960)
### Summary
- Parse all `web_search` tool actions (`search`, `find_in_page`,
`open_page`).
- Previously we only parsed + displayed `search`, which made the TUI
appear to pause when the other actions were being used.
- Show in progress `web_search` calls as `Searching the web`
  - Previously we only showed completed tool calls

<img width="308" height="149" alt="image"
src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845"
/>

### Tests
Added + updated tests, tested locally

### Follow ups
Update VSCode extension to display these as well
2026-01-27 03:33:48 +00:00
pakrym-oai
998e88b12a Use test_codex more (#9961)
Reduces boilderplate.
2026-01-26 18:52:10 -08:00
Ahmed Ibrahim
c900de271a Warn users on enabling underdevelopment features (#9954)
<img width="938" height="73" alt="image"
src="https://github.com/user-attachments/assets/a2d5ac46-92c5-4828-b35e-0965c30cdf36"
/>
2026-01-27 01:58:05 +00:00
alexsong-oai
a641a6427c feat: load interface metadata from SKILL.json (#9953) 2026-01-27 01:38:06 +00:00
Charley Cunningham
47aa1f3b6a Reject request_user_input outside Plan/Pair (#9955)
## Context

Previous work in https://github.com/openai/codex/pull/9560 only rejected
`request_user_input` in Execute and Custom modes. Since then, additional
modes
(e.g., Code) were added, so the guard should be mode-agnostic.

## What changed

- Switch the handler to an allowlist: only Plan and PairProgramming are
allowed
- Return the same error for any other mode (including Code)
- Add a Code-mode rejection test alongside the existing Execute/Custom
tests

## Why

This prevents `request_user_input` from being used in modes where it is
not
intended, even as new modes are introduced.
2026-01-26 17:12:17 -08:00
Ahmed Ibrahim
a8f195828b Add composer config and shared menu surface helpers (#9891)
Centralize built-in slash-command gating and extract shared menu-surface
helpers.

- Add bottom_pane::slash_commands and reuse it from composer + command
popup.
- Introduce ChatComposerConfig + shared menu surface rendering without
changing default behavior.
2026-01-26 23:16:29 +00:00
Ahmed Ibrahim
159ff06281 plan prompt (#9943)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-26 14:48:54 -08:00
blevy-oai
bdc4742bfc Add MCP server scopes config and use it as fallback for OAuth login (#9647)
### Motivation
- Allow MCP OAuth flows to request scopes defined in `config.toml`
instead of requiring users to always pass `--scopes` on the CLI.
CLI/remote parameters should still override config values.

### Description
- Add optional `scopes: Option<Vec<String>>` to `McpServerConfig` and
`RawMcpServerConfig`, and propagate it through deserialization and the
built config types.
- Serialize `scopes` into the MCP server TOML via
`serialize_mcp_server_table` in `core/src/config/edit.rs` and include
`scopes` in the generated config schema (`core/config.schema.json`).
- CLI: update `codex-rs/cli/src/mcp_cmd.rs` `run_login` to fall back to
`server.scopes` when the `--scopes` flag is empty, with explicit CLI
scopes still taking precedence.
- App server: update
`codex-rs/app-server/src/codex_message_processor.rs`
`mcp_server_oauth_login` to use `params.scopes.or_else(||
server.scopes.clone())` so the RPC path also respects configured scopes.
- Update many test fixtures to initialize the new `scopes` field (set to
`None`) so test code builds with the new struct field.

### Testing
- Ran config tooling and formatters: `just write-config-schema`
(succeeded), `just fmt` (succeeded), and `just fix -p codex-core`, `just
fix -p codex-cli`, `just fix -p codex-app-server` (succeeded where
applicable).
- Ran unit tests for the CLI: `cargo test -p codex-cli` (passed).
- Ran unit tests for core: `cargo test -p codex-core` (ran; many tests
passed but several failed, including model refresh/403-related tests,
shell snapshot/timeouts, and several `unified_exec` expectations).
- Ran app-server tests: `cargo test -p codex-app-server` (ran; many
integration-suite tests failed due to mocked/remote HTTP 401/403
responses and wiremock expectations).

If you want, I can split the tests into smaller focused runs or help
debug the failing integration tests (they appear to be unrelated to the
config change and stem from external HTTP/mocking behaviors encountered
during the test runs).

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69718f505914832ea1f334b3ba064553)
2026-01-26 14:13:04 -08:00
Eric Traut
b77bf4d36d Aligned feature stage names with public feature maturity stages (#9929)
We've recently standardized a [feature maturity
model](https://developers.openai.com/codex/feature-maturity) that we're
using in our docs and support forums to communicate expectations to
users. This PR updates the internal stage names and descriptions to
match.

This change involves a simple internal rename and updates to a few
user-visible strings. No functional change.
2026-01-26 11:43:36 -08:00
Charley Cunningham
62266b13f8 Add thread/unarchive to restore archived rollouts (#9843)
## Summary
- Adds a new `thread/unarchive` RPC to move archived thread rollouts
back into the active `sessions/` tree.

## What changed
- **Protocol**
  - Adds `thread/unarchive` request/response types and wiring.
- **Server**
  - Implements `thread_unarchive` in the app server.
  - Validates the archived rollout path and thread ID.
- Restores the rollout to `sessions/YYYY/MM/DD/...` based on the rollout
filename timestamp.
- **Core**
- Adds `find_archived_thread_path_by_id_str` helper for archived
rollouts.
- **Docs**
  - Documents the new RPC and usage example.
- **Tests**
  - Adds an end-to-end server test that:
    1) starts a thread,
    2) archives it,
    3) unarchives it,
    4) asserts the file is restored to `sessions/`.

## How to use
```json
{ "method": "thread/unarchive", "id": 24, "params": { "threadId": "<thread-id>" } }
```

## Author Codex Session

`codex resume 019bf158-54b6-7960-a696-9d85df7e1bc1` (soon I'll make this
kind of session UUID forkable by anyone with the right
`session_object_storage_url` line in their config, but for now just
pasting it here for my reference)
2026-01-26 11:24:36 -08:00
jif-oai
09251387e0 chore: update interrupt message (#9925) 2026-01-26 19:07:54 +00:00
Ahmed Ibrahim
e471ebc5d2 prompt (#9928)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-26 10:27:18 -08:00
Gene Oden
375a5ef051 fix: attempt to reduce high cpu usage when using collab (#9776)
Reproduce with a prompt like this with collab enabled:
```
Examine the code at <some subdirectory with a deeply nested project>.  Find the most urgent issue to resolve and describe it to me.
```

Existing behavior causes the top-level agent to busy wait on subagents.
2026-01-26 10:07:25 -08:00
gt-oai
fdc69df454 Fix flakey shell snapshot test (#9919)
Sometimes fails with:

```
failures:

  ---- shell_snapshot::tests::timed_out_snapshot_shell_is_terminated stdout ----

  thread 'shell_snapshot::tests::timed_out_snapshot_shell_is_terminated' panicked at codex-rs/core/src/shell_snapshot.rs:588:9:
  expected timeout error, got Failed to execute sh

  Caused by:
      Text file busy (os error 26)


  failures:
      shell_snapshot::tests::timed_out_snapshot_shell_is_terminated

  test result: FAILED. 815 passed; 1 failed; 4 ignored; 0 measured; 0 filtered out; finished in 18.00s
```
2026-01-26 18:05:30 +00:00
Shijie Rao
3ba702c5b6 Feat: add isOther to question returned by request user input tool (#9890)
### Summary
Add `isOther` to question object from request_user_input tool input and
remove `other` option from the tool prompt to better handle tool input.
2026-01-26 09:52:38 -08:00
gt-oai
6316e57497 Fix up config disabled err msg (#9916)
**Before:**
<img width="745" height="375" alt="image"
src="https://github.com/user-attachments/assets/d6c23562-b87f-4af9-8642-329aab8e594d"
/>

**After:**
<img width="1042" height="354" alt="image"
src="https://github.com/user-attachments/assets/c9a2413c-c945-4c34-8b7e-c6c9b8fbf762"
/>

Two changes:
1. only display if there is a `config.toml` that is skipped (i.e. if
there is just `.codex/skills` but no `.codex/config.toml` we do not
display the error)
2. clarify the implications and the fix in the error message.
2026-01-26 17:49:31 +00:00