Commit Graph

3783 Commits

Author SHA1 Message Date
Cooper Gamble
968aa4e7c6 [codex-core] Simplify pre-turn compaction flow [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 03:16:05 +00:00
Cooper Gamble
eb65671518 [codex-core] Stop threading inline compaction threshold [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 03:01:33 +00:00
Cooper Gamble
8be76fe7b8 [codex-core] Clean up server-side compaction handling [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:49:05 +00:00
Cooper Gamble
4956a47b5b [codex-core] fix stale compaction tests [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
4427ba8452 [core] inline server-side compaction history rewrite [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
413b27501d [core] simplify inline compaction metrics [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
0c88f58519 [core] refresh inline compaction comments [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
bd6733a68c [core] simplify inline server-side compaction handling [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
c4b2ba0ba3 [codex-core] Address compaction review feedback [ci changed_files]
Remove the redundant inline compaction request trace and clarify that streamed server-side compaction rebuilds replacement history on response.completed from the checkpoint snapshot.

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
46615cc195 [codex-core] Trim inline compaction coverage [ci changed_files]
- reduce the server-side compaction test matrix to the highest-signal cases
- add comments around the deferred checkpoint rewrite and inline/preflight split

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
d298fbd6bb [codex-core] Prefer full inline compaction prefix [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
9337146281 [codex-core] Avoid duplicate inline compaction context [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
488b53bceb [codex-core] Fix inline compaction commit semantics [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
7065f6fa1d [codex-core] Preserve raw inline compaction event order [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
29ddeb71ad [codex-core] Fix inline compaction event ordering and token accounting [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
ab9a474c86 [codex-core] Preserve inline compaction checkpoint ordering [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
8da9b522f5 [codex-core] Preserve inline compaction turn prompt state [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
839143c6c4 [codex-core] Fix inline compaction checkpoint history layout [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
d9916b5e6d [codex-core] Fix inline compaction retry and model-switch preflight [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
9d20427fc2 [core] Fix empty-history inline summary replacement [ci changed_files]
Handle repeated inline compactions on turns that started from empty history by stripping leading compaction items after prefix calculation, and add regression coverage for the fresh-session case.

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
bdadb141f2 [core] Always inline OpenAI auto-compaction [ci changed_files]
Ignore compact_prompt for OpenAI inline auto-compaction, remove the legacy compat downgrade path, and keep /compact on the point-in-time endpoint. Also skip previous-model preflight remote compaction when inline server-side compaction is available.\n\nCo-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
90ea0076b9 [codex-core] Remove redundant ThreadId clone in compaction test [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
2bcf6ebaa3 [core] Preserve inline compaction turn state [ci changed_files]
Preserve current-turn history when inline compaction downgrades fail and replace prior same-turn compaction checkpoints instead of stacking them.

Tests:
- cargo test -p codex-core codex::tests::build_server_side_compaction_replacement_history_keeps_current_turn_inputs -- --exact
- cargo test -p codex-core codex::tests::build_server_side_compaction_replacement_history_replaces_prior_same_turn_summary -- --exact
- cargo test -p codex-core codex::tests::downgrade_known_inline_compaction_error_restores_current_turn_when_fallback_fails -- --exact

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
e54a4ae5e2 [codex-core] Refresh config schema for server-side compaction [ci changed_files]
Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
c3a7189a0e [codex-core] Harden inline compaction checkpoint handling [ci changed_files]
Keep current-turn inputs in local inline compaction checkpoints and remember known backend incompatibilities after a compat downgrade so later turns skip the failed inline request path.

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
dea81dba3a [codex-core] Preserve ghost snapshots in compaction fallback [ci changed_files]
Keep same-turn ghost snapshots when pre-turn inline compaction downgrades to the legacy client-side path so undo state survives compatibility fallback.

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
Cooper Gamble
ea2a4be3d8 [codex-core] Add feature-flagged server-side compaction [ci changed_files]
Move normal auto-compaction onto inline Responses API compaction behind a feature flag, keep the legacy path for manual and compatibility cases, and add observability plus integration coverage.

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 02:48:35 +00:00
gabec-openai
052ec629b1 Add keyboard based fast switching between agents in TUI (#13923) 2026-03-10 19:41:51 -07:00
pakrym-oai
816e447ead Add snippets annotated with types to tools when code mode enabled (#14284)
Main purpose is for code mode to understand the return type.
2026-03-10 19:20:15 -07:00
Ahmed Ibrahim
cc417c39a0 Split spawn_csv from multi_agent (#14282)
- make `spawn_csv` a standalone feature for CSV agent jobs
- keep `spawn_csv -> multi_agent` one-way and preserve restricted
subagent disable paths
2026-03-11 01:42:50 +00:00
Ahmed Ibrahim
5b10b93ba2 Add realtime start instructions config override (#14270)
- add `realtime_start_instructions` config support
- thread it into realtime context updates, schema, docs, and tests
2026-03-10 18:42:05 -07:00
pakrym-oai
566897d427 Make unified exec session_id numeric (#14279)
It's a number on the write_stdin input, make it a number on the output
and also internally.
2026-03-10 18:38:39 -07:00
pakrym-oai
24b8d443b8 Prefix code mode output with success or failure message and include error stack (#14272) 2026-03-10 18:33:52 -07:00
pash-openai
cec211cabc render local file links from target paths (#13857)
Co-authored-by: Josh McKinney <joshka@openai.com>
2026-03-10 18:00:48 -07:00
Ahmed Ibrahim
3f7cb03043 Stabilize websocket response.failed error delivery (#14017)
## What changed
- Drop failed websocket connections immediately after a terminal stream
error instead of awaiting a graceful close handshake before forwarding
the error to the caller.
- Keep the success path and the closed-connection guard behavior
unchanged.

## Why this fixes the flake
- The failing integration test waits for the second websocket stream to
surface the model error before issuing a follow-up request.
- On slower runners, the old error path awaited
`ws_stream.close().await` before sending the error downstream. If that
close handshake stalled, the test kept waiting for an error that had
already happened server-side and nextest timed it out.
- Dropping the failed websocket immediately makes the terminal error
observable right away and marks the session closed so the next request
reconnects cleanly instead of depending on a best-effort close
handshake.

## Code or test?
- This is a production logic fix in `codex-api`. The existing websocket
integration test already exercises the regression path.
2026-03-10 17:59:41 -07:00
Ahmed Ibrahim
567ad7fafd Show spawned agent model and effort in TUI (#14273)
- include the requested sub-agent model and reasoning effort in the
spawn begin event\n- render that metadata next to the spawned agent name
and role in the TUI transcript

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 00:46:25 +00:00
pakrym-oai
37f51382fd Rename code mode tool to exec (#14254)
Summary
- update the code-mode handler, runner, instructions, and error text to
refer to the `exec` tool name everywhere that used to say `code_mode`
- ensure generated documentation strings and tool specs describe `exec`
and rely on the shared `PUBLIC_TOOL_NAME`
- refresh the suite tests so they invoke `exec` instead of the old name

Testing
- Not run (not requested)
2026-03-11 00:30:16 +00:00
maja-openai
16daab66d9 prompt changes to guardian (#14263)
## Summary
  - update the guardian prompting
- clarify the guardian rejection message so an action may still proceed
if the user explicitly approves it after being informed of the risk

  ## Testing
  - cargo run on selected examples
2026-03-10 17:05:43 -07:00
Ahmed Ibrahim
f6e966e64a Stabilize pipe process stdin round-trip test (#14013)
## What changed
- keep the explicit stdin-close behavior after writing so the child
still receives EOF deterministically
- on Windows, stop using `python -c` for the round-trip assertion and
instead run a native `cmd.exe` pipeline that reads one line from stdin
with `set /p` and echoes it back
- send `
` on Windows so the stdin payload matches the platform-native line
ending the shell reader expects

## Why this fixes flakiness
The failing branch-local flake was not in `spawn_pipe_process` itself.
The child exited cleanly, but the Windows ARM runner sometimes produced
an empty stdout string when the test used Python as the stdin consumer.
That makes the test sensitive to Python startup and stdin-close timing
rather than the pipe primitive we actually want to validate. Switching
the Windows path to a native `cmd.exe` reader keeps the assertion
focused on our pipe behavior: bytes written to stdin should come back on
stdout before EOF closes the process. The explicit `
` write removes line-ending ambiguity on Windows.

## Scope
- test-only
- no production logic change
2026-03-10 17:00:49 -07:00
Celia Chen
295b56bece chore: add a separate reject-policy flag for skill approvals (#14271)
## Summary
- add `skill_approval` to `RejectConfig` and the app-server v2
`AskForApproval::Reject` payload so skill-script prompts can be
configured independently from sandbox and rule-based prompts
- update Unix shell escalation to reject prompts based on the actual
decision source, keeping prefix rules tied to `rules`, unmatched command
fallbacks tied to `sandbox_approval`, and skill scripts tied to
`skill_approval`
- regenerate the affected protocol/config schemas and expand
unit/integration coverage for the new flag and skill approval behavior
2026-03-10 23:58:23 +00:00
pakrym-oai
18199d4e0e Add store/load support for code mode (#14259)
adds support for transferring state across code mode invocations.
2026-03-10 16:53:53 -07:00
Rasmus Rygaard
f8ef154a6b Pass more params to compaction (#14247)
Pass more params to /compact. This should give us parity with the
/responses endpoint to improve caching.

I'm torn about the MCP await. Blocking will give us parity but it seems
like we explicitly don't block on MCPs. Happy either way
2026-03-10 16:39:57 -07:00
Leo Shimonaka
de2a73cd91 feat: Add additional macOS Sandbox Permissions for Launch Services, Contacts, Reminders (#14155)
Add additional macOS Sandbox Permissions levers for the following:

- Launch Services
- Contacts
- Reminders
2026-03-10 23:34:47 +00:00
joeytrasatti-openai
e4bc352782 Add ephemeral flag support to thread fork (#14248)
### Summary
This PR adds first-class ephemeral support to thread/fork, bringing it
in line with thread/start. The goal is to support one-off completions on
full forked threads without persisting them as normal user-visible
threads.

### Testing
2026-03-10 16:34:27 -07:00
pakrym-oai
8b33485302 Add code_mode output helpers for text and images (#14244)
Summary
- document how code-mode can import `output_text`/`output_image` and
ensure `add_content` stays compatible
- add a synthetic `@openai/code_mode` module that appends content items
and validates inputs
- cover the new behavior with integration tests for structured text and
image outputs

Testing
- Not run (not requested)
2026-03-10 16:25:27 -07:00
Ahmed Ibrahim
bf936fa0c1 Clarify close_agent tool description (#14269)
- clarify the `close_agent` tool description so it nudges models to
close agents they no longer need
- keep the change scoped to the tool spec text only

Co-authored-by: Codex <noreply@openai.com>
2026-03-10 16:25:08 -07:00
gabec-openai
b73228722a Load agent metadata from role files (#14177) 2026-03-10 16:21:48 -07:00
pakrym-oai
e791559029 Add model-controlled truncation for code mode results (#14258)
Summary
- document that `@openai/code_mode` exposes
`set_max_output_tokens_per_exec_call` and that `code_mode` truncates the
final Rust-side output when the budget is exceeded
- enforce the configured budget in the Rust tool runner, reusing
truncation helpers so text-only outputs follow the unified-exec wrapper
and mixed outputs still fit within the limit
- ensure the new behavior is covered by a code-mode integration test and
string spec update

Testing
- Not run (not requested)
2026-03-10 15:57:14 -07:00
pakrym-oai
c7e28cffab Add output schema to MCP tools and expose MCP tool results in code mode (#14236)
Summary
- drop `McpToolOutput` in favor of `CallToolResult`, moving its helpers
to keep MCP tooling focused on the final result shape
- wire the new schema definitions through code mode, context, handlers,
and spec modules so MCP tools serialize the exact output shape expected
by the model
- extend code mode tests to cover multiple MCP call scenarios and ensure
the serialized data matches the new schema
- refresh JS runner helpers and protocol models alongside the schema
changes

Testing
- Not run (not requested)
2026-03-10 15:25:19 -07:00
Dylan Hurd
15163050dc app-server: propagate nested experimental gating for AskForApproval::Reject (#14191)
## Summary
This change makes `AskForApproval::Reject` gate correctly anywhere it
appears inside otherwise-stable app-server protocol types.

Previously, experimental gating for `approval_policy: Reject` was
handled with request-specific logic in `ClientRequest` detection. That
covered a few request params types, but it did not generalize to other
nested uses such as `ProfileV2`, `Config`, `ConfigReadResponse`, or
`ConfigRequirements`.

This PR replaces that ad hoc handling with a generic nested experimental
propagation mechanism.

## Testing

seeing this when run app-server-test-client without experimental api
enabled:
```
 initialize response: InitializeResponse { user_agent: "codex-toy-app-server/0.0.0 (Mac OS 26.3.1; arm64) vscode/2.4.36 (codex-toy-app-server; 0.0.0)" }
> {
>   "id": "50244f6a-270a-425d-ace0-e9e98205bde7",
>   "method": "thread/start",
>   "params": {
>     "approvalPolicy": {
>       "reject": {
>         "mcp_elicitations": false,
>         "request_permissions": true,
>         "rules": false,
>         "sandbox_approval": true
>       }
>     },
>     "baseInstructions": null,
>     "config": null,
>     "cwd": null,
>     "developerInstructions": null,
>     "dynamicTools": null,
>     "ephemeral": null,
>     "experimentalRawEvents": false,
>     "mockExperimentalField": null,
>     "model": null,
>     "modelProvider": null,
>     "persistExtendedHistory": false,
>     "personality": null,
>     "sandbox": null,
>     "serviceName": null
>   }
> }
< {
<   "error": {
<     "code": -32600,
<     "message": "askForApproval.reject requires experimentalApi capability"
<   },
<   "id": "50244f6a-270a-425d-ace0-e9e98205bde7"
< }
[verified] thread/start rejected approvalPolicy=Reject without experimentalApi
```

---------

Co-authored-by: celia-oai <celia@openai.com>
2026-03-10 22:21:52 +00:00