44 Commits

Author SHA1 Message Date
Owen Lin
8dfd654196 feat(app-server-test-client): OTEL setup for tracing (#13493)
### Overview
This PR:
- Updates `app-server-test-client` to load OTEL settings from
`$CODEX_HOME/config.toml` and initializes its own OTEL provider.
- Add real client root spans to app-server test client traces.

This updates `codex-app-server-test-client` so its Datadog traces
reflect the full client-driven flow instead of a set of server spans
stitched together under a synthetic parent.

Before this change, the test client generated a fake `traceparent` once
and reused it for every JSON-RPC request. That kept the requests in one
trace, but there was no real client span at the top, so Datadog ended up
showing the sequence in a slightly misleading way, where all RPCs were
anchored under `initialize`.

Now the test client:
- loads OTEL settings from the normal Codex config path, including
`$CODEX_HOME/config.toml` and existing --config overrides
- initializes tracing the same way other Codex binaries do when trace
export is enabled
- creates a real client root span for each scripted command
- creates per-request client spans for JSON-RPC methods like
`initialize`, `thread/start`, and `turn/start`
- injects W3C trace context from the current client span into
request.trace instead of reusing a fabricated carrier

This gives us a cleaner trace shape in Datadog:
- one trace URL for the whole scripted flow
- a visible client root span
- proper client/server parent-child relationships for each app-server
request
2026-03-04 13:30:09 -08:00
Celia Chen
e6773f856c Feat: Preserve network access on read-only sandbox policies (#13409)
## Summary

`PermissionProfile.network` could not be preserved when additional or
compiled permissions resolved to
`SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access
field. This change makes read-only + network
enabled representable directly and threads that through the protocol,
app-server v2 mirror, and permission-
  merging logic.

## What changed

- Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core
protocol and app-server v2 protocol.
- Kept backward compatibility by defaulting the new field to false, so
legacy read-only payloads still
    deserialize unchanged.
- Updated `has_full_network_access()` and sandbox summaries to respect
read-only network access.
  - Preserved PermissionProfile.network when:
      - compiling skill permission profiles into sandbox policies
      - normalizing additional permissions
      - merging additional permissions into existing sandbox policies
- Updated the approval overlay to show network in the rendered
permission rule when requested.
  - Regenerated app-server schema fixtures for the new v2 wire shape.
2026-03-04 02:41:57 +00:00
Owen Lin
167158f93c chore(app-server): delete v1 RPC methods and notifications (#13375)
## Summary
This removes the old app-server v1 methods and notifications we no
longer need, while keeping the small set the main codex app client still
depends on for now.

The remaining legacy surface is:
- `initialize`
- `getConversationSummary`
- `getAuthStatus`
- `gitDiffToRemote`
- `fuzzyFileSearch`
- `fuzzyFileSearch/sessionStart`
- `fuzzyFileSearch/sessionUpdate`
- `fuzzyFileSearch/sessionStop`

And the raw `codex/event/*` notifications emitted from core. These
notifications will be removed in a followup PR.

## What changed
- removed deprecated v1 request variants from the protocol and
app-server dispatcher
- removed deprecated typed notifications: `authStatusChange`,
`loginChatGptComplete`, and `sessionConfigured`
- updated the app-server test client to use v2 flows instead of deleted
v1 flows
- deleted legacy-only app-server test suites and added focused coverage
for `getConversationSummary`
- regenerated app-server schema fixtures and updated the MCP interface
docs to match the remaining compatibility surface

## Testing
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`
2026-03-03 13:18:25 -08:00
Owen Lin
9965bf31fa feat(app-server-test-client): support tracing (#13286) 2026-03-02 17:24:48 -08:00
Ruslan Nigmatullin
70ed6cbc71 app-server: Add an ability to watch events in the test client (#13080)
Add a `watch` subcommand to `codex-app-server-test-client` binary to
help in manual testing of events flow.
2026-02-27 17:19:53 -08:00
Michael Bolin
14116ade8d feat: include available decisions in command approval requests (#12758)
Command-approval clients currently infer which choices to show from
side-channel fields like `networkApprovalContext`,
`proposedExecpolicyAmendment`, and `additionalPermissions`. That makes
the request shape harder to evolve, and it forces each client to
replicate the server's heuristics instead of receiving the exact
decision list for the prompt.

This PR introduces a mapping between `CommandExecutionApprovalDecision`
and `codex_protocol::protocol::ReviewDecision`:

```rust
impl From<CoreReviewDecision> for CommandExecutionApprovalDecision {
    fn from(value: CoreReviewDecision) -> Self {
        match value {
            CoreReviewDecision::Approved => Self::Accept,
            CoreReviewDecision::ApprovedExecpolicyAmendment {
                proposed_execpolicy_amendment,
            } => Self::AcceptWithExecpolicyAmendment {
                execpolicy_amendment: proposed_execpolicy_amendment.into(),
            },
            CoreReviewDecision::ApprovedForSession => Self::AcceptForSession,
            CoreReviewDecision::NetworkPolicyAmendment {
                network_policy_amendment,
            } => Self::ApplyNetworkPolicyAmendment {
                network_policy_amendment: network_policy_amendment.into(),
            },
            CoreReviewDecision::Abort => Self::Cancel,
            CoreReviewDecision::Denied => Self::Decline,
        }
    }
}
```

And updates `CommandExecutionRequestApprovalParams` to have a new field:

```rust
available_decisions: Option<Vec<CommandExecutionApprovalDecision>>
```

when, if specified, should make it easier for clients to display an
appropriate list of options in the UI.

This makes it possible for `CoreShellActionProvider::prompt()` in
`unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly,
adding support for `ApprovedForSession` when approving a skill script,
which was previously missing in the TUI.

Note this results in a significant change to `exec_options()` in
`approval_overlay.rs`, as the displayed options are now derived from
`available_decisions: &[ReviewDecision]`.

## What Changed

- Add `available_decisions` to
[`ExecApprovalRequestEvent`](de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)),
including helpers to derive the legacy default choices when older
senders omit the field.
- Map `codex_protocol::protocol::ReviewDecision` to app-server
`CommandExecutionApprovalDecision` and expose the ordered list as
experimental `availableDecisions` in
[`CommandExecutionRequestApprovalParams`](de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)).
- Thread optional `available_decisions` through the core approval path
so Unix shell escalation can explicitly request `ApprovedForSession` for
session-scoped approvals instead of relying on client heuristics.
[`unix_escalation.rs`](de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214))
- Update the TUI approval overlay to build its buttons from the ordered
decision list, while preserving the legacy fallback when
`available_decisions` is missing.
- Update the app-server README, test client output, and generated schema
artifacts to document and surface the new field.

## Testing

- Add `approval_overlay.rs` coverage for explicit decision lists,
including the generic `ApprovedForSession` path and network approval
options.
- Update `chatwidget/tests.rs` and app-server protocol tests to populate
the new optional field and keep older event shapes working.

## Developers Docs

- If we document `item/commandExecution/requestApproval` on
[developers.openai.com/codex](https://developers.openai.com/codex), add
experimental `availableDecisions` as the preferred source of approval
choices and note that older servers may omit it.
2026-02-26 01:10:46 +00:00
Celia Chen
4f45668106 Revert "Add skill approval event/response (#12633)" (#12811)
This reverts commit https://github.com/openai/codex/pull/12633. We no
longer need this PR, because we favor sending normal exec command
approval server request with `additional_permissions` of skill
permissions instead
2026-02-26 01:02:42 +00:00
jif-oai
f46b767b7e feat: add search term to thread list (#12578)
Add `searchTerm` to `thread/list` that will search for a match in the
titles (the condition being `searchTerm` $$\in$$ `title`)
2026-02-25 09:59:41 +00:00
viyatb-oai
c086b36b58 feat(ui): add network approval persistence plumbing (#12358)
## Summary
- add TUI approval options for persistent network host rules
- add app-server v2 approval payload plumbing for network approval
context + proposed network policy amendments
- add app-server handling to translate `applyNetworkPolicyAmendment`
decisions back into core review decisions
- update docs/test client output and generated app-server schemas/types
2026-02-25 07:06:19 +00:00
Celia Chen
1151972fb2 feat: add experimental additionalPermissions to v2 command execution approval requests (#12737)
This adds additionalPermissions to the app-server v2
item/commandExecution/requestApproval payload as an experimental field.

The field is now exposed on CommandExecutionRequestApprovalParams and is
populated from the existing core approval event when a command requests
additional sandbox permissions.

This PR also contains changes to make server requests to support
experiment API.

A real app server test client test:

sample payload with experimental flag off:
```
 {
<   "id": 0,
<   "method": "item/commandExecution/requestApproval",
<   "params": {
<     "command": "/bin/zsh -lc 'mkdir -p ~/some/test && touch ~/some/test/file'",
<     "commandActions": [
<       {
<         "command": "mkdir -p '~/some/test'",
<         "type": "unknown"
<       },
<       {
<         "command": "touch '~/some/test/file'",
<         "type": "unknown"
<       }
<     ],
<     "cwd": "/Users/celia/code/codex/codex-rs",
<     "itemId": "call_QLp0LWkQ1XkU6VW9T2vUZFWB",
<     "proposedExecpolicyAmendment": [
<       "mkdir",
<       "-p",
<       "~/some/test"
<     ],
<     "reason": "Do you want to allow creating ~/some/test/file outside the workspace?",
<     "threadId": "019c9309-e209-7d82-a01b-dcf9556a354d",
<     "turnId": "019c9309-e27a-7f33-834f-6011e795c2d6"
<   }
< }
```
with experimental flag on: 
```
< {
<   "id": 0,
<   "method": "item/commandExecution/requestApproval",
<   "params": {
<     "additionalPermissions": {
<       "fileSystem": null,
<       "macos": null,
<       "network": true
<     },
<     "command": "/bin/zsh -lc 'install -D /dev/null ~/some/test/file'",
<     "commandActions": [
<       {
<         "command": "install -D /dev/null '~/some/test/file'",
<         "type": "unknown"
<       }
<     ],
<     "cwd": "/Users/celia/code/codex/codex-rs",
<     "itemId": "call_K3U4b3dRbj3eMCqslmncbGsq",
<     "proposedExecpolicyAmendment": [
<       "install",
<       "-D"
<     ],
<     "reason": "Do you want to allow creating the file at ~/some/test/file outside the workspace sandbox?",
<     "threadId": "019c9303-3a8e-76e1-81bf-d67ac446d892",
<     "turnId": "019c9303-3af1-7143-88a1-73132f771234"
<   }
< }
```
2026-02-25 05:16:35 +00:00
pakrym-oai
58763afa0f Add skill approval event/response (#12633)
Set the stage for skill-level permission approval in addition to
command-level.

Behind a feature flag.
2026-02-23 22:28:58 -08:00
viyatb-oai
e8afaed502 Refactor network approvals to host/protocol/port scope (#12140)
## Summary
Simplify network approvals by removing per-attempt proxy correlation and
moving to session-level approval dedupe keyed by (host, protocol, port).
Instead of encoding attempt IDs into proxy credentials/URLs, we now
treat approvals as a destination policy decision.

- Concurrent calls to the same destination share one approval prompt.
- Different destinations (or same host on different ports) get separate
prompts.
- Allow once approves the current queued request group only.
- Allow for session caches that (host, protocol, port) and auto-allows
future matching requests.
- Never policy continues to deny without prompting.

Example:
- 3 calls: 
  - a.com (line 443)
  - b.com (line 443)
  - a.com (line 443)
=> 2 prompts total (a, b), second a waits on the first decision.
- a.com:80 is treated separately from a.com line 443

## Testing
- `just fmt` (in `codex-rs`)
- `cargo test -p codex-core tools::network_approval::tests`
- `cargo test -p codex-core` (unit tests pass; existing
integration-suite failures remain in this environment)
2026-02-20 10:39:55 -08:00
Owen Lin
edacbf7b6e feat(core): zsh exec bridge (#12052)
zsh fork PR stack:
- https://github.com/openai/codex/pull/12051 
- https://github.com/openai/codex/pull/12052 👈 

### Summary
This PR introduces a feature-gated native shell runtime path that routes
shell execution through a patched zsh exec bridge, removing MCP-specific
behavior from the shell hot path while preserving existing
CommandExecution lifecycle semantics.

When shell_zsh_fork is enabled, shell commands run via patched zsh with
per-`execve` interception through EXEC_WRAPPER. Core receives wrapper
IPC requests over a Unix socket, applies existing approval policy, and
returns allow/deny before the subcommand executes.

### What’s included
**1) New zsh exec bridge runtime in core**
- Wrapper-mode entrypoint (maybe_run_zsh_exec_wrapper_mode) for
EXEC_WRAPPER invocations.
- Per-execution Unix-socket IPC handling for wrapper requests/responses.
- Approval callback integration using existing core approval
orchestration.
- Streaming stdout/stderr deltas to existing command output event
pipeline.
- Error handling for malformed IPC, denial/abort, and execution
failures.

**2) Session lifecycle integration**
SessionServices now owns a `ZshExecBridge`.
Session startup initializes bridge state; shutdown tears it down
cleanly.

**3) Shell runtime routing (feature-gated)**
When `shell_zsh_fork` is enabled:
- Build execution env/spec as usual.
- Add wrapper socket env wiring.
- Execute via `zsh_exec_bridge.execute_shell_request(...)` instead of
the regular shell path.
- Non-zsh-fork behavior remains unchanged.

**4) Config + feature wiring**
- Added `Feature::ShellZshFork` (under development).
- Added config support for `zsh_path` (optional absolute path to patched
zsh):
- `Config`, `ConfigToml`, `ConfigProfile`, overrides, and schema.
- Session startup validates that `zsh_path` exists/usable when zsh-fork
is enabled.
- Added startup test for missing `zsh_path` failure mode.

**5) Seatbelt/sandbox updates for wrapper IPC**
- Extended seatbelt policy generation to optionally allow outbound
connection to explicitly permitted Unix sockets.
- Wired sandboxing path to pass wrapper socket path through to seatbelt
policy generation.
- Added/updated seatbelt tests for explicit socket allow rule and
argument emission.

**6) Runtime entrypoint hooks**
- This allows the same binary to act as the zsh wrapper subprocess when
invoked via `EXEC_WRAPPER`.

**7) Tool selection behavior**
- ToolsConfig now prefers ShellCommand type when shell_zsh_fork is
enabled.
- Added test coverage for precedence with unified-exec enabled.
2026-02-17 20:19:53 -08:00
Owen Lin
db4d2599b5 feat(core): plumb distinct approval ids for command approvals (#12051)
zsh fork PR stack:
- https://github.com/openai/codex/pull/12051 👈 
- https://github.com/openai/codex/pull/12052

With upcoming support for a fork of zsh that allows us to intercept
`execve` and run execpolicy checks for each subcommand as part of a
`CommandExecution`, it will be possible for there to be multiple
approval requests for a shell command like `/path/to/zsh -lc 'git status
&& rg \"TODO\" src && make test'`.

To support that, this PR introduces a new `approval_id` field across
core, protocol, and app-server so that we can associate approvals
properly for subcommands.
2026-02-18 01:55:57 +00:00
Max Johnson
fb0aaf94de codex-rs: fix thread resume rejoin semantics (#11756)
## Summary
- always rejoin an in-memory running thread on `thread/resume`, even
when overrides are present
- reject `thread/resume` when `history` is provided for a running thread
- reject `thread/resume` when `path` mismatches the running thread
rollout path
- warn (but do not fail) on override mismatches for running threads
- add more `thread_resume` integration tests and fixes; including
restart-based resume-with-overrides coverage

## Validation
- `just fmt`
- `cargo test -p codex-app-server --test all thread_resume`
- manual test with app-server-test-client
https://github.com/openai/codex/pull/11755
- manual test both stdio and websocket in app
2026-02-13 23:09:58 +00:00
Max Johnson
f687b074ca app-server-test-client websocket client and thread tools (#11755)
- add websocket endpoint mode with default ws://127.0.0.1:4222 while
keeping stdio codex-bin path compatibility
- add thread-resume (follow stream) and thread-list commands for manual
thread lifecycle testing
- quickstart docs
2026-02-13 17:34:35 +00:00
Michael Bolin
abbd74e2be feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
`SandboxPolicy::ReadOnly` previously implied broad read access and could
not express a narrower read surface.
This change introduces an explicit read-access model so we can support
user-configurable read restrictions in follow-up work, while preserving
current behavior today.

It also ensures unsupported backends fail closed for restricted-read
policies instead of silently granting broader access than intended.

## What

- Added `ReadOnlyAccess` in protocol with:
  - `Restricted { include_platform_defaults, readable_roots }`
  - `FullAccess`
- Updated `SandboxPolicy` to carry read-access configuration:
  - `ReadOnly { access: ReadOnlyAccess }`
  - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
- Preserved existing behavior by defaulting current construction paths
to `ReadOnlyAccess::FullAccess`.
- Threaded the new fields through sandbox policy consumers and call
sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
related tests.
- Updated Seatbelt policy generation to honor restricted read roots by
emitting scoped read rules when full read access is not granted.
- Added fail-closed behavior on Linux and Windows backends when
restricted read access is requested but not yet implemented there
(`UnsupportedOperation`).
- Regenerated app-server protocol schema and TypeScript artifacts,
including `ReadOnlyAccess`.

## Compatibility / rollout

- Runtime behavior remains unchanged by default (`FullAccess`).
- API/schema changes are in place so future config wiring can enable
restricted read access without another policy-shape migration.
2026-02-11 18:31:14 -08:00
Michael Bolin
ead38c3d1c fix: remove errant Cargo.lock files (#11526)
These leaked into the repo:

- #4905 `codex-rs/windows-sandbox-rs/Cargo.lock`
- #5391 `codex-rs/app-server-test-client/Cargo.lock`

Note that these affect cache keys such as:


9722567a80/.github/workflows/rust-release.yml (L154)

so it seems best to remove them.
2026-02-12 02:28:02 +00:00
Celia Chen
641d5268fa chore: persist turn_id in rollout session and make turn_id uuid based (#11246)
Problem:
1. turn id is constructed in-memory;
2. on resuming threads, turn_id might not be unique;
3. client cannot no the boundary of a turn from rollout files easily.

This PR does three things:
1. persist `task_started` and `task_complete` events;
1. persist `turn_id` in rollout turn events;
5. generate turn_id as unique uuids instead of incrementing it in
memory.

This helps us resolve the issue of clients wanting to have unique turn
ids for resuming a thread, and knowing the boundry of each turn in
rollout files.

example debug logs
```
2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None }
2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None }
2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None }
```

backward compatibility:
if you try to resume an old session without task_started and
task_complete event populated, the following happens:
- If you resume and do nothing: those reconstructed historical IDs can
differ next time you resume.
- If you resume and send a new turn: the new turn gets a fresh UUID from
live submission flow and is persisted, so that new turn’s ID is stable
on later resumes.
I think this behavior is fine, because we only care about deterministic
turn id once a turn is triggered.
2026-02-11 03:56:01 +00:00
jif-oai
a364dd8b56 feat: opt-out of events in the app-server (#11319)
Add `optOutNotificationMethods` in the app-server to opt-out events
based on exact method matching
2026-02-10 18:04:52 +00:00
Celia Chen
16647b188b chore: add codex debug app-server tooling (#10367)
codex debug app-server <user message> forwards the message through
codex-app-server-test-client’s send_message_v2 library entry point,
using std::env::current_exe() to resolve the codex binary.

for how it looks like, see:

```
celia@com-92114 codex-rs % cargo build -p codex-cli && target/debug/codex debug app-server --help                       
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.34s
Tooling: helps debug the app server

Usage: codex debug app-server [OPTIONS] <COMMAND>

Commands:
  send-message-v2  
  help             Print this message or the help of the given subcommand(s)
````
and
```
celia@com-92114 codex-rs % cargo build -p codex-cli && target/debug/codex debug app-server send-message-v2 "hello world"
   Compiling codex-cli v0.0.0 (/Users/celia/code/codex/codex-rs/cli)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.38s
> {
>   "method": "initialize",
>   "id": "f8ba9f60-3a49-4ea9-81d6-4ab6853e3954",
>   "params": {
>     "clientInfo": {
>       "name": "codex-toy-app-server",
>       "title": "Codex Toy App Server",
>       "version": "0.0.0"
>     },
>     "capabilities": {
>       "experimentalApi": true
>     }
>   }
> }
< {
<   "id": "f8ba9f60-3a49-4ea9-81d6-4ab6853e3954",
<   "result": {
<     "userAgent": "codex-toy-app-server/0.0.0 (Mac OS 26.2.0; arm64) vscode/2.4.27 (codex-toy-app-server; 0.0.0)"
<   }
< }
< initialize response: InitializeResponse { user_agent: "codex-toy-app-server/0.0.0 (Mac OS 26.2.0; arm64) vscode/2.4.27 (codex-toy-app-server; 0.0.0)" }
> {
>   "method": "thread/start",
>   "id": "203f1630-beee-4e60-b17b-9eff16b1638b",
>   "params": {
>     "model": null,
>     "modelProvider": null,
>     "cwd": null,
>     "approvalPolicy": null,
>     "sandbox": null,
>     "config": null,
>     "baseInstructions": null,
>     "developerInstructions": null,
>     "personality": null,
>     "ephemeral": null,
>     "dynamicTools": null,
>     "mockExperimentalField": null,
>     "experimentalRawEvents": false
>   }
> }
...
```
2026-02-03 23:17:34 +00:00
Michael Bolin
66447d5d2c feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:


8b95d3e082/codex-rs/mcp-types/README.md

Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.

Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:

```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```

so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:

```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```

so the deletion of those individual variants should not be a cause of
great concern.

Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:

```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```

so:

- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`

and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356
2026-02-02 17:41:55 -08:00
Celia Chen
fb2df99cf1 [feat] persist thread_dynamic_tools in db (#10252)
Persist thread_dynamic_tools in sqlite and read first from it. Fall back
to rollout files if it's not found. Persist dynamic tools to both sqlite
and rollout files.

Saw that new sessions get populated to db correctly & old sessions get
backfilled correctly at startup:
```
celia@com-92114 codex-rs % sqlite3 ~/.codex/state.sqlite \      "select thread_id, position,name,description,input_schema from thread_dynamic_tools;"
019c0cad-ec0d-74b2-a787-e8b33a349117|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
....
019c10ca-aa4b-7620-ae40-c0919fbd7ea7|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
```
2026-02-03 00:06:44 +00:00
jif-oai
3cc9122ee2 feat: experimental flags (#10231)
## Problem being solved
- We need a single, reliable way to mark app-server API surface as
experimental so that:
  1. the runtime can reject experimental usage unless the client opts in
2. generated TS/JSON schemas can exclude experimental methods/fields for
stable clients.

Right now that’s easy to drift or miss when done ad-hoc.

## How to declare experimental methods and fields
- **Experimental method**: add `#[experimental("method/name")]` to the
`ClientRequest` variant in `client_request_definitions!`.
- **Experimental field**: on the params struct, derive `ExperimentalApi`
and annotate the field with `#[experimental("method/name.field")]` + set
`inspect_params: true` for the method variant so
`ClientRequest::experimental_reason()` inspects params for experimental
fields.

## How the macro solves it
- The new derive macro lives in
`codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via
`#[derive(ExperimentalApi)]` plus `#[experimental("reason")]`
attributes.
- **Structs**:
- Generates `ExperimentalApi::experimental_reason(&self)` that checks
only annotated fields.
  - The “presence” check is type-aware:
    - `Option<T>`: `is_some_and(...)` recursively checks inner.
    - `Vec`/`HashMap`/`BTreeMap`: must be non-empty.
    - `bool`: must be `true`.
    - Other types: considered present (returns `true`).
- Registers each experimental field in an `inventory` with `(type_name,
serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for
that type. Field names are converted from `snake_case` to `camelCase`
for schema/TS filtering.
- **Enums**:
- Generates an exhaustive `match` returning `Some(reason)` for annotated
variants and `None` otherwise (no wildcard arm).
- **Wiring**:
- Runtime gating uses `ExperimentalApi::experimental_reason()` in
`codex-rs/app-server/src/message_processor.rs` to reject requests unless
`InitializeParams.capabilities.experimental_api == true`.
- Schema/TS export filters use the inventory list and
`EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to
strip experimental methods/fields when `experimental_api` is false.
2026-02-02 11:06:50 +00:00
Shijie Rao
a4cb97ba5a Chore: add cmd related info to exec approval request (#9659)
### Summary
We now rely purely on `item/commandExecution/requestApproval` item to
render pending approval in VSCE and app. With v2 approach, it does not
include the actual cmd that it is attempting and therefore we can only
use `proposedExecpolicyAmendment` to render which can be incomplete.

### Reproduce
* Add `prefix_rule(pattern=["echo"], decision="prompt")` to your
`~/.codex/rules.default.rules`.
* Ask to `Run  echo "approval-test" please` in VSCE or app. 
* The pending approval protal does show up but with no content

#### Example screenshot
<img width="3434" height="3648" alt="Screenshot 2026-01-21 at 8 23
25 PM"
src="https://github.com/user-attachments/assets/75644837-21f1-40f8-8b02-858d361ff817"
/>

#### Sample output
```
  {"method":"item/commandExecution/requestApproval","id":0,"params":{
    "threadId":"019be439-5a90-7600-a7ea-2d2dcc50302a",
    "turnId":"0",
    "itemId":"call_usgnQ4qEX5U9roNdjT7fPzhb",
    "reason":"`/bin/zsh -lc 'echo \"testing\"'` requires approval by policy",
    "proposedExecpolicyAmendment":null
  }}

```

### Fix
Inlude `command` string, `cwd` and `command_actions` in
`CommandExecutionRequestApprovalParams` so that consumers can display
the correct command instead of relying on exec policy output.
2026-01-21 23:58:53 -08:00
charley-oai
eb90e20c0b Persist text elements through TUI input and history (#9393)
Continuation of breaking up this PR
https://github.com/openai/codex/pull/9116

## Summary
- Thread user text element ranges through TUI/TUI2 input, submission,
queueing, and history so placeholders survive resume/edit flows.
- Preserve local image attachments alongside text elements and rehydrate
placeholders when restoring drafts.
- Keep model-facing content shapes clean by attaching UI metadata only
to user input/events (no API content changes).

## Key Changes
- TUI/TUI2 composer now captures text element ranges, trims them with
text edits, and restores them when submission is suppressed.
- User history cells render styled spans for text elements and keep
local image paths for future rehydration.
- Initial chat widget bootstraps accept empty `initial_text_elements` to
keep initialization uniform.
- Protocol/core helpers updated to tolerate the new InputText field
shape without changing payloads sent to the API.
2026-01-19 23:49:34 -08:00
charley-oai
1fa8350ae7 Add text element metadata to protocol, app server, and core (#9331)
The second part of breaking up PR
https://github.com/openai/codex/pull/9116

Summary:

- Add `TextElement` / `ByteRange` to protocol user inputs and user
message events with defaults.
- Thread `text_elements` through app-server v1/v2 request handling and
history rebuild.
- Preserve UI metadata only in user input/events (not `ContentItem`)
while keeping local image attachments in user events for rehydration.

Details:

- Protocol: `UserInput::Text` carries `text_elements`;
`UserMessageEvent` carries `text_elements` + `local_images`.
Serialization includes empty vectors for backward compatibility.
- app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in
camelCase with conversions; v2 uses its own camelCase wrapper.
- app-server: v1/v2 input mapping includes `text_elements`; thread
history rebuilds include them.
- Core: user event emission preserves UI metadata while model history
stays clean; history replay round-trips the metadata.
2026-01-15 17:26:41 -08:00
charley-oai
4a9c2bcc5a Add text element metadata to types (#9235)
Initial type tweaking PR to make the diff of
https://github.com/openai/codex/pull/9116 smaller

This should not change any behavior, just adds some fields to types
2026-01-14 16:41:50 -08:00
zbarsky-openai
2a06d64bc9 feat: add support for building with Bazel (#8875)
This PR configures Codex CLI so it can be built with
[Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
includes configuration so that remote builds can be done using
[BuildBuddy](https://www.buildbuddy.io).

If you are familiar with Bazel, things should work as you expect, e.g.,
run `bazel test //... --keep-going` to run all the tests in the repo,
but we have also added some new aliases in the `justfile` for
convenience:

- `just bazel-test` to run tests locally
- `just bazel-remote-test` to run tests remotely (currently, the remote
build is for x86_64 Linux regardless of your host platform). Note we are
currently seeing the following test failures in the remote build, so we
still need to figure out what is happening here:

```
failures:
    suite::compact::manual_compact_twice_preserves_latest_user_messages
    suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
    suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
```

- `just build-for-release` to build release binaries for all
platforms/architectures remotely

To setup remote execution:
- [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
employees should also request org access at
https://openai.buildbuddy.io/join/ with their `@openai.com` email
address.)
- [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
`~/.bazelrc` (add the line `build
--remote_header=x-buildbuddy-api-key=YOUR_KEY`)
- Use `--config=remote` in your `bazel` invocations (or add `common
--config=remote` to your `~/.bazelrc`, or use the `just` commands)

## CI

In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
(we are working on supporting Windows, but that is not ready yet). Note
that the failures we are seeing in `just bazel-remote-test` do not occur
on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
is green right now.

The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
that macOS CI jobs build _remotely_ on Linux hosts (using the
`docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
root `BUILD.bazel`) using cross-compilation to build the macOS
artifacts. Then these artifacts are downloaded locally to GitHub's macOS
runner so the tests can be executed natively. This is the relevant
config that enables this:

```
common:macos --config=remote
common:macos --strategy=remote
common:macos --strategy=TestRunner=darwin-sandbox,local
```

Because of the remote caching benefits we get from BuildBuddy, these new
CI jobs can be extremely fast! For example, consider these two jobs that
ran all the tests on Linux x86_64:

- Bazel 1m37s
https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
- Cargo 9m20s
https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875

For now, we will continue to run both the Bazel and Cargo jobs for PRs,
but once we add support for Windows and running Clippy, we should be
able to cutover to using Bazel exclusively for PRs, which should still
speed things up considerably. We will probably continue to run the Cargo
jobs post-merge for commits that land on `main` as a sanity check.

Release builds will also continue to be done by Cargo for now.

Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
Earlier attempt to add support for Buck2, now abandoned:
https://github.com/openai/codex/pull/8504

---------

Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-01-09 11:09:43 -08:00
jif-oai
1aed01e99f renaming: task to turn (#8963) 2026-01-09 17:31:17 +00:00
Owen Lin
66450f0445 fix: implement 'Allow this session' for apply_patch approvals (#8451)
**Summary**
This PR makes “ApprovalDecision::AcceptForSession / don’t ask again this
session” actually work for `apply_patch` approvals by caching approvals
based on absolute file paths in codex-core, properly wiring it through
app-server v2, and exposing the choice in both TUI and TUI2.
- This brings `apply_patch` calls to be at feature-parity with general
shell commands, which also have a "Yes, and don't ask again" option.
- This also fixes VSCE's "Allow this session" button to actually work.

While we're at it, also split the app-server v2 protocol's
`ApprovalDecision` enum so execpolicy amendments are only available for
command execution approvals.

**Key changes**
- Core: per-session patch approval allowlist keyed by absolute file
paths
- Handles multi-file patches and renames/moves by recording both source
and destination paths for `Update { move_path: Some(...) }`.
- Extend the `Approvable` trait and `ApplyPatchRuntime` to work with
multiple keys, because an `apply_patch` tool call can modify multiple
files. For a request to be auto-approved, we will need to check that all
file paths have been approved previously.
- App-server v2: honor AcceptForSession for file changes
- File-change approval responses now map AcceptForSession to
ReviewDecision::ApprovedForSession (no longer downgraded to plain
Approved).
- Replace `ApprovalDecision` with two enums:
`CommandExecutionApprovalDecision` and `FileChangeApprovalDecision`
- TUI / TUI2: expose “don’t ask again for these files this session”
- Patch approval overlays now include a third option (“Yes, and don’t
ask again for these files this session (s)”).
    - Snapshot updates for the approval modal.

**Tests added/updated**
- Core:
- Integration test that proves ApprovedForSession on a patch skips the
next patch prompt for the same file
- App-server:
- v2 integration test verifying
FileChangeApprovalDecision::AcceptForSession works properly

**User-visible behavior**
- When the user approves a patch “for session”, future patches touching
only those previously approved file(s) will no longer prompt gain during
that session (both via app-server v2 and TUI/TUI2).

**Manual testing**
Tested both TUI and TUI2 - see screenshots below.

TUI:
<img width="1082" height="355" alt="image"
src="https://github.com/user-attachments/assets/adcf45ad-d428-498d-92fc-1a0a420878d9"
/>


TUI2:
<img width="1089" height="438" alt="image"
src="https://github.com/user-attachments/assets/dd768b1a-2f5f-4bd6-98fd-e52c1d3abd9e"
/>
2026-01-07 20:11:12 +00:00
jif-oai
116059c3a0 chore: unify conversation with thread name (#8830)
Done and verified by Codex + refactor feature of RustRover
2026-01-07 17:04:53 +00:00
Celia Chen
11d4f3f45e [app-server] fix config loading for conversations (#8765)
Currently we don't load config properly for app server conversations.
see:
https://linear.app/openai/issue/CODEX-3956/config-flags-not-respected-in-codex-app-server.
This PR fixes that by respecting the config passed in.

Tested by running `cargo build -p codex-cli &&
RUST_LOG=codex_app_server=debug CODEX_BIN=target/debug/codex cargo run
-p codex-app-server-test-client -- \
--config
model_providers.mock_provider.base_url=\"http://localhost:4010/v2\" \
    --config model_provider=\"mock_provider\" \
    --config model_providers.mock_provider.name="hello" \
    send-message-v2 "hello"`
and verified that the mock_provider is called instead of default
provider.

#closes
https://linear.app/openai/issue/CODEX-3956/config-flags-not-respected-in-codex-app-server

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-01-06 22:02:17 +00:00
Owen Lin
cab7136fb3 chore: add model/list call to app-server-test-client (#8331)
Allows us to run `cargo run -p codex-app-server-test-client --
model-list` to return the list of models over app-server.
2026-01-06 17:50:17 +00:00
Eric Traut
c4af707e09 Removed experimental "command risk assessment" feature (#7799)
This experimental feature received lukewarm reception during internal
testing. Removing from the code base.
2025-12-10 09:48:11 -08:00
jif-oai
0ad54982ae chore: rework unified exec events (#7775) 2025-12-10 10:30:38 +00:00
zhao-oai
0a32acaa2d updating app server types to support execpoilcy amendment (#7747)
also includes minor refactor merging `ApprovalDecision` with
`CommandExecutionRequestAcceptSettings`
2025-12-08 13:56:22 -08:00
Owen Lin
77c457121e fix: remove serde(flatten) annotation for TurnError (#7499)
The problem with using `serde(flatten)` on Turn status is that it
conditionally serializes the `error` field, which is not the pattern we
want in API v2 where all fields on an object should always be returned.

```
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct Turn {
    pub id: String,
    /// Only populated on a `thread/resume` response.
    /// For all other responses and notifications returning a Turn,
    /// the items field will be an empty list.
    pub items: Vec<ThreadItem>,
    #[serde(flatten)]
    pub status: TurnStatus,
}

#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "status", rename_all = "camelCase")]
#[ts(tag = "status", export_to = "v2/")]
pub enum TurnStatus {
    Completed,
    Interrupted,
    Failed { error: TurnError },
    InProgress,
}
```

serializes to:
```
{
  "id": "turn-123",
  "items": [],
  "status": "completed"
}

{
  "id": "turn-123",
  "items": [],
  "status": "failed",
  "error": {
    "message": "Tool timeout",
    "codexErrorInfo": null
  }
}
```

Instead we want:
```
{
  "id": "turn-123",
  "items": [],
  "status": "completed",
  "error": null
}

{
  "id": "turn-123",
  "items": [],
  "status": "failed",
  "error": {
    "message": "Tool timeout",
    "codexErrorInfo": null
  }
}
```
2025-12-02 21:39:10 +00:00
Celia Chen
0dd822264a [app-server-test-client] add send-followup-v2 (#7271)
Add a new endpoint that allows us to test multi-turn behavior.

Tested with running:
```
RUST_LOG=codex_app_server=debug CODEX_BIN=target/debug/codex \
      cargo run -p codex-app-server-test-client -- \
      send-follow-up-v2 "hello" "and now a follow-up question"
```
2025-11-25 08:04:27 +00:00
Josh McKinney
ec49b56874 chore: add cargo-deny configuration (#7119)
- add GitHub workflow running cargo-deny on push/PR
- document cargo-deny allowlist with workspace-dep notes and advisory
ignores
- align workspace crates to inherit version/edition/license for
consistent checks
2025-11-24 12:22:18 -08:00
Owen Lin
d6c30ed25e [app-server] feat: v2 apply_patch approval flow (#6760)
This PR adds the API V2 version of the apply_patch approval flow, which
centers around `ThreadItem::FileChange`.

This PR wires the new RPC (`item/fileChange/requestApproval`, V2 only)
and related events (`item/started`, `item/completed` for
`ThreadItem::FileChange`, which are emitted in both V1 and V2) through
the app-server
protocol. The new approval RPC is only sent when the user initiates a
turn with the new `turn/start` API so we don't break backwards
compatibility with VSCE.

Similar to https://github.com/openai/codex/pull/6758, the approach I
took was to make as few changes to the Codex core as possible,
leveraging existing `EventMsg` core events, and translating those in
app-server. I did have to add a few additional fields to
`EventMsg::PatchApplyBegin` and `EventMsg::PatchApplyEnd`, but those
were fairly lightweight.

However, the `EventMsg`s emitted by core are the following:
```
1) Auto-approved (no request for approval)

- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd

2) Approved by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd

3) Declined by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd
```

For a request triggering an approval, this would result in:
```
item/fileChange/requestApproval
item/started
item/completed
```

which is different from the `ThreadItem::CommandExecution` flow
introduced in https://github.com/openai/codex/pull/6758, which does the
below and is preferable:
```
item/started
item/commandExecution/requestApproval
item/completed
```

To fix this, we leverage `TurnSummaryStore` on codex_message_processor
to store a little bit of state, allowing us to fire `item/started` and
`item/fileChange/requestApproval` whenever we receive the underlying
`EventMsg::ApplyPatchApprovalRequest`, and no-oping when we receive the
`EventMsg::PatchApplyBegin` later.

This is much less invasive than modifying the order of EventMsg within
core (I tried).

The resulting payloads:
```
{
  "method": "item/started",
  "params": {
    "item": {
      "changes": [
        {
          "diff": "Hello from Codex!\n",
          "kind": "add",
          "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
        }
      ],
      "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
      "status": "inProgress",
      "type": "fileChange"
    }
  }
}
```

```
{
  "id": 0,
  "method": "item/fileChange/requestApproval",
  "params": {
    "grantRoot": null,
    "itemId": "call_Nxnwj7B3YXigfV6Mwh03d686",
    "reason": null,
    "threadId": "019a9e11-8295-7883-a283-779e06502c6f",
    "turnId": "1"
  }
}
```

```
{
  "id": 0,
  "result": {
    "decision": "accept"
  }
}
```

```
{
  "method": "item/completed",
  "params": {
    "item": {
      "changes": [
        {
          "diff": "Hello from Codex!\n",
          "kind": "add",
          "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
        }
      ],
      "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
      "status": "completed",
      "type": "fileChange"
    }
  }
}
```
2025-11-19 20:13:31 -08:00
Celia Chen
b395dc1be6 [app-server] introduce turn/completed v2 event (#6800)
similar to logic in
`codex/codex-rs/exec/src/event_processor_with_jsonl_output.rs`.
translation of v1 -> v2 events:
`codex/event/task_complete` -> `turn/completed`
`codex/event/turn_aborted` -> `turn/completed` with `interrupted` status
`codex/event/error` -> `turn/completed` with `error` status

this PR also makes `items` field in `Turn` optional. For now, we only
populate it when we resume a thread, and leave it as None for all other
places until we properly rewrite core to keep track of items.

tested using the codex app server client. example new event:
```
< {
<   "method": "turn/completed",
<   "params": {
<     "turn": {
<       "id": "0",
<       "items": [],
<       "status": "interrupted"
<     }
<   }
< }
```
2025-11-19 01:55:24 +00:00
Owen Lin
b3a824ae3c [app-server-test-client] feat: auto approve command (#6852) 2025-11-18 15:25:02 -08:00
Owen Lin
c3951e505d feat: add app-server-test-client crate for internal use (#5391)
For app-server development it's been helpful to be able to trigger some
test flows end-to-end and print the JSON-RPC messages sent between
client and server.
2025-11-14 12:39:58 -08:00