Compare commits

..

3 Commits

Author SHA1 Message Date
Michael Bolin
a4ec8ce2e5 stabilize zsh-fork approvals and resume --last 2026-03-28 14:10:08 -07:00
Michael Bolin
ae8a3be958 bazel: refresh the expired macOS SDK pin (#16128)
## Why

macOS BuildBuddy started failing before target analysis because the
Apple CDN object pinned in
[`MODULE.bazel`](fce0f76d57/MODULE.bazel (L28-L36))
now returns `403 Forbidden`. The failure report that triggered this
change was this [BuildBuddy
invocation](https://app.buildbuddy.io/invocation/c57590e0-1bdb-4e19-a86f-74d4a7ded228).

This repo uses `@llvm//extensions:osx.bzl` via `osx.from_archive(...)`,
and that API does not discover a current SDK URL for us. It fetches
exactly the `urls`, `sha256`, and `strip_prefix` we pin. Once Apple
retires that `swcdn.apple.com` object, `@macos_sdk` stops resolving and
every downstream macOS build fails during external repository fetch.

This is the same basic failure mode we hit in
[b9fa08ec61](b9fa08ec61):
the pin itself aged out.

## How I tracked it down

1. I started from the BuildBuddy error and copied the exact
`swcdn.apple.com/.../CLTools_macOSNMOS_SDK.pkg` URL from the failure.
2. I reproduced the issue outside CI by opening that URL directly in a
browser and by running `curl -I` against it locally. Both returned `403
Forbidden`, which ruled out BuildBuddy as the root cause.
3. I searched the repo for that URL and found it hardcoded in
`MODULE.bazel`.
4. I inspected the `llvm` Bzlmod `osx` extension implementation to
confirm that `osx.from_archive(...)` is just a literal fetch of the
pinned archive metadata. There is no automatic fallback or catalog
lookup behind it.
5. I queried Apple's software update catalogs to find the current
Command Line Tools package for macOS 26.x. The useful catalog was:
-
`https://swscan.apple.com/content/catalogs/others/index-26-15-14-13-12-10.16-10.15-10.14-10.13-10.12-10.11-10.10-10.9-mountainlion-lion-snowleopard-leopard.merged-1.sucatalog.gz`

This is scriptable; it does not require opening a website in a browser.
The catalog is a gzip-compressed plist served over HTTP, so the workflow
is just:

   1. fetch the catalog,
   2. decompress it,
   3. search or parse the plist for `CLTools_macOSNMOS_SDK.pkg` entries,
   4. inspect the matching product metadata.

   The quick shell version I used was:

   ```shell
   curl -L <catalog-url> \
     | gzip -dc \
     | rg -n -C 6 'CLTools_macOSNMOS_SDK\.pkg|PostDate|English\.dist'
   ```

That is enough to surface the current product id, package URL, post
date, and the matching `.dist` file. If we want something less
grep-driven next time, the same catalog can be parsed structurally. For
example:

   ```python
   import gzip
   import plistlib
   import urllib.request

url =
"https://swscan.apple.com/content/catalogs/others/index-26-15-14-13-12-10.16-10.15-10.14-10.13-10.12-10.11-10.10-10.9-mountainlion-lion-snowleopard-leopard.merged-1.sucatalog.gz"
   with urllib.request.urlopen(url) as resp:
       catalog = plistlib.loads(gzip.decompress(resp.read()))

   for product_id, product in catalog["Products"].items():
       for package in product.get("Packages", []):
           package_url = package.get("URL", "")
           if package_url.endswith("CLTools_macOSNMOS_SDK.pkg"):
               print(product_id)
               print(product.get("PostDate"))
               print(package_url)
               print(product.get("Distributions", {}).get("English"))
   ```

In practice, `curl` was only the transport. The important part is that
the catalog itself is a machine-readable plist, so this can be
automated.
6. That catalog contains the newer `047-96692` Command Line Tools
release, and its distribution file identifies it as [Command Line Tools
for Xcode
26.4](https://swdist.apple.com/content/downloads/32/53/047-96692-A_OAHIHT53YB/ybtshxmrcju8m2qvw3w5elr4rajtg1x3y3/047-96692.English.dist).
7. I downloaded that package locally, computed its SHA-256, expanded it
with `pkgutil --expand-full`, and verified that it contains
`Payload/Library/Developer/CommandLineTools/SDKs/MacOSX26.4.sdk`, which
is the correct new `strip_prefix` for this pin.

The core debugging loop looked like this:

```shell
curl -I <stale swcdn URL>
rg 'swcdn\.apple\.com|osx\.from_archive' MODULE.bazel
curl -L <apple 26.x sucatalog> | gzip -dc | rg 'CLTools_macOSNMOS_SDK.pkg'
pkgutil --expand-full CLTools_macOSNMOS_SDK.pkg expanded
find expanded/Payload/Library/Developer/CommandLineTools/SDKs -maxdepth 1 -mindepth 1
```

## What changed

- Updated `MODULE.bazel` to point `osx.from_archive(...)` at the
currently live `047-96692` `CLTools_macOSNMOS_SDK.pkg` object.
- Updated the pinned `sha256` to match that package.
- Updated the `strip_prefix` from `MacOSX26.2.sdk` to `MacOSX26.4.sdk`.

## Verification

- `bazel --output_user_root="$(mktemp -d
/tmp/codex-bazel-sdk-fetch.XXXXXX)" build @macos_sdk//sysroot`

## Notes for next time

As long as we pin raw `swcdn.apple.com` objects, this will likely happen
again. When it does, the expected recovery path is:

1. Reproduce the `403` against the exact URL from CI.
2. Find the stale pin in `MODULE.bazel`.
3. Look up the current CLTools package in the relevant Apple software
update catalog for that macOS major version.
4. Download the replacement package and refresh both `sha256` and
`strip_prefix`.
5. Validate the new pin with a fresh `@macos_sdk` fetch, not just an
incremental Bazel build.

The important detail is that the non-`26` catalog did not surface the
macOS 26.x SDK package here; the `index-26-15-14-...` catalog was the
one that exposed the currently live replacement.
2026-03-28 21:08:19 +00:00
Michael Bolin
bc53d42fd9 codex-tools: extract tool spec models (#16047)
## Why

This continues the `codex-tools` migration by moving another passive
tool-definition layer out of `codex-core`.

After `ResponsesApiTool` and the lower-level schema adapters moved into
`codex-tools`, `core/src/client_common.rs` was still owning `ToolSpec`
and the web-search request wire types even though they are serialized
data models rather than runtime orchestration. Keeping those types in
`codex-core` makes the crate boundary look smaller than it really is and
leaves non-runtime tool-shape code coupled to core.

## What changed

- moved `ToolSpec`, `ResponsesApiWebSearchFilters`, and
`ResponsesApiWebSearchUserLocation` into
`codex-rs/tools/src/tool_spec.rs`
- added focused unit tests in `codex-rs/tools/src/tool_spec_tests.rs`
for:
  - `ToolSpec::name()`
  - web-search config conversions
  - `ToolSpec` serialization for `web_search` and `tool_search`
- kept `codex-rs/tools/src/lib.rs` exports-only by re-exporting the new
module from `lib.rs`
- reduced `core/src/client_common.rs` to a compatibility shim that
re-exports the extracted tool-spec types for current core call sites
- updated `core/src/tools/spec_tests.rs` to consume the extracted
web-search types directly from `codex-tools`
- updated `codex-rs/tools/README.md` so the crate contract reflects that
`codex-tools` now owns the passive tool-spec request models in addition
to the lower-level Responses API structs

## Test plan

- `cargo test -p codex-tools`
- `cargo test -p codex-core --lib tools::spec::`
- `cargo test -p codex-core --lib client_common::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`

## References

- #15923
- #15928
- #15944
- #15953
- #16031
2026-03-28 13:37:00 -07:00
12 changed files with 558 additions and 132 deletions

View File

@@ -27,11 +27,11 @@ register_toolchains("@llvm//toolchain:all")
osx = use_extension("@llvm//extensions:osx.bzl", "osx")
osx.from_archive(
sha256 = "6a4922f89487a96d7054ec6ca5065bfddd9f1d017c74d82f1d79cecf7feb8228",
strip_prefix = "Payload/Library/Developer/CommandLineTools/SDKs/MacOSX26.2.sdk",
sha256 = "1bde70c0b1c2ab89ff454acbebf6741390d7b7eb149ca2a3ca24cc9203a408b7",
strip_prefix = "Payload/Library/Developer/CommandLineTools/SDKs/MacOSX26.4.sdk",
type = "pkg",
urls = [
"https://swcdn.apple.com/content/downloads/26/44/047-81934-A_28TPKM5SD1/ps6pk6dk4x02vgfa5qsctq6tgf23t5f0w2/CLTools_macOSNMOS_SDK.pkg",
"https://swcdn.apple.com/content/downloads/32/53/047-96692-A_OAHIHT53YB/ybtshxmrcju8m2qvw3w5elr4rajtg1x3y3/CLTools_macOSNMOS_SDK.pkg",
],
)
osx.frameworks(names = [

View File

@@ -472,10 +472,15 @@ async fn turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2()
first_file.display(),
second_file.display()
);
// Login shells can emit an extra approval for system startup helpers
// (for example `/usr/libexec/path_helper -s` on macOS) before the target
// `rm` subcommands. Give the command enough budget to exercise the full
// approval sequence on slower CI shards.
let tool_timeout_ms = 15_000;
let tool_call_arguments = serde_json::to_string(&serde_json::json!({
"command": shell_command,
"workdir": serde_json::Value::Null,
"timeout_ms": 5000
"timeout_ms": tool_timeout_ms
}))?;
let response = responses::sse(vec![
responses::ev_response_created("resp-1"),

View File

@@ -1,10 +1,10 @@
use crate::client_common::tools::ToolSpec;
use crate::config::types::Personality;
use crate::error::Result;
pub use codex_api::common::ResponseEvent;
use codex_protocol::models::BaseInstructions;
use codex_protocol::models::FunctionCallOutputBody;
use codex_protocol::models::ResponseItem;
use codex_tools::ToolSpec;
use futures::Stream;
use serde::Deserialize;
use serde_json::Value;
@@ -157,107 +157,11 @@ fn strip_total_output_header(output: &str) -> Option<(&str, u32)> {
}
pub(crate) mod tools {
use codex_protocol::config_types::WebSearchContextSize;
use codex_protocol::config_types::WebSearchFilters as ConfigWebSearchFilters;
use codex_protocol::config_types::WebSearchUserLocation as ConfigWebSearchUserLocation;
use codex_protocol::config_types::WebSearchUserLocationType;
pub(crate) use codex_tools::FreeformTool;
pub(crate) use codex_tools::FreeformToolFormat;
use codex_tools::JsonSchema;
pub(crate) use codex_tools::ResponsesApiTool;
pub(crate) use codex_tools::ToolSearchOutputTool;
use serde::Serialize;
/// When serialized as JSON, this produces a valid "Tool" in the OpenAI
/// Responses API.
#[derive(Debug, Clone, Serialize, PartialEq)]
#[serde(tag = "type")]
pub(crate) enum ToolSpec {
#[serde(rename = "function")]
Function(ResponsesApiTool),
#[serde(rename = "tool_search")]
ToolSearch {
execution: String,
description: String,
parameters: JsonSchema,
},
#[serde(rename = "local_shell")]
LocalShell {},
#[serde(rename = "image_generation")]
ImageGeneration { output_format: String },
// TODO: Understand why we get an error on web_search although the API docs say it's supported.
// https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses#:~:text=%7B%20type%3A%20%22web_search%22%20%7D%2C
// The `external_web_access` field determines whether the web search is over cached or live content.
// https://platform.openai.com/docs/guides/tools-web-search#live-internet-access
#[serde(rename = "web_search")]
WebSearch {
#[serde(skip_serializing_if = "Option::is_none")]
external_web_access: Option<bool>,
#[serde(skip_serializing_if = "Option::is_none")]
filters: Option<ResponsesApiWebSearchFilters>,
#[serde(skip_serializing_if = "Option::is_none")]
user_location: Option<ResponsesApiWebSearchUserLocation>,
#[serde(skip_serializing_if = "Option::is_none")]
search_context_size: Option<WebSearchContextSize>,
#[serde(skip_serializing_if = "Option::is_none")]
search_content_types: Option<Vec<String>>,
},
#[serde(rename = "custom")]
Freeform(FreeformTool),
}
impl ToolSpec {
pub(crate) fn name(&self) -> &str {
match self {
ToolSpec::Function(tool) => tool.name.as_str(),
ToolSpec::ToolSearch { .. } => "tool_search",
ToolSpec::LocalShell {} => "local_shell",
ToolSpec::ImageGeneration { .. } => "image_generation",
ToolSpec::WebSearch { .. } => "web_search",
ToolSpec::Freeform(tool) => tool.name.as_str(),
}
}
}
#[derive(Debug, Clone, Serialize, PartialEq)]
pub(crate) struct ResponsesApiWebSearchFilters {
#[serde(skip_serializing_if = "Option::is_none")]
pub(crate) allowed_domains: Option<Vec<String>>,
}
impl From<ConfigWebSearchFilters> for ResponsesApiWebSearchFilters {
fn from(filters: ConfigWebSearchFilters) -> Self {
Self {
allowed_domains: filters.allowed_domains,
}
}
}
#[derive(Debug, Clone, Serialize, PartialEq)]
pub(crate) struct ResponsesApiWebSearchUserLocation {
#[serde(rename = "type")]
pub(crate) r#type: WebSearchUserLocationType,
#[serde(skip_serializing_if = "Option::is_none")]
pub(crate) country: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub(crate) region: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub(crate) city: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub(crate) timezone: Option<String>,
}
impl From<ConfigWebSearchUserLocation> for ResponsesApiWebSearchUserLocation {
fn from(user_location: ConfigWebSearchUserLocation) -> Self {
Self {
r#type: user_location.r#type,
country: user_location.country,
region: user_location.region,
city: user_location.city,
timezone: user_location.timezone,
}
}
}
pub(crate) use codex_tools::ToolSpec;
}
pub struct ResponseStream {

View File

@@ -1,4 +1,3 @@
use crate::client_common::tools::FreeformTool;
use crate::config::test_config;
use crate::models_manager::manager::ModelsManager;
use crate::models_manager::model_info::with_config_overrides;
@@ -12,6 +11,9 @@ use codex_protocol::openai_models::InputModality;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::openai_models::ModelsResponse;
use codex_tools::AdditionalProperties;
use codex_tools::FreeformTool;
use codex_tools::ResponsesApiWebSearchFilters;
use codex_tools::ResponsesApiWebSearchUserLocation;
use codex_tools::mcp_tool_to_deferred_responses_api_tool;
use codex_utils_absolute_path::AbsolutePathBuf;
use pretty_assertions::assert_eq;
@@ -106,14 +108,7 @@ fn deferred_responses_api_tool_serializes_with_defer_loading() {
}
fn tool_name(tool: &ToolSpec) -> &str {
match tool {
ToolSpec::Function(ResponsesApiTool { name, .. }) => name,
ToolSpec::ToolSearch { .. } => "tool_search",
ToolSpec::LocalShell {} => "local_shell",
ToolSpec::ImageGeneration { .. } => "image_generation",
ToolSpec::WebSearch { .. } => "web_search",
ToolSpec::Freeform(FreeformTool { name, .. }) => name,
}
tool.name()
}
// Avoid order-based assertions; compare via set containment instead.
@@ -1205,10 +1200,10 @@ fn web_search_config_is_forwarded_to_tool_spec() {
external_web_access: Some(true),
filters: web_search_config
.filters
.map(crate::client_common::tools::ResponsesApiWebSearchFilters::from),
.map(ResponsesApiWebSearchFilters::from),
user_location: web_search_config
.user_location
.map(crate::client_common::tools::ResponsesApiWebSearchUserLocation::from),
.map(ResponsesApiWebSearchUserLocation::from),
search_context_size: web_search_config.search_context_size,
search_content_types: None,
}

View File

@@ -49,10 +49,7 @@ pub async fn apply_patch_harness() -> Result<TestCodexHarness> {
async fn apply_patch_harness_with(
configure: impl FnOnce(TestCodexBuilder) -> TestCodexBuilder,
) -> Result<TestCodexHarness> {
// Keep Windows shell-command apply_patch tests on a deterministic outer
// shell so heredoc interception does not depend on runner-local shell
// auto-detection.
let builder = configure(test_codex().with_windows_cmd_shell()).with_config(|config| {
let builder = configure(test_codex()).with_config(|config| {
config.include_apply_patch_tool = true;
});
// Box harness construction so apply_patch_cli tests do not inline the

View File

@@ -1099,18 +1099,12 @@ fn turn_items_for_thread(
.map(|turn| turn.items.clone())
}
fn all_thread_source_kinds() -> Vec<ThreadSourceKind> {
fn resumable_thread_source_kinds() -> Vec<ThreadSourceKind> {
vec![
ThreadSourceKind::Cli,
ThreadSourceKind::VsCode,
ThreadSourceKind::Exec,
ThreadSourceKind::AppServer,
ThreadSourceKind::SubAgent,
ThreadSourceKind::SubAgentReview,
ThreadSourceKind::SubAgentCompact,
ThreadSourceKind::SubAgentThreadSpawn,
ThreadSourceKind::SubAgentOther,
ThreadSourceKind::Unknown,
]
}
@@ -1169,7 +1163,7 @@ async fn resolve_resume_thread_id(
limit: Some(100),
sort_key: Some(ThreadSortKey::UpdatedAt),
model_providers: model_providers.clone(),
source_kinds: Some(all_thread_source_kinds()),
source_kinds: Some(resumable_thread_source_kinds()),
archived: Some(false),
cwd: None,
search_term: None,
@@ -1209,7 +1203,7 @@ async fn resolve_resume_thread_id(
limit: Some(100),
sort_key: Some(ThreadSortKey::UpdatedAt),
model_providers: model_providers.clone(),
source_kinds: Some(all_thread_source_kinds()),
source_kinds: Some(resumable_thread_source_kinds()),
archived: Some(false),
cwd: None,
// Thread names are attached separately from rollout titles, so name
@@ -1898,6 +1892,19 @@ mod tests {
assert_eq!(resume_lookup_model_providers(&config, &named_args), None);
}
#[test]
fn resumable_thread_source_kinds_exclude_internal_threads() {
assert_eq!(
resumable_thread_source_kinds(),
vec![
ThreadSourceKind::Cli,
ThreadSourceKind::VsCode,
ThreadSourceKind::Exec,
ThreadSourceKind::AppServer,
]
);
}
#[test]
fn turn_items_for_thread_returns_matching_turn_items() {
let thread = AppServerThread {

View File

@@ -1,10 +1,17 @@
#![allow(clippy::unwrap_used, clippy::expect_used)]
use anyhow::Context;
use codex_protocol::ThreadId;
use codex_protocol::protocol::SessionMeta;
use codex_protocol::protocol::SessionMetaLine;
use codex_protocol::protocol::SessionSource;
use codex_protocol::protocol::SubAgentSource;
use codex_utils_cargo_bin::find_resource;
use core_test_support::test_codex_exec::test_codex_exec;
use pretty_assertions::assert_eq;
use serde_json::Value;
use serde_json::json;
use std::string::ToString;
use std::time::Duration;
use tempfile::TempDir;
use uuid::Uuid;
use walkdir::WalkDir;
@@ -220,6 +227,118 @@ fn exec_resume_last_accepts_prompt_after_flag_in_json_mode() -> anyhow::Result<(
Ok(())
}
#[test]
fn exec_resume_last_ignores_newer_internal_thread() -> anyhow::Result<()> {
let test = test_codex_exec();
let fixture = exec_fixture()?;
let repo_root = exec_repo_root()?;
let marker = format!("resume-last-visible-{}", Uuid::new_v4());
let prompt = format!("echo {marker}");
test.cmd()
.env("CODEX_RS_SSE_FIXTURE", &fixture)
.env("OPENAI_BASE_URL", "http://unused.local")
.arg("--skip-git-repo-check")
.arg("-C")
.arg(&repo_root)
.arg(&prompt)
.assert()
.success();
let sessions_dir = test.home_path().join("sessions");
let path = find_session_file_containing_marker(&sessions_dir, &marker)
.expect("no session file found after first run");
// `updated_at` is second-granularity, so make the injected internal thread
// deterministically newer than the visible exec session.
std::thread::sleep(Duration::from_millis(1100));
let internal_thread_id = Uuid::new_v4();
let internal_rollout_path = test.home_path().join("sessions/2026/03/27").join(format!(
"rollout-2026-03-27T00-00-00-{internal_thread_id}.jsonl"
));
std::fs::create_dir_all(
internal_rollout_path
.parent()
.expect("internal rollout parent directory"),
)?;
let internal_thread_id_str = internal_thread_id.to_string();
let internal_payload = serde_json::to_value(SessionMetaLine {
meta: SessionMeta {
id: ThreadId::from_string(&internal_thread_id_str)?,
forked_from_id: None,
timestamp: "2026-03-27T00:00:00.000Z".to_string(),
cwd: repo_root.clone(),
originator: "codex".to_string(),
cli_version: "0.0.0".to_string(),
source: SessionSource::SubAgent(SubAgentSource::MemoryConsolidation),
agent_path: None,
agent_nickname: None,
agent_role: None,
model_provider: None,
base_instructions: None,
dynamic_tools: None,
memory_mode: None,
},
git: None,
})?;
let internal_lines = [
json!({
"timestamp": "2026-03-27T00:00:00.000Z",
"type": "session_meta",
"payload": internal_payload,
})
.to_string(),
json!({
"timestamp": "2026-03-27T00:00:00.000Z",
"type": "response_item",
"payload": {
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": "internal memory sweep"}],
},
})
.to_string(),
json!({
"timestamp": "2026-03-27T00:00:00.000Z",
"type": "event_msg",
"payload": {
"type": "user_message",
"message": "internal memory sweep",
"kind": "plain",
},
})
.to_string(),
];
std::fs::write(&internal_rollout_path, internal_lines.join("\n") + "\n")?;
let marker2 = format!("resume-last-visible-2-{}", Uuid::new_v4());
let prompt2 = format!("echo {marker2}");
test.cmd()
.env("CODEX_RS_SSE_FIXTURE", &fixture)
.env("OPENAI_BASE_URL", "http://unused.local")
.arg("--skip-git-repo-check")
.arg("-C")
.arg(&repo_root)
.arg(&prompt2)
.arg("resume")
.arg("--last")
.assert()
.success();
let resumed_path = find_session_file_containing_marker(&sessions_dir, &marker2)
.expect("no resumed session file containing marker2");
assert_eq!(
resumed_path, path,
"resume --last should ignore newer internal threads"
);
Ok(())
}
#[test]
fn exec_resume_last_respects_cwd_filter_and_all_flag() -> anyhow::Result<()> {
let test = test_codex_exec();

View File

@@ -33,17 +33,31 @@ fn duplicate_fd_for_transfer(fd: impl AsFd, name: &str) -> anyhow::Result<OwnedF
.with_context(|| format!("failed to duplicate {name} for escalation transfer"))
}
async fn connect_escalation_stream(
handshake_client: AsyncDatagramSocket,
) -> anyhow::Result<(AsyncSocket, OwnedFd)> {
let (server, client) = AsyncSocket::pair()?;
let server_stream_guard: OwnedFd = server.into_inner().into();
let transferred_server_stream =
duplicate_fd_for_transfer(&server_stream_guard, "handshake stream")?;
const HANDSHAKE_MESSAGE: [u8; 1] = [0];
// Keep one local reference to the transferred stream alive until the server
// answers the first request. On macOS, dropping the sender's last local copy
// immediately after the datagram handshake can make the peer observe EOF
// before the received fd is fully servicing the stream.
handshake_client
.send_with_fds(&HANDSHAKE_MESSAGE, &[transferred_server_stream])
.await
.context("failed to send handshake datagram")?;
Ok((client, server_stream_guard))
}
pub async fn run_shell_escalation_execve_wrapper(
file: String,
argv: Vec<String>,
) -> anyhow::Result<i32> {
let handshake_client = get_escalate_client()?;
let (server, client) = AsyncSocket::pair()?;
const HANDSHAKE_MESSAGE: [u8; 1] = [0];
handshake_client
.send_with_fds(&HANDSHAKE_MESSAGE, &[server.into_inner().into()])
.await
.context("failed to send handshake datagram")?;
let (client, server_stream_guard) = connect_escalation_stream(handshake_client).await?;
let env = std::env::vars()
.filter(|(k, _)| !matches!(k.as_str(), ESCALATE_SOCKET_ENV_VAR | EXEC_WRAPPER_ENV_VAR))
.collect();
@@ -56,6 +70,11 @@ pub async fn run_shell_escalation_execve_wrapper(
})
.await
.context("failed to send EscalateRequest")?;
// Once the first request has been written into the stream, the local guard
// is no longer needed to bridge the datagram handoff. Dropping it here
// lets client-side reads still observe EOF if the server exits before
// replying.
drop(server_stream_guard);
let message = client
.receive::<EscalateResponse>()
.await
@@ -128,6 +147,12 @@ mod tests {
use super::*;
use std::os::fd::AsRawFd;
use std::os::unix::net::UnixStream;
use std::path::PathBuf;
use std::time::Duration;
use pretty_assertions::assert_eq;
use tokio::time::sleep;
use tokio::time::timeout;
#[test]
fn duplicate_fd_for_transfer_does_not_close_original() {
@@ -141,4 +166,83 @@ mod tests {
assert_ne!(unsafe { libc::fcntl(original_fd, libc::F_GETFD) }, -1);
}
#[tokio::test]
async fn connect_escalation_stream_keeps_sender_alive_until_first_request_write()
-> anyhow::Result<()> {
let (server_datagram, client_datagram) = AsyncDatagramSocket::pair()?;
let client_task = tokio::spawn(async move {
let (client_stream, server_stream_guard) =
connect_escalation_stream(client_datagram).await?;
let guard_fd = server_stream_guard.as_raw_fd();
assert_ne!(unsafe { libc::fcntl(guard_fd, libc::F_GETFD) }, -1);
client_stream
.send(EscalateRequest {
file: PathBuf::from("/bin/echo"),
argv: vec!["echo".to_string(), "hello".to_string()],
workdir: AbsolutePathBuf::current_dir()?,
env: Default::default(),
})
.await?;
drop(server_stream_guard);
assert_eq!(-1, unsafe { libc::fcntl(guard_fd, libc::F_GETFD) });
let response = client_stream.receive::<EscalateResponse>().await?;
Ok::<EscalateResponse, anyhow::Error>(response)
});
let (_, mut fds) = server_datagram.receive_with_fds().await?;
assert_eq!(fds.len(), 1);
sleep(Duration::from_millis(20)).await;
let server_stream = AsyncSocket::from_fd(fds.remove(0))?;
let request = server_stream.receive::<EscalateRequest>().await?;
assert_eq!(request.file, PathBuf::from("/bin/echo"));
assert_eq!(request.argv, vec!["echo".to_string(), "hello".to_string()]);
let expected = EscalateResponse {
action: EscalateAction::Deny {
reason: Some("not now".to_string()),
},
};
server_stream.send(expected.clone()).await?;
let response = client_task.await??;
assert_eq!(response, expected);
Ok(())
}
#[tokio::test]
async fn dropping_guard_after_request_write_preserves_server_eof() -> anyhow::Result<()> {
let (server_datagram, client_datagram) = AsyncDatagramSocket::pair()?;
let client_task = tokio::spawn(async move {
let (client_stream, server_stream_guard) =
connect_escalation_stream(client_datagram).await?;
client_stream
.send(EscalateRequest {
file: PathBuf::from("/bin/echo"),
argv: vec!["echo".to_string()],
workdir: AbsolutePathBuf::current_dir()?,
env: Default::default(),
})
.await?;
drop(server_stream_guard);
let err = timeout(
Duration::from_millis(250),
client_stream.receive::<EscalateResponse>(),
)
.await
.expect("server close should not hang the client")
.expect_err("expected EOF after server closes without replying");
assert_eq!(err.kind(), std::io::ErrorKind::UnexpectedEof);
Ok::<(), anyhow::Error>(())
});
let (_, mut fds) = server_datagram.receive_with_fds().await?;
assert_eq!(fds.len(), 1);
let server_stream = AsyncSocket::from_fd(fds.remove(0))?;
let request = server_stream.receive::<EscalateRequest>().await?;
assert_eq!(request.file, PathBuf::from("/bin/echo"));
drop(server_stream);
client_task.await??;
Ok(())
}
}

View File

@@ -11,10 +11,13 @@ schema and Responses API tool primitives that no longer need to live in
- `JsonSchema`
- `AdditionalProperties`
- `ToolDefinition`
- `ToolSpec`
- `ResponsesApiTool`
- `FreeformTool`
- `FreeformToolFormat`
- `ToolSearchOutputTool`
- `ResponsesApiWebSearchFilters`
- `ResponsesApiWebSearchUserLocation`
- `ResponsesApiNamespace`
- `ResponsesApiNamespaceTool`
- `parse_tool_input_schema()`

View File

@@ -6,6 +6,7 @@ mod json_schema;
mod mcp_tool;
mod responses_api;
mod tool_definition;
mod tool_spec;
pub use dynamic_tool::parse_dynamic_tool;
pub use json_schema::AdditionalProperties;
@@ -24,3 +25,6 @@ pub use responses_api::mcp_tool_to_deferred_responses_api_tool;
pub use responses_api::mcp_tool_to_responses_api_tool;
pub use responses_api::tool_definition_to_responses_api_tool;
pub use tool_definition::ToolDefinition;
pub use tool_spec::ResponsesApiWebSearchFilters;
pub use tool_spec::ResponsesApiWebSearchUserLocation;
pub use tool_spec::ToolSpec;

View File

@@ -0,0 +1,105 @@
use crate::FreeformTool;
use crate::JsonSchema;
use crate::ResponsesApiTool;
use codex_protocol::config_types::WebSearchContextSize;
use codex_protocol::config_types::WebSearchFilters as ConfigWebSearchFilters;
use codex_protocol::config_types::WebSearchUserLocation as ConfigWebSearchUserLocation;
use codex_protocol::config_types::WebSearchUserLocationType;
use serde::Serialize;
/// When serialized as JSON, this produces a valid "Tool" in the OpenAI
/// Responses API.
#[derive(Debug, Clone, Serialize, PartialEq)]
#[serde(tag = "type")]
pub enum ToolSpec {
#[serde(rename = "function")]
Function(ResponsesApiTool),
#[serde(rename = "tool_search")]
ToolSearch {
execution: String,
description: String,
parameters: JsonSchema,
},
#[serde(rename = "local_shell")]
LocalShell {},
#[serde(rename = "image_generation")]
ImageGeneration { output_format: String },
// TODO: Understand why we get an error on web_search although the API docs
// say it's supported.
// https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses#:~:text=%7B%20type%3A%20%22web_search%22%20%7D%2C
// The `external_web_access` field determines whether the web search is over
// cached or live content.
// https://platform.openai.com/docs/guides/tools-web-search#live-internet-access
#[serde(rename = "web_search")]
WebSearch {
#[serde(skip_serializing_if = "Option::is_none")]
external_web_access: Option<bool>,
#[serde(skip_serializing_if = "Option::is_none")]
filters: Option<ResponsesApiWebSearchFilters>,
#[serde(skip_serializing_if = "Option::is_none")]
user_location: Option<ResponsesApiWebSearchUserLocation>,
#[serde(skip_serializing_if = "Option::is_none")]
search_context_size: Option<WebSearchContextSize>,
#[serde(skip_serializing_if = "Option::is_none")]
search_content_types: Option<Vec<String>>,
},
#[serde(rename = "custom")]
Freeform(FreeformTool),
}
impl ToolSpec {
pub fn name(&self) -> &str {
match self {
ToolSpec::Function(tool) => tool.name.as_str(),
ToolSpec::ToolSearch { .. } => "tool_search",
ToolSpec::LocalShell {} => "local_shell",
ToolSpec::ImageGeneration { .. } => "image_generation",
ToolSpec::WebSearch { .. } => "web_search",
ToolSpec::Freeform(tool) => tool.name.as_str(),
}
}
}
#[derive(Debug, Clone, Serialize, PartialEq)]
pub struct ResponsesApiWebSearchFilters {
#[serde(skip_serializing_if = "Option::is_none")]
pub allowed_domains: Option<Vec<String>>,
}
impl From<ConfigWebSearchFilters> for ResponsesApiWebSearchFilters {
fn from(filters: ConfigWebSearchFilters) -> Self {
Self {
allowed_domains: filters.allowed_domains,
}
}
}
#[derive(Debug, Clone, Serialize, PartialEq)]
pub struct ResponsesApiWebSearchUserLocation {
#[serde(rename = "type")]
pub r#type: WebSearchUserLocationType,
#[serde(skip_serializing_if = "Option::is_none")]
pub country: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub region: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub city: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub timezone: Option<String>,
}
impl From<ConfigWebSearchUserLocation> for ResponsesApiWebSearchUserLocation {
fn from(user_location: ConfigWebSearchUserLocation) -> Self {
Self {
r#type: user_location.r#type,
country: user_location.country,
region: user_location.region,
city: user_location.city,
timezone: user_location.timezone,
}
}
}
#[cfg(test)]
#[path = "tool_spec_tests.rs"]
mod tests;

View File

@@ -0,0 +1,183 @@
use super::ResponsesApiWebSearchFilters;
use super::ResponsesApiWebSearchUserLocation;
use super::ToolSpec;
use crate::AdditionalProperties;
use crate::FreeformTool;
use crate::FreeformToolFormat;
use crate::JsonSchema;
use crate::ResponsesApiTool;
use codex_protocol::config_types::WebSearchContextSize;
use codex_protocol::config_types::WebSearchFilters as ConfigWebSearchFilters;
use codex_protocol::config_types::WebSearchUserLocation as ConfigWebSearchUserLocation;
use codex_protocol::config_types::WebSearchUserLocationType;
use pretty_assertions::assert_eq;
use serde_json::json;
use std::collections::BTreeMap;
#[test]
fn tool_spec_name_covers_all_variants() {
assert_eq!(
ToolSpec::Function(ResponsesApiTool {
name: "lookup_order".to_string(),
description: "Look up an order".to_string(),
strict: false,
defer_loading: None,
parameters: JsonSchema::Object {
properties: BTreeMap::new(),
required: None,
additional_properties: None,
},
output_schema: None,
})
.name(),
"lookup_order"
);
assert_eq!(
ToolSpec::ToolSearch {
execution: "sync".to_string(),
description: "Search for tools".to_string(),
parameters: JsonSchema::Object {
properties: BTreeMap::new(),
required: None,
additional_properties: None,
},
}
.name(),
"tool_search"
);
assert_eq!(ToolSpec::LocalShell {}.name(), "local_shell");
assert_eq!(
ToolSpec::ImageGeneration {
output_format: "png".to_string(),
}
.name(),
"image_generation"
);
assert_eq!(
ToolSpec::WebSearch {
external_web_access: Some(true),
filters: None,
user_location: None,
search_context_size: None,
search_content_types: None,
}
.name(),
"web_search"
);
assert_eq!(
ToolSpec::Freeform(FreeformTool {
name: "exec".to_string(),
description: "Run a command".to_string(),
format: FreeformToolFormat {
r#type: "grammar".to_string(),
syntax: "lark".to_string(),
definition: "start: \"exec\"".to_string(),
},
})
.name(),
"exec"
);
}
#[test]
fn web_search_config_converts_to_responses_api_types() {
assert_eq!(
ResponsesApiWebSearchFilters::from(ConfigWebSearchFilters {
allowed_domains: Some(vec!["example.com".to_string()]),
}),
ResponsesApiWebSearchFilters {
allowed_domains: Some(vec!["example.com".to_string()]),
}
);
assert_eq!(
ResponsesApiWebSearchUserLocation::from(ConfigWebSearchUserLocation {
r#type: WebSearchUserLocationType::Approximate,
country: Some("US".to_string()),
region: Some("California".to_string()),
city: Some("San Francisco".to_string()),
timezone: Some("America/Los_Angeles".to_string()),
}),
ResponsesApiWebSearchUserLocation {
r#type: WebSearchUserLocationType::Approximate,
country: Some("US".to_string()),
region: Some("California".to_string()),
city: Some("San Francisco".to_string()),
timezone: Some("America/Los_Angeles".to_string()),
}
);
}
#[test]
fn web_search_tool_spec_serializes_expected_wire_shape() {
assert_eq!(
serde_json::to_value(ToolSpec::WebSearch {
external_web_access: Some(true),
filters: Some(ResponsesApiWebSearchFilters {
allowed_domains: Some(vec!["example.com".to_string()]),
}),
user_location: Some(ResponsesApiWebSearchUserLocation {
r#type: WebSearchUserLocationType::Approximate,
country: Some("US".to_string()),
region: Some("California".to_string()),
city: Some("San Francisco".to_string()),
timezone: Some("America/Los_Angeles".to_string()),
}),
search_context_size: Some(WebSearchContextSize::High),
search_content_types: Some(vec!["text".to_string(), "image".to_string()]),
})
.expect("serialize web_search"),
json!({
"type": "web_search",
"external_web_access": true,
"filters": {
"allowed_domains": ["example.com"],
},
"user_location": {
"type": "approximate",
"country": "US",
"region": "California",
"city": "San Francisco",
"timezone": "America/Los_Angeles",
},
"search_context_size": "high",
"search_content_types": ["text", "image"],
})
);
}
#[test]
fn tool_search_tool_spec_serializes_expected_wire_shape() {
assert_eq!(
serde_json::to_value(ToolSpec::ToolSearch {
execution: "sync".to_string(),
description: "Search app tools".to_string(),
parameters: JsonSchema::Object {
properties: BTreeMap::from([(
"query".to_string(),
JsonSchema::String {
description: Some("Tool search query".to_string()),
},
)]),
required: Some(vec!["query".to_string()]),
additional_properties: Some(AdditionalProperties::Boolean(false)),
},
})
.expect("serialize tool_search"),
json!({
"type": "tool_search",
"execution": "sync",
"description": "Search app tools",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Tool search query",
}
},
"required": ["query"],
"additionalProperties": false,
},
})
);
}