test: harden app-server integration tests (#19683)

## Why

Windows Bazel runs in the permissions stack exposed that app-server
integration tests were launching normal plugin startup warmups in every
subprocess. Those warmups can call
`https://chatgpt.com/backend-api/plugins/featured` when a test is not
specifically exercising plugin startup, which adds slow background work,
noisy stderr, and dependence on external network state. The relevant
startup/featured-plugin behavior was introduced across #15042 and
#15264.

A few app-server tests also had long optional waits or unbounded cleanup
paths, making failures expensive to diagnose and contributing to slow
Windows shards. One external-agent config test from #18246 used a
GitHub-style marketplace source, which was enough to exercise the
pending remote-import path but also meant the background completion task
could attempt a real clone.

## What Changed

- Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks`
plumbing and a hidden debug-only
`--disable-plugin-startup-tasks-for-tests` app-server flag, so
integration tests can suppress startup plugin warmups without adding a
production env-var gate.
- Has the app-server test harness pass that hidden flag by default,
while opting plugin-startup coverage back in for tests that
intentionally exercise startup sync and featured-plugin warmup behavior.
- Lowers normal app-server subprocess logging from `info`/`debug` to
`warn` to avoid multi-megabyte stderr output in Bazel logs.
- Prevents the external-agent config test from attempting a real
marketplace clone by using an invalid non-local source while still
exercising the pending-import completion path.
- Bounds optional filesystem/realtime waits and fake WebSocket
test-server shutdown so failures produce targeted timeouts instead of
hanging a shard.
- Fixes the Unix script-resolution test in `rmcp-client` to exercise
PATH resolution directly and include the actual spawn error in failures.

## Verification

- `cargo check -p codex-app-server`
- `cargo clippy -p codex-app-server --tests -- -D warnings`
- `cargo test -p codex-rmcp-client
program_resolver::tests::test_unix_executes_script_without_extension`
- `cargo test -p codex-app-server --test all
external_agent_config_import_sends_completion_notification_after_pending_plugins_finish
-- --nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request --
--nocapture`
- Windows Local Bazel passed with this test-hardening bundle before it
was extracted from #19606.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683).
* #19395
* #19394
* #19393
* #19392
* #19606
* __->__ #19683
This commit is contained in:
Michael Bolin
2026-04-26 12:43:16 -07:00
committed by GitHub
parent 87bc72408c
commit ac2bffa443
14 changed files with 140 additions and 43 deletions

View File

@@ -281,7 +281,7 @@ impl RealtimeE2eHarness {
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
mcp.initialize().await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
login_with_api_key(&mut mcp, "sk-test-key").await?;
let thread_start_request_id = mcp
@@ -345,10 +345,16 @@ impl RealtimeE2eHarness {
/// Returns the nth JSON message app-server wrote to the fake Realtime API
/// sideband websocket.
async fn sideband_outbound_request(&self, request_index: usize) -> Value {
self.realtime_server
.wait_for_request(/*connection_index*/ 0, request_index)
.await
.body_json()
timeout(
DEFAULT_TIMEOUT,
self.realtime_server
.wait_for_request(/*connection_index*/ 0, request_index),
)
.await
.unwrap_or_else(|_| {
panic!("timed out waiting for realtime sideband request {request_index}")
})
.body_json()
}
async fn append_audio(&mut self, thread_id: String) -> Result<()> {
@@ -534,7 +540,7 @@ async fn realtime_conversation_streams_v2_notifications() -> Result<()> {
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
mcp.initialize().await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
login_with_api_key(&mut mcp, "sk-test-key").await?;
let thread_start_request_id = mcp
@@ -783,7 +789,7 @@ async fn realtime_text_output_modality_requests_text_output_and_final_transcript
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
mcp.initialize().await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
login_with_api_key(&mut mcp, "sk-test-key").await?;
let thread_start_request_id = mcp
@@ -885,7 +891,7 @@ async fn realtime_list_voices_returns_supported_names() -> Result<()> {
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
mcp.initialize().await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
let request_id = mcp
.send_thread_realtime_list_voices_request(ThreadRealtimeListVoicesParams {})
@@ -957,7 +963,7 @@ async fn realtime_conversation_stop_emits_closed_notification() -> Result<()> {
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
mcp.initialize().await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
login_with_api_key(&mut mcp, "sk-test-key").await?;
let thread_start_request_id = mcp
@@ -1053,7 +1059,7 @@ async fn realtime_webrtc_start_emits_sdp_notification() -> Result<()> {
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
mcp.initialize().await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
login_with_api_key(&mut mcp, "sk-test-key").await?;
let thread_start_request_id = mcp
@@ -1968,7 +1974,7 @@ async fn realtime_webrtc_start_surfaces_backend_error() -> Result<()> {
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
mcp.initialize().await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
login_with_api_key(&mut mcp, "sk-test-key").await?;
// Phase 2: start a normal app-server thread and request realtime over WebRTC.
@@ -2029,7 +2035,7 @@ async fn realtime_conversation_requires_feature_flag() -> Result<()> {
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
mcp.initialize().await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
let thread_start_request_id = mcp
.send_thread_start_request(ThreadStartParams::default())