test: harden app-server integration tests (#19683)

## Why

Windows Bazel runs in the permissions stack exposed that app-server
integration tests were launching normal plugin startup warmups in every
subprocess. Those warmups can call
`https://chatgpt.com/backend-api/plugins/featured` when a test is not
specifically exercising plugin startup, which adds slow background work,
noisy stderr, and dependence on external network state. The relevant
startup/featured-plugin behavior was introduced across #15042 and
#15264.

A few app-server tests also had long optional waits or unbounded cleanup
paths, making failures expensive to diagnose and contributing to slow
Windows shards. One external-agent config test from #18246 used a
GitHub-style marketplace source, which was enough to exercise the
pending remote-import path but also meant the background completion task
could attempt a real clone.

## What Changed

- Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks`
plumbing and a hidden debug-only
`--disable-plugin-startup-tasks-for-tests` app-server flag, so
integration tests can suppress startup plugin warmups without adding a
production env-var gate.
- Has the app-server test harness pass that hidden flag by default,
while opting plugin-startup coverage back in for tests that
intentionally exercise startup sync and featured-plugin warmup behavior.
- Lowers normal app-server subprocess logging from `info`/`debug` to
`warn` to avoid multi-megabyte stderr output in Bazel logs.
- Prevents the external-agent config test from attempting a real
marketplace clone by using an invalid non-local source while still
exercising the pending-import completion path.
- Bounds optional filesystem/realtime waits and fake WebSocket
test-server shutdown so failures produce targeted timeouts instead of
hanging a shard.
- Fixes the Unix script-resolution test in `rmcp-client` to exercise
PATH resolution directly and include the actual spawn error in failures.

## Verification

- `cargo check -p codex-app-server`
- `cargo clippy -p codex-app-server --tests -- -D warnings`
- `cargo test -p codex-rmcp-client
program_resolver::tests::test_unix_executes_script_without_extension`
- `cargo test -p codex-app-server --test all
external_agent_config_import_sends_completion_notification_after_pending_plugins_finish
-- --nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request --
--nocapture`
- Windows Local Bazel passed with this test-hardening bundle before it
was extracted from #19606.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683).
* #19395
* #19394
* #19393
* #19392
* #19606
* __->__ #19683
This commit is contained in:
Michael Bolin
2026-04-26 12:43:16 -07:00
committed by GitHub
parent 87bc72408c
commit ac2bffa443
14 changed files with 140 additions and 43 deletions

View File

@@ -106,19 +106,26 @@ pub struct McpProcess {
}
pub const DEFAULT_CLIENT_NAME: &str = "codex-app-server-tests";
pub const DISABLE_PLUGIN_STARTUP_TASKS_ARG: &str = "--disable-plugin-startup-tasks-for-tests";
const DISABLE_MANAGED_CONFIG_ENV_VAR: &str = "CODEX_APP_SERVER_DISABLE_MANAGED_CONFIG";
impl McpProcess {
pub async fn new(codex_home: &Path) -> anyhow::Result<Self> {
Self::new_with_env_and_args(codex_home, &[], &[]).await
Self::new_with_env_and_args(codex_home, &[], &[DISABLE_PLUGIN_STARTUP_TASKS_ARG]).await
}
pub async fn new_without_managed_config(codex_home: &Path) -> anyhow::Result<Self> {
Self::new_with_env(codex_home, &[(DISABLE_MANAGED_CONFIG_ENV_VAR, Some("1"))]).await
}
pub async fn new_with_plugin_startup_tasks(codex_home: &Path) -> anyhow::Result<Self> {
Self::new_with_env_and_args(codex_home, &[], &[]).await
}
pub async fn new_with_args(codex_home: &Path, args: &[&str]) -> anyhow::Result<Self> {
Self::new_with_env_and_args(codex_home, &[], args).await
let mut all_args = vec![DISABLE_PLUGIN_STARTUP_TASKS_ARG];
all_args.extend_from_slice(args);
Self::new_with_env_and_args(codex_home, &[], &all_args).await
}
/// Creates a new MCP process, allowing tests to override or remove
@@ -130,7 +137,12 @@ impl McpProcess {
codex_home: &Path,
env_overrides: &[(&str, Option<&str>)],
) -> anyhow::Result<Self> {
Self::new_with_env_and_args(codex_home, env_overrides, &[]).await
Self::new_with_env_and_args(
codex_home,
env_overrides,
&[DISABLE_PLUGIN_STARTUP_TASKS_ARG],
)
.await
}
async fn new_with_env_and_args(
@@ -147,7 +159,7 @@ impl McpProcess {
cmd.stderr(Stdio::piped());
cmd.current_dir(codex_home);
cmd.env("CODEX_HOME", codex_home);
cmd.env("RUST_LOG", "info");
cmd.env("RUST_LOG", "warn");
// Keep integration tests isolated from host managed configuration.
cmd.env(
"CODEX_APP_SERVER_MANAGED_CONFIG_PATH",