codex/prs/bolinfest/study/PR-1823-study.md at a409c34c856e39ea6b2ac8eb1af7fdf8758c481e

mirrors/codex

Fork 0

mirror of https://github.com/openai/codex.git synced 2026-04-28 18:32:04 +03:00

Files

Daniel Edrisian d2af202db7 study last 100 days PRs

2025-09-02 15:17:45 -07:00

3.7 KiB

Raw Blame History

DOs

Prefer behavior-based tests: Validate real effects under Seatbelt instead of checking policy text.

#[cfg(target_os = "macos")]
#[tokio::test]
async fn python_lock_works_under_seatbelt() {
    use super::{spawn_command_under_seatbelt, SandboxPolicy};
    use crate::spawn::StdioPolicy;
    use std::collections::HashMap;

    let policy = SandboxPolicy::WorkspaceWrite {
        writable_roots: vec![],
        network_access: false,
        include_default_writable_roots: true,
    };

    let py = r#"import multiprocessing as mp
def f(l):
    with l: pass
if __name__ == "__main__":
    l = mp.Lock()
    p = mp.Process(target=f, args=(l,))
    p.start(); p.join()
"#;

    let mut child = spawn_command_under_seatbelt(
        vec!["python3".into(), "-c".into(), py.into()],
        &policy,
        std::env::current_dir().unwrap(),
        StdioPolicy::RedirectForShellTool,
        HashMap::new(),
    ).await.expect("spawn under seatbelt");

    let status = child.wait().await.expect("wait for child");
    assert!(status.success(), "python exited with {status:?}");
}

Gate macOS-only logic: Use platform guards so Seatbelt tests don’t run where unsupported.

#[cfg(target_os = "macos")]
#[tokio::test]
async fn seatbelt_specific_behavior() {
    // macOS-only assertions here
}

Use the Seatbelt helper: Rely on spawn_command_under_seatbelt rather than rolling your own sandboxing.

use super::{spawn_command_under_seatbelt, SandboxPolicy};
use crate::spawn::StdioPolicy;

let policy = SandboxPolicy::WorkspaceWrite {
    writable_roots: vec![],
    network_access: false,
    include_default_writable_roots: true,
};

let mut child = spawn_command_under_seatbelt(
    vec!["/usr/bin/true".into()],
    &policy,
    std::env::current_dir().unwrap(),
    StdioPolicy::RedirectForShellTool,
    std::collections::HashMap::new(),
).await?;

Assert with context: Provide clear failure messages with inline variables for quick diagnosis.

let status = child.wait().await?;
assert!(status.success(), "process failed: {status:?}");

Keep tests hermetic: Avoid network and mutable global state; configure the policy to be restrictive.

let policy = SandboxPolicy::WorkspaceWrite {
    writable_roots: vec![],
    network_access: false, // hermetic
    include_default_writable_roots: true,
};

DON’Ts

Don’t test policy strings directly: These are brittle and duplicate behavior tests.

#[test]
fn bad_string_based_policy_test() {
    // Fragile: asserts on implementation detail, not behavior
    assert!(MACOS_SEATBELT_BASE_POLICY.contains("(allow ipc-posix-sem)"));
}

Don’t run Seatbelt tests cross‑platform: Missing guards will cause spurious failures on non‑macOS.

// Bad: no #[cfg(target_os = "macos")]
#[tokio::test]
async fn runs_everywhere_but_shouldnt() {
    // This may fail on Linux/Windows
}

Don’t bypass the helper or shell out to sandbox tools directly: Centralize sandbox behavior via the core API.

// Bad: manual sandbox invocation
use std::process::Command;

#[test]
fn bad_manual_sandbox_exec() {
    let _ = Command::new("/usr/bin/sandbox-exec")
        .args(["-p", "…", "python3", "-c", "print('ok')"])
        .status()
        .unwrap();
}

Don’t use vague assertions: Lack of context slows triage.

// Bad: no context on failure
assert!(status.success());

Don’t depend on network access in sandboxed tests: Seatbelt and CI may block it, making tests flaky.

// Bad: external network dependency
#[tokio::test]
async fn bad_network_test() {
    let _ = reqwest::get("https://example.com").await.unwrap(); // may fail under sandbox
}

3.7 KiB Raw Blame History Unescape Escape

3.7 KiB

Raw Blame History