mirror of
https://github.com/openai/codex.git
synced 2026-04-30 19:32:04 +03:00
## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved
## Runtime model
This PR does not introduce a new `approval_policy`.
Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
- `user`
- `guardian_subagent`
`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.
The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.
When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`
Users can still change `/approvals` afterward.
Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior
ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.
## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.
- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface
## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.
It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`
with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`
`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`
`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.
These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.
This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.
## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
- resolved approval/denial state in history
## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)
Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)
---------
Co-authored-by: Codex <noreply@openai.com>
904 lines
32 KiB
Rust
904 lines
32 KiB
Rust
use codex_core::config::types::Personality;
|
|
use codex_core::features::Feature;
|
|
use codex_core::models_manager::manager::ModelsManager;
|
|
use codex_core::models_manager::manager::RefreshStrategy;
|
|
use codex_protocol::config_types::ReasoningSummary;
|
|
use codex_protocol::openai_models::ConfigShellToolType;
|
|
use codex_protocol::openai_models::ModelInfo;
|
|
use codex_protocol::openai_models::ModelInstructionsVariables;
|
|
use codex_protocol::openai_models::ModelMessages;
|
|
use codex_protocol::openai_models::ModelVisibility;
|
|
use codex_protocol::openai_models::ModelsResponse;
|
|
use codex_protocol::openai_models::ReasoningEffort;
|
|
use codex_protocol::openai_models::ReasoningEffortPreset;
|
|
use codex_protocol::openai_models::TruncationPolicyConfig;
|
|
use codex_protocol::openai_models::default_input_modalities;
|
|
use codex_protocol::protocol::AskForApproval;
|
|
use codex_protocol::protocol::EventMsg;
|
|
use codex_protocol::protocol::Op;
|
|
use codex_protocol::protocol::SandboxPolicy;
|
|
use codex_protocol::user_input::UserInput;
|
|
use core_test_support::load_default_config_for_test;
|
|
use core_test_support::responses::mount_models_once;
|
|
use core_test_support::responses::mount_sse_once;
|
|
use core_test_support::responses::mount_sse_sequence;
|
|
use core_test_support::responses::sse_completed;
|
|
use core_test_support::responses::start_mock_server;
|
|
use core_test_support::skip_if_no_network;
|
|
use core_test_support::test_codex::test_codex;
|
|
use core_test_support::wait_for_event;
|
|
use pretty_assertions::assert_eq;
|
|
use std::sync::Arc;
|
|
use tempfile::TempDir;
|
|
use tokio::time::Duration;
|
|
use tokio::time::Instant;
|
|
use tokio::time::sleep;
|
|
use wiremock::BodyPrintLimit;
|
|
use wiremock::MockServer;
|
|
|
|
const LOCAL_FRIENDLY_TEMPLATE: &str =
|
|
"You optimize for team morale and being a supportive teammate as much as code quality.";
|
|
const LOCAL_PRAGMATIC_TEMPLATE: &str = "You are a deeply pragmatic, effective software engineer.";
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn personality_does_not_mutate_base_instructions_without_template() {
|
|
let codex_home = TempDir::new().expect("create temp dir");
|
|
let mut config = load_default_config_for_test(&codex_home).await;
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
config.personality = Some(Personality::Friendly);
|
|
|
|
let model_info = codex_core::test_support::construct_model_info_offline("gpt-5.1", &config);
|
|
assert_eq!(
|
|
model_info.get_model_instructions(config.personality),
|
|
model_info.base_instructions
|
|
);
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn base_instructions_override_disables_personality_template() {
|
|
let codex_home = TempDir::new().expect("create temp dir");
|
|
let mut config = load_default_config_for_test(&codex_home).await;
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
config.personality = Some(Personality::Friendly);
|
|
config.base_instructions = Some("override instructions".to_string());
|
|
|
|
let model_info =
|
|
codex_core::test_support::construct_model_info_offline("gpt-5.2-codex", &config);
|
|
|
|
assert_eq!(model_info.base_instructions, "override instructions");
|
|
assert_eq!(
|
|
model_info.get_model_instructions(config.personality),
|
|
"override instructions"
|
|
);
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn user_turn_personality_none_does_not_add_update_message() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = start_mock_server().await;
|
|
let resp_mock = mount_sse_once(&server, sse_completed("resp-1")).await;
|
|
let mut builder = test_codex()
|
|
.with_model("gpt-5.2-codex")
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request = resp_mock.single_request();
|
|
let developer_texts = request.message_input_texts("developer");
|
|
assert!(
|
|
!developer_texts
|
|
.iter()
|
|
.any(|text| text.contains("<personality_spec>")),
|
|
"did not expect a personality update message when personality is None"
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn config_personality_some_sets_instructions_template() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = start_mock_server().await;
|
|
let resp_mock = mount_sse_once(&server, sse_completed("resp-1")).await;
|
|
let mut builder = test_codex()
|
|
.with_model("gpt-5.2-codex")
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
config.personality = Some(Personality::Friendly);
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request = resp_mock.single_request();
|
|
let instructions_text = request.instructions_text();
|
|
|
|
assert!(
|
|
instructions_text.contains(LOCAL_FRIENDLY_TEMPLATE),
|
|
"expected personality update to include the local friendly template, got: {instructions_text:?}"
|
|
);
|
|
|
|
let developer_texts = request.message_input_texts("developer");
|
|
for text in developer_texts {
|
|
assert!(
|
|
!text.contains("<personality_spec>"),
|
|
"expected no personality update message in developer input"
|
|
);
|
|
}
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn config_personality_none_sends_no_personality() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = start_mock_server().await;
|
|
let resp_mock = mount_sse_once(&server, sse_completed("resp-1")).await;
|
|
let mut builder = test_codex()
|
|
.with_model("gpt-5.2-codex")
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
config.personality = Some(Personality::None);
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request = resp_mock.single_request();
|
|
let instructions_text = request.instructions_text();
|
|
assert!(
|
|
!instructions_text.contains(LOCAL_FRIENDLY_TEMPLATE),
|
|
"expected no friendly personality template, got: {instructions_text:?}"
|
|
);
|
|
assert!(
|
|
!instructions_text.contains(LOCAL_PRAGMATIC_TEMPLATE),
|
|
"expected no pragmatic personality template, got: {instructions_text:?}"
|
|
);
|
|
assert!(
|
|
!instructions_text.contains("{{ personality }}"),
|
|
"expected personality placeholder to be removed, got: {instructions_text:?}"
|
|
);
|
|
|
|
let developer_texts = request.message_input_texts("developer");
|
|
assert!(
|
|
!developer_texts
|
|
.iter()
|
|
.any(|text| text.contains("<personality_spec>")),
|
|
"did not expect a personality update message when personality is None"
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn default_personality_is_pragmatic_without_config_toml() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = start_mock_server().await;
|
|
let resp_mock = mount_sse_once(&server, sse_completed("resp-1")).await;
|
|
let mut builder = test_codex()
|
|
.with_model("gpt-5.2-codex")
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request = resp_mock.single_request();
|
|
let instructions_text = request.instructions_text();
|
|
assert!(
|
|
instructions_text.contains(LOCAL_PRAGMATIC_TEMPLATE),
|
|
"expected default friendly template, got: {instructions_text:?}"
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn user_turn_personality_some_adds_update_message() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = start_mock_server().await;
|
|
let resp_mock = mount_sse_sequence(
|
|
&server,
|
|
vec![sse_completed("resp-1"), sse_completed("resp-2")],
|
|
)
|
|
.await;
|
|
let mut builder = test_codex()
|
|
.with_model("exp-codex-personality")
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
test.codex
|
|
.submit(Op::OverrideTurnContext {
|
|
cwd: None,
|
|
approval_policy: None,
|
|
approvals_reviewer: None,
|
|
sandbox_policy: None,
|
|
windows_sandbox_level: None,
|
|
model: None,
|
|
effort: None,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: Some(Personality::Friendly),
|
|
})
|
|
.await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let requests = resp_mock.requests();
|
|
assert_eq!(requests.len(), 2, "expected two requests");
|
|
let request = requests
|
|
.last()
|
|
.expect("expected personality update request");
|
|
|
|
let developer_texts = request.message_input_texts("developer");
|
|
let personality_text = developer_texts
|
|
.iter()
|
|
.find(|text| text.contains("<personality_spec>"))
|
|
.expect("expected personality update message in developer input");
|
|
|
|
assert!(
|
|
personality_text.contains("The user has requested a new communication style."),
|
|
"expected personality update preamble, got {personality_text:?}"
|
|
);
|
|
assert!(
|
|
personality_text.contains(LOCAL_FRIENDLY_TEMPLATE),
|
|
"expected personality update to include the local pragmatic template, got: {personality_text:?}"
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn user_turn_personality_same_value_does_not_add_update_message() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = start_mock_server().await;
|
|
let resp_mock = mount_sse_sequence(
|
|
&server,
|
|
vec![sse_completed("resp-1"), sse_completed("resp-2")],
|
|
)
|
|
.await;
|
|
let mut builder = test_codex()
|
|
.with_model("exp-codex-personality")
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
config.personality = Some(Personality::Pragmatic);
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
test.codex
|
|
.submit(Op::OverrideTurnContext {
|
|
cwd: None,
|
|
approval_policy: None,
|
|
approvals_reviewer: None,
|
|
sandbox_policy: None,
|
|
windows_sandbox_level: None,
|
|
model: None,
|
|
effort: None,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: Some(Personality::Pragmatic),
|
|
})
|
|
.await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let requests = resp_mock.requests();
|
|
assert_eq!(requests.len(), 2, "expected two requests");
|
|
let request = requests
|
|
.last()
|
|
.expect("expected second request after personality override");
|
|
|
|
let developer_texts = request.message_input_texts("developer");
|
|
let personality_text = developer_texts
|
|
.iter()
|
|
.find(|text| text.contains("<personality_spec>"));
|
|
assert!(
|
|
personality_text.is_none(),
|
|
"expected no personality preamble for unchanged personality, got {personality_text:?}"
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn instructions_uses_base_if_feature_disabled() -> anyhow::Result<()> {
|
|
let codex_home = TempDir::new().expect("create temp dir");
|
|
let mut config = load_default_config_for_test(&codex_home).await;
|
|
config
|
|
.features
|
|
.disable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
config.personality = Some(Personality::Friendly);
|
|
|
|
let model_info =
|
|
codex_core::test_support::construct_model_info_offline("gpt-5.2-codex", &config);
|
|
assert_eq!(
|
|
model_info.get_model_instructions(config.personality),
|
|
model_info.base_instructions
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn user_turn_personality_skips_if_feature_disabled() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = start_mock_server().await;
|
|
let resp_mock = mount_sse_sequence(
|
|
&server,
|
|
vec![sse_completed("resp-1"), sse_completed("resp-2")],
|
|
)
|
|
.await;
|
|
let mut builder = test_codex()
|
|
.with_model("exp-codex-personality")
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.disable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
test.codex
|
|
.submit(Op::OverrideTurnContext {
|
|
cwd: None,
|
|
approval_policy: None,
|
|
approvals_reviewer: None,
|
|
sandbox_policy: None,
|
|
windows_sandbox_level: None,
|
|
model: None,
|
|
effort: None,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: Some(Personality::Pragmatic),
|
|
})
|
|
.await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: test.config.permissions.approval_policy.value(),
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: test.session_configured.model.clone(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let requests = resp_mock.requests();
|
|
assert_eq!(requests.len(), 2, "expected two requests");
|
|
let request = requests
|
|
.last()
|
|
.expect("expected personality update request");
|
|
|
|
let developer_texts = request.message_input_texts("developer");
|
|
let personality_text = developer_texts
|
|
.iter()
|
|
.find(|text| text.contains("<personality_spec>"));
|
|
assert!(
|
|
personality_text.is_none(),
|
|
"expected no personality preamble, got {personality_text:?}"
|
|
);
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn remote_model_friendly_personality_instructions_with_feature() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = MockServer::builder()
|
|
.body_print_limit(BodyPrintLimit::Limited(80_000))
|
|
.start()
|
|
.await;
|
|
|
|
let remote_slug = "codex-remote-default-personality";
|
|
let default_personality_message = "Default from remote template";
|
|
let friendly_personality_message = "Friendly variant";
|
|
let remote_model = ModelInfo {
|
|
slug: remote_slug.to_string(),
|
|
display_name: "Remote default personality test".to_string(),
|
|
description: Some("Remote model with default personality template".to_string()),
|
|
default_reasoning_level: Some(ReasoningEffort::Medium),
|
|
supported_reasoning_levels: vec![ReasoningEffortPreset {
|
|
effort: ReasoningEffort::Medium,
|
|
description: ReasoningEffort::Medium.to_string(),
|
|
}],
|
|
shell_type: ConfigShellToolType::UnifiedExec,
|
|
visibility: ModelVisibility::List,
|
|
supported_in_api: true,
|
|
priority: 1,
|
|
upgrade: None,
|
|
base_instructions: "base instructions".to_string(),
|
|
model_messages: Some(ModelMessages {
|
|
instructions_template: Some("Base instructions\n{{ personality }}\n".to_string()),
|
|
instructions_variables: Some(ModelInstructionsVariables {
|
|
personality_default: Some(default_personality_message.to_string()),
|
|
personality_friendly: Some(friendly_personality_message.to_string()),
|
|
personality_pragmatic: Some("Pragmatic variant".to_string()),
|
|
}),
|
|
}),
|
|
supports_reasoning_summaries: false,
|
|
default_reasoning_summary: ReasoningSummary::Auto,
|
|
support_verbosity: false,
|
|
default_verbosity: None,
|
|
availability_nux: None,
|
|
apply_patch_tool_type: None,
|
|
web_search_tool_type: Default::default(),
|
|
truncation_policy: TruncationPolicyConfig::bytes(10_000),
|
|
supports_parallel_tool_calls: false,
|
|
supports_image_detail_original: false,
|
|
context_window: Some(128_000),
|
|
auto_compact_token_limit: None,
|
|
effective_context_window_percent: 95,
|
|
experimental_supported_tools: Vec::new(),
|
|
input_modalities: default_input_modalities(),
|
|
prefer_websockets: false,
|
|
used_fallback_model_metadata: false,
|
|
supports_search_tool: false,
|
|
};
|
|
|
|
let _models_mock = mount_models_once(
|
|
&server,
|
|
ModelsResponse {
|
|
models: vec![remote_model],
|
|
},
|
|
)
|
|
.await;
|
|
|
|
let resp_mock = mount_sse_once(&server, sse_completed("resp-1")).await;
|
|
|
|
let mut builder = test_codex()
|
|
.with_auth(codex_core::CodexAuth::create_dummy_chatgpt_auth_for_testing())
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
config.model = Some(remote_slug.to_string());
|
|
config.personality = Some(Personality::Friendly);
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
wait_for_model_available(&test.thread_manager.get_models_manager(), remote_slug).await;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: AskForApproval::Never,
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: remote_slug.to_string(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: Some(Personality::Friendly),
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request = resp_mock.single_request();
|
|
let instructions_text = request.instructions_text();
|
|
|
|
assert!(
|
|
instructions_text.contains(friendly_personality_message),
|
|
"expected instructions to include the remote friendly personality template, got: {instructions_text:?}"
|
|
);
|
|
assert!(
|
|
!instructions_text.contains(default_personality_message),
|
|
"expected instructions to skip the remote default personality template, got: {instructions_text:?}"
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn user_turn_personality_remote_model_template_includes_update_message() -> anyhow::Result<()>
|
|
{
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = MockServer::builder()
|
|
.body_print_limit(BodyPrintLimit::Limited(80_000))
|
|
.start()
|
|
.await;
|
|
|
|
let remote_slug = "codex-remote-personality";
|
|
let remote_friendly_message = "Friendly from remote template";
|
|
let remote_pragmatic_message = "Pragmatic from remote template";
|
|
let remote_model = ModelInfo {
|
|
slug: remote_slug.to_string(),
|
|
display_name: "Remote personality test".to_string(),
|
|
description: Some("Remote model with personality template".to_string()),
|
|
default_reasoning_level: Some(ReasoningEffort::Medium),
|
|
supported_reasoning_levels: vec![ReasoningEffortPreset {
|
|
effort: ReasoningEffort::Medium,
|
|
description: ReasoningEffort::Medium.to_string(),
|
|
}],
|
|
shell_type: ConfigShellToolType::UnifiedExec,
|
|
visibility: ModelVisibility::List,
|
|
supported_in_api: true,
|
|
priority: 1,
|
|
upgrade: None,
|
|
base_instructions: "base instructions".to_string(),
|
|
model_messages: Some(ModelMessages {
|
|
instructions_template: Some("Base instructions\n{{ personality }}\n".to_string()),
|
|
instructions_variables: Some(ModelInstructionsVariables {
|
|
personality_default: None,
|
|
personality_friendly: Some(remote_friendly_message.to_string()),
|
|
personality_pragmatic: Some(remote_pragmatic_message.to_string()),
|
|
}),
|
|
}),
|
|
supports_reasoning_summaries: false,
|
|
default_reasoning_summary: ReasoningSummary::Auto,
|
|
support_verbosity: false,
|
|
default_verbosity: None,
|
|
availability_nux: None,
|
|
apply_patch_tool_type: None,
|
|
web_search_tool_type: Default::default(),
|
|
truncation_policy: TruncationPolicyConfig::bytes(10_000),
|
|
supports_parallel_tool_calls: false,
|
|
supports_image_detail_original: false,
|
|
context_window: Some(128_000),
|
|
auto_compact_token_limit: None,
|
|
effective_context_window_percent: 95,
|
|
experimental_supported_tools: Vec::new(),
|
|
input_modalities: default_input_modalities(),
|
|
prefer_websockets: false,
|
|
used_fallback_model_metadata: false,
|
|
supports_search_tool: false,
|
|
};
|
|
|
|
let _models_mock = mount_models_once(
|
|
&server,
|
|
ModelsResponse {
|
|
models: vec![remote_model],
|
|
},
|
|
)
|
|
.await;
|
|
|
|
let resp_mock = mount_sse_sequence(
|
|
&server,
|
|
vec![sse_completed("resp-1"), sse_completed("resp-2")],
|
|
)
|
|
.await;
|
|
|
|
let mut builder = test_codex()
|
|
.with_auth(codex_core::CodexAuth::create_dummy_chatgpt_auth_for_testing())
|
|
.with_config(|config| {
|
|
config
|
|
.features
|
|
.enable(Feature::Personality)
|
|
.expect("test config should allow feature update");
|
|
config.model = Some("gpt-5.2-codex".to_string());
|
|
});
|
|
let test = builder.build(&server).await?;
|
|
|
|
wait_for_model_available(&test.thread_manager.get_models_manager(), remote_slug).await;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: AskForApproval::Never,
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: remote_slug.to_string(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
test.codex
|
|
.submit(Op::OverrideTurnContext {
|
|
cwd: None,
|
|
approval_policy: None,
|
|
approvals_reviewer: None,
|
|
sandbox_policy: None,
|
|
windows_sandbox_level: None,
|
|
model: None,
|
|
effort: None,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: Some(Personality::Friendly),
|
|
})
|
|
.await?;
|
|
|
|
test.codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
cwd: test.cwd_path().to_path_buf(),
|
|
approval_policy: AskForApproval::Never,
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
|
model: remote_slug.to_string(),
|
|
effort: test.config.model_reasoning_effort,
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let requests = resp_mock.requests();
|
|
assert_eq!(requests.len(), 2, "expected two requests");
|
|
let request = requests
|
|
.last()
|
|
.expect("expected personality update request");
|
|
let developer_texts = request.message_input_texts("developer");
|
|
let personality_text = developer_texts
|
|
.iter()
|
|
.find(|text| text.contains(remote_friendly_message))
|
|
.expect("expected personality update message in developer input");
|
|
|
|
assert!(
|
|
personality_text.contains("The user has requested a new communication style."),
|
|
"expected personality update preamble, got {personality_text:?}"
|
|
);
|
|
assert!(
|
|
personality_text.contains(remote_friendly_message),
|
|
"expected personality update to include remote template, got: {personality_text:?}"
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
async fn wait_for_model_available(manager: &Arc<ModelsManager>, slug: &str) {
|
|
let deadline = Instant::now() + Duration::from_secs(2);
|
|
loop {
|
|
let models = manager.list_models(RefreshStrategy::OnlineIfUncached).await;
|
|
if models.iter().any(|model| model.model == slug) {
|
|
return;
|
|
}
|
|
if Instant::now() >= deadline {
|
|
panic!("timed out waiting for the remote model {slug} to appear");
|
|
}
|
|
sleep(Duration::from_millis(25)).await;
|
|
}
|
|
}
|