guardian initial feedback / tweaks (#13897)

## Summary
- remove the remaining model-visible guardian-specific `on-request`
prompt additions so enabling the feature does not change the main
approval-policy instructions
- neutralize user-facing guardian wording to talk about automatic
approval review / approval requests rather than a second reviewer or
only sandbox escalations
- tighten guardian retry-context handling so agent-authored
`justification` stays in the structured action JSON and is not also
injected as raw retry context
- simplify guardian review plumbing in core by deleting dead
prompt-append paths and trimming some request/transcript setup code

## Notable Changes
- delete the dead `permissions/approval_policy/guardian.md` append path
and stop threading `guardian_approval_enabled` through model-facing
developer-instruction builders
- rename the experimental feature copy to `Automatic approval review`
and update the `/experimental` snapshot text accordingly
- make approval-review status strings generic across shell, patch,
network, and MCP review types
- forward real sandbox/network retry reasons for shell and unified-exec
guardian review, but do not pass agent-authored justification as raw
retry context
- simplify `guardian.rs` by removing the one-field request wrapper,
deduping reasoning-effort selection, and cleaning up transcript entry
collection

## Testing
- `just fmt`
- full validation left to CI

---------

Co-authored-by: Codex <noreply@openai.com>
This commit is contained in:
Charley Cunningham
2026-03-09 09:25:24 -07:00
committed by GitHub
parent 2bc3e52a91
commit f23fcd6ced
16 changed files with 421 additions and 352 deletions

View File

@@ -149,7 +149,7 @@ pub enum Feature {
Steer,
/// Allow request_user_input in Default collaboration mode.
DefaultModeRequestUserInput,
/// Enable guardian subagent approvals.
/// Enable automatic review for approval prompts.
GuardianApproval,
/// Enable collaboration modes (Plan, Default).
/// Kept for config backward compatibility; behavior is always collaboration-modes-enabled.
@@ -710,8 +710,8 @@ pub const FEATURES: &[FeatureSpec] = &[
id: Feature::GuardianApproval,
key: "guardian_approval",
stage: Stage::Experimental {
name: "Guardian approvals",
menu_description: "Let a guardian subagent review `on-request` approval prompts instead of showing them to you, including sandbox escapes and blocked network access.",
name: "Automatic approval review",
menu_description: "Dispatch `on-request` approval prompts (for e.g. sandbox escapes or blocked network access) to a carefully-prompted security reviewer subagent rather than blocking the agent on your input.",
announcement: "",
},
default_enabled: false,
@@ -917,11 +917,14 @@ mod tests {
let stage = spec.stage;
assert!(matches!(stage, Stage::Experimental { .. }));
assert_eq!(stage.experimental_menu_name(), Some("Guardian approvals"));
assert_eq!(
stage.experimental_menu_name(),
Some("Automatic approval review")
);
assert_eq!(
stage.experimental_menu_description().map(str::to_owned),
Some(
"Let a guardian subagent review `on-request` approval prompts instead of showing them to you, including sandbox escapes and blocked network access.".to_string()
"Dispatch `on-request` approval prompts (for e.g. sandbox escapes or blocked network access) to a carefully-prompted security reviewer subagent rather than blocking the agent on your input.".to_string()
)
);
assert_eq!(stage.experimental_announcement(), None);