## Summary
If we are in a mode that is already explicitly un-sandboxed, then
`ApprovalPolicy::Never` should not block dangerous commands.
## Testing
- [x] Existing unit test covers old behavior
- [x] Added a unit test for this new case
## Summary
If a subagent requests approval, and the user persists that approval to
the execpolicy, it should (by default) propagate. We'll need to rethink
this a bit in light of coming Permissions changes, though I think this
is closer to the end state that we'd want, which is that execpolicy
changes to one permissions profile should be synced across threads.
## Testing
- [x] Added integration test
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Once the repo-local lint exists, `codex-rs` needs to follow the
checked-in convention and CI needs to keep it from drifting. This commit
applies the fallback `/*param*/` style consistently across existing
positional literal call sites without changing those APIs.
The longer-term preference is still to avoid APIs that require comments
by choosing clearer parameter types and call shapes. This PR is
intentionally the mechanical follow-through for the places where the
existing signatures stay in place.
After rebasing onto newer `main`, the rollout also had to cover newly
introduced `tui_app_server` call sites. That made it clear the first cut
of the CI job was too expensive for the common path: it was spending
almost as much time installing `cargo-dylint` and re-testing the lint
crate as a representative test job spends running product tests. The CI
update keeps the full workspace enforcement but trims that extra
overhead from ordinary `codex-rs` PRs.
## What changed
- keep a dedicated `argument_comment_lint` job in `rust-ci`
- mechanically annotate remaining opaque positional literals across
`codex-rs` with exact `/*param*/` comments, including the rebased
`tui_app_server` call sites that now fall under the lint
- keep the checked-in style aligned with the lint policy by using
`/*param*/` and leaving string and char literals uncommented
- cache `cargo-dylint`, `dylint-link`, and the relevant Cargo
registry/git metadata in the lint job
- split changed-path detection so the lint crate's own `cargo test` step
runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes
- continue to run the repo wrapper over the `codex-rs` workspace, so
product-code enforcement is unchanged
Most of the code changes in this commit are intentionally mechanical
comment rewrites or insertions driven by the lint itself.
## Verification
- `./tools/argument-comment-lint/run.sh --workspace`
- `cargo test -p codex-tui-app-server -p codex-tui`
- parsed `.github/workflows/rust-ci.yml` locally with PyYAML
---
* -> #14652
* #14651
## Description
This PR expands tracing coverage across app-server thread startup, core
session initialization, and the Responses transport layer. It also gives
core dispatch spans stable operation-specific names so traces are easier
to follow than the old generic `submission_dispatch` spans.
Also use `fmt::Display` for types that we serialize in traces so we send
strings instead of rust types
## Why
PR #13783 moved the `codex.rs` unit tests into `codex_tests.rs`. This
applies the same extraction pattern across the rest of `codex-rs/core`
so the production modules stay focused on runtime code instead of large
inline test blocks.
Keeping the tests in sibling files also makes follow-up edits easier to
review because product changes no longer have to share a file with
hundreds or thousands of lines of test scaffolding.
## What changed
- replaced each inline `mod tests { ... }` in `codex-rs/core/src/**`
with a path-based module declaration
- moved each extracted unit test module into a sibling `*_tests.rs`
file, using `mod_tests.rs` for `mod.rs` modules
- preserved the existing `cfg(...)` guards and module-local structure so
the refactor remains structural rather than behavioral
## Testing
- `cargo test -p codex-core --lib` (`1653 passed; 0 failed; 5 ignored`)
- `just fix -p codex-core`
- `cargo fmt --check`
- `cargo shear`
## Stack
fix: fail closed for unsupported split windows sandboxing #14172
fix: preserve split filesystem semantics in linux sandbox #14173
-> fix: align core approvals with split sandbox policies #14171
refactor: centralize filesystem permissions precedence #14174
## Why This PR Exists
This PR is intentionally narrower than the title may suggest.
Most of the original split-permissions migration already landed in the
earlier `#13434 -> #13453` stack. In particular:
- `#13439` already did the broad runtime plumbing for split filesystem
and network policies.
- `#13445` already moved `apply_patch` safety onto filesystem-policy
semantics.
- `#13448` already switched macOS Seatbelt generation to split policies.
- `#13449` and `#13453` already handled Linux helper and bubblewrap
enforcement.
- `#13440` already introduced the first protocol-side helpers for
deriving effective filesystem access.
The reason this PR still exists is that after the follow-on
`[permissions]` work and the new shared precedence helper in `#14174`, a
few core approval paths were still deciding behavior from the legacy
`SandboxPolicy` projection instead of the split filesystem policy that
actually carries the carveouts.
That means this PR is mostly a cleanup and alignment pass over the
remaining core consumers, not a fresh sandbox backend migration.
## What Is Actually New Here
- make unmatched-command fallback decisions consult
`FileSystemSandboxPolicy` instead of only legacy `DangerFullAccess` /
`ReadOnly` / `WorkspaceWrite` categories
- thread `file_system_sandbox_policy` into the shell, unified-exec, and
intercepted-exec approval paths so they all use the same split-policy
semantics
- keep `apply_patch` safety on the same effective-access rules as the
shared protocol helper, rather than letting it drift through
compatibility projections
- add loader-level regression coverage proving legacy `sandbox_mode`
config still builds split policies and round-trips back without semantic
drift
## What This PR Does Not Do
This PR does not introduce new platform backend enforcement on its own.
- Linux backend parity remains in `#14173`.
- Windows fail-closed handling remains in `#14172`.
- The shared precedence/model changes live in `#14174`.
## Files To Focus On
- `core/src/exec_policy.rs`: unmatched-command fallback and approval
rendering now read the split filesystem policy directly
- `core/src/tools/sandboxing.rs`: default exec-approval requirement keys
off `FileSystemSandboxPolicy.kind`
- `core/src/tools/handlers/shell.rs`: shell approval requests now carry
the split filesystem policy
- `core/src/unified_exec/process_manager.rs`: unified-exec approval
requests now carry the split filesystem policy
- `core/src/tools/runtimes/shell/unix_escalation.rs`: intercepted exec
fallback now uses the same split-policy approval semantics
- `core/src/safety.rs`: `apply_patch` safety keeps using effective
filesystem access rather than legacy sandbox categories
- `core/src/config/config_tests.rs`: new regression coverage for legacy
`sandbox_mode` no-drift behavior through the split-policy loader
## Notes
- `core/src/codex.rs` and `core/src/codex_tests.rs` are just small
fallout updates for `RequestPermissionsResponse.scope`; they are not the
point of the PR.
- If you reviewed the earlier `#13439` / `#13445` stack, the main review
question here is simply: “are there any remaining approval or
patch-safety paths that still reconstruct semantics from legacy
`SandboxPolicy` instead of consuming the split filesystem policy
directly?”
## Testing
- cargo test -p codex-core
legacy_sandbox_mode_config_builds_split_policies_without_drift
- cargo test -p codex-core request_permissions
- cargo test -p codex-core intercepted_exec_policy
- cargo test -p codex-core
restricted_sandbox_requires_exec_approval_on_request
- cargo test -p codex-core
unmatched_on_request_uses_split_filesystem_policy_for_escalation_prompts
- cargo test -p codex-core explicit_
- cargo clippy -p codex-core --tests -- -D warnings
## Summary
- add `skill_approval` to `RejectConfig` and the app-server v2
`AskForApproval::Reject` payload so skill-script prompts can be
configured independently from sandbox and rule-based prompts
- update Unix shell escalation to reject prompts based on the actual
decision source, keeping prefix rules tied to `rules`, unmatched command
fallbacks tied to `sandbox_approval`, and skill scripts tied to
`skill_approval`
- regenerate the affected protocol/config schemas and expand
unit/integration coverage for the new flag and skill approval behavior
## Summary
We need to support allowing request_permissions calls when using
`Reject` policy
<img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06
40 PM"
src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5"
/>
Note that this is a backwards-incompatible change for Reject policy. I'm
not sure if we need to add a default based on our current use/setup
## Testing
- [x] Added tests
- [x] Tested locally
## Summary
- add the guardian reviewer flow for `on-request` approvals in command,
patch, sandbox-retry, and managed-network approval paths
- keep guardian behind `features.guardian_approval` instead of exposing
a public `approval_policy = guardian` mode
- route ordinary `OnRequest` approvals to the guardian subagent when the
feature is enabled, without changing the public approval-mode surface
## Public model
- public approval modes stay unchanged
- guardian is enabled via `features.guardian_approval`
- when that feature is on, `approval_policy = on-request` keeps the same
approval boundaries but sends those approval requests to the guardian
reviewer instead of the user
- `/experimental` only persists the feature flag; it does not rewrite
`approval_policy`
- CLI and app-server no longer expose a separate `guardian` approval
mode in this PR
## Guardian reviewer
- the reviewer runs as a normal subagent and reuses the existing
subagent/thread machinery
- it is locked to a read-only sandbox and `approval_policy = never`
- it does not inherit user/project exec-policy rules
- it prefers `gpt-5.4` when the current provider exposes it, otherwise
falls back to the parent turn's active model
- it fail-closes on timeout, startup failure, malformed output, or any
other review error
- it currently auto-approves only when `risk_score < 80`
## Review context and policy
- guardian mirrors `OnRequest` approval semantics rather than
introducing a separate approval policy
- explicit `require_escalated` requests follow the same approval surface
as `OnRequest`; the difference is only who reviews them
- managed-network allowlist misses that enter the approval flow are also
reviewed by guardian
- the review prompt includes bounded recent transcript history plus
recent tool call/result evidence
- transcript entries and planned-action strings are truncated with
explicit `<guardian_truncated ... />` markers so large payloads stay
bounded
- apply-patch reviews include the full patch content (without
duplicating the structured `changes` payload)
- the guardian request layout is snapshot-tested using the same
model-visible Responses request formatter used elsewhere in core
## Guardian network behavior
- the guardian subagent inherits the parent session's managed-network
allowlist when one exists, so it can use the same approved network
surface while reviewing
- exact session-scoped network approvals are copied into the guardian
session with protocol/port scope preserved
- those copied approvals are now seeded before the guardian's first turn
is submitted, so inherited approvals are available during any immediate
review-time checks
## Out of scope / follow-ups
- the sandbox-permission validation split was pulled into a separate PR
and is not part of this diff
- a future follow-up can enable `serde_json` preserve-order in
`codex-core` and then simplify the guardian action rendering further
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- distinguish reject-policy handling for prefix-rule approvals versus
sandbox approvals in Unix shell escalation
- keep prompting for skill-script execution when `rules=true` but
`sandbox_approval=false`, instead of denying the command up front
- add regression coverage for both skill-script reject-policy paths in
`codex-rs/core/tests/suite/skill_approval.rs`
## Summary
Today `SandboxPermissions::requires_additional_permissions()` does not
actually mean "is `WithAdditionalPermissions`". It returns `true` for
any non-default sandbox override, including `RequireEscalated`. That
broad behavior is relied on in multiple `main` callsites.
The naming is security-sensitive because `SandboxPermissions` is used on
shell-like tool calls to tell the executor how a single command should
relate to the turn sandbox:
- `UseDefault`: run with the turn sandbox unchanged
- `RequireEscalated`: request execution outside the sandbox
- `WithAdditionalPermissions`: stay sandboxed but widen permissions for
that command only
## Problem
The old helper name reads as if it only applies to the
`WithAdditionalPermissions` variant. In practice it means "this command
requested any explicit sandbox override."
That ambiguity made it easy to read production checks incorrectly and
made the guardian change look like a standalone `main` fix when it is
not.
On `main` today:
- `shell` and `unified_exec` intentionally reject any explicit
`sandbox_permissions` request unless approval policy is `OnRequest`
- `exec_policy` intentionally treats any explicit sandbox override as
prompt-worthy in restricted sandboxes
- tests intentionally serialize both `RequireEscalated` and
`WithAdditionalPermissions` as explicit sandbox override requests
So changing those callsites from the broad helper to a narrow
`WithAdditionalPermissions` check would be a behavior change, not a pure
cleanup.
## What This PR Does
- documents `SandboxPermissions` as a per-command sandbox override, not
a generic permissions bag
- adds `requests_sandbox_override()` for the broad meaning: anything
except `UseDefault`
- adds `uses_additional_permissions()` for the narrow meaning: only
`WithAdditionalPermissions`
- keeps `requires_additional_permissions()` as a compatibility alias to
the broad meaning for now
- updates the current broad callsites to use the accurately named broad
helper
- adds unit coverage that locks in the semantics of all three helpers
## What This PR Does Not Do
This PR does not change runtime behavior. That is intentional.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
[#12964](https://github.com/openai/codex/pull/12964) added
`host_executable()` support to `codex-execpolicy`, and
[#13046](https://github.com/openai/codex/pull/13046) adopted it in the
zsh-fork interception path.
The remaining gap was the preflight execpolicy check in
`core/src/exec_policy.rs`. That path derives approval requirements
before execution for `shell`, `shell_command`, and `unified_exec`, but
it was still using the default exact-token matcher.
As a result, a command that already included an absolute executable
path, such as `/usr/bin/git status`, could still miss a basename rule
like `prefix_rule(pattern = ["git"], ...)` during preflight even when
the policy also defined a matching `host_executable(name = "git", ...)`
entry.
This PR brings the same opt-in `host_executable()` resolution to the
preflight approval path when an absolute program path is already present
in the parsed command.
## What Changed
- updated
`ExecPolicyManager::create_exec_approval_requirement_for_command()` in
`core/src/exec_policy.rs` to use `check_multiple_with_options(...)` with
`MatchOptions { resolve_host_executables: true }`
- kept the existing shell parsing flow for approval derivation, but now
allow basename rules to match absolute executable paths during preflight
when `host_executable()` permits it
- updated requested-prefix amendment evaluation to use the same
host-executable-aware matching mode, so suggested `prefix_rule()`
amendments are checked consistently for absolute-path commands
- added preflight coverage for:
- absolute-path commands that should match basename rules through
`host_executable()`
- absolute-path commands whose paths are not in the allowed
`host_executable()` mapping
- requested prefix-rule amendments for absolute-path commands
## Verification
- `just fix -p codex-core`
- `cargo test -p codex-core --lib exec_policy::tests::`
## Why
`execpolicy` currently keys `prefix_rule()` matching off the literal
first token. That works for rules like `["/usr/bin/git"]`, but it means
shared basename rules such as `["git"]` do not help when a caller passes
an absolute executable path like `/usr/bin/git`.
This PR lays the groundwork for basename-aware matching without changing
existing callers yet. It adds typed host-executable metadata and an
opt-in resolution path in `codex-execpolicy`, so a follow-up PR can
adopt the new behavior in `unix_escalation.rs` and other call sites
without having to redesign the policy layer first.
## What Changed
- added `host_executable(name = ..., paths = [...])` to the execpolicy
parser and validated it with `AbsolutePathBuf`
- stored host executable mappings separately from prefix rules inside
`Policy`
- added `MatchOptions` and opt-in `*_with_options()` APIs that preserve
existing behavior by default
- implemented exact-first matching with optional basename fallback,
gated by `host_executable()` allowlists when present
- normalized executable names for cross-platform matching so Windows
paths like `git.exe` can satisfy `host_executable(name = "git", ...)`
- updated `match` / `not_match` example validation to exercise the
host-executable resolution path instead of only raw prefix-rule matching
- preserved source locations for deferred example-validation errors so
policy load failures still point at the right file and line
- surfaced `resolvedProgram` on `RuleMatch` so callers can tell when a
basename rule matched an absolute executable path
- preserved host executable metadata when requirements policies overlay
file-based policies in `core/src/exec_policy.rs`
- documented the new rule shape and CLI behavior in
`execpolicy/README.md`
## Verification
- `cargo test -p codex-execpolicy`
- added coverage in `execpolicy/tests/basic.rs` for parsing, precedence,
empty allowlists, basename fallback, exact-match precedence, and
host-executable-backed `match` / `not_match` examples
- added a regression test in `core/src/exec_policy.rs` to verify
requirements overlays preserve `host_executable()` metadata
- verified `cargo test -p codex-core --lib`, including source-rendering
coverage for deferred validation errors
## Summary
Introduces the initial implementation of Feature::RequestPermissions.
RequestPermissions allows the model to request that a command be run
inside the sandbox, with additional permissions, like writing to a
specific folder. Eventually this will include other rules as well, and
the ability to persist these permissions, but this PR is already quite
large - let's get the core flow working and go from there!
<img width="1279" height="541" alt="Screenshot 2026-02-15 at 2 26 22 PM"
src="https://github.com/user-attachments/assets/0ee3ec0f-02ec-4509-91a2-809ac80be368"
/>
## Testing
- [x] Added tests
- [x] Tested locally
- [x] Feature
## Summary
Persist network approval allow/deny decisions as `network_rule(...)`
entries in execpolicy (not proxy config)
It adds `network_rule` parsing + append support in `codex-execpolicy`,
including `decision="prompt"` (parse-only; not compiled into proxy
allow/deny lists)
- compile execpolicy network rules into proxy allow/deny lists and
update the live proxy state on approval
- preserve requirements execpolicy `network_rule(...)` entries when
merging with file-based execpolicy
- reject broad wildcard hosts (for example `*`) for persisted
`network_rule(...)`
## Summary
`gpt-5.3-codex` really likes to write complicated shell scripts, and
suggest a partial prefix_rule that wouldn't actually approve the
command. We should only show the `prefix_rule` suggestion from the model
if it would actually fully approve the command the user is seeing.
This will technically cause more instances of overly-specific
suggestions when we fallback, but I think the UX is clearer,
particularly when the model doesn't necessarily understand the current
limitations of execpolicy parsing.
## Testing
- [x] Add unit tests
- [x] Add integration tests
## Summary
Fixes a few things in our exec_policy handling of prefix_rules:
1. Correctly match redirects specifically for exec_policy parsing. i.e.
if you have `prefix_rule(["echo"], decision="allow")` then `echo hello >
output.txt` should match - this should fix#10321
2. If there already exists any rule that would match our prefix rule
(not just a prompt), then drop it, since it won't do anything.
## Testing
- [x] Updated unit tests, added approvals ScenarioSpecs
## Summary
This feature is now reasonably stable, let's remove it so we can
simplify our upcoming iterations here.
## Testing
- [x] Existing tests pass
Currently, if there are syntax errors detected in the starlark rules
file, the entire policy is silently ignored by the CLI. The app server
correctly emits a message that can be displayed in a GUI.
This PR changes the CLI (both the TUI and non-interactive exec) to fail
when the rules file can't be parsed. It then prints out an error message
and exits with a non-zero exit code. This is consistent with the
handling of errors in the config file.
This addresses #11603
## Summary
If the model suggests a bad rule, don't show it to the user. This does
not impact the parsing of existing rules, just the ones we show.
## Testing
- [x] Added unit tests
- [x] Ran locally
### Motivation
- Git subcommand matching was being classified as "dangerous" and caused
benign developer workflows (for example `git push --force-with-lease`)
to be blocked by the preflight policy.
- The change aligns behavior with the intent to reserve the dangerous
checklist for truly destructive shell ops (e.g. `rm -rf`) and avoid
surprising developer-facing blocks.
### Description
- Remove git-specific subcommand checks from
`is_dangerous_to_call_with_exec` in
`codex-rs/shell-command/src/command_safety/is_dangerous_command.rs`,
leaving only explicit `rm` and `sudo` passthrough checks.
- Deleted the git-specific helper logic that classified `reset`,
`branch`-delete, `push` (force/delete/refspec) and `clean --force` as
dangerous.
- Updated unit tests in the same file to assert that various `git
reset`/`git branch`/`git push`/`git clean` variants are no longer
classified as dangerous.
- Kept `find_git_subcommand` (used by safe-command classification)
intact so safe/unsafe parsing elsewhere remains functional.
### Testing
- Ran formatter with `just fmt` successfully.
- Ran unit tests with `cargo test -p codex-shell-command` and all tests
passed (`144 passed; 0 failed`).
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_698d19dedb4883299c3ceb5bbc6a0dcf)
`SandboxPolicy::ReadOnly` previously implied broad read access and could
not express a narrower read surface.
This change introduces an explicit read-access model so we can support
user-configurable read restrictions in follow-up work, while preserving
current behavior today.
It also ensures unsupported backends fail closed for restricted-read
policies instead of silently granting broader access than intended.
## What
- Added `ReadOnlyAccess` in protocol with:
- `Restricted { include_platform_defaults, readable_roots }`
- `FullAccess`
- Updated `SandboxPolicy` to carry read-access configuration:
- `ReadOnly { access: ReadOnlyAccess }`
- `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
- Preserved existing behavior by defaulting current construction paths
to `ReadOnlyAccess::FullAccess`.
- Threaded the new fields through sandbox policy consumers and call
sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
related tests.
- Updated Seatbelt policy generation to honor restricted read roots by
emitting scoped read rules when full read access is not granted.
- Added fail-closed behavior on Linux and Windows backends when
restricted read access is requested but not yet implemented there
(`UnsupportedOperation`).
- Regenerated app-server protocol schema and TypeScript artifacts,
including `ReadOnlyAccess`.
## Compatibility / rollout
- Runtime behavior remains unchanged by default (`FullAccess`).
- API/schema changes are in place so future config wiring can enable
restricted read access without another policy-shape migration.
## Summary
This should rarely, if ever, happen in practice. But regardless, we
should never provide an empty list of `commands` to ExecPolicy. This PR
is almost entirely adding test around these cases.
## Testing
- [x] Adds a bunch of unit tests for this
## Summary
- Reduced repeated approvals for equivalent wrapper commands and fixed
execpolicy matching for heredoc-style shell invocations, with minimal
behavior change and fail-closed defaults.
## Fixes
1. Canonicalized approval matching for wrappers so equivalent commands
map to the same approval intent.
2. Added heredoc-aware prefix extraction for execpolicy so commands like
`python3 <<'PY' ... PY` match rules such as `prefix_rule(["python3"],
...)`.
3. Kept fallback behavior conservative: if parsing is ambiguous,
existing prompt behavior is preserved.
## Edge Cases Covered
- Wrapper path/name differences: `/bin/bash` vs `bash`, `/bin/zsh` vs
`zsh`.
- Shell modes: `-c` and `-lc`.
- Heredoc forms: quoted delimiter (`<<'PY'`) and unquoted delimiter (`<<
PY`).
- Multi-command heredoc scripts are rejected by the fallback
- Non-heredoc redirections (`>`, etc.) are not treated as heredoc prefix
matches.
- Complex scripts still fall back to prior behavior rather than
expanding permissions.
---------
Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
fixes https://github.com/openai/codex/issues/10160 and some more.
## Description
Hardens Git command safety to prevent approval bypasses for destructive
or write-capable invocations (branch delete, risky push forms,
output/config-override flags), so these commands no longer auto-run as
“safe.”
- `git branch -d` variants (especially in worktrees / with global
options like -C / -c)
- `git show|diff|log --output` ... style file-write flags
- risky Git config override flags (-c, --config-env) that can trigger
external execution
- dangerous push forms that weren’t fully caught (`--force*`,
`--delete`, `+refspec`, `:refspec`)
- grouped short-flag delete forms (e.g. stacked branch flags containing
`d/D`)
will fast follow with a common git policy to bring windows to parity.
---------
Co-authored-by: Eric Traut <etraut@openai.com>
`requirements.toml` should be able to specify rules which always run.
My intention here was that these rules could only ever be restrictive,
which means the decision can be "prompt" or "forbidden" but never
"allow". A requirement of "you must always allow this command" didn't
make sense to me, but happy to be gaveled otherwise.
Rules already applies the most restrictive decision, so we can safely
merge these with rules found in other config folders.
When an invalid config.toml key or value is detected, the CLI currently
just quits. This leaves the VSCE in a dead state.
This PR changes the behavior to not quit and bubble up the config error
to users to make it actionable. It also surfaces errors related to
"rules" parsing.
This allows us to surface these errors to users in the VSCE, like this:
<img width="342" height="129" alt="Screenshot 2026-01-13 at 4 29 22 PM"
src="https://github.com/user-attachments/assets/a79ffbe7-7604-400c-a304-c5165b6eebc4"
/>
<img width="346" height="244" alt="Screenshot 2026-01-13 at 4 45 06 PM"
src="https://github.com/user-attachments/assets/de874f7c-16a2-4a95-8c6d-15f10482e67b"
/>
Adds an optional `justification` parameter to the `prefix_rule()`
execpolicy DSL so policy authors can attach human-readable rationale to
a rule. That justification is propagated through parsing/matching and
can be surfaced to the model (or approval UI) when a command is blocked
or requires approval.
When a command is rejected (or gated behind approval) due to policy, a
generic message makes it hard for the model/user to understand what went
wrong and what to do instead. Allowing policy authors to supply a short
justification improves debuggability and helps guide the model toward
compliant alternatives.
Example:
```python
prefix_rule(
pattern = ["git", "push"],
decision = "forbidden",
justification = "pushing is blocked in this repo",
)
```
If Codex tried to run `git push origin main`, now the failure would
include:
```
`git push origin main` rejected: pushing is blocked in this repo
```
whereas previously, all it was told was:
```
execpolicy forbids this command
```
https://github.com/openai/codex/pull/8354 added support for in-repo
`.config/` files, so this PR updates the logic for loading `*.rules`
files to load `*.rules` files from all relevant layers. The main change
to the business logic is `load_exec_policy()` in
`codex-rs/core/src/exec_policy.rs`.
Note this adds a `config_folder()` method to `ConfigLayerSource` that
returns `Option<AbsolutePathBuf>` so that it is straightforward to
iterate over the sources and get the associated config folder, if any.
We decided that `*.rules` is a more fitting (and concise) file extension
than `*.codexpolicy`, so we are changing the file extension for the
"execpolicy" effort. We are also changing the subfolder of `$CODEX_HOME`
from `policy` to `rules` to match.
This PR updates the in-repo docs and we will update the public docs once
the next CLI release goes out.
Locally, I created `~/.codex/rules/default.rules` with the following
contents:
```
prefix_rule(pattern=["gh", "pr", "view"])
```
And then I asked Codex to run:
```
gh pr view 7888 --json title,body,comments
```
and it was able to!
Currently, we only show the “don’t ask again for commands that start
with…” option when a command is immediately flagged as needing approval.
However, there is another case where we ask for approval: When a command
is initially auto-approved to run within sandbox, but it fails to run
inside sandbox, we would like to attempt to retry running outside of
sandbox. This will require a prompt to the user.
This PR addresses this latter case
The caller should decide whether wrapping the policy in `Arc<RwLock>` is
necessary. This should make https://github.com/openai/codex/pull/7609 a
bit smoother.
- `exec_policy_for()` -> `load_exec_policy_for_features()`
- introduce `load_exec_policy()` that does not take `Features` as an arg
- both return `Result<Policy, ExecPolicyError>` instead of
Result<Arc<RwLock<Policy>>, ExecPolicyError>`
This simplifies the tests as they have no need for `Arc<RwLock>`.
## Refactor of the `execpolicy` crate
To illustrate why we need this refactor, consider an agent attempting to
run `apple | rm -rf ./`. Suppose `apple` is allowed by `execpolicy`.
Before this PR, `execpolicy` would consider `apple` and `pear` and only
render one rule match: `Allow`. We would skip any heuristics checks on
`rm -rf ./` and immediately approve `apple | rm -rf ./` to run.
To fix this, we now thread a `fallback` evaluation function into
`execpolicy` that runs when no `execpolicy` rules match a given command.
In our example, we would run `fallback` on `rm -rf ./` and prevent
`apple | rm -rf ./` from being run without approval.
this PR enables TUI to approve commands and add their prefixes to an
allowlist:
<img width="708" height="605" alt="Screenshot 2025-11-21 at 4 18 07 PM"
src="https://github.com/user-attachments/assets/56a19893-4553-4770-a881-becf79eeda32"
/>
note: we only show the option to whitelist the command when
1) command is not multi-part (e.g `git add -A && git commit -m 'hello
world'`)
2) command is not already matched by an existing rule
I think this might help with https://github.com/openai/codex/pull/7033
because `create_approval_requirement_for_command()` will soon need
access to `Session.state`, which is a `tokio::sync::Mutex` that needs to
be accessed via `async`.
adding execpolicycheck tool onto codex cli
this is useful for validating policies (can be multiple) against
commands.
it will also surface errors in policy syntax:
<img width="1150" height="281" alt="Screenshot 2025-11-19 at 12 46
21 PM"
src="https://github.com/user-attachments/assets/8f99b403-564c-4172-acc9-6574a8d13dc3"
/>
this PR also changes output format when there's no match in the CLI.
instead of returning the raw string `noMatch`, we return
`{"noMatch":{}}`
this PR is a rewrite of: https://github.com/openai/codex/pull/6932 (due
to the numerous merge conflicts present in the original PR)
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
This PR threads execpolicy2 into codex-core.
activated via feature flag: exec_policy (on by default)
reads and parses all .codexpolicy files in `codex_home/codex`
refactored tool runtime API to integrate execpolicy logic
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>