codex

rad/codex

mirror of https://github.com/openai/codex.git synced 2026-03-05 21:45:28 +03:00

Author	SHA1	Message	Date
Michael Bolin	264fc444b6	feat: discourage the use of the --all-features flag (#12429 ) ## Why Developers are frequently running low on disk space, and routine use of `--all-features` contributes to larger Cargo build caches in `target/` by compiling additional feature combinations. This change updates local workflow guidance to avoid `--all-features` by default and reserve it for cases where full feature coverage is specifically needed. ## What Changed - Updated `AGENTS.md` guidance for `codex-rs` to recommend `cargo test` / `just test` for full-suite local runs, and to call out the disk-usage cost of routine `--all-features` usage. - Updated the root `justfile` so `just fix` and `just clippy` no longer pass `--all-features` by default. - Updated `docs/install.md` to explicitly describe `cargo test --all-features` as an optional heavier-weight run (more build time and `target/` disk usage). ## Verification - Confirmed the `justfile` parses and the recipes list successfully with `just --list`.	2026-02-20 23:02:24 -08:00
Josh McKinney	de93cef5b7	bazel: enforce MODULE.bazel.lock sync with Cargo.lock (#11790 ) ## Why this change When Cargo dependencies change, it is easy to end up with an unexpected local diff in `MODULE.bazel.lock` after running Bazel. That creates noisy working copies and pushes lockfile fixes later in the cycle. This change addresses that pain point directly. ## What this change enforces The expected invariant is: after dependency updates, `MODULE.bazel.lock` is already in sync with Cargo resolution. In practice, running `bazel mod deps` should not mutate the lockfile in a clean state. If it does, the dependency update is incomplete. ## How this is enforced This change adds a single lockfile check script that snapshots `MODULE.bazel.lock`, runs `bazel mod deps`, and fails if the file changes. The same check is wired into local workflow commands (`just bazel-lock-update` and `just bazel-lock-check`) and into Bazel CI (Linux x86_64 job) so drift is caught early and consistently. The developer documentation is updated in `codex-rs/docs/bazel.md` and `AGENTS.md` to make the expected flow explicit. `MODULE.bazel.lock` is also refreshed in this PR to match the current Cargo dependency resolution. ## Expected developer workflow After changing `Cargo.toml` or `Cargo.lock`, run `just bazel-lock-update`, then run `just bazel-lock-check`, and include any resulting `MODULE.bazel.lock` update in the same change. ## Testing Ran `just bazel-lock-check` locally.	2026-02-14 02:11:19 +00:00
Josh McKinney	75e79cf09a	docs: require insta snapshot coverage for UI changes (#10669 ) Adds an explicit requirement in AGENTS.md that any user-visible UI change includes corresponding insta snapshot coverage and that snapshots are reviewed/accepted in the PR. Tests: N/A (docs only)	2026-02-12 22:47:09 +00:00
Owen Lin	efc8d45750	feat(app-server): experimental flag to persist extended history (#11227 ) This PR adds an experimental `persist_extended_history` bool flag to app-server thread APIs so rollout logs can retain a richer set of EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e. on `thread/resume`). ### Motivation Today, our rollout recorder only persists a small subset (e.g. user message, reasoning, assistant message) of `EventMsg` types, dropping a good number (like command exec, file change, etc.) that are important for reconstructing full item history for `thread/resume`, `thread/read`, and `thread/fork`. Some clients want to be able to resume a thread without lossiness. This lossiness is primarily a UI thing, since what the model sees are `ResponseItem` and not `EventMsg`. ### Approach This change introduces an opt-in `persist_full_history` flag to preserve those events when you start/resume/fork a thread (defaults to `false`). This is done by adding an `EventPersistenceMode` to the rollout recorder: - `Limited` (existing behavior, default) - `Extended` (new opt-in behavior) In `Extended` mode, persist additional `EventMsg` variants needed for non-lossy app-server `ThreadItem` reconstruction. We now store the following ThreadItems that we didn't before: - web search - command execution - patch/file changes - MCP tool calls - image view calls - collab tool outcomes - context compaction - review mode enter/exit For command executions in particular, we truncate the output using the existing `truncate_text` from core to store an upper bound of 10,000 bytes, which is also the default value for truncating tool outputs shown to the model. This keeps the size of the rollout file and command execution items returned over the wire reasonable. And we also persist `EventMsg::Error` which we can now map back to the Turn's status and populates the Turn's error metadata. #### Updates to EventMsgs To truly make `thread/resume` non-lossy, we also needed to persist the `status` on `EventMsg::CommandExecutionEndEvent` and `EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a command failed or was declined (similar for apply_patch). These EventMsgs were never persisted before so I made it a required field.	2026-02-12 19:34:22 +00:00
pakrym-oai	086d02fb14	Try to stop small helper methods (#11203 )	2026-02-09 20:01:30 +00:00
Owen Lin	731f0f384a	chore(app-server): update AGENTS.md for config + optional collection guidance (#10914 ) Based on recent app-server PRs	2026-02-06 12:45:27 -08:00
Owen Lin	efd96c46c7	fix(app-server): fix TS annotations for optional fields on requests (#10412 ) This updates our generated TypeScript types to be more correct with how the server actually behaves, specifically for JSON-RPC requests. Before this PR, we'd generate `field: T \| null`. After this PR, we will have `field?: T \| null`. The latter matches how the server actually works, in that if an optional field is omitted, the server will treat it as null. This also makes it less annoying in theory for clients to upgrade to newer versions of Codex, since adding a new optional field to a JSON-RPC request should not require a client change. NOTE: This only applies to JSON-RPC requests. All other payloads (i.e. responses, notifications) will return `field: T \| null` as usual.	2026-02-03 11:51:37 -08:00
Charley Cunningham	47aa1f3b6a	Reject request_user_input outside Plan/Pair (#9955 ) ## Context Previous work in https://github.com/openai/codex/pull/9560 only rejected `request_user_input` in Execute and Custom modes. Since then, additional modes (e.g., Code) were added, so the guard should be mode-agnostic. ## What changed - Switch the handler to an allowlist: only Plan and PairProgramming are allowed - Return the same error for any other mode (including Code) - Add a Code-mode rejection test alongside the existing Execute/Custom tests ## Why This prevents `request_user_input` from being used in modes where it is not intended, even as new modes are introduced.	2026-01-26 17:12:17 -08:00
Dylan Hurd	e520592bcf	chore: tweak AGENTS.md (#9650 ) ## Summary Update AGENTS.md to improve testing flow ## Testing - [x] Tested locally, much faster	2026-01-21 20:20:45 -08:00
Ahmed Ibrahim	ebc88f29f8	don't ask for approval for `just fix` (#9586 ) It blocks all my skills from executing because it asks to run just fmt. It's quick command that doesn't need approval. <img width="967" height="120" alt="image" src="https://github.com/user-attachments/assets/f8e6ca76-a650-49e9-beb2-ce98ba48d310" />	2026-01-21 04:56:11 +00:00
sayan-oai	40e2405998	add generated jsonschema for config.toml (#8956 ) ### What Add JSON Schema generation for `config.toml`, with checked‑in `docs/config.schema.json`. We can move the schema elsewhere if preferred (and host it if there's demand). Add fixture test to prevent drift and `just write-config-schema` to regenerate on schema changes. Generate MCP config schema from `RawMcpServerConfig` instead of `McpServerConfig` because that is the runtime type used for deserialization. Populate feature flag values into generated schema so they can be autocompleted. ### Tests Added tests + regenerate script to prevent drift. Tested autocompletions using generated jsonschema locally with Even Better TOML. https://github.com/user-attachments/assets/5aa7cd39-520c-4a63-96fb-63798183d0bc	2026-01-13 10:22:51 -08:00
Michael Bolin	f6b563ec64	feat: introduce find_resource! macro that works with Cargo or Bazel (#8879 ) To support Bazelification in https://github.com/openai/codex/pull/8875, this PR introduces a new `find_resource!` macro that we use in place of our existing logic in tests that looks for resources relative to the compile-time `CARGO_MANIFEST_DIR` env var. To make this work, we plan to add the following to all `rust_library()` and `rust_test()` Bazel rules in the project: ``` rustc_env = { "BAZEL_PACKAGE": native.package_name(), }, ``` Our new `find_resource!` macro reads this value via `option_env!("BAZEL_PACKAGE")` so that the Bazel package _of the code using `find_resource!`_ is injected into the code expanded from the macro. (If `find_resource()` were a function, then `option_env!("BAZEL_PACKAGE")` would always be `codex-rs/utils/cargo-bin`, which is not what we want.) Note we only consider the `BAZEL_PACKAGE` value when the `RUNFILES_DIR` environment variable is set at runtime, indicating that the test is being run by Bazel. In this case, we have to concatenate the runtime `RUNFILES_DIR` with the compile-time `BAZEL_PACKAGE` value to build the path to the resource. In testing this change, I discovered one funky edge case in `codex-rs/exec-server/tests/common/lib.rs` where we have to _normalize_ (but not canonicalize!) the result from `find_resource!` because the path contains a `common/..` component that does not exist on disk when the test is run under Bazel, so it must be semantically normalized using the [`path-absolutize`](https://crates.io/crates/path-absolutize) crate before it is passed to `dotslash fetch`. Because this new behavior may be non-obvious, this PR also updates `AGENTS.md` to make humans/Codex aware that this API is preferred.	2026-01-07 18:06:08 -08:00
Michael Bolin	e61bae12e3	feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496 ) This PR introduces a `codex-utils-cargo-bin` utility crate that wraps/replaces our use of `assert_cmd::Command` and `escargot::CargoBuild`. As you can infer from the introduction of `buck_project_root()` in this PR, I am attempting to make it possible to build Codex under [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to achieve faster incremental local builds (largely due to Buck2's [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/) build strategy, as well as benefits from its local build daemon) as well as faster CI builds if we invest in remote execution and caching. See https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages for more details about the performance advantages of Buck2. Buck2 enforces stronger requirements in terms of build and test isolation. It discourages assumptions about absolute paths (which is key to enabling remote execution). Because the `CARGO_BIN_EXE_` environment variables that Cargo provides are absolute paths (which `assert_cmd::Command` reads), this is a problem for Buck2, which is why we need this `codex-utils-cargo-bin` utility. My WIP-Buck2 setup sets the `CARGO_BIN_EXE_` environment variables passed to a `rust_test()` build rule as relative paths. `codex-utils-cargo-bin` will resolve these values to absolute paths, when necessary. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496). * #8498 * __->__ #8496	2025-12-23 19:29:32 -08:00
Eric Traut	42b8f28ee8	Fixed resume matching to respect case insensitivity when using WSL mount points (#8000 ) This fixes #7995	2025-12-16 16:27:38 -08:00
Josh McKinney	596fcd040f	docs: remove blanket ban on unsigned integers (#7957 ) Drop the AGENTS.md rule that forbids unsigned ints. The blanket guidance causes unnecessary complexity in cases where values are naturally unsigned, leading to extra clamping/conversion code instead of using checked or saturating arithmetic where needed.	2025-12-12 17:01:56 -08:00
Dylan Hurd	a8cbbdbc6e	feat(core) Add login to shell_command tool (#6846 ) ## Summary Adds the `login` parameter to the `shell_command` tool - optional, defaults to true. ## Testing - [x] Tested locally	2025-12-05 11:03:25 -08:00
pakrym-oai	6c384eb9c6	tests: replace mount_sse_once_match with mount_sse_once for SSE mocking (#6640 )	2025-11-13 18:04:05 -08:00
pakrym-oai	f8b30af6dc	Prefer `wait_for_event` over `wait_for_event_with_timeout`. (#6346 ) No need to specify the timeout in most cases.	2025-11-06 16:14:43 -08:00
Ahmed Ibrahim	049a61bcfc	Auto compact at ~90% (#5292 ) Users now hit a window exceeded limit and they usually don't know what to do. This starts auto compact at ~90% of the window.	2025-10-20 11:29:49 -07:00
Gabriel Peal	40fba1bb4c	[MCP] Add support for resources (#5239 ) This PR adds support for [MCP resources](https://modelcontextprotocol.io/specification/2025-06-18/server/resources) by adding three new tools for the model: 1. `list_resources` 2. `list_resource_templates` 3. `read_resource` These 3 tools correspond to the [three primary MCP resource protocol messages](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#protocol-messages). Example of listing and reading a GitHub resource tempalte <img width="2984" height="804" alt="CleanShot 2025-10-15 at 17 31 10" src="https://github.com/user-attachments/assets/89b7f215-2e2a-41c5-90dd-b932ac84a585" /> `/mcp` with Figma configured <img width="2984" height="442" alt="CleanShot 2025-10-15 at 18 29 35" src="https://github.com/user-attachments/assets/a7578080-2ed2-4c59-b9b4-d8461f90d8ee" /> Fixes #4956	2025-10-17 01:05:15 -04:00
pakrym-oai	35a770e871	Simplify request body assertions (#4845 ) We'll have a lot more test like these	2025-10-07 09:56:39 +01:00
Michael Bolin	c32e9cfe86	chore: subject docs/*.md to Prettier checks (#4645 ) Apparently we were not running our `pnpm run prettier` check in CI, so many files that were covered by the existing Prettier check were not well-formatted. This updates CI and formats the files.	2025-10-03 11:35:48 -07:00
Gabriel Peal	1d17ca1fa3	[MCP] Add support for MCP Oauth credentials (#4517 ) This PR adds oauth login support to streamable http servers when `experimental_use_rmcp_client` is enabled. This PR is large but represents the minimal amount of work required for this to work. To keep this PR smaller, login can only be done with `codex mcp login` and `codex mcp logout` but it doesn't appear in `/mcp` or `codex mcp list` yet. Fingers crossed that this is the last large MCP PR and that subsequent PRs can be smaller. Under the hood, credentials are stored using platform credential managers using the [keyring crate](https://crates.io/crates/keyring). When the keyring isn't available, it falls back to storing credentials in `CODEX_HOME/.credentials.json` which is consistent with how other coding agents handle authentication. I tested this on macOS, Windows, WSL (ubuntu), and Linux. I wasn't able to test the dbus store on linux but did verify that the fallback works. One quirk is that if you have credentials, during development, every build will have its own ad-hoc binary so the keyring won't recognize the reader as being the same as the write so it may ask for the user's password. I may add an override to disable this or allow users/enterprises to opt-out of the keyring storage if it causes issues. <img width="5064" height="686" alt="CleanShot 2025-09-30 at 19 31 40" src="https://github.com/user-attachments/assets/9573f9b4-07f1-4160-83b8-2920db287e2d" /> <img width="745" height="486" alt="image" src="https://github.com/user-attachments/assets/9562649b-ea5f-4f22-ace2-d0cb438b143e" />	2025-10-03 13:43:12 -04:00
Abhishek Bhardwaj	208089e58e	AGENTS.md: Add instruction to install missing commands (#3807 ) This change instructs the model to install any missing command. Else tokens are wasted when it tries to run commands that aren't available multiple times before installing them.	2025-09-17 11:06:59 -07:00
Jeremy Rose	d6182becbe	syntax-highlight bash lines (#3142 ) i'm not yet convinced i have the best heuristics for what to highlight, but this feels like a useful step towards something a bit easier to read, esp. when the model is producing large commands. <img width="669" height="589" alt="Screenshot 2025-09-03 at 8 21 56 PM" src="https://github.com/user-attachments/assets/b9cbcc43-80e8-4d41-93c8-daa74b84b331" /> also a fairly significant refactor of our line wrapping logic.	2025-09-05 14:10:32 +00:00
Jeremy Rose	1c04e1314d	AGENTS.md: clarify test approvals for codex-rs (#3132 ) Clarifies codex-rs testing approvals in AGENTS.md: - Allow running project-specific or individual tests without asking. - Require asking before running the complete test suite. - Keep `just fmt` always allowed without approval.	2025-09-04 13:36:12 -07:00
Jeremy Rose	97000c6e6d	core: correct sandboxed shell tool description (reads allowed anywhere) (#3069 ) Correct the `shell` tool description for sandboxed runs and add targeted tests. - Fix the WorkspaceWrite description to clearly state that writes outside the writable roots require escalated permissions; reads are not restricted. The previous wording/formatting could be read as restricting reads outside the workspace. - Render the writable roots list on its own lines under a newline after "writable roots:" for clarity. - Show the "Commands that require network access" note only in WorkspaceWrite when network is disabled. - Add focused tests that call `create_shell_tool_for_sandbox` directly and assert the exact description text for WorkspaceWrite, ReadOnly, and DangerFullAccess. - Update AGENTS.md to note that `just fmt` can be run automatically without asking.	2025-09-03 10:02:34 -07:00
Jeremy Rose	578ff09e17	prefer ratatui Stylized for constructing lines/spans (#3068 ) no functional change, just simplifying ratatui styling and adding guidance in AGENTS.md for future.	2025-09-02 23:19:54 +00:00
Jeremy Rose	7d734bff65	suggest just fix -p in agents.md (#2881 )	2025-08-28 22:32:53 -07:00
ae	8192cf147e	[chore] Tweak AGENTS.md so agent doesn't always have to test (#2706 )	2025-08-26 00:27:19 -07:00
wkrettek	85099017fd	Fix typo in AGENTS.md (#2518 ) - Change `examole` to `example`	2025-08-22 16:05:39 -07:00
Jeremy Rose	8481eb4c6e	tui: tab-completing a command moves the cursor to the end (#2362 ) also tweak agents.md for faster `just fix`	2025-08-20 09:57:55 -07:00
ae	5bce369c4d	fix: clean up styles & colors and define in styles.md (#2401 ) New style guide: # Headers, primary, and secondary text - Headers: Use `bold`. For markdown with various header levels, leave in the `#` signs. - Primary text: Default. - Secondary text: Use `dim`. # Foreground colors - Default: Most of the time, just use the default foreground color. `reset` can help get it back. - Selection: Use ANSI `blue`. (Ed & AE want to make this cyan too, but we'll do that in a followup since it's riskier in different themes.) - User input tips and status indicators: Use ANSI `cyan`. - Success and additions: Use ANSI `green`. - Errors, failures and deletions: Use ANSI `red`. - Codex: Use ANSI `magenta`. # Avoid - Avoid custom colors because there's no guarantee that they'll contrast well or look good on various terminal color themes. - Avoid ANSI `black`, `white`, `yellow` as foreground colors because the terminal theme will do a better job. (Use `reset` if you need to in order to get those.) The exception is if you need contrast rendering over a manually colored background. (There are some rules to try to catch this in `clippy.toml`.) # Testing Tested in a variety of light and dark color themes in Terminal, iTerm2, and Ghostty.	2025-08-18 08:26:29 -07:00
Jeremy Rose	1ad8ae2579	color the status letter in apply patch summary (#2337 ) <img width="440" height="77" alt="Screenshot 2025-08-14 at 8 30 30 PM" src="https://github.com/user-attachments/assets/c6169a3a-2e98-4ace-b7ee-918cf4368b7a" />	2025-08-15 20:25:48 +00:00
Jeremy Rose	45d6c74682	tui: align diff display by always showing sign char and keeping fixed gutter (#2353 ) diff lines without a sign char were misaligned.	2025-08-15 09:32:45 -07:00
Jeremy Rose	8bdb4521c9	AGENTS.md more strongly suggests running targeted tests first (#2306 )	2025-08-15 00:51:32 +00:00
Gabriel Peal	7f6408720b	[1/3] Parse exec commands and format them more nicely in the UI (#2095 ) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110	2025-08-11 14:26:15 -04:00
Michael Bolin	80555d4ff2	feat: make .git read-only within a writable root when using Seatbelt (#1765 ) To make `--full-auto` safer, this PR updates the Seatbelt policy so that a `SandboxPolicy` with a `writable_root` that contains a `.git/` _directory_ will make `.git/` _read-only_ (though as a follow-up, we should also consider the case where `.git` is a _file_ with a `gitdir: /path/to/actual/repo/.git` entry that should also be protected). The two major changes in this PR: - Updating `SandboxPolicy::get_writable_roots_with_cwd()` to return a `Vec<WritableRoot>` instead of a `Vec<PathBuf>` where a `WritableRoot` can specify a list of read-only subpaths. - Updating `create_seatbelt_command_args()` to honor the read-only subpaths in `WritableRoot`. The logic to update the policy is a fairly straightforward update to `create_seatbelt_command_args()`, but perhaps the more interesting part of this PR is the introduction of an integration test in `tests/sandbox.rs`. Leveraging the new API in #1785, we test `SandboxPolicy` under various conditions, including ones where `$TMPDIR` is not readable, which is critical for verifying the new behavior. To ensure that Codex can run its own tests, e.g.: ``` just codex debug seatbelt --full-auto -- cargo test if_git_repo_is_writable_root_then_dot_git_folder_is_read_only ``` I had to introduce the use of `CODEX_SANDBOX=sandbox`, which is comparable to how `CODEX_SANDBOX_NETWORK_DISABLED=1` was already being used. Adding a comparable change for Landlock will be done in a subsequent PR.	2025-08-01 16:11:24 -07:00
pakrym-oai	327e2254f6	chore: rename toolchain file (#1604 ) Rename toolchain file so older versions of cargo can pick it up.	2025-07-17 15:36:15 -07:00
pakrym-oai	6949329a7f	chore: auto format code on save and add more details to AGENTS.md (#1582 ) Adds a default vscode config with generally applicable settings. Adds more entrypoints to justfile both for environment setup and to help agents better verify changes.	2025-07-17 11:40:00 -07:00
Michael Bolin	2b122da087	feat: add support for AGENTS.md in Rust CLI (#885 ) The TypeScript CLI already has support for including the contents of `AGENTS.md` in the instructions sent with the first turn of a conversation. This PR brings this functionality to the Rust CLI. To be considered, `AGENTS.md` must be in the `cwd` of the session, or in one of the parent folders up to a Git/filesystem root (whichever is encountered first). By default, a maximum of 32 KiB of `AGENTS.md` will be included, though this is configurable using the new-in-this-PR `project_doc_max_bytes` option in `config.toml`.	2025-05-10 17:52:59 -07:00

41 Commits