codex

mirror of https://github.com/openai/codex.git synced 2026-05-02 04:11:39 +03:00

Author	SHA1	Message	Date
Charley Cunningham	e3cbf913e8	Fix wait_agent expectations in core tests (#14637 ) ## Summary - update stale core tool-spec expectations from `wait` to `wait_agent` - update the prompt-caching tool-name assertion to match the renamed tool - fix the Bazel regressions introduced after #14631 renamed the multi-agent wait tool ## Testing - cargo test -p codex-core tools::spec::tests - cargo test -p codex-core suite::prompt_caching::prompt_tools_are_consistent_across_requests Co-authored-by: Codex <noreply@openai.com>	2026-03-13 15:15:59 -07:00
pakrym-oai	cb7d8f45a1	Normalize MCP tool names to code-mode safe form (#14605 ) Code mode doesn't allow `-` in names and it's better if function names and code-mode names are the same.	2026-03-13 14:50:16 -07:00
Ahmed Ibrahim	36dfb84427	Stabilize multi-agent feature flag (#14622 ) - make multi_agent stable and enabled by default - update feature and tool-spec coverage to match the new default --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-13 14:38:15 -07:00
pakrym-oai	477a2dd345	Add code_mode_only feature (#14617 ) Summary - add the code_mode_only feature flag/config schema and wire its dependency on code_mode - update code mode tool descriptions to list nested tools with detailed headers - restrict available tools for prompt and exec descriptions when code_mode_only is enabled and test the behavior Testing - Not run (not requested)	2026-03-13 13:30:19 -07:00
sayan-oai	9f2da5a9ce	chore: clarify plugin + app copy in model instructions (#14541 ) - clarify app mentions are in user messages - clarify what it means for tools to be provided via `codex_apps` MCP - add plugin descriptions (with basic sanitization) to top-level `## Plugins` section alongside the corresponding plugin names - explain that skills from plugins are prefixed with `plugin_name:` in top-level `##Plugins` section changes to more logically organize `Apps`, `Skills`, and `Plugins` instructions will be in a separate PR, as that shuffles dev + user instructions in ways that change tests broadly. ### Tests confirmed in local rollout, some new tests.	2026-03-13 10:57:41 -07:00
Jack Mousseau	59b588b8ec	Improve granular approval policy prompt (#14553 )	2026-03-13 10:42:17 -07:00
Won Park	958f93f899	sending back imagaegencall response back to responseapi (#14558 ) Sending back the ResponseItem::ImageGenerationCall as is, because it is now supported from the API-side.	2026-03-13 17:29:19 +00:00
iceweasel-oai	6b3d82daca	Use a private desktop for Windows sandbox instead of Winsta0\Default (#14400 ) ## Summary - launch Windows sandboxed children on a private desktop instead of `Winsta0\Default` - make private desktop the default while keeping `windows.sandbox_private_desktop=false` as the escape hatch - centralize process launch through the shared `create_process_as_user(...)` path - scope the private desktop ACL to the launching logon SID ## Why Today sandboxed Windows commands run on the visible shared desktop. That leaves an avoidable same-desktop attack surface for window interaction, spoofing, and related UI/input issues. This change moves sandboxed commands onto a dedicated per-launch desktop by default so the sandbox no longer shares `Winsta0\Default` with the user session. The implementation stays conservative on security with no silent fallback back to `Winsta0\Default` If private-desktop setup fails on a machine, users can still opt out explicitly with `windows.sandbox_private_desktop=false`. ## Validation - `cargo build -p codex-cli` - elevated-path `codex exec` desktop-name probe returned `CodexSandboxDesktop-*` - elevated-path `codex exec` smoke sweep for shell commands, nested `pwsh`, jobs, and hidden `notepad` launch - unelevated-path full private-desktop compatibility sweep via `codex exec` with `-c windows.sandbox=unelevated`	2026-03-13 10:13:39 -07:00
pakrym-oai	9c9867c9fa	code mode: single line tool declarations (#14526 ) ## Summary - render code mode tool declarations as single-line TypeScript snippets - make the JSON schema renderer emit inline object shapes for these declarations - update code mode/spec expectations to match the new inline rendering ## Testing - `just fmt` - `cargo test -p codex-core render_json_schema_to_typescript` - `cargo test -p codex-core code_mode_augments_` - `cargo test -p codex-core --test all exports_all_tools_metadata -- --nocapture`	2026-03-13 10:08:34 -07:00
Ahmed Ibrahim	c7e847aaeb	Add diagnostics for read_only_unless_trusted timeout flake (#14518 ) ## Summary - add targeted diagnostic logging for the read_only_unless_trusted_requires_approval scenarios in approval_matrix_covers_all_modes - add a scoped timeout buffer only for ro_unless_trusted write-file scenarios: 1000ms -> 2000ms - keep all other write-file scenarios at 1000ms ## Why The last two main failures were both in codex-core::all suite::approvals::approval_matrix_covers_all_modes with exit_code=124 in the same scenario. This points to execution-time jitter in CI rather than a semantic approval-policy mismatch. ## Notes - This does not introduce any >5s timeout and does not disable/quarantine tests. - The timeout increase is tightly scoped to the single flaky path and keeps the matrix deterministic under CI scheduling variance.	2026-03-12 23:51:03 -07:00
Jack Mousseau	7c7e267501	Simplify permissions available in request permissions tool (#14529 )	2026-03-12 21:13:17 -07:00
Channing Conger	0daffe667a	code_mode: Move exec params from runtime declarations to @pragma (#14511 ) This change moves code_mode exec session settings out of the runtime API and into an optional first-line pragma, so instead of calling runtime helpers like set_yield_time() or set_max_output_tokens_per_exec_call(), the model can write // @exec: {"yield_time_ms": ..., "max_output_tokens": ...} at the top of the freeform exec source. Rust now parses that pragma before building the source, validates it, and passes the values directly in the exec start message to the code-mode broker, which applies them at session start without any worker-runtime mutation path. The @openai/code_mode module no longer exposes those setter functions, the docs and grammar were updated to describe the pragma form, and the existing code_mode tests were converted to use pragma-based configuration instead.	2026-03-13 03:27:42 +00:00
alexsong-oai	1a363d5fcf	Add plugin usage telemetry (#14531 ) adding metrics including: * plugin used * plugin installed/uninstalled * plugin enabled/disabled	2026-03-12 19:22:30 -07:00
Jack Mousseau	b7dba72dbd	Rename reject approval policy to granular (#14516 )	2026-03-12 16:38:04 -07:00
Eric Traut	d32820ab07	Fix `codex exec --profile` handling (#14524 ) PR #14005 introduced a regression whereby `codex exec --profile` overrides were dropped when starting or resuming a thread. That causes the thread to miss profile-scoped settings like `model_instructions_file`. This PR preserve the active profile in the thread start/resume config overrides so the app-server rebuild sees the same profile that exec resolved. Fixes #14515	2026-03-12 17:34:25 -06:00
Rasmus Rygaard	53d5972226	Reapply "Pass more params to compaction" (#14298 ) (#14521 ) This reverts commit `8af97ce4b0`. Confirmed that this runs locally without the previous issues with tool use	2026-03-12 23:27:21 +00:00
Anton Panasenko	651717323c	feat(search_tool): gate search_tool on model supports_search_tool field (#14502 )	2026-03-12 16:03:50 -07:00
pakrym-oai	a2546d5dff	Expose code-mode tools through globals (#14517 ) Summary - make all code-mode tools accessible as globals so callers only need `tools.<name>` - rename text/image helpers and key globals (store, load, ALL_TOOLS, etc.) to reflect the new shared namespace - update the JS bridge, runners, descriptions, router, and tests to follow the new API Testing - Not run (not requested)	2026-03-12 15:43:59 -07:00
Jack Mousseau	a314c7d3ae	Decouple request permissions feature and tool (#14426 )	2026-03-12 14:47:08 -07:00
pakrym-oai	04e14bdf23	Rename exec session IDs to cell IDs (#14510 ) - Update the code-mode executor, wait handler, and protocol plumbing to use cell IDs instead of session IDs for node communication - Switch tool metadata, wait description, and suite tests to refer to cell IDs so user-visible messages match the new terminology Testing - Not run (not requested)	2026-03-12 14:05:30 -07:00
pakrym-oai	dadffd27d4	Fix MCP tool calling (#14491 ) Properly escape mcp tool names and make tools only available via imports.	2026-03-12 13:38:52 -07:00
pakrym-oai	a5a4899d0c	Skip nested tool call parallel test on Windows (#14505 ) Summary - disable the `code_mode_nested_tool_calls_can_run_in_parallel` test on Windows where `exec_command` is unavailable Testing - Not run (not requested)	2026-03-12 13:32:11 -07:00
pakrym-oai	25e301ed98	Add parallel tool call test (#14494 ) Summary - pin tests to `test-gpt-5.1-codex` so code-mode suites exercise that model explicitly - add a regression test that ensures nested tool calls can execute in parallel and assert on timing - refresh `codex-rs/Cargo.lock` for the updated dependency tree (add `codex-utils-pty`, drop `codex-otel`) Testing - Not run (not requested)	2026-03-12 12:10:14 -07:00
pakrym-oai	d1b03f0d7f	Add default code-mode yield timeout (#14484 ) Summary - expose the default yield timeout through code mode runtime so the handler, wait tool, and protocol share the same 10s value that matches unified exec - document the timeout change in the tool descriptions and propagate the value all the way into the runner metadata - adjust Cargo.lock to keep the dependency tree in sync with the added code mode tool dependency Testing - Not run (not requested)	2026-03-12 12:06:23 -07:00
pakrym-oai	cfe3f6821a	Cleanup code_mode tool descriptions (#14480 ) Move to separate files and clarify a bit.	2026-03-12 11:13:35 -07:00
pakrym-oai	2f03b1a322	Dispatch tools when code mode is not awaited directly (#14437 ) ## Summary - start a code mode worker once per turn and let it pump nested tool calls through a dedicated queue - simplify code mode request/response dispatch around request ids and generic runner-unavailable errors - clean up the code mode process API and runner protocol plumbing ## Testing - not run yet	2026-03-12 09:00:20 -07:00
viyatb-oai	e99e8e4a6b	fix: follow up on linux sandbox review nits (#14440 ) ## Summary - address the follow-up review nits from #13996 in a separate PR - make the approvals test command a raw string and keep the managed-network path using env proxy routing - inline `--apply-seccomp-then-exec` in the Linux sandbox inner command builder - remove the bubblewrap-specific sandbox metric tag path and drop the `use_legacy_landlock` shim from `sandbox_tag`/`TurnMetadataState::new` - restore the `Feature` import that `origin/main` currently still needs in `connectors.rs` ## Testing - `cargo test -p codex-linux-sandbox` - focused `codex-core` tests were rerun/started, but the final verification pass was interrupted when I pushed at request	2026-03-11 23:59:50 -07:00
viyatb-oai	04892b4ceb	refactor: make bubblewrap the default Linux sandbox (#13996 ) ## Summary - make bubblewrap the default Linux sandbox and keep `use_legacy_landlock` as the only override - remove `use_linux_sandbox_bwrap` from feature, config, schema, and docs surfaces - update Linux sandbox selection, CLI/config plumbing, and related tests/docs to match the new default - fold in the follow-up CI fixes for request-permissions responses and Linux read-only sandbox error text	2026-03-11 23:31:18 -07:00
pakrym-oai	f6c6128fc7	Support waiting for code_mode sessions (#14295 ) ## Summary - persist the code mode runner process in the session-scoped code mode store - switch the runner protocol from `init` to `start` with explicit session ids - handle runner-side session processing without the init waiter queue ## Validation - just fmt - cargo check -p codex-core - node --check codex-rs/core/src/tools/code_mode_runner.cjs	2026-03-11 23:13:54 -07:00
Ahmed Ibrahim	367a8a2210	Clarify spawn agent authorization (#14432 ) - Clarify that spawn_agent requires explicit user permission for delegation or parallel agent work. - Add a regression test covering the new description text.	2026-03-11 23:03:07 -07:00
Matthew Zeng	ba5b94287e	[apps] Add tool_suggest tool. (#14287 ) - [x] Add tool_suggest tool. - [x] Move chatgpt/src/connectors.rs and core/src/connectors.rs into a dedicated mod so that we have all the logic and global cache in one place. - [x] Update TUI app link view to support rendering the installation view for mcp elicitation. --------- Co-authored-by: Shaqayeq <shaqayeq@openai.com> Co-authored-by: Eric Traut <etraut@openai.com> Co-authored-by: pakrym-oai <pakrym@openai.com> Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com> Co-authored-by: guinness-oai <guinness@openai.com> Co-authored-by: Eugene Brevdo <ebrevdo@users.noreply.github.com> Co-authored-by: Charlie Guo <cguo@openai.com> Co-authored-by: Fouad Matin <fouad@openai.com> Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com> Co-authored-by: xl-openai <xl@openai.com> Co-authored-by: alexsong-oai <alexsong@openai.com> Co-authored-by: Owen Lin <owenlin0@gmail.com> Co-authored-by: sdcoffey <stevendcoffey@gmail.com> Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Won Park <won@openai.com> Co-authored-by: Dylan Hurd <dylan.hurd@openai.com> Co-authored-by: celia-oai <celia@openai.com> Co-authored-by: gabec-openai <gabec@openai.com> Co-authored-by: joeytrasatti-openai <joey.trasatti@openai.com> Co-authored-by: Leo Shimonaka <leoshimo@openai.com> Co-authored-by: Rasmus Rygaard <rasmus@openai.com> Co-authored-by: maja-openai <163171781+maja-openai@users.noreply.github.com> Co-authored-by: pash-openai <pash@openai.com> Co-authored-by: Josh McKinney <joshka@openai.com>	2026-03-11 22:06:59 -07:00
Owen Lin	5bc82c5b93	feat(app-server): propagate traces across tasks and core ops (#14387 ) ## Summary This PR keeps app-server RPC request trace context alive for the full lifetime of the work that request kicks off (e.g. for `thread/start`, this is `app-server rpc handler -> tokio background task -> core op submissions`). Previously we lose trace lineage once the request handler returns or hands work off to background tasks. This approach is especially relevant for `thread/start` and other RPC handlers that run in a non-blocking way. In the near future we'll most likely want to make all app-server handlers run in a non-blocking way by default, and only queue operations that must operate in order (e.g. thread RPCs per thread?), so we want to make sure tracing in app-server just generally works. Depends on https://github.com/openai/codex/pull/14300 Before <img width="155" height="207" alt="image" src="https://github.com/user-attachments/assets/c9487459-36f1-436c-beb7-fafeb40737af" /> After <img width="299" height="337" alt="image" src="https://github.com/user-attachments/assets/727392b2-d072-4427-9dc4-0502d8652dea" /> ## What changed - Keep request-scoped trace context around until we send the final response or error, or the connection closes. - Thread that trace context through detached `thread/start` work so background startup stays attached to the originating request. - Pass request trace context through to downstream core operations, including: - thread creation - resume/fork flows - turn submission - review - interrupt - realtime conversation operations - Add tracing tests that verify: - remote W3C trace context is preserved for `thread/start` - remote W3C trace context is preserved for `turn/start` - downstream core spans stay under the originating request span - request-scoped tracing state is cleaned up correctly - Clean up shutdown behavior so detached background tasks and spawned threads are drained before process exit.	2026-03-11 20:18:31 -07:00
Anton Panasenko	77b0c75267	feat: search_tool migrate to bring you own tool of Responses API (#14274 ) ## Why to support a new bring your own search tool in Responses API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search) we migrating our bm25 search tool to use official way to execute search on client and communicate additional tools to the model. ## What - replace the legacy `search_tool_bm25` flow with client-executed `tool_search` - add protocol, SSE, history, and normalization support for `tool_search_call` and `tool_search_output` - return namespaced Codex Apps search results and wire namespaced follow-up tool calls back into MCP dispatch	2026-03-11 17:51:51 -07:00
Curtis 'Fjord' Hawthorne	8791f0ab9a	Let models opt into original image detail (#14175 ) ## Summary This PR narrows original image detail handling to a single opt-in feature: - `image_detail_original` lets the model request `detail: "original"` on supported models - Omitting `detail` preserves the default resized behavior The model only sees `detail: "original"` guidance when the active model supports it: - JS REPL instructions include the guidance and examples only on supported models - `view_image` only exposes a `detail` parameter when the feature and model can use it The image detail API is intentionally narrow and consistent across both paths: - `view_image.detail` supports only `"original"`; otherwise omit the field - `codex.emitImage(..., detail)` supports only `"original"`; otherwise omit the field - Unsupported explicit values fail clearly at the API boundary instead of being silently reinterpreted - Unsupported explicit `detail: "original"` requests fall back to normal behavior when the feature is disabled or the model does not support original detail	2026-03-11 15:25:07 -07:00
Curtis 'Fjord' Hawthorne	5a89660ae4	Add js_repl cwd and homeDir helpers (#14385 ) ## Summary This PR adds two read-only path helpers to `js_repl`: - `codex.cwd` - `codex.homeDir` They are exposed alongside the existing `codex.tmpDir` helper so the REPL can reference basic host path context without reopening direct `process` access. ## Implementation - expose `codex.cwd` and `codex.homeDir` from the js_repl kernel - make `codex.homeDir` come from the kernel process environment - pass session dependency env through js_repl kernel startup so `codex.homeDir` matches the env a shell-launched process would see - keep existing shell `HOME` population behavior unchanged - update js_repl prompt/docs and add runtime/integration coverage for the new helpers	2026-03-11 14:44:44 -07:00
Charley Cunningham	f5bb338fdb	Defer initial context insertion until the first turn (#14313 ) ## Summary - defer fresh-session `build_initial_context()` until the first real turn instead of seeding model-visible context during startup - rely on the existing `reference_context_item == None` turn-start path to inject full initial context on that first real turn (and again after baseline resets such as compaction) - add a regression test for `InitialHistory::New` and update affected deterministic tests / snapshots around developer-message layout, collaboration instructions, personality updates, and compact request shapes ## Notes - this PR does not add any special empty-thread `/compact` behavior - most of the snapshot churn is the direct result of moving the initial model-visible context from startup to the first real turn, so first-turn request layouts no longer contain a pre-user startup copy of permissions / environment / other developer-visible context - remote manual `/compact` with no prior user still skips the remote compact request; local first-turn `/compact` still issues a compact request, but that request now reflects the lack of startup-seeded context --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-11 12:33:10 -07:00
Ahmed Ibrahim	c32c445f1c	Clarify locked role settings in spawn prompt (#14283 ) - tell agents when a role pins model or reasoning effort so they know those settings are not changeable - add prompt-builder coverage for the locked-setting notes	2026-03-11 12:33:10 -07:00
Ahmed Ibrahim	8f8a0f55ce	spawn prompt (#14362 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-03-11 12:33:10 -07:00
pakrym-oai	65b325159d	Add ALL_TOOLS export to code mode (#14294 ) So code mode can search for tools.	2026-03-11 12:33:10 -07:00
Rasmus Rygaard	7f22329389	Revert "Pass more params to compaction" (#14298 )	2026-03-11 12:33:10 -07:00
Channing Conger	fd4a673525	Responses: set x-client-request-id as convesration_id when talking to responses (#14312 ) Right now we're sending the header session_id to responses which is ignored/dropped. This sets a useful x-client-request-id to the conversation_id.	2026-03-11 12:33:10 -07:00
Ahmed Ibrahim	a4d884c767	Split spawn_csv from multi_agent (#14282 ) - make `spawn_csv` a standalone feature for CSV agent jobs - keep `spawn_csv -> multi_agent` one-way and preserve restricted subagent disable paths	2026-03-11 12:33:09 -07:00
Ahmed Ibrahim	39c1bc1c68	Add realtime start instructions config override (#14270 ) - add `realtime_start_instructions` config support - thread it into realtime context updates, schema, docs, and tests	2026-03-11 12:33:09 -07:00
pakrym-oai	01792a4c61	Prefix code mode output with success or failure message and include error stack (#14272 )	2026-03-11 12:33:09 -07:00
Ahmed Ibrahim	c8446d7cf3	Stabilize websocket response.failed error delivery (#14017 ) ## What changed - Drop failed websocket connections immediately after a terminal stream error instead of awaiting a graceful close handshake before forwarding the error to the caller. - Keep the success path and the closed-connection guard behavior unchanged. ## Why this fixes the flake - The failing integration test waits for the second websocket stream to surface the model error before issuing a follow-up request. - On slower runners, the old error path awaited `ws_stream.close().await` before sending the error downstream. If that close handshake stalled, the test kept waiting for an error that had already happened server-side and nextest timed it out. - Dropping the failed websocket immediately makes the terminal error observable right away and marks the session closed so the next request reconnects cleanly instead of depending on a best-effort close handshake. ## Code or test? - This is a production logic fix in `codex-api`. The existing websocket integration test already exercises the regression path.	2026-03-11 12:33:09 -07:00
pakrym-oai	8a099b3dfb	Rename code mode tool to exec (#14254 ) Summary - update the code-mode handler, runner, instructions, and error text to refer to the `exec` tool name everywhere that used to say `code_mode` - ensure generated documentation strings and tool specs describe `exec` and rely on the shared `PUBLIC_TOOL_NAME` - refresh the suite tests so they invoke `exec` instead of the old name Testing - Not run (not requested)	2026-03-11 12:33:09 -07:00
Celia Chen	c1a424691f	chore: add a separate reject-policy flag for skill approvals (#14271 ) ## Summary - add `skill_approval` to `RejectConfig` and the app-server v2 `AskForApproval::Reject` payload so skill-script prompts can be configured independently from sandbox and rule-based prompts - update Unix shell escalation to reject prompts based on the actual decision source, keeping prefix rules tied to `rules`, unmatched command fallbacks tied to `sandbox_approval`, and skill scripts tied to `skill_approval` - regenerate the affected protocol/config schemas and expand unit/integration coverage for the new flag and skill approval behavior	2026-03-11 12:33:09 -07:00
pakrym-oai	83b22bb612	Add store/load support for code mode (#14259 ) adds support for transferring state across code mode invocations.	2026-03-11 12:33:09 -07:00
Rasmus Rygaard	2621ba17e3	Pass more params to compaction (#14247 ) Pass more params to /compact. This should give us parity with the /responses endpoint to improve caching. I'm torn about the MCP await. Blocking will give us parity but it seems like we explicitly don't block on MCPs. Happy either way	2026-03-11 12:33:09 -07:00
pakrym-oai	07c22d20f6	Add code_mode output helpers for text and images (#14244 ) Summary - document how code-mode can import `output_text`/`output_image` and ensure `add_content` stays compatible - add a synthetic `@openai/code_mode` module that appends content items and validates inputs - cover the new behavior with integration tests for structured text and image outputs Testing - Not run (not requested)	2026-03-11 12:33:08 -07:00

... 2 3 4 5 6 ...

1014 Commits