codex

mirror of https://github.com/openai/codex.git synced 2026-04-30 11:21:34 +03:00

Author	SHA1	Message	Date
jif-oai	5441130e0a	feat: adding stream parser (#12666 ) Add a stream parser to extract citations (and others) from a stream. This support cases where markers are split in differen tokens. Codex never manage to make this code work so everything was done manually. Please review correctly and do not touch this part of the code without a very clear understanding of it	2026-02-25 13:27:58 +00:00
alexsong-oai	6d6570d89d	Support external agent config detect and import (#12660 ) Migration Behavior * Config * Migrates settings.json into config.toml * Only adds fields when config.toml is missing, or when those fields are missing from the existing file * Supported mappings: env -> shell_environment_policy sandbox.enabled = true -> sandbox_mode = "workspace-write" * Skills * Copies home and repo .claude/skills into .agents/skills * Existing skill directories are not overwritten * SKILL.md content is rewritten from Claude-related terms to Codex * AgentsMd * Repo only * Migrates CLAUDE.md into AGENTS.md * Detect/import only proceed when AGENTS.md is missing or present but empty * Content is rewritten from Claude-related terms to Codex	2026-02-25 02:11:51 -08:00
Michael Bolin	3ca0e7673b	feat: run zsh fork shell tool via shell-escalation (#12649 ) ## Why This PR switches the `shell_command` zsh-fork path over to `codex-shell-escalation` so the new shell tool can use the shared exec-wrapper/escalation protocol instead of the `zsh_exec_bridge` implementation that was introduced in https://github.com/openai/codex/pull/12052. `zsh_exec_bridge` relied on UNIX domain sockets, which is not as tamper-proof as the FD-based approach in `codex-shell-escalation`. ## What Changed - Added a Unix zsh-fork runtime adapter in `core` (`core/src/tools/runtimes/shell/unix_escalation.rs`) that: - runs zsh-fork commands through `codex_shell_escalation::run_escalate_server` - bridges exec-policy / approval decisions into `ShellActionProvider` - executes escalated commands via a `ShellCommandExecutor` that calls `process_exec_tool_call` - Updated `ShellRuntime` / `ShellCommandHandler` / tool spec wiring to select a `shell_command` backend (`classic` vs `zsh-fork`) while leaving the generic `shell` tool path unchanged. - Removed the `zsh_exec_bridge`-based session service and deleted `core/src/zsh_exec_bridge/mod.rs`. - Moved exec-wrapper entrypoint dispatch to `arg0` by handling the `codex-execve-wrapper` arg0 alias there, and removed the old `codex_core::maybe_run_zsh_exec_wrapper_mode()` hooks from `cli` and `app-server` mains. - Added the needed `codex-shell-escalation` dependencies for `core` and `arg0`. ## Tests - `cargo test -p codex-core shell_zsh_fork_prefers_shell_command_over_unified_exec` - `cargo test -p codex-app-server turn_start_shell_zsh_fork -- --nocapture` - verifies zsh-fork command execution and approval flows through the new backend - includes subcommand approve/decline coverage using the shared zsh DotSlash fixture in `app-server/tests/suite/zsh` - To test manually, I added the following to `~/.codex/config.toml`: ```toml zsh_path = "/Users/mbolin/code/codex3/codex-rs/app-server/tests/suite/zsh" [features] shell_zsh_fork = true ``` Then I ran `just c` to run the dev build of Codex with these changes and sent it the message: ``` run `echo $0` ``` And it replied with: ``` echo $0 printed: /Users/mbolin/code/codex3/codex-rs/app-server/tests/suite/zsh In this tool context, $0 reflects the script path used to invoke the shell, not just zsh. ``` so the tool appears to be wired up correctly. ## Notes - The zsh subcommand-decline integration test now uses `rm` under a `WorkspaceWrite` sandbox. The previous `/usr/bin/true` scenario is auto-allowed by the new `shell-escalation` policy path, which no longer produces subcommand approval prompts.	2026-02-24 10:31:08 -08:00
pakrym-oai	b17148f13a	Prefer v2 websockets if available (#12428 ) And also cleanup settings flow to avoid reading many separate flags. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-21 20:08:04 +00:00
Michael Bolin	2fe4be1aa9	fix: codex-arg0 no longer depends on codex-core (#12434 ) ## Why `codex-rs/arg0` only needed two things from `codex-core`: - the `find_codex_home()` wrapper - the special argv flag used for the internal `apply_patch` self-invocation path That made `codex-arg0` depend on `codex-core` for a very small surface area. This change removes that dependency edge and moves the shared `apply_patch` invocation flag to a more natural boundary (`codex-apply-patch`) while keeping the contract explicitly documented. ## What Changed - Moved the internal `apply_patch` argv[1] flag constant out of `codex-core` and into `codex-apply-patch`. - Renamed the constant to `CODEX_CORE_APPLY_PATCH_ARG1` and documented that it is part of the Codex core process-invocation contract (even though it now lives in `codex-apply-patch`). - Updated `arg0`, the core apply-patch runtime, and the `codex-exec` apply-patch test to import the constant from `codex-apply-patch`. - Updated `codex-rs/arg0` to call `codex_utils_home_dir::find_codex_home()` directly instead of `codex_core::config::find_codex_home()`. - Removed the `codex-core` dependency from `codex-rs/arg0` and added the needed direct dependency on `codex-utils-home-dir`. - Added `codex-apply-patch` as a dev-dependency for `codex-rs/exec` tests (the apply-patch test now imports the moved constant directly). ## Verification - `cargo test -p codex-apply-patch` - `cargo test -p codex-arg0` - `cargo test -p codex-core --lib apply_patch` - `cargo test -p codex-exec test_standalone_exec_cli_can_use_apply_patch` - `cargo shear`	2026-02-21 00:20:42 -08:00
Michael Bolin	1af2a37ada	chore: remove codex-core public protocol/shell re-exports (#12432 ) ## Why `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules from `codex-protocol` and `codex-shell-command`. That made it easy for workspace crates to import those APIs through `codex-core`, which in turn hides dependency edges and makes it harder to reduce compile-time coupling over time. This change removes those public re-exports so call sites must import from the source crates directly. Even when a crate still depends on `codex-core` today, this makes dependency boundaries explicit and unblocks future work to drop `codex-core` dependencies where possible. ## What Changed - Removed public re-exports from `codex-rs/core/src/lib.rs` for: - `codex_protocol::protocol` and related protocol/model types (including `InitialHistory`) - `codex_protocol::config_types` (`protocol_config_types`) - `codex_shell_command::{bash, is_dangerous_command, is_safe_command, parse_command, powershell}` - Migrated workspace Rust call sites to import directly from: - `codex_protocol::protocol` - `codex_protocol::config_types` - `codex_protocol::models` - `codex_shell_command` - Added explicit `Cargo.toml` dependencies (`codex-protocol` / `codex-shell-command`) in crates that now import those crates directly. - Kept `codex-core` internal modules compiling by using `pub(crate)` aliases in `core/src/lib.rs` (internal-only, not part of the public API). - Updated the two utility crates that can already drop a `codex-core` dependency edge entirely: - `codex-utils-approval-presets` - `codex-utils-cli` ## Verification - `cargo test -p codex-utils-approval-presets` - `cargo test -p codex-utils-cli` - `cargo check --workspace --all-targets` - `just clippy`	2026-02-20 23:45:35 -08:00
Ahmed Ibrahim	6817f0be8a	Wire realtime api to core (#12268 ) - Introduce `RealtimeConversationManager` for realtime API management - Add `op::conversation` to start conversation, insert audio, insert text, and close conversation. - emit conversation lifecycle and realtime events. - Move shared realtime payload types into codex-protocol and add core e2e websocket tests for start/replace/transport-close paths. Things to consider: - Should we use the same `op::` and `Events` channel to carry audio? I think we should try this simple approach and later we can create separate one if the channels got congested. - Sending text updates to the client: we can start simple and later restrict that. - Provider auth isn't wired for now intentionally	2026-02-20 19:06:35 -08:00
Owen Lin	edacbf7b6e	feat(core): zsh exec bridge (#12052 ) zsh fork PR stack: - https://github.com/openai/codex/pull/12051 - https://github.com/openai/codex/pull/12052 👈 ### Summary This PR introduces a feature-gated native shell runtime path that routes shell execution through a patched zsh exec bridge, removing MCP-specific behavior from the shell hot path while preserving existing CommandExecution lifecycle semantics. When shell_zsh_fork is enabled, shell commands run via patched zsh with per-`execve` interception through EXEC_WRAPPER. Core receives wrapper IPC requests over a Unix socket, applies existing approval policy, and returns allow/deny before the subcommand executes. ### What’s included 1) New zsh exec bridge runtime in core - Wrapper-mode entrypoint (maybe_run_zsh_exec_wrapper_mode) for EXEC_WRAPPER invocations. - Per-execution Unix-socket IPC handling for wrapper requests/responses. - Approval callback integration using existing core approval orchestration. - Streaming stdout/stderr deltas to existing command output event pipeline. - Error handling for malformed IPC, denial/abort, and execution failures. 2) Session lifecycle integration SessionServices now owns a `ZshExecBridge`. Session startup initializes bridge state; shutdown tears it down cleanly. 3) Shell runtime routing (feature-gated) When `shell_zsh_fork` is enabled: - Build execution env/spec as usual. - Add wrapper socket env wiring. - Execute via `zsh_exec_bridge.execute_shell_request(...)` instead of the regular shell path. - Non-zsh-fork behavior remains unchanged. 4) Config + feature wiring - Added `Feature::ShellZshFork` (under development). - Added config support for `zsh_path` (optional absolute path to patched zsh): - `Config`, `ConfigToml`, `ConfigProfile`, overrides, and schema. - Session startup validates that `zsh_path` exists/usable when zsh-fork is enabled. - Added startup test for missing `zsh_path` failure mode. 5) Seatbelt/sandbox updates for wrapper IPC - Extended seatbelt policy generation to optionally allow outbound connection to explicitly permitted Unix sockets. - Wired sandboxing path to pass wrapper socket path through to seatbelt policy generation. - Added/updated seatbelt tests for explicit socket allow rule and argument emission. 6) Runtime entrypoint hooks - This allows the same binary to act as the zsh wrapper subprocess when invoked via `EXEC_WRAPPER`. 7) Tool selection behavior - ToolsConfig now prefers ShellCommand type when shell_zsh_fork is enabled. - Added test coverage for precedence with unified-exec enabled.	2026-02-17 20:19:53 -08:00
gabec-openai	5341ad08f8	Use prompt-based co-author attribution with config override (#11617 )	2026-02-17 20:15:54 +00:00
viyatb-oai	b527ee2890	feat(core): add structured network approval plumbing and policy decision model (#11672 ) ### Description #### Summary Introduces the core plumbing required for structured network approvals #### What changed - Added structured network policy decision modeling in core. - Added approval payload/context types needed for network approval semantics. - Wired shell/unified-exec runtime plumbing to consume structured decisions. - Updated related core error/event surfaces for structured handling. - Updated protocol plumbing used by core approval flow. - Included small CLI debug sandbox compatibility updates needed by this layer. #### Why establishes the minimal backend foundation for network approvals without yet changing high-level orchestration or TUI behavior. #### Notes - Behavior remains constrained by existing requirements/config gating. - Follow-up PRs in the stack handle orchestration, UX, and app-server integration. --------- Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>	2026-02-14 04:18:12 +00:00
Eric Traut	b98c810328	Report syntax errors in rules file (#11686 ) Currently, if there are syntax errors detected in the starlark rules file, the entire policy is silently ignored by the CLI. The app server correctly emits a message that can be displayed in a GUI. This PR changes the CLI (both the TUI and non-interactive exec) to fail when the rules file can't be parsed. It then prints out an error message and exits with a non-zero exit code. This is consistent with the handling of errors in the config file. This addresses #11603	2026-02-13 10:33:40 -08:00
Celia Chen	dfd1e199a0	[feat] add seatbelt permission files (#11639 ) Add seatbelt permission extension abstraction as permission files for seatbelt profiles. This should complement our current sandbox policy	2026-02-12 23:30:22 +00:00
iceweasel-oai	5c3ca73914	add a slash command to grant sandbox read access to inaccessible directories (#11512 ) There is an edge case where a directory is not readable by the sandbox. In practice, we've seen very little of it, but it can happen so this slash command unlocks users when it does. Future idea is to make this a tool that the agent knows about so it can be more integrated.	2026-02-12 12:48:36 -08:00
Owen Lin	efc8d45750	feat(app-server): experimental flag to persist extended history (#11227 ) This PR adds an experimental `persist_extended_history` bool flag to app-server thread APIs so rollout logs can retain a richer set of EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e. on `thread/resume`). ### Motivation Today, our rollout recorder only persists a small subset (e.g. user message, reasoning, assistant message) of `EventMsg` types, dropping a good number (like command exec, file change, etc.) that are important for reconstructing full item history for `thread/resume`, `thread/read`, and `thread/fork`. Some clients want to be able to resume a thread without lossiness. This lossiness is primarily a UI thing, since what the model sees are `ResponseItem` and not `EventMsg`. ### Approach This change introduces an opt-in `persist_full_history` flag to preserve those events when you start/resume/fork a thread (defaults to `false`). This is done by adding an `EventPersistenceMode` to the rollout recorder: - `Limited` (existing behavior, default) - `Extended` (new opt-in behavior) In `Extended` mode, persist additional `EventMsg` variants needed for non-lossy app-server `ThreadItem` reconstruction. We now store the following ThreadItems that we didn't before: - web search - command execution - patch/file changes - MCP tool calls - image view calls - collab tool outcomes - context compaction - review mode enter/exit For command executions in particular, we truncate the output using the existing `truncate_text` from core to store an upper bound of 10,000 bytes, which is also the default value for truncating tool outputs shown to the model. This keeps the size of the rollout file and command execution items returned over the wire reasonable. And we also persist `EventMsg::Error` which we can now map back to the Turn's status and populates the Turn's error metadata. #### Updates to EventMsgs To truly make `thread/resume` non-lossy, we also needed to persist the `status` on `EventMsg::CommandExecutionEndEvent` and `EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a command failed or was declined (similar for apply_patch). These EventMsgs were never persisted before so I made it a required field.	2026-02-12 19:34:22 +00:00
Michael Bolin	476c1a7160	Remove `test-support` feature from `codex-core` and replace it with explicit test toggles (#11405 ) ## Why `codex-core` was being built in multiple feature-resolved permutations because test-only behavior was modeled as crate features. For a large crate, those permutations increase compile cost and reduce cache reuse. ## Net Change - Removed the `test-support` crate feature and related feature wiring so `codex-core` no longer needs separate feature shapes for test consumers. - Standardized cross-crate test-only access behind `codex_core::test_support`. - External test code now imports helpers from `codex_core::test_support`. - Underlying implementation hooks are kept internal (`pub(crate)`) instead of broadly public. ## Outcome - Fewer `codex-core` build permutations. - Better incremental cache reuse across test targets. - No intended production behavior change.	2026-02-10 22:44:02 -08:00
Michael Bolin	b68a84ee8e	Remove `deterministic_process_ids` feature to avoid duplicate `codex-core` builds (#11393 ) ## Why `codex-core` enabled `deterministic_process_ids` through a self dev-dependency. That forced a second feature-resolved build of the same crate, which increased compile time and test latency. ## What Changed - Removed the `deterministic_process_ids` feature from `codex-rs/core/Cargo.toml`. - Removed the self dev-dependency on `codex-core` that enabled that feature. - Removed the Bazel `deterministic_process_ids` crate feature for `codex-core`. - Added a test-only `AtomicBool` override in unified exec process-id allocation. - Added a test-support setter for that override and re-exported it from `codex-core`. - Enabled deterministic process IDs in integration tests via `core_test_support` ctor. ## Behavior - Production behavior remains random process IDs. - Unit tests remain deterministic via `cfg(test)`. - Integration tests remain deterministic via explicit test-support initialization. ## Validation - `just fmt` - `cargo test -p codex-core unified_exec::` - `cargo test -p codex-core --test all unified_exec -- --test-threads=1` - `cargo tree -p codex-core -e features` (verified the removed feature path)	2026-02-10 19:07:01 -08:00
Michael Bolin	d44f4205fb	chore: rename codex-command to codex-shell-command (#11378 ) This addresses some post-merge feedback on https://github.com/openai/codex/pull/11361: - crate rename - reuse `detect_shell_type()` utility	2026-02-10 17:03:46 -08:00
Michael Bolin	d8f9bb65e2	# Split command parsing/safety out of `codex-core` into new `codex-command` (#11361 ) `codex-core` had accumulated command parsing and command safety logic (`bash`, `powershell`, `parse_command`, and `command_safety`) that is logically cohesive but orthogonal to most core session/runtime logic. Keeping this code in `codex-core` made the crate increasingly monolithic and raised iteration cost for unrelated core changes. This change extracts that surface into a dedicated crate, `codex-command`, while preserving existing `codex_core::...` call sites via re-exports. ## Why this refactor During analysis, command parsing/safety stood out as a good first split because it has: - a clear domain boundary (shell parsing + safety classification) - relatively self-contained dependencies (notably `tree-sitter` / `tree-sitter-bash`) - a meaningful standalone test surface (`134` tests moved with the crate) - many downstream uses that benefit from independent compilation and caching The practical problem was build latency from a large `codex-core` compile/test graph. Clean-build timings before and after this split showed measurable wins: - `cargo check -p codex-core`: `57.08s` -> `53.54s` (~`6.2%` faster) - `cargo test -p codex-core --no-run`: `2m39.9s` -> `2m20s` (~`12.4%` faster) - `codex-core lib` compile unit: `57.18s` -> `49.67s` (~`13.1%` faster) - `codex-core lib(test)` compile unit: `60.87s` -> `53.21s` (~`12.6%` faster) This gives a concrete reduction in core build overhead without changing behavior. ## What changed ### New crate - Added `codex-rs/command` as workspace crate `codex-command`. - Added: - `command/src/lib.rs` - `command/src/bash.rs` - `command/src/powershell.rs` - `command/src/parse_command.rs` - `command/src/command_safety/` - `command/src/shell_detect.rs` - `command/BUILD.bazel` ### Code moved out of `codex-core` - Moved modules from `core/src` into `command/src`: - `bash.rs` - `powershell.rs` - `parse_command.rs` - `command_safety/` ### Dependency graph updates - Added workspace member/dependency entries for `codex-command` in `codex-rs/Cargo.toml`. - Added `codex-command` dependency to `codex-rs/core/Cargo.toml`. - Removed `tree-sitter` and `tree-sitter-bash` from `codex-core` direct deps (now owned by `codex-command`). ### API compatibility for callers To avoid immediate downstream churn, `codex-core` now re-exports the moved modules/functions: - `codex_command::bash` - `codex_command::powershell` - `codex_command::parse_command` - `codex_command::is_safe_command` - `codex_command::is_dangerous_command` This keeps existing `codex_core::...` paths working while enabling gradual migration to direct `codex-command` usage. ### Internal decoupling detail - Added `command::shell_detect` so moved `bash`/`powershell` logic no longer depends on core shell internals. - Adjusted PowerShell helper visibility in `codex-command` for existing core test usage (`UTF8` prefix helper + executable discovery functions). ## Validation - `just fmt` - `just fix -p codex-command -p codex-core` - `cargo test -p codex-command` (`134` passed) - `cargo test -p codex-core --no-run` - `cargo test -p codex-core shell_command_handler` ## Notes / follow-up This commit intentionally prioritizes boundary extraction and compatibility. A follow-up can migrate downstream crates to depend directly on `codex-command` (instead of through `codex-core` re-exports) to realize additional incremental build wins.	2026-02-10 14:43:16 -08:00
iceweasel-oai	82f93a13b2	include sandbox (seatbelt, elevated, etc.) as in turn metadata header (#10946 ) This will help us understand retention/usage for folks who use the Windows (or any other) sandboxes	2026-02-10 19:50:07 +00:00
viyatb-oai	62d0f302fd	fix(core): canonicalize wrapper approvals and support heredoc prefix … (#10941 ) ## Summary - Reduced repeated approvals for equivalent wrapper commands and fixed execpolicy matching for heredoc-style shell invocations, with minimal behavior change and fail-closed defaults. ## Fixes 1. Canonicalized approval matching for wrappers so equivalent commands map to the same approval intent. 2. Added heredoc-aware prefix extraction for execpolicy so commands like `python3 <<'PY' ... PY` match rules such as `prefix_rule(["python3"], ...)`. 3. Kept fallback behavior conservative: if parsing is ambiguous, existing prompt behavior is preserved. ## Edge Cases Covered - Wrapper path/name differences: `/bin/bash` vs `bash`, `/bin/zsh` vs `zsh`. - Shell modes: `-c` and `-lc`. - Heredoc forms: quoted delimiter (`<<'PY'`) and unquoted delimiter (`<< PY`). - Multi-command heredoc scripts are rejected by the fallback - Non-heredoc redirections (`>`, etc.) are not treated as heredoc prefix matches. - Complex scripts still fall back to prior behavior rather than expanding permissions. --------- Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>	2026-02-10 11:46:40 -08:00
jif-oai	d735df1f50	Extract hooks into dedicated crate (#11311 ) Summary - move `core/src/hooks` implementation into a new `codex-hooks` crate with its own manifest - update `codex-rs` workspace and `codex-core` crate to depend on the extracted `hooks` crate and wire up the shared APIs - ensure references, modules, and lockfile reflect the new crate layout Testing - Not run (not requested)	2026-02-10 13:42:17 +00:00
jif-oai	6049ff02a0	memories: add extraction and prompt module foundation (#11200 ) ## Summary - add the new `core/src/memories` module (phase-one parsing, rollout filtering, storage, selection, prompts) - add Askama-backed memory templates for stage-one input/system and consolidation prompts - add module tests for parsing, filtering, path bucketing, and summary maintenance ## Testing - just fmt - cargo test -p codex-core --lib memories::	2026-02-10 10:10:24 +00:00
Matthew Zeng	d90df4761b	[apps] Add gated instructions for Apps. (#10924 ) - [x] Add gated instructions for Apps.	2026-02-09 14:48:09 -08:00
Michael Bolin	ff74aaae21	chore: reverse the codex-network-proxy -> codex-core dependency (#11121 )	2026-02-08 17:03:24 -08:00
Owen Lin	0d8b2b74c4	feat(app-server): turn/steer API (#10821 ) This PR adds a dedicated `turn/steer` API for appending user input to an in-flight turn. ## Motivation Currently, steering in the app is implemented by just calling `turn/start` while a turn is running. This has some really weird quirks: - Client gets back a new `turn.id`, even though streamed events/approvals remained tied to the original active turn ID. - All the various turn-level override params on `turn/start` do not apply to the "steer", and would only apply to the next real turn. - There can also be a race condition where the client thinks the turn is active but the server has already completed it, so there might be bugs if the client has baked in some client-specific behavior thinking it's a steer when in fact the server kicked off a new turn. This is particularly possible when running a client against a remote app-server. Having a dedicated `turn/steer` API eliminates all those quirks. `turn/steer` behavior: - Requires an active turn on threadId. Returns a JSON-RPC error if there is no active turn. - If expectedTurnId is provided, it must match the active turn (more useful when connecting to a remote app-server). - Does not emit `turn/started`. - Does not accept turn overrides (`cwd`, `model`, `sandbox`, etc.) or `outputSchema` to accurately reflect that these are not applied when steering.	2026-02-06 00:35:04 +00:00
sayan-oai	5fdf6f5efa	chore: rm web-search-eligible header (#10660 ) default-enablement of web_search is now client-side, no need to send eligibility headers to backend. Tested locally, headers no longer sent. will wait for corresponding backend change to deploy before merging	2026-02-05 11:48:34 -08:00
gt-oai	3b54fd7336	Add hooks implementation and wire up to `notify` (#9691 ) This introduces a `Hooks` service. It registers hooks from config and dispatches hook events at runtime. N.B. The hook config is not wired up to this yet. But for legacy reasons, we wire up `notify` from config and power it using hooks now. Nothing about the `notify` interface has changed. I'd start by reviewing `hooks/types.rs` Some things to note: - hook names subject to change - no hook result yet - stopping semantics yet to be introduced - additional hooks yet to be introduced	2026-02-05 16:49:35 +00:00
pap-openai	b2424cb635	adding fork information (UI) when forking (#10246 ) - shows `/fork` command that ran in prev session - shows `session forked from name (uuid) \|\| uuid (if name is not set)` as an event in new session	2026-02-05 13:24:55 +00:00
pakrym-oai	0e8d359da9	Session-level model client (#10664 ) Make ModelClient a session-scoped object. Move state that is session level onto the client, and make state that is per-turn explicit on corresponding methods. Stop taking a huge Config object, instead only pass in values that are actually needed. --------- Co-authored-by: Josh McKinney <joshka@openai.com>	2026-02-04 16:58:48 -08:00
Eric Traut	7bcc552325	Added support for live updates to skills (#10478 ) Add a centralized FileWatcher in codex-core (using notify) that watches skill roots from the config layer stack (recursive) Send `SkillsChanged` events when relevant file system changes are detected On `SkillsChanged`: * Invalidate the skills cache immediately in ThreadManager * Emit EventMsg::SkillsUpdateAvailable to active sessions ~~* Broadcast a new app-server notification: SkillsListUpdatedNotification~~ This change does not inject new items into the event stream. That means the agent will not know about new skills, so it won't be able to implicitly invoke new skills. It also won't know about changes to existing skills, so if it has already read the contents of a modified skill, it will not honor the new behavior. This change also does not detect modifications to AGENTS.md. I plan to address these limitations in a follow-on PR modeled after #9985. Injection of new skills and AGENTS was deemed to risky, hence the need to split the feature into two stages. The changes in this PR were designed to easily accommodate the second stage once we have some other foundational changes in place. Testing: In addition to automated tests, I did manual testing to confirm that newly-created skills, deleted skills, and renamed skills are reflected in the TUI skill picker menu. Also confirmed that modifications to behaviors for explicitly-invoked skills are honored. --------- Co-authored-by: Xin Lin <xl@openai.com>	2026-02-04 15:25:03 -08:00
jif-oai	e9335374b9	feat: add phase 1 mem client (#10629 ) Adding a client on top of https://github.com/openai/openai/pull/672176	2026-02-04 17:59:36 +00:00
pakrym-oai	56ebfff1a8	Move metadata calculation out of client (#10589 ) Model client shouldn't be responsible for this.	2026-02-03 21:59:13 -08:00
Anton Panasenko	fcaed4cb88	feat: log webscocket timing into runtime metrics (#10577 )	2026-02-03 18:04:07 -08:00
jif-oai	d2394a2494	chore: nuke chat/completions API (#10157 )	2026-02-03 11:31:57 +00:00
pash-openai	019d89ff86	make codex better at git (#10145 ) adds basic git context to the session prefix so the model can anchor git actions and be a bit more version-aware. structured it in a multiroot-friendly shape even though we only have one root today	2026-02-02 16:57:29 -08:00
pap-openai	1644cbfc6d	Session picker shows thread_name if set (#10340 ) - shows names of threads in the ResumePicker used by `/resume` and `codex resume` if set, default to preview (previous behaviour) if none - adds a `find_thread_names_by_ids` that maps names to IDs in `codex-rs/core/src/rollout/session_index.rs`. It reads sequentially in normal (instead of reverse order in `codex resume <name>`) the index mapping file. This function is called from a list of session (default page is 25, pages loaded depends of height of terminal), for which most of them will always have at least one session unnamed and require the whole file to be read therefore. Could be better and sqlite integration will make this better - those reads won't be needed when leveraging sqlite Opened questions: - We could rename the TUI "Conversation" column to "Name" or "Thread" that would feel more accurate. Could be a fast-follow if we implement auto-naming as it'll always be a name instead?	2026-02-02 08:13:17 +00:00
alexsong-oai	b164ac6d1e	feat: fire tracking events for skill invocation (#10120 )	2026-01-31 18:06:26 -08:00
Dylan Hurd	0f9858394b	feat(core,tui,app-server) personality migration (#10307 ) ## Summary Keep existing users on Pragmatic, to preserve behavior while new users default to Friendly ## Testing - [x] Tested locally - [x] add integration tests	2026-01-31 17:25:14 -07:00
Charley Cunningham	ec4a2d07e4	Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786 ) ## Summary - Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed in core, emitting plan deltas plus a plan `ThreadItem`, while stripping tags from normal assistant output. - Persist plan items and rebuild them on resume so proposed plans show in thread history. - Wire plan items/deltas through app-server protocol v2 and render a dedicated proposed-plan view in the TUI, including the “Implement this plan?” prompt only when a plan item is present. ## Changes ### Core (`codex-rs/core`) - Added a generic, line-based tag parser that buffers each line until it can disprove a tag prefix; implements auto-close on `finish()` for unterminated tags. `codex-rs/core/src/tagged_block_parser.rs` - Refactored proposed plan parsing to wrap the generic parser. `codex-rs/core/src/proposed_plan_parser.rs` - In plan mode, stream assistant deltas as: - Normal text → `AgentMessageContentDelta` - Plan text → `PlanDelta` + `TurnItem::Plan` start/completion (`codex-rs/core/src/codex.rs`) - Final plan item content is derived from the completed assistant message (authoritative), not necessarily the concatenated deltas. - Strips `<proposed_plan>` blocks from assistant text in plan mode so tags don’t appear in normal messages. (`codex-rs/core/src/stream_events_utils.rs`) - Persist `ItemCompleted` events only for plan items for rollout replay. (`codex-rs/core/src/rollout/policy.rs`) - Guard `update_plan` tool in Plan Mode with a clear error message. (`codex-rs/core/src/tools/handlers/plan.rs`) - Updated Plan Mode prompt to: - keep `<proposed_plan>` out of non-final reasoning/preambles - require exact tag formatting - allow only one `<proposed_plan>` block per turn (`codex-rs/core/templates/collaboration_mode/plan.md`) ### Protocol / App-server protocol - Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items. (`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`) - Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with EXPERIMENTAL markers and note that deltas may not match the final plan item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`) - Added plan delta route in app-server protocol common mapping. (`codex-rs/app-server-protocol/src/protocol/common.rs`) - Rebuild plan items from persisted `ItemCompleted` events on resume. (`codex-rs/app-server-protocol/src/protocol/thread_history.rs`) ### App-server - Forward plan deltas to v2 clients and map core plan items to v2 plan items. (`codex-rs/app-server/src/bespoke_event_handling.rs`, `codex-rs/app-server/src/codex_message_processor.rs`) - Added v2 plan item tests. (`codex-rs/app-server/tests/suite/v2/plan_item.rs`) ### TUI - Added a dedicated proposed plan history cell with special background and padding, and moved “• Proposed Plan” outside the highlighted block. (`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`) - Only show “Implement this plan?” when a plan item exists. (`codex-rs/tui/src/chatwidget.rs`, `codex-rs/tui/src/chatwidget/tests.rs`) <img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM" src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286" /> ### Docs / Misc - Updated protocol docs to mention plan deltas. (`codex-rs/docs/protocol_v1.md`) - Minor plumbing updates in exec/debug clients to tolerate plan deltas. (`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`) ## Tests - Added core integration tests: - Plan mode strips plan from agent messages. - Missing `</proposed_plan>` closes at end-of-message. (`codex-rs/core/tests/suite/items.rs`) - Added unit tests for generic tag parser (prefix buffering, non-tag lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`) - Existing app-server plan item tests in v2. (`codex-rs/app-server/tests/suite/v2/plan_item.rs`) ## Notes / Behavior - Plan output no longer appears in standard assistant text in Plan Mode; it streams via `PlanDelta` and completes as a `TurnItem::Plan`. - The final plan item content is authoritative and may diverge from streamed deltas (documented as experimental). - Reasoning summaries are not filtered; prompt instructs the model not to include `<proposed_plan>` outside the final plan message. ## Codex Author `codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`	2026-01-30 18:59:30 +00:00
pap-openai	1ef5455eb6	Conversation naming (#8991 ) Session renaming: - `/rename my_session` - `/rename` without arg and passing an argument in `customViewPrompt` - AppExitInfo shows resume hint using the session name if set instead of uuid, defaults to uuid if not set - Names are stored in `CODEX_HOME/sessions.jsonl` Session resuming: - codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry matching the name and resumes the session --------- Co-authored-by: jif-oai <jif@openai.com>	2026-01-30 10:40:09 +00:00
pakrym-oai	3b1cddf001	Fall back to http when websockets fail (#10139 ) I expect not all proxies work with websockets, fall back to http if websockets fail.	2026-01-29 10:36:21 -08:00
Matthew Zeng	b9cd089d1f	[connectors] Support connectors part 2 - slash command and tui (#9728 ) - [x] Support `/apps` slash command to browse the apps in tui. - [x] Support inserting apps to prompt using `$`. - [x] Lots of simplification/renaming from connectors to apps.	2026-01-28 19:51:58 -08:00
jif-oai	3878c3dc7c	feat: sqlite 1 (#10004 ) Add a `.sqlite` database to be used to store rollout metatdata (and later logs) This PR is phase 1: * Add the database and the required infrastructure * Add a backfill of the database * Persist the newly created rollout both in files and in the DB * When we need to get metadata or a rollout, consider the `JSONL` as the source of truth but compare the results with the DB and show any errors	2026-01-28 15:29:14 +01:00
iceweasel-oai	c40ad65bd8	remove sandbox globals. (#9797 ) Threads sandbox updates through OverrideTurnContext for active turn Passes computed sandbox type into safety/exec	2026-01-27 11:04:23 -08:00
sayan-oai	86adf53235	fix: handle all web_search actions and in progress invocations (#9960 ) ### Summary - Parse all `web_search` tool actions (`search`, `find_in_page`, `open_page`). - Previously we only parsed + displayed `search`, which made the TUI appear to pause when the other actions were being used. - Show in progress `web_search` calls as `Searching the web` - Previously we only showed completed tool calls <img width="308" height="149" alt="image" src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845" /> ### Tests Added + updated tests, tested locally ### Follow ups Update VSCode extension to display these as well	2026-01-27 03:33:48 +00:00
Charley Cunningham	62266b13f8	Add thread/unarchive to restore archived rollouts (#9843 ) ## Summary - Adds a new `thread/unarchive` RPC to move archived thread rollouts back into the active `sessions/` tree. ## What changed - Protocol - Adds `thread/unarchive` request/response types and wiring. - Server - Implements `thread_unarchive` in the app server. - Validates the archived rollout path and thread ID. - Restores the rollout to `sessions/YYYY/MM/DD/...` based on the rollout filename timestamp. - Core - Adds `find_archived_thread_path_by_id_str` helper for archived rollouts. - Docs - Documents the new RPC and usage example. - Tests - Adds an end-to-end server test that: 1) starts a thread, 2) archives it, 3) unarchives it, 4) asserts the file is restored to `sessions/`. ## How to use ```json { "method": "thread/unarchive", "id": 24, "params": { "threadId": "<thread-id>" } } ``` ## Author Codex Session `codex resume 019bf158-54b6-7960-a696-9d85df7e1bc1` (soon I'll make this kind of session UUID forkable by anyone with the right `session_object_storage_url` line in their config, but for now just pasting it here for my reference)	2026-01-26 11:24:36 -08:00
jif-oai	d594693d1a	feat: dynamic tools injection (#9539 ) ## Summary Add dynamic tool injection to thread startup in API v2, wire dynamic tool calls through the app server to clients, and plumb responses back into the model tool pipeline. ### Flow (high level) - Thread start injects `dynamic_tools` into the model tool list for that thread (validation is done here). - When the model emits a tool call for one of those names, core raises a `DynamicToolCallRequest` event. - The app server forwards it to the client as `item/tool/call`, waits for the client’s response, then submits a `DynamicToolResponse` back to core. - Core turns that into a `function_call_output` in the next model request so the model can continue. ### What changed - Added dynamic tool specs to v2 thread start params and protocol types; introduced `item/tool/call` (request/response) for dynamic tool execution. - Core now registers dynamic tool specs at request time and routes those calls via a new dynamic tool handler. - App server validates tool names/schemas, forwards dynamic tool call requests to clients, and publishes tool outputs back into the session. - Integration tests	2026-01-26 10:06:44 +00:00
jif-oai	83775f4df1	feat: ephemeral threads (#9765 ) Add ephemeral threads capabilities. Only exposed through the `app-server` v2 The idea is to disable the rollout recorder for those threads.	2026-01-24 14:57:40 +00:00
Matthew Zeng	a2c829a808	[connectors] Support connectors part 1 - App server & MCP (#9667 ) In order to make Codex work with connectors, we add a built-in gateway MCP that acts as a transparent proxy between the client and the connectors. The gateway MCP collects actions that are accessible to the user and sends them down to the user, when a connector action is chosen to be called, the client invokes the action through the gateway MCP as well. - [x] Add the system built-in gateway MCP to list and run connectors. - [x] Add the app server methods and protocol	2026-01-22 16:48:43 -08:00
Skylar Graika	b236f1c95d	fix: prevent repeating interrupted turns (#9043 ) ## What Record a model-visible `<turn_aborted>` marker in history when a turn is interrupted, and treat it as a session prefix. ## Why When a turn is interrupted, Codex emits `TurnAborted` but previously did not persist anything model-visible in the conversation history. On the next user turn, the model can’t tell the previous work was aborted and may resume/repeat earlier actions (including duplicated side effects like re-opening PRs). Fixes: https://github.com/openai/codex/issues/9042 ## How On `TurnAbortReason::Interrupted`, append a hidden user message containing a `<turn_aborted>…</turn_aborted>` marker and flush. Treat `<turn_aborted>` like `<environment_context>` for session-prefix filtering. Add a regression test to ensure follow-up turns don’t repeat side effects from an aborted turn. ## Testing `just fmt` `just fix -p codex-core` `cargo test -p codex-core -- --test-threads=1` `cargo test --all-features -- --test-threads=1` --------- Co-authored-by: Skylar Graika <sgraika127@gmail.com> Co-authored-by: jif-oai <jif@openai.com> Co-authored-by: Eric Traut <etraut@openai.com>	2026-01-20 13:07:28 -08:00

1 2 3 4

175 Commits