codex

mirror of https://github.com/openai/codex.git synced 2026-05-02 20:32:04 +03:00

Author	SHA1	Message	Date
Ahmed Ibrahim	0aeb55bf08	Record realtime close marker on replacement (#13058 ) ## Summary - record a realtime close developer message when a new realtime session replaces an active one - assert the replacement marker through the mocked responses request path --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-01 13:54:12 -08:00
Curtis 'Fjord' Hawthorne	7e980d7db6	Support multimodal custom tool outputs (#12948 ) ## Summary This changes `custom_tool_call_output` to use the same output payload shape as `function_call_output`, so freeform tools can return either plain text or structured content items. The main goal is to let `js_repl` return image content from nested `view_image` calls in its own `custom_tool_call_output`, instead of relying on a separate injected message. ## What changed - Changed `custom_tool_call_output.output` from `string` to `FunctionCallOutputPayload` - Updated freeform tool plumbing to preserve structured output bodies - Updated `js_repl` to aggregate nested tool content items and attach them to the outer `js_repl` result - Removed the old `js_repl` special case that injected `view_image` results as a separate pending user image message - Updated normalization/history/truncation paths to handle multimodal `custom_tool_call_output` - Regenerated app-server protocol schema artifacts ## Behavior Direct `view_image` calls still return a `function_call_output` with image content. When `view_image` is called inside `js_repl`, the outer `js_repl` `custom_tool_call_output` now carries: - an `input_text` item if the JS produced text output - one or more `input_image` items from nested tool results So the nested image result now stays inside the `js_repl` tool output instead of being injected as a separate message. ## Compatibility This is intended to be backward-compatible for resumed conversations. Older histories that stored `custom_tool_call_output.output` as a plain string still deserialize correctly, and older histories that used the previous injected-image-message flow also continue to resume. Added regression coverage for resuming a pre-change rollout containing: - string-valued `custom_tool_call_output` - legacy injected image message history #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12948	2026-02-26 18:17:46 -08:00
Charley Cunningham	07aefffb1f	core: bundle settings diff updates into one dev/user envelope (#12417 ) ## Summary - bundle contextual prompt injection into at most one developer message plus one contextual user message in both: - per-turn settings updates - initial context insertion - preserve `<model_switch>` across compaction by rebuilding it through canonical initial-context injection, instead of relying on strip/reattach hacks - centralize contextual user fragment detection in one shared definition table and reuse it for parsing/compaction logic - keep `AGENTS.md` in its natural serialized format: - `# AGENTS.md instructions for {dirname}` - `<INSTRUCTIONS>...</INSTRUCTIONS>` - simplify related tests/helpers and accept the expected snapshot/layout updates from bundled multi-part messages ## Why The goal is to converge toward a simpler, more intentional prompt shape where contextual updates are consistently represented as one developer envelope plus one contextual user envelope, while keeping parsing and compaction behavior aligned with that representation. ## Notable details - the temporary `SettingsUpdateEnvelope` wrapper was removed; these paths now return `Vec<ResponseItem>` directly - local/remote compaction no longer rely on model-switch strip/restore helpers - contextual user detection is now driven by shared fragment definitions instead of ad hoc matcher assembly - AGENTS/user instructions are still the same logical context; only the synthetic `<user_instructions>` wrapper was replaced by the natural AGENTS text format ## Testing - `just fmt` - `cargo test -p codex-app-server codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages -- --exact` - `cargo test -p codex-core compact::tests::collect_user_messages_filters_session_prefix_entries --lib -- --exact` - `cargo test -p codex-core --test all 'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch' -- --exact` - `cargo test -p codex-core --test all 'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_apps_guidance_as_developer_message_when_enabled' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_developer_instructions_message_in_request' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_user_instructions_message_in_request' -- --exact` - `cargo test -p codex-core --test all 'suite::client::resume_includes_initial_messages_and_sends_prior_items' -- --exact` - `cargo test -p codex-core --test all 'suite::review::review_input_isolated_from_parent_history' -- --exact` - `cargo test -p codex-exec --test all 'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' -- --exact` - `cargo test -p core_test_support context_snapshot::tests::full_text_mode_preserves_unredacted_text -- --exact` ## Notes - I also ran several targeted `compact`, `compact_remote`, `prompt_caching`, `model_visible_layout`, and `event_mapping` tests while iterating on prompt-shape changes. - I have not claimed a clean full-workspace `cargo test` from this environment because local sandbox/resource conditions have previously produced unrelated failures in large workspace runs.	2026-02-26 00:12:08 -08:00
Eric Traut	8da40c9251	Raise image byte estimate for compaction token accounting (#12717 ) Increase `IMAGE_BYTES_ESTIMATE` from 340 bytes to 7,373 bytes so the existing 4-bytes/token heuristic yields an image estimate of ~1,844 tokens instead of ~85. This makes auto-compaction more conservative for image-heavy transcripts and avoids underestimating context usage, which can otherwise cause compaction to fail when there is not enough free context remaining. The new value was chosen because that's the image resolution cap used for our latest models. Follow-up to [#12419](https://github.com/openai/codex/pull/12419). Refs [#11845](https://github.com/openai/codex/issues/11845).	2026-02-24 16:11:38 -08:00
Dylan Hurd	f6053fdfb3	feat(core) Introduce Feature::RequestPermissions (#11871 ) ## Summary Introduces the initial implementation of Feature::RequestPermissions. RequestPermissions allows the model to request that a command be run inside the sandbox, with additional permissions, like writing to a specific folder. Eventually this will include other rules as well, and the ability to persist these permissions, but this PR is already quite large - let's get the core flow working and go from there! <img width="1279" height="541" alt="Screenshot 2026-02-15 at 2 26 22 PM" src="https://github.com/user-attachments/assets/0ee3ec0f-02ec-4509-91a2-809ac80be368" /> ## Testing - [x] Added tests - [x] Tested locally - [x] Feature	2026-02-24 09:48:57 -08:00
Michael Bolin	66d5d34e6e	core: preserve constrained approval/sandbox policies in TurnContext (#12473 )	2026-02-21 14:40:24 -08:00
Eric Traut	3586fcb802	Improve token usage estimate for images (#12419 ) Fixes #11845. Adjust context/token estimation for inline image `data:*;base64,...` URLs so we do not count the raw base64 payload as model-visible text. What changed: - keep the existing JSON-length estimator as the baseline - detect only inline base64 `data:` image URLs in message and function-call output content items - subtract only the base64 payload bytes (preserving data URL prefix + JSON overhead) - add a fixed per-image estimate of 340 bytes (~85 tokens at the repo’s 4-bytes/token heuristic) This avoids large overestimates from MCP image tool outputs while leaving normal image URLs (`https://`, `file://`, non-base64 `data:` URLs) unchanged. Tests: - message image data URL estimate regression - function-call output image data URL estimate regression - non-base64 image URLs unchanged - non-base64 `data:` URLs unchanged - `data:application/octet-stream;base64,...` adjusted - multiple inline images apply multiple fixed costs - text-only items unchanged	2026-02-21 14:25:36 -08:00
Charley Cunningham	bb0ac5be70	Fix compaction context reinjection and model baselines (#12252 ) ## Summary - move regular-turn context diff/full-context persistence into `run_turn` so pre-turn compaction runs before incoming context updates are recorded - after successful pre-turn compaction, rely on a cleared `reference_context_item` to trigger full context reinjection on the follow-up regular turn (manual `/compact` keeps replacement history summary-only and also clears the baseline) - preserve `<model_switch>` when full context is reinjected, and inject it before the rest of the full-context items - scope `reference_context_item` and `previous_model` to regular user turns only so standalone tasks (`/compact`, shell, review, undo) cannot suppress future reinjection or `<model_switch>` behavior - make context-diff persistence + `reference_context_item` updates explicit in the regular-turn path, with clearer docs/comments around the invariant - stop persisting local `/compact` `RolloutItem::TurnContext` snapshots (only regular turns persist `TurnContextItem` now) - simplify resume/fork previous-model/reference-baseline hydration by looking up the last surviving turn context from rollout lifecycle events, including rollback and compaction-crossing handling - remove the legacy fallback that guessed from bare `TurnContext` rollouts without lifecycle events - update compaction/remote-compaction/model-visible snapshots and compact test assertions (including remote compaction mock response shape) ## Why We were persisting incoming context items before spawning the regular turn task, which let pre-turn compaction requests accidentally include incoming context diffs without the new user message. Fixing that exposed follow-on baseline issues around `/compact`, resume/fork, and standalone tasks that could cause duplicate context injection or suppress `<model_switch>` instructions. This PR re-centers the invariants around regular turns: - regular turns persist model-visible context diffs/full reinjection and update the `reference_context_item` - standalone tasks do not advance those regular-turn baselines - compaction clears the baseline when replacement history may have stripped the referenced context diffs ## Follow-ups (TODOs left in code) - `TODO(ccunningham)`: fix rollback/backtracking baseline handling more comprehensively - `TODO(ccunningham)`: include pending incoming context items in pre-turn compaction threshold estimation - `TODO(ccunningham)`: inject updated personality spec alongside `<model_switch>` so some model-switch paths can avoid forced full reinjection - `TODO(ccunningham)`: review task turn lifecycle (`TurnStarted`/`TurnComplete`) behavior and emit task-start context diffs for task types that should have them (excluding `/compact`) ## Validation - `just fmt` - CI should cover the updated compaction/resume/model-visible snapshot expectations and rollout-hydration behavior - I did not rerun the full local test suite after the latest resume-lookup / rollout-persistence simplifications	2026-02-20 23:13:08 -08:00
Charley Cunningham	f2d5842ed1	Move previous turn context tracking into ContextManager history (#12179 ) ## Summary - add `previous_context_item: Option<TurnContextItem>` to `ContextManager` - expose session/state accessors for reading and updating the stored previous context item - switch settings diffing to use `TurnContextItem` instead of `TurnContext` - remove submission-loop local `previous_context` and persist the previous context item in history ## Testing - `just fmt` - `just fix -p codex-core` - `cargo test -p codex-core --test all model_switching::` - `cargo test -p codex-core --test all collaboration_instructions::` - `cargo test -p codex-core --test all personality::` - `cargo test -p codex-core --test all permissions_messages::permissions_message_not_added_when_no_change`	2026-02-19 09:56:20 -08:00
jif-oai	2daa3fd44f	feat: sub-agent injection (#12152 ) This PR adds parent-thread sub-agent completion notifications and change the prompt of the model to prevent if from being confused	2026-02-19 11:32:10 +00:00
Charley Cunningham	cab607befb	Centralize context update diffing logic (#11807 ) ## Summary This PR centralizes model-visible state diffing for turn context updates into one module, while keeping existing behavior and call sites stable. ### What changed - Added `core/src/context_updates.rs` with the consolidated diffing logic for: - environment context updates - permissions/policy updates - collaboration mode updates - model-instruction switch updates - personality updates - Added `BuildSettingsUpdateItemsParams` so required dependencies are passed explicitly. - Updated `Session::build_settings_update_items` in `core/src/codex.rs` to delegate to the centralized module. - Reused the same centralized `personality_message_for` helper from initial-context assembly to avoid duplicated logic. - Registered the new module in `core/src/lib.rs`. ## Why This is a minimal, shippable step toward the model-visible-state design: all state diff decisions for turn-context update items now live in one place, improving reviewability and reducing drift risk without expanding scope. ## Behavior - Intended to be behavior-preserving. - No protocol/schema changes. - No call-site behavior changes beyond routing through the new centralized logic. ## Testing Ran targeted tests in this worktree: - `cargo test -p codex-core build_settings_update_items_emits_environment_item_for_network_changes` - `cargo test -p codex-core collaboration_instructions --test all` Both passed. ## Codex author `codex resume 019c540f-3951-7352-a3fa-6f07b834d4ce`	2026-02-17 09:21:44 -08:00
Ahmed Ibrahim	5e01450963	Strip unsupported images from prompt history to guard against model switch (#11349 ) - Make `ContextManager::for_prompt` modality-aware and strip input_image content when the active model is text-only. - Added a test for multi-model -> text-only model switch	2026-02-10 11:58:00 -08:00
Charley Cunningham	9450cd9ce5	core: add focused diagnostics for remote compaction overflows (#11133 ) ## Summary - add targeted remote-compaction failure diagnostics in compact_remote logging - log the specific values needed to explain overflow timing: - last_api_response_total_tokens - estimated_tokens_of_items_added_since_last_successful_api_response - estimated_bytes_of_items_added_since_last_successful_api_response - failing_compaction_request_body_bytes - simplify breakdown naming and remove last_api_response_total_bytes_estimate (it was an approximation and not useful for debugging) ## Why When compaction fails with context_length_exceeded, we need concrete, low-ambiguity numbers that map directly to: 1) what the API most recently reported, and 2) what local history added since then. This keeps the failure logs actionable without adding broad, noisy metrics. ## Testing - just fmt - cargo test -p codex-core	2026-02-09 12:42:20 -08:00
Charley Cunningham	0883e5d3e5	core: account for all post-response items in auto-compact token checks (#11132 ) ## Summary - change compaction pre-check accounting to include all items added after the last model-generated item, not only trailing codex-generated outputs - use that boundary consistently in get_total_token_usage() and get_total_token_usage_breakdown() - update history tests to cover user/tool-output items after the last model item ## Why last_token_usage.total_tokens is API-reported for the last successful model response. After that point, local history may gain additional items (user messages, injected context, tool outputs). Compaction triggering must account for all of those items to avoid late compaction attempts that can overflow context. ## Testing - just fmt - cargo test -p codex-core	2026-02-09 08:34:38 -08:00
Charley Cunningham	dc7007beaa	Fix remote compaction estimator/payload instruction small mismatch (#10692 ) ## Summary This PR fixes a deterministic mismatch in remote compaction where pre-trim estimation and the `/v1/responses/compact` payload could use different base instructions. Before this change: - pre-trim estimation used model-derived instructions (`model_info.get_model_instructions(...)`) - compact payload used session base instructions (`sess.get_base_instructions()`) After this change: - remote pre-trim estimation and compact payload both use the same `BaseInstructions` instance from session state. ## Changes - Added a shared estimator entry point in `ContextManager`: - `estimate_token_count_with_base_instructions(&self, base_instructions: &BaseInstructions) -> Option<i64>` - Kept `estimate_token_count(&TurnContext)` as a thin wrapper that resolves model/personality instructions and delegates to the new helper. - Updated remote compaction flow to fetch base instructions once and reuse it for both: - trim preflight estimation - compact request payload construction - Added regression coverage for parity and behavior: - unit test verifying explicit-base estimator behavior - integration test proving remote compaction uses session override instructions and trims accordingly ## Why this matters This removes a deterministic divergence source where pre-trim could think the request fits while the actual compact request exceeded context because its instructions were longer/different. ## Scope In scope: - estimator/payload base-instructions parity in remote compaction Out of scope: - retry-on-`context_length_exceeded` - compaction threshold/headroom policy changes - broader trimming policy changes ## Codex author: `codex fork 019c2b24-c2df-7b31-a482-fb8cf7a28559`	2026-02-04 23:24:06 -08:00
Owen Lin	5ea107a088	feat(app-server, core): allow text + image content items for dynamic tool outputs (#10567 ) Took over the work that @aaronl-openai started here: https://github.com/openai/codex/pull/10397 Now that app-server clients are able to set up custom tools (called `dynamic_tools` in app-server), we should expose a way for clients to pass in not just text, but also image outputs. This is something the Responses API already supports for function call outputs, where you can pass in either a string or an array of content outputs (text, image, file): https://platform.openai.com/docs/api-reference/responses/create#responses_create-input-input_item_list-item-function_tool_call_output-output-array-input_image So let's just plumb it through in Codex (with the caveat that we only support text and image for now). This is implemented end-to-end across app-server v2 protocol types and core tool handling. ## Breaking API change NOTE: This introduces a breaking change with dynamic tools, but I think it's ok since this concept was only recently introduced (https://github.com/openai/codex/pull/9539) and it's better to get the API contract correct. I don't think there are any real consumers of this yet (not even the Codex App). Old shape: `{ "output": "dynamic-ok", "success": true }` New shape: ``` { "contentItems": [ { "type": "inputText", "text": "dynamic-ok" }, { "type": "inputImage", "imageUrl": "data:image/png;base64,AAA" } ] "success": true } ```	2026-02-04 16:12:47 -08:00
pakrym-oai	7f20357611	Stop client from being state carrier (#10595 ) I'd like to make client session wide. This requires shedding all random state it has to carry.	2026-02-04 09:05:37 -08:00
pakrym-oai	cbfd2a37cc	Trim compaction input (#10374 ) Two fixes: 1. Include trailing tool output in the total context size calculation. Otherwise when checking whether compaction should run we ignore newly added outputs. 2. Trim trailing tool output/tool calls until we can fit the request into the model context size. Otherwise the compaction endpoint will fail to compact. We only trim items that can be reproduced again by the model (tool calls, tool call outputs).	2026-02-02 19:03:11 -08:00
sayan-oai	fc05374344	chore: add phase to message responseitem (#10455 ) ### What add wiring for `phase` field on `ResponseItem::Message` to lay groundwork for differentiating model preambles and final messages. currently optional. follows pattern in #9698. updated schemas with `just write-app-server-schema` so we can see type changes. ### Tests Updated existing tests for SSE parsing and hydrating from history	2026-02-03 02:52:26 +00:00
Dylan Hurd	a33fa4bfe5	chore(config) Rename config setting to personality (#10314 ) ## Summary Let's make the setting name consistent with the SlashCommand! ## Testing - [x] Updated tests	2026-01-31 19:38:06 -08:00
Michael Bolin	10ea117ee1	chore: implement Mul for TruncationPolicy (#10272 ) Codex thought this was a good idea while working on https://github.com/openai/codex/pull/10192.	2026-01-30 15:50:20 -08:00
Dylan Hurd	8b3521ee77	feat(core) update Personality on turn (#9644 ) ## Summary Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This is explicitly unused by the clients so far, to simplify PRs - app-server and tui implementations will be follow-ups. ## Testing - [x] added integration tests	2026-01-22 12:04:23 -08:00
pakrym-oai	b511c38ddb	Support end_turn flag (#9698 ) Experimental flag that signals the end of the turn.	2026-01-22 17:27:48 +00:00
Dylan Hurd	96a72828be	feat(core) ModelInfo.model_instructions_template (#9597 ) ## Summary #9555 is the start of a rename, so I'm starting to standardize here. Sets up `model_instructions` templating with a strongly-typed object for injecting a personality block into the model instructions. ## Testing - [x] Added tests - [x] Ran locally	2026-01-21 18:11:18 -08:00
Skylar Graika	b236f1c95d	fix: prevent repeating interrupted turns (#9043 ) ## What Record a model-visible `<turn_aborted>` marker in history when a turn is interrupted, and treat it as a session prefix. ## Why When a turn is interrupted, Codex emits `TurnAborted` but previously did not persist anything model-visible in the conversation history. On the next user turn, the model can’t tell the previous work was aborted and may resume/repeat earlier actions (including duplicated side effects like re-opening PRs). Fixes: https://github.com/openai/codex/issues/9042 ## How On `TurnAbortReason::Interrupted`, append a hidden user message containing a `<turn_aborted>…</turn_aborted>` marker and flush. Treat `<turn_aborted>` like `<environment_context>` for session-prefix filtering. Add a regression test to ensure follow-up turns don’t repeat side effects from an aborted turn. ## Testing `just fmt` `just fix -p codex-core` `cargo test -p codex-core -- --test-threads=1` `cargo test --all-features -- --test-threads=1` --------- Co-authored-by: Skylar Graika <sgraika127@gmail.com> Co-authored-by: jif-oai <jif@openai.com> Co-authored-by: Eric Traut <etraut@openai.com>	2026-01-20 13:07:28 -08:00
Ahmed Ibrahim	b11e96fb04	Act on reasoning-included per turn (#9402 ) - Reset reasoning-included flag each turn and update compaction test	2026-01-19 11:23:25 -08:00
Dylan Hurd	bffe9b33e9	chore(core) Create instructions module (#9422 ) ## Summary We have a variety of things we refer to as instructions in the code base: our current canonical terms are: - base instructions (raw string) - developer instructions (has a type in protocol) - user instructions We also have `instructions` floating around in various places. We should standardize on the above, and start using types to prevent them from ending up in the wrong place. There will be additional PRs, but I'm going to keep these small so we can easily follow them! ## Testing - [x] Tests pass, this is purely a file move	2026-01-17 16:01:26 -08:00
Eric Traut	1fc72c647f	Fix token estimate during compaction (#9337 ) This addresses #9287	2026-01-15 19:48:11 -08:00
jif-oai	dc3deaa3e7	feat: return an error if the image sent by the user is a bad image (#9146 ) ## Before When we detect an `InvalidImageRequest`, we replace the image by a placeholder and keep going ## Now In such `InvalidImageRequest`, we check if the image is due to a user message or a tool call output. For tool call output we still replace it with a placeholder to avoid breaking the agentic loop bu tif this is because of a user message, we send an error to the user	2026-01-14 09:07:45 +00:00
jif-oai	69898e3dba	clean: all history cloning (#8916 )	2026-01-08 18:17:18 +00:00
jif-oai	5522663f92	feat: add a few metrics (#8910 )	2026-01-08 15:39:57 +00:00
Ahmed Ibrahim	c31960b13a	remove unnecessary todos (#8842 ) > // todo(aibrahim): why are we passing model here while it can change? we update it on each turn with `.with_model` > //TODO(aibrahim): run CI in release mode. although it's good to have, release builds take double the time tests take. > // todo(aibrahim): make this async function we figured out another way of doing this sync	2026-01-07 10:43:10 -08:00
Ahmed Ibrahim	9179c9deac	Merge Modelfamily into modelinfo (#8763 ) - Merge ModelFamily into ModelInfo - Remove logic for adding instructions to apply patch - Add compaction limit and visible context window to `ModelInfo`	2026-01-07 10:35:09 -08:00
jif-oai	116059c3a0	chore: unify conversation with thread name (#8830 ) Done and verified by Codex + refactor feature of RustRover	2026-01-07 17:04:53 +00:00
Owen Lin	8b7ec31ba7	feat(app-server): thread/rollback API (#8454 ) Add `thread/rollback` to app-server to support IDEs undo-ing the last N turns of a thread. For context, an IDE partner will be supporting an "undo" capability where the IDE (the app-server client) will be responsible for reverting the local changes made during the last turn. To support this well, we also need a way to drop the last turn (or more generally, the last N turns) from the agent's context. This is what `thread/rollback` does. Core idea: A Thread rollback is represented as a persisted event message (EventMsg::ThreadRollback) in the rollout JSONL file, not by rewriting history. On resume, both the model's context (core replay) and the UI turn list (app-server v2's thread history builder) apply these markers so the pruned history is consistent across live conversations and `thread/resume`. Implementation notes: - Rollback only affects agent context and appends to the rollout file; clients are responsible for reverting files on disk. - If a thread rollback is currently in progress, subsequent `thread/rollback` calls are rejected. - Because we use `CodexConversation::submit` and codex core tracks active turns, returning an error on concurrent rollbacks is communicated via an `EventMsg::Error` with a new variant `CodexErrorInfo::ThreadRollbackFailed`. app-server watches for that and sends the BAD_REQUEST RPC response. Tests cover thread rollbacks in both core and app-server, including when `num_turns` > existing turns (which clears all turns). Note: this explicitly does not behave like `/undo` which we just removed from the CLI, which does the opposite of what `thread/rollback` does. `/undo` reverts local changes via ghost commits/snapshots and does not modify the agent's context / conversation history.	2026-01-06 21:23:48 +00:00
Eric Traut	1e3cad95c0	Do not panic when session contains a tool call without an output (#8048 ) Normally, all tool calls within a saved session should have a response, but there are legitimate reasons for the response to be missing. This can occur if the user canceled the call or there was an error of some sort during the rollout. We shouldn't panic in this case. This is a partial fix for #7990	2025-12-14 22:16:49 -08:00
pakrym-oai	b3ddd50eee	Remote compact for API-key users (#7835 )	2025-12-12 10:05:02 -08:00
jif-oai	e91bb6b947	fix: ignore ghost snapshots in token consumption (#7638 )	2025-12-05 13:57:24 +00:00
jif-oai	51307eaf07	feat: retroactive image placeholder to prevent poisoning (#6774 ) If an image can't be read by the API, it will poison the entire history, preventing any new turn on the conversation. This detect such cases and replace the image by a placeholder	2025-12-03 11:35:56 +00:00
jif-oai	a421eba31f	fix: disable review rollout filtering (#7371 )	2025-12-01 09:04:13 +00:00
jif-oai	aaec8abf58	feat: detached review (#7292 )	2025-11-28 11:34:57 +00:00
jif-oai	99bcb90353	chore: use proxy for encrypted summary (#7252 )	2025-11-24 16:51:47 +00:00
Ahmed Ibrahim	b519267d05	Account for encrypted reasoning for auto compaction (#7113 ) - The total token used returned from the api doesn't account for the reasoning items before the assistant message - Account for those for auto compaction - Add the encrypted reasoning effort in the common tests utils - Add a test to make sure it works as expected	2025-11-22 03:06:45 +00:00
pakrym-oai	52d0ec4cd8	Delete tiktoken-rs (#7018 )	2025-11-20 11:15:04 -08:00
Ahmed Ibrahim	efebc62fb7	Move shell to use `truncate_text` (#6842 ) Move shell to use the configurable `truncate_text` --------- Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-19 01:56:08 -08:00
Ahmed Ibrahim	3de8790714	Add the utility to truncate by tokens (#6746 ) - This PR is to make it on path for truncating by tokens. This path will be initially used by unified exec and context manager (responsible for MCP calls mainly). - We are exposing new config `calls_output_max_tokens` - Use `tokens` as the main budget unit but truncate based on the model family by Introducing `TruncationPolicy`. - Introduce `truncate_text` as a router for truncation based on the mode. In next PRs: - remove truncate_with_line_bytes_budget - Add the ability to the model to override the token budget.	2025-11-18 11:36:23 -08:00
jif-oai	838531d3e4	feat: remote compaction (#6795 ) Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-18 16:51:16 +00:00
Ahmed Ibrahim	94dfb211af	Refactor truncation helpers into its own file (#6683 ) That's to centralize the truncation in one place. Next step would be to make only two methods public: one with bytes/lines and one with tokens.	2025-11-15 06:44:23 +00:00
Ahmed Ibrahim	9890ceb939	Avoid double truncation (#6631 ) 1. Avoid double truncation by giving 10% above the tool default constant 2. Add tests that fails when const = 1	2025-11-13 16:59:31 -08:00
jif-oai	2a417c47ac	feat: proxy context left after compaction (#6597 )	2025-11-13 16:54:03 +01:00

1 2

51 Commits