Add the utility to truncate by tokens (#6746)

- This PR is to make it on path for truncating by tokens. This path will be initially used by unified exec and context manager (responsible for MCP calls mainly). - We are exposing new config `calls_output_max_tokens` - Use `tokens` as the main budget unit but truncate based on the model family by Introducing `TruncationPolicy`. - Introduce `truncate_text` as a router for truncation based on the mode. In next PRs: - remove truncate_with_line_bytes_budget - Add the ability to the model to override the token budget.
2026-05-04 13:21:54 +03:00 · 2025-11-18 11:36:23 -08:00
parent b035c604b0
commit 3de8790714
21 changed files with 770 additions and 549 deletions
--- a/docs/example-config.md
+++ b/docs/example-config.md
@@ -33,6 +33,7 @@ model_provider = "openai"
 # model_context_window = 128000       # tokens; default: auto for model
 # model_max_output_tokens = 8192      # tokens; default: auto for model
 # model_auto_compact_token_limit = 0  # disable/override auto; default: model family specific
+# tool_output_token_limit = 10000  # tokens stored per tool output; default: 10000 for gpt-5.1-codex

 ################################################################################
 # Reasoning & Verbosity (Responses API capable models)