Add the utility to truncate by tokens (#6746)

- This PR is to make it on path for truncating by tokens. This path will
be initially used by unified exec and context manager (responsible for
MCP calls mainly).
- We are exposing new config `calls_output_max_tokens`
- Use `tokens` as the main budget unit but truncate based on the model
family by Introducing `TruncationPolicy`.
- Introduce `truncate_text` as a router for truncation based on the
mode.

In next PRs:
- remove truncate_with_line_bytes_budget
- Add the ability to the model to override the token budget.
This commit is contained in:
Ahmed Ibrahim
2025-11-18 11:36:23 -08:00
committed by GitHub
parent b035c604b0
commit 3de8790714
21 changed files with 770 additions and 549 deletions

View File

@@ -33,6 +33,7 @@ model_provider = "openai"
# model_context_window = 128000 # tokens; default: auto for model
# model_max_output_tokens = 8192 # tokens; default: auto for model
# model_auto_compact_token_limit = 0 # disable/override auto; default: model family specific
# tool_output_token_limit = 10000 # tokens stored per tool output; default: 10000 for gpt-5.1-codex
################################################################################
# Reasoning & Verbosity (Responses API capable models)