add(core): safety check downgrade warning (#11964)

Add per-turn notice when a request is downgraded to a fallback model due
to cyber safety checks.

**Changes**

- codex-api: Emit a ServerModel event based on the openai-model response
header and/or response payload (SSE + WebSocket), including when the
model changes mid-stream.
- core: When the server-reported model differs from the requested model,
emit a single per-turn warning explaining the reroute to gpt-5.2 and
directing users to Trusted
    Access verification and the cyber safety explainer.
- app-server (v2): Surface these cyber model-routing warnings as
synthetic userMessage items with text prefixed by Warning: (and document
this behavior).
This commit is contained in:
Fouad Matin
2026-02-16 22:13:36 -08:00
committed by GitHub
parent 08f689843f
commit 02e9006547
12 changed files with 843 additions and 4 deletions

View File

@@ -557,7 +557,7 @@ Today both notifications carry an empty `items` array even when item events were
`ThreadItem` is the tagged union carried in turn responses and `item/*` notifications. Currently we support events for the following items:
- `userMessage``{id, content}` where `content` is a list of user inputs (`text`, `image`, or `localImage`).
- `userMessage``{id, content}` where `content` is a list of user inputs (`text`, `image`, or `localImage`). Cyber model-routing warnings are surfaced as synthetic `userMessage` items with `text` prefixed by `Warning:`.
- `agentMessage``{id, text}` containing the accumulated agent reply.
- `plan``{id, text}` emitted for plan-mode turns; plan text can stream via `item/plan/delta` (experimental).
- `reasoning``{id, summary, content}` where `summary` holds streamed reasoning summaries (applicable for most OpenAI models) and `content` holds raw reasoning blocks (applicable for e.g. open source models).