### Description
- Remove the now-unused `instructions` field from the session metadata
to simplify SessionMeta and stop propagating transient instruction text
through the rollout recorder API. This was only saving
user_instructions, and was never being read.
- Stop passing user instructions into the rollout writer at session
creation so the rollout header only contains canonical session metadata.
### Testing
- Ran `just fmt` which completed successfully.
- Ran `just fix -p codex-protocol`, `just fix -p codex-core`, `just fix
-p codex-app-server`, `just fix -p codex-tui`, and `just fix -p
codex-tui2` which completed (Clippy fixes applied) as part of
verification.
- Ran `cargo test -p codex-protocol` which passed (28 tests).
- Ran `cargo test -p codex-core` which showed failures in a small set of
tests (not caused by the protocol type change directly):
`default_client::tests::test_create_client_sets_default_headers`,
several `models_manager::manager::tests::refresh_available_models_*`,
and `shell_snapshot::tests::linux_sh_snapshot_includes_sections` (these
tests failed in this CI run).
- Ran `cargo test -p codex-app-server` which reported several failing
integration tests (including
`suite::codex_message_processor_flow::test_codex_jsonrpc_conversation_flow`,
`suite::output_schema::send_user_turn_*`, and
`suite::user_agent::get_user_agent_returns_current_codex_user_agent`).
- `cargo test -p codex-tui` and `cargo test -p codex-tui2` were
attempted but aborted due to disk space exhaustion (`No space left on
device`).
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_696bd8ce632483228d298cf07c7eb41c)
Add support for returning threads by either `created_at` OR `updated_at`
descending. Previously core always returned threads ordered by
`created_at`.
This PR:
- updates core to be able to list threads by `updated_at` OR
`created_at` descending based on what the caller wants
- also update `thread/list` in app-server to expose this (default to
`created_at` if not specified)
All existing codepaths (app-server, TUI) still default to `created_at`,
so no behavior change is expected with this PR.
**Implementation**
To sort by `updated_at` is a bit nontrivial (whereas `created_at` is
easy due to the way we structure the folders and filenames on disk,
which are all based on `created_at`).
The most naive way to do this without introducing a cache file or sqlite
DB (which we have to implement/maintain) is to scan files in reverse
`created_at` order on disk, and look at the file's mtime (last modified
timestamp according to the filesystem) until we reach `MAX_SCAN_FILES`
(currently set to 10,000). Then, we can return the most recent N
threads.
Based on some quick and dirty benchmarking on my machine with ~1000
rollout files, calling `thread/list` with limit 50, the `updated_at`
path is slower as expected due to all the I/O:
- updated-at: average 103.10 ms
- created-at: average 41.10 ms
Those absolute numbers aren't a big deal IMO, but we can certainly
optimize this in a followup if needed by introducing more state stored
on disk.
**Caveat**
There's also a limitation in that any files older than `MAX_SCAN_FILES`
will be excluded, which means if a user continues a REALLY old thread,
it's possible to not be included. In practice that should not be too big
of an issue.
If a user makes...
- 1000 rollouts/day → threads older than 10 days won't show up
- 100 rollouts/day → ~100 days
If this becomes a problem for some reason, even more motivation to
implement an updated_at cache.
https://github.com/openai/codex/pull/8235 introduced `ConfigBuilder` and
this PR updates all call non-test call sites to use it instead of
`Config::load_from_base_config_with_overrides()`.
This is important because `load_from_base_config_with_overrides()` uses
an empty `ConfigRequirements`, which is a reasonable default for testing
so the tests are not influenced by the settings on the host. This method
is now guarded by `#[cfg(test)]` so it cannot be used by business logic.
Because `ConfigBuilder::build()` is `async`, many of the test methods
had to be migrated to be `async`, as well. On the bright side, this made
it possible to eliminate a bunch of `block_on_future()` stuff.
By default, show only sessions that shared a cwd with the current cwd.
`--all` shows all sessions in all cwds. Also, show the branch name from
the rollout metadata.
<img width="1091" height="638" alt="Screenshot 2025-11-04 at 3 30 47 PM"
src="https://github.com/user-attachments/assets/aae90308-6115-455f-aff7-22da5f1d9681"
/>
Because conversations that use the Responses API can have encrypted
reasoning messages, trying to resume a conversation with a different
provider could lead to confusing "failed to decrypt" errors. (This is
reproducible by starting a conversation using ChatGPT login and resuming
it as a conversation that uses OpenAI models via Azure.)
This changes `ListConversationsParams` to take a `model_providers:
Option<Vec<String>>` and adds `model_provider` on each
`ConversationSummary` it returns so these cases can be disambiguated.
Note this ended up making changes to
`codex-rs/core/src/rollout/tests.rs` because it had a number of cases
where it expected `Some` for the value of `next_cursor`, but the list of
rollouts was complete, so according to this docstring:
bcd64c7e72/codex-rs/app-server-protocol/src/protocol.rs (L334-L337)
If there are no more items to return, then `next_cursor` should be
`None`. This PR updates that logic.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658).
* #5803
* #5793
* __->__ #5658
1. Adds AgentMessage, Reasoning, WebSearch items.
2. Switches the ResponseItem parsing to use new items and then also emit
3. Removes user-item kind and filters out "special" (environment) user
items when returning to clients.
This PR does multiple things that are necessary for conversation resume
to work from the extension. I wanted to make sure everything worked so
these changes wound up in one PR:
1. Generate more ts types
2. Resume rollout history files rather than create a new one every time
it is resumed so you don't see a duplicate conversation in history for
every resume. Chatted with @aibrahim-oai to verify this
3. Return conversation_id in conversation summaries
4. [Cleanup] Use serde and strong types for a lot of the rollout file
parsing
Adds a TUI resume flow with an interactive picker and quick resume.
- CLI:
- --resume / -r: open picker to resume a prior session
- --continue / -l: resume the most recent session (no picker)
- Behavior on resume: initial history is replayed, welcome banner
hidden, and the first redraw is suppressed to avoid flicker.
- Implementation:
- New tui/src/resume_picker.rs (paginated listing via
RolloutRecorder::list_conversations)
- App::run accepts ResumeSelection; resumes from disk when requested
- ChatWidget refactor with ChatWidgetInit and new_from_existing; replays
initial messages
- Tests: cover picker sorting/preview extraction and resumed-history
rendering.
- Docs: getting-started updated with flags and picker usage.
https://github.com/user-attachments/assets/1bb6469b-e5d1-42f6-bec6-b1ae6debda3b