Commit Graph

20 Commits

Author SHA1 Message Date
Michael Bolin
7a3eec6fdb core: cut codex-core compile time 48% with native async SessionTask (#16631)
## Why

This continues the compile-time cleanup from #16630. `SessionTask`
implementations are monomorphized, but `Session` stores the task behind
a `dyn` boundary so it can drive and abort heterogenous turn tasks
uniformly. That means we can move the `#[async_trait]` expansion off the
implementation trait, keep a small boxed adapter only at the storage
boundary, and preserve the existing task lifecycle semantics while
reducing the amount of generated async-trait glue in `codex-core`.

One measurement caveat showed up while exploring this: a warm
incremental benchmark based on `touch core/src/tasks/mod.rs && cargo
check -p codex-core --lib` was basically flat, but that was the wrong
benchmark for this change. Using package-clean `codex-core` rebuilds,
like #16630, shows the real win.

Relevant pre-change code:

- [`SessionTask` with
`#[async_trait]`](3c7f013f97/codex-rs/core/src/tasks/mod.rs (L129-L182))
- [`RunningTask` storing `Arc<dyn
SessionTask>`](3c7f013f97/codex-rs/core/src/state/turn.rs (L69-L77))

## What changed

- Switched `SessionTask::{run, abort}` to native RPITIT futures with
explicit `Send` bounds.
- Added a private `AnySessionTask` adapter that boxes those futures only
at the `Arc<dyn ...>` storage boundary.
- Updated `RunningTask` to store `Arc<dyn AnySessionTask>` and removed
`#[async_trait]` from the concrete task impls plus test-only
`SessionTask` impls.

## Timing

Benchmarked package-clean `codex-core` rebuilds with dependencies left
warm:

```shell
cargo check -p codex-core --lib >/dev/null
cargo clean -p codex-core >/dev/null
/usr/bin/time -p cargo +nightly rustc -p codex-core --lib -- \
  -Z time-passes \
  -Z time-passes-format=json >/dev/null
```

| revision | rustc `total` | process `real` | `generate_crate_metadata`
| `MIR_borrow_checking` | `monomorphization_collector_graph_walk` |
| --- | ---: | ---: | ---: | ---: | ---: |
| parent `3c7f013f9735` | 67.21s | 67.71s | 24.61s | 23.43s | 22.43s |
| this PR `2cafd783ac22` | 35.08s | 35.60s | 8.01s | 7.25s | 7.15s |
| delta | -47.8% | -47.4% | -67.5% | -69.1% | -68.1% |

For completeness, the warm touched-file benchmark stayed flat (`1.96s`
parent vs `1.97s` this PR), which is why that benchmark should not be
used to evaluate this refactor.

## Verification

- Ran `cargo test -p codex-core`; this change compiled and task-related
tests passed before hitting the same unrelated 5
`config::tests::*guardian*` failures already present on the parent
stack.
2026-04-02 23:39:56 +00:00
Michael Bolin
b77fe8fefe Apply argument comment lint across codex-rs (#14652)
## Why

Once the repo-local lint exists, `codex-rs` needs to follow the
checked-in convention and CI needs to keep it from drifting. This commit
applies the fallback `/*param*/` style consistently across existing
positional literal call sites without changing those APIs.

The longer-term preference is still to avoid APIs that require comments
by choosing clearer parameter types and call shapes. This PR is
intentionally the mechanical follow-through for the places where the
existing signatures stay in place.

After rebasing onto newer `main`, the rollout also had to cover newly
introduced `tui_app_server` call sites. That made it clear the first cut
of the CI job was too expensive for the common path: it was spending
almost as much time installing `cargo-dylint` and re-testing the lint
crate as a representative test job spends running product tests. The CI
update keeps the full workspace enforcement but trims that extra
overhead from ordinary `codex-rs` PRs.

## What changed

- keep a dedicated `argument_comment_lint` job in `rust-ci`
- mechanically annotate remaining opaque positional literals across
`codex-rs` with exact `/*param*/` comments, including the rebased
`tui_app_server` call sites that now fall under the lint
- keep the checked-in style aligned with the lint policy by using
`/*param*/` and leaving string and char literals uncommented
- cache `cargo-dylint`, `dylint-link`, and the relevant Cargo
registry/git metadata in the lint job
- split changed-path detection so the lint crate's own `cargo test` step
runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes
- continue to run the repo wrapper over the `codex-rs` workspace, so
product-code enforcement is unchanged

Most of the code changes in this commit are intentionally mechanical
comment rewrites or insertions driven by the lint itself.

## Verification

- `./tools/argument-comment-lint/run.sh --workspace`
- `cargo test -p codex-tui-app-server -p codex-tui`
- parsed `.github/workflows/rust-ci.yml` locally with PyYAML

---

* -> #14652
* #14651
2026-03-16 16:48:15 -07:00
Owen Lin
289ed549cf chore(otel): rename OtelManager to SessionTelemetry (#13808)
## Summary
This is a purely mechanical refactor of `OtelManager` ->
`SessionTelemetry` to better convey what the struct is doing. No
behavior change.

## Why

`OtelManager` ended up sounding much broader than what this type
actually does. It doesn't manage OTEL globally; it's the session-scoped
telemetry surface for emitting log/trace events and recording metrics
with consistent session metadata (`app_version`, `model`, `slug`,
`originator`, etc.).

`SessionTelemetry` is a more accurate name, and updating the call sites
makes that boundary a lot easier to follow.

## Validation

- `just fmt`
- `cargo test -p codex-otel`
- `cargo test -p codex-core`
2026-03-06 16:23:30 -08:00
Owen Lin
27724f6ead feat(core, tracing): add a span representing a turn (#13424)
This is PR 3 of the app-server tracing rollout.

PRs https://github.com/openai/codex/pull/13285 and
https://github.com/openai/codex/pull/13368 gave us inbound request spans
in app-server and propagated trace context through Submission. This
change finishes the next piece in core: when a request actually starts a
turn, we now create a core-owned long-lived span that stays open for the
real lifetime of the turn.

What changed:
- `Session::spawn_task` can now optionally create a long-lived turn span
and run the spawned task inside it
- `turn/start` uses that path, so normal turn execution stays under a
single core-owned span after the async handoff
- `review/start` uses the same pattern
- added a unit test that verifies the spawned turn task inherits the
submission dispatch trace ancestry

**Why**
The app-server request span is intentionally short-lived. Once work
crosses into core, we still want one span that covers the actual
execution window until completion or interruption. This keeps that
ownership where it belongs: in the layer that owns the runtime
lifecycle.
2026-03-04 11:09:17 -08:00
Charley Cunningham
bb0ac5be70 Fix compaction context reinjection and model baselines (#12252)
## Summary
- move regular-turn context diff/full-context persistence into
`run_turn` so pre-turn compaction runs before incoming context updates
are recorded
- after successful pre-turn compaction, rely on a cleared
`reference_context_item` to trigger full context reinjection on the
follow-up regular turn (manual `/compact` keeps replacement history
summary-only and also clears the baseline)
- preserve `<model_switch>` when full context is reinjected, and inject
it *before* the rest of the full-context items
- scope `reference_context_item` and `previous_model` to regular user
turns only so standalone tasks (`/compact`, shell, review, undo) cannot
suppress future reinjection or `<model_switch>` behavior
- make context-diff persistence + `reference_context_item` updates
explicit in the regular-turn path, with clearer docs/comments around the
invariant
- stop persisting local `/compact` `RolloutItem::TurnContext` snapshots
(only regular turns persist `TurnContextItem` now)
- simplify resume/fork previous-model/reference-baseline hydration by
looking up the last surviving turn context from rollout lifecycle
events, including rollback and compaction-crossing handling
- remove the legacy fallback that guessed from bare `TurnContext`
rollouts without lifecycle events
- update compaction/remote-compaction/model-visible snapshots and
compact test assertions (including remote compaction mock response
shape)

## Why
We were persisting incoming context items before spawning the regular
turn task, which let pre-turn compaction requests accidentally include
incoming context diffs without the new user message. Fixing that exposed
follow-on baseline issues around `/compact`, resume/fork, and standalone
tasks that could cause duplicate context injection or suppress
`<model_switch>` instructions.

This PR re-centers the invariants around regular turns:
- regular turns persist model-visible context diffs/full reinjection and
update the `reference_context_item`
- standalone tasks do not advance those regular-turn baselines
- compaction clears the baseline when replacement history may have
stripped the referenced context diffs

## Follow-ups (TODOs left in code)
- `TODO(ccunningham)`: fix rollback/backtracking baseline handling more
comprehensively
- `TODO(ccunningham)`: include pending incoming context items in
pre-turn compaction threshold estimation
- `TODO(ccunningham)`: inject updated personality spec alongside
`<model_switch>` so some model-switch paths can avoid forced full
reinjection
- `TODO(ccunningham)`: review task turn lifecycle
(`TurnStarted`/`TurnComplete`) behavior and emit task-start context
diffs for task types that should have them (excluding `/compact`)

## Validation
- `just fmt`
- CI should cover the updated compaction/resume/model-visible snapshot
expectations and rollout-hydration behavior
- I did **not** rerun the full local test suite after the latest
resume-lookup / rollout-persistence simplifications
2026-02-20 23:13:08 -08:00
Ahmed Ibrahim
ba8b5d9018 Treat compaction failure as failure state (#10927)
- Return compaction errors from local and remote compaction flows.\n-
Stop turns/tasks when auto-compaction fails instead of continuing
execution.
2026-02-06 13:51:46 -08:00
Eric Traut
dd80e332c4 Removed the "remote_compaction" feature flag (#10840)
This feature is always on now
2026-02-05 23:54:57 -08:00
pakrym-oai
7f20357611 Stop client from being state carrier (#10595)
I'd like to make client session wide. This requires shedding all random
state it has to carry.
2026-02-04 09:05:37 -08:00
jif-oai
5522663f92 feat: add a few metrics (#8910) 2026-01-08 15:39:57 +00:00
jif-oai
634650dd25 feat: metrics capabilities (#8318)
Add metrics capabilities to Codex. The `README.md` is up to date.

This will not be merged with the metrics before this PR of course:
https://github.com/openai/codex/pull/8350
2026-01-08 11:47:36 +00:00
pakrym-oai
b3ddd50eee Remote compact for API-key users (#7835) 2025-12-12 10:05:02 -08:00
Michael Bolin
1cfc967eb8 fix: Features should be immutable over the lifetime of a session/thread (#7540)
I noticed that `features: Features` was defined on `struct
SessionConfiguration`, which is commonly owned by `SessionState`, which
is in turn owned by `Session`.

Though I do not believe that `Features` should be allowed to be modified
over the course of a session (if the feature state is not invariant, it
makes it harder to reason about), which argues that it should live on
`Session` rather than `SessionState` or `SessionConfiguration`.

This PR moves `Features` to `Session` and updates all call sites. It
appears the only place we were mutating `Features` was:

- in tests
- the sub-agent config for a review task:


3ef76ff29d/codex-rs/core/src/tasks/review.rs (L86-L89)

Note this change also means it is no longer an `async` call to check the
state of a feature, eliminating the possibility of a
[TOCTTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
error between checking the state of a feature and acting on it:


3ef76ff29d/codex-rs/core/src/codex.rs (L1069-L1076)
2025-12-03 16:12:31 -08:00
pakrym-oai
75f38f16dd Run remote auto compaction (#6879) 2025-11-19 00:43:58 -08:00
jif-oai
c56d0c159b fix: local compaction (#6844) 2025-11-18 22:18:10 +00:00
jif-oai
838531d3e4 feat: remote compaction (#6795)
Co-authored-by: pakrym-oai <pakrym@openai.com>
2025-11-18 16:51:16 +00:00
jif-oai
50a77dc138 Move compact (#6454) 2025-11-10 11:59:48 +00:00
pakrym-oai
789e65b9d2 Pass TurnContext around instead of sub_id (#5421)
Today `sub_id` is an ID of a single incoming Codex Op submition. We then
associate all events triggered by this operation using the same
`sub_id`.

At the same time we are also creating a TurnContext per submission and
we'd like to start associating some events (item added/item completed)
with an entire turn instead of just the operation that started it.

Using turn context when sending events give us flexibility to change
notification scheme.
2025-10-21 08:04:16 -07:00
pakrym-oai
9c903c4716 Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).


Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
2025-10-20 13:34:44 -07:00
pakrym-oai
c03e31ecf5 Support graceful agent interruption (#5287) 2025-10-17 18:52:57 +00:00
jif-oai
1fc3413a46 ref: state - 2 (#4229)
Extracting tasks in a module and start abstraction behind a Trait (more
to come on this but each task will be tackled in a dedicated PR)
The goal was to drop the ActiveTask and to have a (potentially) set of
tasks during each turn
2025-09-26 13:49:08 +00:00