Commit Graph

1721 Commits

Author SHA1 Message Date
Michael Bolin
cccf9b5eb4 fix: make project_doc skill-render tests deterministic (#11545)
## Why
`project_doc::tests::skills_are_appended_to_project_doc` and
`project_doc::tests::skills_render_without_project_doc` were assuming a
single synthetic skill in test setup, but they called
`load_skills(&cfg)`, which loads from repo/user/system roots.

That made the assertions environment-dependent. After
[#11531](https://github.com/openai/codex/pull/11531) added
`.codex/skills/test-tui/SKILL.md`, the repo-scoped `test-tui` skill
began appearing in these test outputs and exposed the flake.

## What Changed
- Added a test-only helper in `codex-rs/core/src/project_doc.rs` that
loads skills from an explicit root via `load_skills_from_roots`.
- Scoped that root to `codex_home/skills` with `SkillScope::User`.
- Updated both affected tests to use this helper instead of
`load_skills(&cfg)`:
  - `skills_are_appended_to_project_doc`
  - `skills_render_without_project_doc`

This keeps the tests focused on the fixture skills they create,
independent of ambient repo/home skills.

## Verification
- `cargo test -p codex-core
project_doc::tests::skills_render_without_project_doc -- --exact`
- `cargo test -p codex-core
project_doc::tests::skills_are_appended_to_project_doc -- --exact`
2026-02-12 05:38:33 +00:00
sayan-oai
d1a97ed852 fix compilation (#11532)
fix broken main
2026-02-11 19:31:13 -08:00
Michael Bolin
abbd74e2be feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
`SandboxPolicy::ReadOnly` previously implied broad read access and could
not express a narrower read surface.
This change introduces an explicit read-access model so we can support
user-configurable read restrictions in follow-up work, while preserving
current behavior today.

It also ensures unsupported backends fail closed for restricted-read
policies instead of silently granting broader access than intended.

## What

- Added `ReadOnlyAccess` in protocol with:
  - `Restricted { include_platform_defaults, readable_roots }`
  - `FullAccess`
- Updated `SandboxPolicy` to carry read-access configuration:
  - `ReadOnly { access: ReadOnlyAccess }`
  - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
- Preserved existing behavior by defaulting current construction paths
to `ReadOnlyAccess::FullAccess`.
- Threaded the new fields through sandbox policy consumers and call
sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
related tests.
- Updated Seatbelt policy generation to honor restricted read roots by
emitting scoped read rules when full read access is not granted.
- Added fail-closed behavior on Linux and Windows backends when
restricted read access is requested but not yet implemented there
(`UnsupportedOperation`).
- Regenerated app-server protocol schema and TypeScript artifacts,
including `ReadOnlyAccess`.

## Compatibility / rollout

- Runtime behavior remains unchanged by default (`FullAccess`).
- API/schema changes are in place so future config wiring can enable
restricted read access without another policy-shape migration.
2026-02-11 18:31:14 -08:00
Ahmed Ibrahim
95fb86810f Update context window after model switch (#11520)
- Update token usage aggregation to refresh model context window after a
model change.
- Add protocol/core tests, including an e2e model-switch test that
validates switching to a smaller model updates telemetry.
2026-02-11 17:41:23 -08:00
Ahmed Ibrahim
40de788c4d Clamp auto-compact limit to context window (#11516)
- Clamp auto-compaction to the minimum of configured limit and 90% of
context window
- Add an e2e compact test for clamped behavior
- Update remote compact tests to account for earlier auto-compaction in
setup turns
2026-02-11 17:41:08 -08:00
Ahmed Ibrahim
6938150c5e Pre-sampling compact with previous model context (#11504)
- Run pre-sampling compact through a single helper that builds
previous-model turn context and compacts before the follow-up request
when switching to a smaller context window.
- Keep compaction events on the parent turn id and add compact suite
coverage for switch-in-session and resume+switch flows.
2026-02-11 17:24:06 -08:00
willwang-openai
3f1b41689a change model cap to server overload (#11388)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-02-11 17:16:27 -08:00
Anton Panasenko
d3b078c282 Consolidate search_tool feature into apps (#11509)
## Summary
- Remove `Feature::SearchTool` and the `search_tool` config key from the
feature registry/schema.
- Gate `search_tool_bm25` exposure via `Feature::Apps` in
`core/src/tools/spec.rs`.
- Update MCP selection logic in `core/src/codex.rs` to use
`Feature::Apps` for search-tool behavior.
- Update `core/tests/suite/search_tool.rs` to enable `Feature::Apps`.
- Regenerate `core/config.schema.json` via `just write-config-schema`.

## Testing
- `just fmt`
- `cargo test -p codex-core --test all suite::search_tool::`

## Tickets
- None
2026-02-11 16:52:42 -08:00
Ahmed Ibrahim
bb5dfd037a Hydrate previous model across resume/fork/rollback/task start (#11497)
- Replace pending resume model state with persistent previous_model and
hydrate it on resume, fork, rollback, and task end in spawn_task
2026-02-11 16:45:18 -08:00
Anton Panasenko
23444a063b chore: inject originator/residency headers to ws client (#11506) 2026-02-11 16:43:36 -08:00
Eric Traut
fa767871cb Added seatbelt policy rule to allow os.cpus (#11277)
I don't think this policy change increases the risk, other than
potentially exposing the caller to bugs in these kernel calls, which are
unlikely.

Without this change, some tools are silently failing or making incorrect
decisions about the processor type (e.g. installing x86 binaries rather
than Apple silicon binaries).

This addresses #11210

---------

Co-authored-by: viyatb-oai <viyatb@openai.com>
2026-02-11 16:42:14 -08:00
gt-oai
7112e16809 Add AfterToolUse hook (#11335)
Not wired up to config yet. (So we can change the name if we want)

An example payload:

```
{
  "session_id": "019c48b7-7098-7b61-bc48-32e82585d451",
  "cwd": "/Users/gt/code/codex/codex-rs",
  "triggered_at": "2026-02-10T18:02:31Z",
  "hook_event": {
    "event_type": "after_tool_use",
    "turn_id": "4",
    "call_id": "call_iuo4DqWgjE7OxQywnL2UzJUE",
    "tool_name": "apply_patch",
    "tool_kind": "custom",
    "tool_input": {
      "input_type": "custom",
      "input": "*** Begin Patch\n*** Update File: README.md\n@@\n-# Codex CLI hello (Rust Implementation)\n+# Codex CLI (Rust Implementation)\n*** End Patch\n"
    },
    "executed": true,
    "success": true,
    "duration_ms": 37,
    "mutating": true,
    "sandbox": "none",
    "sandbox_policy": "danger-full-access",
    "output_preview": "{\"output\":\"Success. Updated the following files:\\nM README.md\\n\",\"metadata\":{\"exit_code\":0,\"duration_seconds\":0.0}}"
  }
}
```
2026-02-11 22:25:04 +00:00
Eric Traut
81c534102e Increased file watcher debounce duration from 1s to 10s (#11494)
Users were reporting that when they were actively editing a skill file,
they would see frequent errors (one per second) across all of their
active session until they fixed all frontmatter parse errors. This
change will reduce the chatter at the expense of a slightly longer delay
before skills are updated in the UI.

This addresses #11385
2026-02-11 14:08:03 -08:00
jif-oai
de6f2ef746 nit: memory truncation (#11479)
Use existing truncation for memories
2026-02-11 21:11:57 +00:00
Curtis 'Fjord' Hawthorne
42e22f3bde Add feature-gated freeform js_repl core runtime (#10674)
## Summary

This PR adds an **experimental, feature-gated `js_repl` core runtime**
so models can execute JavaScript in a persistent REPL context across
tool calls.

The implementation integrates with existing feature gating, tool
registration, prompt composition, config/schema docs, and tests.

## What changed

- Added new experimental feature flag: `features.js_repl`.
- Added freeform `js_repl` tool and companion `js_repl_reset` tool.
- Gated tool availability behind `Feature::JsRepl`.
- Added conditional prompt-section injection for JS REPL instructions
via marker-based prompt processing.
- Implemented JS REPL handlers, including freeform parsing and pragma
support (timeout/reset controls).
- Added runtime resolution order for Node:
  1. `CODEX_JS_REPL_NODE_PATH`
  2. `js_repl_node_path` in config
  3. `PATH`
- Added JS runtime assets/version files and updated docs/schema.

## Why

This enables richer agent workflows that require incremental JavaScript
execution with preserved state, while keeping rollout safe behind an
explicit feature flag.

## Testing

Coverage includes:

- Feature-flag gating behavior for tool exposure.
- Freeform parser/pragma handling edge cases.
- Runtime behavior (state persistence across calls and top-level `await`
support).

## Usage

```toml
[features]
js_repl = true
```

Optional runtime override:

- `CODEX_JS_REPL_NODE_PATH`, or
- `js_repl_node_path` in config.

#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/10674
-  `2` https://github.com/openai/codex/pull/10672
-  `3` https://github.com/openai/codex/pull/10671
-  `4` https://github.com/openai/codex/pull/10673
-  `5` https://github.com/openai/codex/pull/10670
2026-02-11 12:05:02 -08:00
iceweasel-oai
87279de434 Promote Windows Sandbox (#11341)
1. Move Windows Sandbox NUX to right after trust directory screen
2. Don't offer read-only as an option in Sandbox NUX.
Elevated/Legacy/Quit
3. Don't allow new untrusted directories. It's trust or quit
4. move experimental sandbox features to `[windows]
sandbox="elevated|unelevatd"`
5. Copy tweaks = elevated -> default, non-elevated -> non-admin
2026-02-11 11:48:33 -08:00
Owen Lin
24e6adbda5 fix: Constrained import (#11485)
main seems broken
2026-02-11 11:44:20 -08:00
jif-oai
53c1818d29 chore: update mem prompt (#11480) 2026-02-11 19:29:39 +00:00
jif-oai
2fac9cc8cd chore: sub-agent never ask for approval (#11464) 2026-02-11 19:19:37 +00:00
jif-oai
1170ffeeae chore: clean rollout extraction in memories (#11471) 2026-02-11 18:25:45 +00:00
jif-oai
d4b2c230f1 feat: memory read path (#11459) 2026-02-11 18:22:45 +00:00
Michael Bolin
3a9324707d feat: panic if Constrained<WebSearchMode> does not support Disabled (#11470)
If this happens, this is a logical error on our part and we should fix
it.
2026-02-11 10:18:58 -08:00
Max Johnson
7053aa5457 Reapply "Add app-server transport layer with websocket support" (#11370)
Reapply "Add app-server transport layer with websocket support" with
additional fixes from https://github.com/openai/codex/pull/11313/changes
to avoid deadlocking.

This reverts commit 47356ff83c.

## Summary

To avoid deadlocking when queues are full, we maintain separate tokio
tasks dedicated to incoming vs outgoing event handling
- split the app-server main loop into two tasks in
`run_main_with_transport`
   - inbound handling (`transport_event_rx`)
   - outbound handling (`outgoing_rx` + `thread_created_rx`)
- separate incoming and outgoing websocket tasks

## Validation

Integration tests, testing thoroughly e2e in codex app w/ >10 concurrent
requests

<img width="1365" height="979" alt="Screenshot 2026-02-10 at 2 54 22 PM"
src="https://github.com/user-attachments/assets/47ca2c13-f322-4e5c-bedd-25859cbdc45f"
/>

---------

Co-authored-by: jif-oai <jif@openai.com>
2026-02-11 18:13:39 +00:00
Michael Bolin
577a416f9a Extract codex-config from codex-core (#11389)
`codex-core` had accumulated config loading, requirements parsing,
constraint logic, and config-layer state handling in a single crate.
This change extracts that subsystem into `codex-config` to reduce
`codex-core` rebuild/test surface area and isolate future config work.

## What Changed

### Added `codex-config`

- Added new workspace crate `codex-rs/config` (`codex-config`).
- Added workspace/build wiring in:
  - `codex-rs/Cargo.toml`
  - `codex-rs/config/Cargo.toml`
  - `codex-rs/config/BUILD.bazel`
- Updated lockfiles (`codex-rs/Cargo.lock`, `MODULE.bazel.lock`).
- Added `codex-core` -> `codex-config` dependency in
`codex-rs/core/Cargo.toml`.

### Moved config internals from `core` into `config`

Moved modules to `codex-rs/config/src/`:

- `core/src/config/constraint.rs` -> `config/src/constraint.rs`
- `core/src/config_loader/cloud_requirements.rs` ->
`config/src/cloud_requirements.rs`
- `core/src/config_loader/config_requirements.rs` ->
`config/src/config_requirements.rs`
- `core/src/config_loader/fingerprint.rs` -> `config/src/fingerprint.rs`
- `core/src/config_loader/merge.rs` -> `config/src/merge.rs`
- `core/src/config_loader/overrides.rs` -> `config/src/overrides.rs`
- `core/src/config_loader/requirements_exec_policy.rs` ->
`config/src/requirements_exec_policy.rs`
- `core/src/config_loader/state.rs` -> `config/src/state.rs`

`codex-config` now re-exports this surface from `config/src/lib.rs` at
the crate top level.

### Updated `core` to consume/re-export `codex-config`

- `core/src/config_loader/mod.rs` now imports/re-exports config-loader
types/functions from top-level `codex_config::*`.
- Local moved modules were removed from `core/src/config_loader/`.
- `core/src/config/mod.rs` now re-exports constraint types from
`codex_config`.
2026-02-11 10:02:49 -08:00
viyatb-oai
7e0178597e feat(core): promote Linux bubblewrap sandbox to Experimental (#11381)
## Summary
- Promote `use_linux_sandbox_bwrap` to `Stage::Experimental` on Linux so
users see it in `/experimental` and get a startup nudge.
2026-02-11 09:49:24 -08:00
jif-oai
9efb7f4a15 clean: memory rollout recorder (#11462) 2026-02-11 15:46:10 +00:00
pakrym-oai
eac5473114 Do not attempt to append after response.completed (#11402)
Completed responses are fully done, and new response must be created.
2026-02-11 07:45:17 -08:00
sayan-oai
83a54766b7 chore: rename disable_websockets -> websockets_disabled (#11420)
`disable_websockets()` is confusing because its a getter. rename for
clarity
2026-02-11 07:44:05 -08:00
jif-oai
b58afbfd0a feat: set policy for phase 2 memory (#11449)
Set the policy of the memory phase 2 worker such that it never ask for
approval
2026-02-11 15:39:22 +00:00
jif-oai
bd3bf6eda1 fix: optional schema of memories (#11454) 2026-02-11 15:05:36 +00:00
jif-oai
156f47edd0 feat: close mem agent after consolidation (#11455)
Close the phase-2 agent of memory when it's done

Fire and forget (i.e. best effort)
2026-02-11 14:34:11 +00:00
jif-oai
f19452e475 nit: increase max raw memories (#11452) 2026-02-11 14:17:34 +00:00
jif-oai
f5d4a21098 feat: new memory prompts (#11439)
* Update prompt
* Wire CWD in the prompt
* Handle the no-output case
2026-02-11 13:57:52 +00:00
jif-oai
3d0ead8db8 feat: improve thread listing (#11429)
Improve listing by doing:
1. List using the rollout file system
2. Upsert the result in the DB (if present)
3. Return the result of a DB listing
4. Fallback on the result of 1 

+ some metrics on top of this
2026-02-11 11:22:05 +00:00
Michael Bolin
476c1a7160 Remove test-support feature from codex-core and replace it with explicit test toggles (#11405)
## Why

`codex-core` was being built in multiple feature-resolved permutations
because test-only behavior was modeled as crate features. For a large
crate, those permutations increase compile cost and reduce cache reuse.

## Net Change

- Removed the `test-support` crate feature and related feature wiring so
`codex-core` no longer needs separate feature shapes for test consumers.
- Standardized cross-crate test-only access behind
`codex_core::test_support`.
- External test code now imports helpers from
`codex_core::test_support`.
- Underlying implementation hooks are kept internal (`pub(crate)`)
instead of broadly public.

## Outcome

- Fewer `codex-core` build permutations.
- Better incremental cache reuse across test targets.
- No intended production behavior change.
2026-02-10 22:44:02 -08:00
Michael Bolin
f6dd9e37e7 tui: show non-file layer content in /debug-config (#11412)
The debug output listed non-file-backed layers such as session flags and
MDM managed config, but it did not show their values. That made it
difficult to explain unexpected effective settings because users could
not inspect those layers on disk.

Now `/debug-config` might include output like this:

```
Config layer stack (lowest precedence first):
  1. system (/etc/codex/config.toml) (enabled)
  2. user (/Users/mbolin/.codex/config.toml) (enabled)
  3. legacy managed_config.toml (mdm) (enabled)
     MDM value:
       # Production Codex configuration file.

       [otel]
       log_user_prompt = true
       environment = "prod"
       exporter = { otlp-http = {
         endpoint = "https://example.com/otel",
         protocol = "binary"
       }}
```
2026-02-11 06:23:08 +00:00
xl-openai
fdd0cd1de9 feat: support multiple rate limits (#11260)
Added multi-limit support end-to-end by carrying limit_name in
rate-limit snapshots and handling multiple buckets instead of only
codex.
Extended /usage client parsing to consume additional_rate_limits
Updated TUI /status and in-memory state to store/render per-limit
snapshots
Extended app-server rate-limit read response: kept rate_limits and added
rate_limits_by_name.
Adjusted usage-limit error messaging for non-default codex limit buckets
2026-02-10 20:09:31 -08:00
Celia Chen
641d5268fa chore: persist turn_id in rollout session and make turn_id uuid based (#11246)
Problem:
1. turn id is constructed in-memory;
2. on resuming threads, turn_id might not be unique;
3. client cannot no the boundary of a turn from rollout files easily.

This PR does three things:
1. persist `task_started` and `task_complete` events;
1. persist `turn_id` in rollout turn events;
5. generate turn_id as unique uuids instead of incrementing it in
memory.

This helps us resolve the issue of clients wanting to have unique turn
ids for resuming a thread, and knowing the boundry of each turn in
rollout files.

example debug logs
```
2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None }
2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None }
2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None }
```

backward compatibility:
if you try to resume an old session without task_started and
task_complete event populated, the following happens:
- If you resume and do nothing: those reconstructed historical IDs can
differ next time you resume.
- If you resume and send a new turn: the new turn gets a fresh UUID from
live submission flow and is persisted, so that new turn’s ID is stable
on later resumes.
I think this behavior is fine, because we only care about deterministic
turn id once a turn is triggered.
2026-02-11 03:56:01 +00:00
pakrym-oai
4473147985 Do not resend output items in incremental websockets connections (#11383)
In the incremental websocket output items are already part of the
context, no need to send them again and duplicate.
2026-02-10 19:38:08 -08:00
Dylan Hurd
cc8c293378 fix(exec-policy) No empty command lists (#11397)
## Summary
This should rarely, if ever, happen in practice. But regardless, we
should never provide an empty list of `commands` to ExecPolicy. This PR
is almost entirely adding test around these cases.

## Testing
- [x] Adds a bunch of unit tests for this
2026-02-10 19:22:23 -08:00
Michael Bolin
b68a84ee8e Remove deterministic_process_ids feature to avoid duplicate codex-core builds (#11393)
## Why

`codex-core` enabled `deterministic_process_ids` through a self
dev-dependency.
That forced a second feature-resolved build of the same crate, which
increased
compile time and test latency.

## What Changed

- Removed the `deterministic_process_ids` feature from
`codex-rs/core/Cargo.toml`.
- Removed the self dev-dependency on `codex-core` that enabled that
feature.
- Removed the Bazel `deterministic_process_ids` crate feature for
`codex-core`.
- Added a test-only `AtomicBool` override in unified exec process-id
allocation.
- Added a test-support setter for that override and re-exported it from
`codex-core`.
- Enabled deterministic process IDs in integration tests via
`core_test_support` ctor.

## Behavior

- Production behavior remains random process IDs.
- Unit tests remain deterministic via `cfg(test)`.
- Integration tests remain deterministic via explicit test-support
initialization.

## Validation

- `just fmt`
- `cargo test -p codex-core unified_exec::`
- `cargo test -p codex-core --test all unified_exec -- --test-threads=1`
- `cargo tree -p codex-core -e features` (verified the removed feature
path)
2026-02-10 19:07:01 -08:00
pakrym-oai
c68999ee6d Prefer websocket transport when model opts in (#11386)
Summary
- add a `prefer_websockets` field to `ModelInfo`, defaulting to `false`
in all fixtures and constructors
- wire the new flag into websocket selection so models that opt in
always use websocket transport even when the feature gate is off

Testing
- Not run (not requested)
2026-02-10 18:50:48 -08:00
pakrym-oai
bfd4e2112c Disable very flaky tests (#11394)
Collected from last 20 builds of main in
https://github.com/openai/codex/commits/main/.
2026-02-10 18:50:11 -08:00
github-actions[bot]
f101300dba Update models.json (#11376)
Automated update of models.json.

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: sayan-oai <sayan@openai.com>
2026-02-10 17:25:35 -08:00
Michael Bolin
d44f4205fb chore: rename codex-command to codex-shell-command (#11378)
This addresses some post-merge feedback on
https://github.com/openai/codex/pull/11361:

- crate rename
- reuse `detect_shell_type()` utility
2026-02-10 17:03:46 -08:00
jif-oai
87bbfc50a1 feat: prevent double backfill (#11377)
## Summary

Add a DB-backed lease to prevent duplicate `.sqlite` backfill workers
from running concurrently.

### What changed
- Added StateRuntime::try_claim_backfill(lease_seconds) that atomically
claims backfill only when:
  - backfill is not complete, and
  - no fresh running worker currently owns it.
- Updated backfill_sessions to use the claim API and exit early when
another worker already holds the lease.
- Added runtime tests covering:
  - singleton claim behavior,
  - stale lease takeover,
  - claim blocked after complete.
- Set backfill lease to 900s in production and 1s in tests.

### Why

This avoids duplicate backfill work and reduces backfill status churn
under concurrent startup, while preserving
current best-effort fallback behavior.
2026-02-11 00:24:20 +00:00
jif-oai
674799d356 feat: mem v2 - PR6 (consolidation) (#11374) 2026-02-11 00:02:57 +00:00
jif-oai
2c9be54c9a feat: mem v2 - PR5 (#11372) 2026-02-10 23:22:55 +00:00
viyatb-oai
1d47927aa0 Enable SOCKS defaults for common local network proxy use cases (#11362)
## Summary
- enable local-use defaults in network proxy settings: SOCKS5 on, SOCKS5
UDP on, upstream proxying on, and local binding on
- add a regression test that asserts the full
`NetworkProxySettings::default()` baseline
- Fixed managed listener reservation behavior.
Before: we always reserved a loopback SOCKS listener, even when
enable_socks5 = false.
Now: SOCKS listener is only reserved when SOCKS is enabled.
- Fixed /debug-config env output for SOCKS-disabled sessions.
ALL_PROXY now shows the HTTP proxy URL when SOCKS is disabled (instead
of incorrectly showing socks5h://...).


## Validation
- just fmt
- cargo test -p codex-network-proxy
- cargo clippy -p codex-network-proxy --all-targets
2026-02-10 15:13:52 -08:00
jif-oai
623d3f4071 feat: mem v2 - PR4 (#11369)
# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.
2026-02-10 23:10:35 +00:00