Files
codex/codex-rs/file-search
Felipe Coury 745c48b088 fix(core): scope file search gitignore to repository context (#13250)
Closes #3493

## Problem

When a user's home directory (or any ancestor) contains a broad
`.gitignore` (e.g. `*` + `!.gitignore`), the `@` file mention picker in
Codex silently hides valid repository files like `package.json`. The
picker returns `no matches` for searches that should succeed. This is
surprising because manually typed paths still work, making the failure
hard to diagnose.

## Mental model

Git itself never walks above the repository root to assemble its ignore
list. Its `.gitignore` resolution is strictly scoped: it reads
`.gitignore` files from the repo root downward, the per-repo
`.git/info/exclude`, and the user's global excludes file (via
`core.excludesFile`). A `.gitignore` sitting in a parent directory above
the repo root has no effect on `git status`, `git ls-files`, or any
other git operation. Our file search should replicate this contract
exactly.

The `ignore` crate's `WalkBuilder` has a `require_git` flag that
controls whether it follows this contract:

- `require_git(false)` (the previous setting): the walker reads
`.gitignore` files from _all_ ancestor directories, even those above or
outside the repository root. This is a deliberate divergence from git's
behavior in the `ignore` crate, intended for non-git use cases. It means
a `~/.gitignore` with `*` will suppress every file in the walk—something
git itself would never do.

- `require_git(true)` (this fix): the walker only applies `.gitignore`
semantics when it detects a `.git` directory, scoping ignore resolution
to the repository boundary. This matches git's own behavior: parent
`.gitignore` files above the repo root have no effect.

The fix is a one-line change: `require_git(false)` becomes
`require_git(true)`.

## How `require_git(false)` got here

The setting was introduced in af338cc (#2981, "Improve @ file search:
include specific hidden dirs such as .github, .gitlab"). That PR's goal
was to make hidden directories like `.github` and `.vscode` discoverable
by setting `.hidden(false)` on the walker. The `require_git(false)` was
added alongside it with the comment _"Don't require git to be present to
apply git-related ignore rules"_—the author likely intended gitignore
rules to still filter results even when no `.git` directory exists (e.g.
searching an extracted tarball that has a `.gitignore` but no `.git`).

The unintended consequence: with `require_git(false)`, the `ignore`
crate walks _above_ the search root to find `.gitignore` files in
ancestor directories. This is a side effect the original author almost
certainly didn't anticipate. The PR message says "Preserve `.gitignore`
semantics," but `require_git(false)` actually _breaks_ git's semantics
by applying ancestor ignore files that git itself would never read.

In short: the intent was "apply gitignore even without `.git`" but the
effect was "apply gitignore from every ancestor directory." This fix
restores git-correct scoping.

## Non-goals

- This PR does not change behavior when `respect_gitignore` is `false`
(that path already disables all git-related ignore rules).
- The first test
(`parent_gitignore_outside_repo_does_not_hide_repo_files`) intentionally
omits `git init`. The `ignore` crate's `require_git(true)` causes it to
skip gitignore processing entirely when no `.git` exists, which is the
desired behavior for that scenario. A second test
(`git_repo_still_respects_local_gitignore_when_enabled`) covers the
complementary case with a real git repo.

## Tradeoffs

**Behavioral shift**: With `require_git(true)`, directories that contain
`.gitignore` files but are _not_ inside a git repository will no longer
have those ignore rules applied during `@` search. This is a correctness
improvement for the primary use case (searching inside repos), but
changes behavior for the edge case of searching non-repo directories
that happen to have `.gitignore` files. In practice, Codex is
overwhelmingly used inside git repositories, so this tradeoff strongly
favors the fix.

**Two test strategies**: The first test omits `git init` to verify
parent ignore leakage is blocked; the second runs `git init` to verify
the repo's own `.gitignore` is still honored. Together they cover both
sides of the `require_git(true)` contract.

## Architecture

The change is in `walker_worker()` within
`codex-rs/file-search/src/lib.rs`, which configures the
`ignore::WalkBuilder` used by the file search walker thread. The walker
feeds discovered file paths into `nucleo` for fuzzy matching. The
`require_git` flag controls whether the walker consults `.gitignore`
files at all—it sits upstream of all ignore processing.

```
walker_worker
  └─ WalkBuilder::new(root)
       ├─ .hidden(false)         — include dotfiles
       ├─ .follow_links(true)    — follow symlinks
       ├─ .require_git(true)     — ← THE FIX: only apply gitignore in git repos
       └─ (conditional) git_ignore(false), git_global(false), etc.
            └─ applied when respect_gitignore == false
```

## Tests

- `parent_gitignore_outside_repo_does_not_hide_repo_files`: creates a
temp directory tree with a parent `.gitignore` containing `*`, a child
"repo" directory with `package.json` and `.vscode/settings.json`, and
asserts that both files are discoverable via `run()` with
`respect_gitignore: true`.
- `git_repo_still_respects_local_gitignore_when_enabled`: the
complementary test—runs `git init` inside the child directory and
verifies that the repo's own `.gitignore` exclusions still work (e.g.
`.vscode/extensions.json` is excluded while `.vscode/settings.json` is
whitelisted). Confirms that `require_git(true)` does not disable
gitignore processing inside actual git repositories.
2026-03-02 21:52:20 -07:00
..

codex_file_search

Fast fuzzy file search tool for Codex.

Uses https://crates.io/crates/ignore under the hood (which is what ripgrep uses) to traverse a directory (while honoring .gitignore, etc.) to produce the list of files to search and then uses https://crates.io/crates/nucleo-matcher to fuzzy-match the user supplied PATTERN against the corpus.