fix(core): scope file search gitignore to repository context (#13250)

Closes #3493

## Problem

When a user's home directory (or any ancestor) contains a broad
`.gitignore` (e.g. `*` + `!.gitignore`), the `@` file mention picker in
Codex silently hides valid repository files like `package.json`. The
picker returns `no matches` for searches that should succeed. This is
surprising because manually typed paths still work, making the failure
hard to diagnose.

## Mental model

Git itself never walks above the repository root to assemble its ignore
list. Its `.gitignore` resolution is strictly scoped: it reads
`.gitignore` files from the repo root downward, the per-repo
`.git/info/exclude`, and the user's global excludes file (via
`core.excludesFile`). A `.gitignore` sitting in a parent directory above
the repo root has no effect on `git status`, `git ls-files`, or any
other git operation. Our file search should replicate this contract
exactly.

The `ignore` crate's `WalkBuilder` has a `require_git` flag that
controls whether it follows this contract:

- `require_git(false)` (the previous setting): the walker reads
`.gitignore` files from _all_ ancestor directories, even those above or
outside the repository root. This is a deliberate divergence from git's
behavior in the `ignore` crate, intended for non-git use cases. It means
a `~/.gitignore` with `*` will suppress every file in the walk—something
git itself would never do.

- `require_git(true)` (this fix): the walker only applies `.gitignore`
semantics when it detects a `.git` directory, scoping ignore resolution
to the repository boundary. This matches git's own behavior: parent
`.gitignore` files above the repo root have no effect.

The fix is a one-line change: `require_git(false)` becomes
`require_git(true)`.

## How `require_git(false)` got here

The setting was introduced in af338cc (#2981, "Improve @ file search:
include specific hidden dirs such as .github, .gitlab"). That PR's goal
was to make hidden directories like `.github` and `.vscode` discoverable
by setting `.hidden(false)` on the walker. The `require_git(false)` was
added alongside it with the comment _"Don't require git to be present to
apply git-related ignore rules"_—the author likely intended gitignore
rules to still filter results even when no `.git` directory exists (e.g.
searching an extracted tarball that has a `.gitignore` but no `.git`).

The unintended consequence: with `require_git(false)`, the `ignore`
crate walks _above_ the search root to find `.gitignore` files in
ancestor directories. This is a side effect the original author almost
certainly didn't anticipate. The PR message says "Preserve `.gitignore`
semantics," but `require_git(false)` actually _breaks_ git's semantics
by applying ancestor ignore files that git itself would never read.

In short: the intent was "apply gitignore even without `.git`" but the
effect was "apply gitignore from every ancestor directory." This fix
restores git-correct scoping.

## Non-goals

- This PR does not change behavior when `respect_gitignore` is `false`
(that path already disables all git-related ignore rules).
- The first test
(`parent_gitignore_outside_repo_does_not_hide_repo_files`) intentionally
omits `git init`. The `ignore` crate's `require_git(true)` causes it to
skip gitignore processing entirely when no `.git` exists, which is the
desired behavior for that scenario. A second test
(`git_repo_still_respects_local_gitignore_when_enabled`) covers the
complementary case with a real git repo.

## Tradeoffs

**Behavioral shift**: With `require_git(true)`, directories that contain
`.gitignore` files but are _not_ inside a git repository will no longer
have those ignore rules applied during `@` search. This is a correctness
improvement for the primary use case (searching inside repos), but
changes behavior for the edge case of searching non-repo directories
that happen to have `.gitignore` files. In practice, Codex is
overwhelmingly used inside git repositories, so this tradeoff strongly
favors the fix.

**Two test strategies**: The first test omits `git init` to verify
parent ignore leakage is blocked; the second runs `git init` to verify
the repo's own `.gitignore` is still honored. Together they cover both
sides of the `require_git(true)` contract.

## Architecture

The change is in `walker_worker()` within
`codex-rs/file-search/src/lib.rs`, which configures the
`ignore::WalkBuilder` used by the file search walker thread. The walker
feeds discovered file paths into `nucleo` for fuzzy matching. The
`require_git` flag controls whether the walker consults `.gitignore`
files at all—it sits upstream of all ignore processing.

```
walker_worker
  └─ WalkBuilder::new(root)
       ├─ .hidden(false)         — include dotfiles
       ├─ .follow_links(true)    — follow symlinks
       ├─ .require_git(true)     — ← THE FIX: only apply gitignore in git repos
       └─ (conditional) git_ignore(false), git_global(false), etc.
            └─ applied when respect_gitignore == false
```

## Tests

- `parent_gitignore_outside_repo_does_not_hide_repo_files`: creates a
temp directory tree with a parent `.gitignore` containing `*`, a child
"repo" directory with `package.json` and `.vscode/settings.json`, and
asserts that both files are discoverable via `run()` with
`respect_gitignore: true`.
- `git_repo_still_respects_local_gitignore_when_enabled`: the
complementary test—runs `git init` inside the child directory and
verifies that the repo's own `.gitignore` exclusions still work (e.g.
`.vscode/extensions.json` is excluded while `.vscode/settings.json` is
whitelisted). Confirms that `require_git(true)` does not disable
gitignore processing inside actual git repositories.
This commit is contained in:
Felipe Coury
2026-03-03 01:52:20 -03:00
committed by GitHub
parent 2f5b01abd6
commit 745c48b088

View File

@@ -92,6 +92,13 @@ pub struct FileSearchOptions {
pub exclude: Vec<String>,
pub threads: NonZero<usize>,
pub compute_indices: bool,
/// Toggle ignore-file processing in the walker.
///
/// When enabled, `.gitignore` files are scoped by
/// `WalkBuilder::require_git(true)`, so they are honored only when the
/// traversed path is inside a git repository. When disabled, the walker
/// turns off `.gitignore`, git-global/exclude rules, `.ignore`, and
/// parent-directory ignore scanning.
pub respect_gitignore: bool,
}
@@ -379,6 +386,18 @@ fn get_file_path<'a>(path: &'a Path, search_directories: &[PathBuf]) -> Option<(
rel_path.to_str().map(|p| (root_idx, p))
}
/// Walks the search directories and feeds discovered file paths into `nucleo`
/// via the injector.
///
/// The walker uses `require_git(true)` to match git's own ignore semantics:
/// git never reads `.gitignore` files from directories above the repository
/// root. Without this flag, the `ignore` crate reads `.gitignore` files from
/// *all* ancestor directories—a deliberate divergence from git intended for
/// non-git use cases—allowing a broad parent ignore (e.g. `~/.gitignore`
/// containing `*`) to silently suppress every file in the walk.
///
/// When `respect_gitignore` is `false`, all git-related ignore processing is
/// disabled regardless of this flag.
fn walker_worker(
inner: Arc<SessionInner>,
override_matcher: Option<ignore::overrides::Override>,
@@ -399,8 +418,9 @@ fn walker_worker(
.hidden(false)
// Follow symlinks to search their contents.
.follow_links(true)
// Don't require git to be present to apply to apply git-related ignore rules.
.require_git(false);
// Keep ignore behavior aligned with git repositories: only apply
// gitignore rules when a git context exists.
.require_git(true);
if !inner.respect_gitignore {
walk_builder
.git_ignore(false)
@@ -966,4 +986,151 @@ mod tests {
assert_eq!(results.matches, Vec::new());
assert_eq!(results.total_match_count, 0);
}
/// Regression test for #3493: a parent directory's `.gitignore` with `*`
/// must not suppress files discovered inside a child "repo" directory.
///
/// The fixture intentionally omits `git init` so that no `.git` directory
/// exists. With `require_git(true)`, the walker skips all gitignore
/// processing, making the parent's broad ignore harmless.
#[test]
fn parent_gitignore_outside_repo_does_not_hide_repo_files() {
let temp = tempfile::tempdir().unwrap();
let parent = temp.path().join("home");
let repo = parent.join("repo");
fs::create_dir_all(repo.join(".vscode")).unwrap();
fs::write(parent.join(".gitignore"), "*\n!.gitignore\n").unwrap();
fs::write(
repo.join(".gitignore"),
".vscode/*\n!.vscode/\n!.vscode/settings.json\n!package.json\n",
)
.unwrap();
fs::write(repo.join("package.json"), "{ \"name\": \"demo\" }\n").unwrap();
fs::write(repo.join(".vscode/settings.json"), "{ \"editor\": true }\n").unwrap();
let respect_results = run(
"package",
vec![repo.clone()],
FileSearchOptions {
limit: NonZero::new(20).unwrap(),
exclude: Vec::new(),
threads: NonZero::new(2).unwrap(),
compute_indices: false,
respect_gitignore: true,
},
None,
)
.expect("run ok");
assert!(
respect_results
.matches
.iter()
.any(|m| m.path.as_path() == Path::new("package.json"))
);
let nested_file_results = run(
"settings",
vec![repo],
FileSearchOptions {
limit: NonZero::new(20).unwrap(),
exclude: Vec::new(),
threads: NonZero::new(2).unwrap(),
compute_indices: false,
respect_gitignore: true,
},
None,
)
.expect("run ok");
assert!(
nested_file_results
.matches
.iter()
.any(|m| m.path.as_path() == Path::new(".vscode/settings.json"))
);
}
#[test]
fn git_repo_still_respects_local_gitignore_when_enabled() {
let temp = tempfile::tempdir().unwrap();
let parent = temp.path().join("home");
let repo = parent.join("repo");
fs::create_dir_all(repo.join(".vscode")).unwrap();
fs::write(parent.join(".gitignore"), "*\n!.gitignore\n").unwrap();
fs::write(
repo.join(".gitignore"),
".vscode/*\n!.vscode/\n!.vscode/settings.json\n!package.json\n",
)
.unwrap();
fs::write(repo.join("package.json"), "{ \"name\": \"demo\" }\n").unwrap();
fs::write(repo.join(".vscode/settings.json"), "{ \"editor\": true }\n").unwrap();
fs::write(
repo.join(".vscode/extensions.json"),
"{ \"extensions\": [] }\n",
)
.unwrap();
fs::create_dir_all(repo.join(".git")).unwrap();
let package_results = run(
"package",
vec![repo.clone()],
FileSearchOptions {
limit: NonZero::new(20).unwrap(),
exclude: Vec::new(),
threads: NonZero::new(2).unwrap(),
compute_indices: false,
respect_gitignore: true,
},
None,
)
.expect("run ok");
assert!(
package_results
.matches
.iter()
.any(|m| m.path.as_path() == Path::new("package.json"))
);
let ignored_results = run(
"extensions.json",
vec![repo.clone()],
FileSearchOptions {
limit: NonZero::new(20).unwrap(),
exclude: Vec::new(),
threads: NonZero::new(2).unwrap(),
compute_indices: false,
respect_gitignore: true,
},
None,
)
.expect("run ok");
assert!(
!ignored_results
.matches
.iter()
.any(|m| m.path.as_path() == Path::new(".vscode/extensions.json"))
);
let whitelisted_results = run(
"settings.json",
vec![repo],
FileSearchOptions {
limit: NonZero::new(20).unwrap(),
exclude: Vec::new(),
threads: NonZero::new(2).unwrap(),
compute_indices: false,
respect_gitignore: true,
},
None,
)
.expect("run ok");
assert!(
whitelisted_results
.matches
.iter()
.any(|m| m.path.as_path() == Path::new(".vscode/settings.json"))
);
}
}