mirror of
https://github.com/openai/codex.git
synced 2026-03-05 21:45:28 +03:00
fix(core): scope file search gitignore to repository context (#13250)
Closes #3493
## Problem
When a user's home directory (or any ancestor) contains a broad
`.gitignore` (e.g. `*` + `!.gitignore`), the `@` file mention picker in
Codex silently hides valid repository files like `package.json`. The
picker returns `no matches` for searches that should succeed. This is
surprising because manually typed paths still work, making the failure
hard to diagnose.
## Mental model
Git itself never walks above the repository root to assemble its ignore
list. Its `.gitignore` resolution is strictly scoped: it reads
`.gitignore` files from the repo root downward, the per-repo
`.git/info/exclude`, and the user's global excludes file (via
`core.excludesFile`). A `.gitignore` sitting in a parent directory above
the repo root has no effect on `git status`, `git ls-files`, or any
other git operation. Our file search should replicate this contract
exactly.
The `ignore` crate's `WalkBuilder` has a `require_git` flag that
controls whether it follows this contract:
- `require_git(false)` (the previous setting): the walker reads
`.gitignore` files from _all_ ancestor directories, even those above or
outside the repository root. This is a deliberate divergence from git's
behavior in the `ignore` crate, intended for non-git use cases. It means
a `~/.gitignore` with `*` will suppress every file in the walk—something
git itself would never do.
- `require_git(true)` (this fix): the walker only applies `.gitignore`
semantics when it detects a `.git` directory, scoping ignore resolution
to the repository boundary. This matches git's own behavior: parent
`.gitignore` files above the repo root have no effect.
The fix is a one-line change: `require_git(false)` becomes
`require_git(true)`.
## How `require_git(false)` got here
The setting was introduced in af338cc (#2981, "Improve @ file search:
include specific hidden dirs such as .github, .gitlab"). That PR's goal
was to make hidden directories like `.github` and `.vscode` discoverable
by setting `.hidden(false)` on the walker. The `require_git(false)` was
added alongside it with the comment _"Don't require git to be present to
apply git-related ignore rules"_—the author likely intended gitignore
rules to still filter results even when no `.git` directory exists (e.g.
searching an extracted tarball that has a `.gitignore` but no `.git`).
The unintended consequence: with `require_git(false)`, the `ignore`
crate walks _above_ the search root to find `.gitignore` files in
ancestor directories. This is a side effect the original author almost
certainly didn't anticipate. The PR message says "Preserve `.gitignore`
semantics," but `require_git(false)` actually _breaks_ git's semantics
by applying ancestor ignore files that git itself would never read.
In short: the intent was "apply gitignore even without `.git`" but the
effect was "apply gitignore from every ancestor directory." This fix
restores git-correct scoping.
## Non-goals
- This PR does not change behavior when `respect_gitignore` is `false`
(that path already disables all git-related ignore rules).
- The first test
(`parent_gitignore_outside_repo_does_not_hide_repo_files`) intentionally
omits `git init`. The `ignore` crate's `require_git(true)` causes it to
skip gitignore processing entirely when no `.git` exists, which is the
desired behavior for that scenario. A second test
(`git_repo_still_respects_local_gitignore_when_enabled`) covers the
complementary case with a real git repo.
## Tradeoffs
**Behavioral shift**: With `require_git(true)`, directories that contain
`.gitignore` files but are _not_ inside a git repository will no longer
have those ignore rules applied during `@` search. This is a correctness
improvement for the primary use case (searching inside repos), but
changes behavior for the edge case of searching non-repo directories
that happen to have `.gitignore` files. In practice, Codex is
overwhelmingly used inside git repositories, so this tradeoff strongly
favors the fix.
**Two test strategies**: The first test omits `git init` to verify
parent ignore leakage is blocked; the second runs `git init` to verify
the repo's own `.gitignore` is still honored. Together they cover both
sides of the `require_git(true)` contract.
## Architecture
The change is in `walker_worker()` within
`codex-rs/file-search/src/lib.rs`, which configures the
`ignore::WalkBuilder` used by the file search walker thread. The walker
feeds discovered file paths into `nucleo` for fuzzy matching. The
`require_git` flag controls whether the walker consults `.gitignore`
files at all—it sits upstream of all ignore processing.
```
walker_worker
└─ WalkBuilder::new(root)
├─ .hidden(false) — include dotfiles
├─ .follow_links(true) — follow symlinks
├─ .require_git(true) — ← THE FIX: only apply gitignore in git repos
└─ (conditional) git_ignore(false), git_global(false), etc.
└─ applied when respect_gitignore == false
```
## Tests
- `parent_gitignore_outside_repo_does_not_hide_repo_files`: creates a
temp directory tree with a parent `.gitignore` containing `*`, a child
"repo" directory with `package.json` and `.vscode/settings.json`, and
asserts that both files are discoverable via `run()` with
`respect_gitignore: true`.
- `git_repo_still_respects_local_gitignore_when_enabled`: the
complementary test—runs `git init` inside the child directory and
verifies that the repo's own `.gitignore` exclusions still work (e.g.
`.vscode/extensions.json` is excluded while `.vscode/settings.json` is
whitelisted). Confirms that `require_git(true)` does not disable
gitignore processing inside actual git repositories.
This commit is contained in:
@@ -92,6 +92,13 @@ pub struct FileSearchOptions {
|
||||
pub exclude: Vec<String>,
|
||||
pub threads: NonZero<usize>,
|
||||
pub compute_indices: bool,
|
||||
/// Toggle ignore-file processing in the walker.
|
||||
///
|
||||
/// When enabled, `.gitignore` files are scoped by
|
||||
/// `WalkBuilder::require_git(true)`, so they are honored only when the
|
||||
/// traversed path is inside a git repository. When disabled, the walker
|
||||
/// turns off `.gitignore`, git-global/exclude rules, `.ignore`, and
|
||||
/// parent-directory ignore scanning.
|
||||
pub respect_gitignore: bool,
|
||||
}
|
||||
|
||||
@@ -379,6 +386,18 @@ fn get_file_path<'a>(path: &'a Path, search_directories: &[PathBuf]) -> Option<(
|
||||
rel_path.to_str().map(|p| (root_idx, p))
|
||||
}
|
||||
|
||||
/// Walks the search directories and feeds discovered file paths into `nucleo`
|
||||
/// via the injector.
|
||||
///
|
||||
/// The walker uses `require_git(true)` to match git's own ignore semantics:
|
||||
/// git never reads `.gitignore` files from directories above the repository
|
||||
/// root. Without this flag, the `ignore` crate reads `.gitignore` files from
|
||||
/// *all* ancestor directories—a deliberate divergence from git intended for
|
||||
/// non-git use cases—allowing a broad parent ignore (e.g. `~/.gitignore`
|
||||
/// containing `*`) to silently suppress every file in the walk.
|
||||
///
|
||||
/// When `respect_gitignore` is `false`, all git-related ignore processing is
|
||||
/// disabled regardless of this flag.
|
||||
fn walker_worker(
|
||||
inner: Arc<SessionInner>,
|
||||
override_matcher: Option<ignore::overrides::Override>,
|
||||
@@ -399,8 +418,9 @@ fn walker_worker(
|
||||
.hidden(false)
|
||||
// Follow symlinks to search their contents.
|
||||
.follow_links(true)
|
||||
// Don't require git to be present to apply to apply git-related ignore rules.
|
||||
.require_git(false);
|
||||
// Keep ignore behavior aligned with git repositories: only apply
|
||||
// gitignore rules when a git context exists.
|
||||
.require_git(true);
|
||||
if !inner.respect_gitignore {
|
||||
walk_builder
|
||||
.git_ignore(false)
|
||||
@@ -966,4 +986,151 @@ mod tests {
|
||||
assert_eq!(results.matches, Vec::new());
|
||||
assert_eq!(results.total_match_count, 0);
|
||||
}
|
||||
|
||||
/// Regression test for #3493: a parent directory's `.gitignore` with `*`
|
||||
/// must not suppress files discovered inside a child "repo" directory.
|
||||
///
|
||||
/// The fixture intentionally omits `git init` so that no `.git` directory
|
||||
/// exists. With `require_git(true)`, the walker skips all gitignore
|
||||
/// processing, making the parent's broad ignore harmless.
|
||||
#[test]
|
||||
fn parent_gitignore_outside_repo_does_not_hide_repo_files() {
|
||||
let temp = tempfile::tempdir().unwrap();
|
||||
let parent = temp.path().join("home");
|
||||
let repo = parent.join("repo");
|
||||
fs::create_dir_all(repo.join(".vscode")).unwrap();
|
||||
|
||||
fs::write(parent.join(".gitignore"), "*\n!.gitignore\n").unwrap();
|
||||
fs::write(
|
||||
repo.join(".gitignore"),
|
||||
".vscode/*\n!.vscode/\n!.vscode/settings.json\n!package.json\n",
|
||||
)
|
||||
.unwrap();
|
||||
fs::write(repo.join("package.json"), "{ \"name\": \"demo\" }\n").unwrap();
|
||||
fs::write(repo.join(".vscode/settings.json"), "{ \"editor\": true }\n").unwrap();
|
||||
|
||||
let respect_results = run(
|
||||
"package",
|
||||
vec![repo.clone()],
|
||||
FileSearchOptions {
|
||||
limit: NonZero::new(20).unwrap(),
|
||||
exclude: Vec::new(),
|
||||
threads: NonZero::new(2).unwrap(),
|
||||
compute_indices: false,
|
||||
respect_gitignore: true,
|
||||
},
|
||||
None,
|
||||
)
|
||||
.expect("run ok");
|
||||
assert!(
|
||||
respect_results
|
||||
.matches
|
||||
.iter()
|
||||
.any(|m| m.path.as_path() == Path::new("package.json"))
|
||||
);
|
||||
|
||||
let nested_file_results = run(
|
||||
"settings",
|
||||
vec![repo],
|
||||
FileSearchOptions {
|
||||
limit: NonZero::new(20).unwrap(),
|
||||
exclude: Vec::new(),
|
||||
threads: NonZero::new(2).unwrap(),
|
||||
compute_indices: false,
|
||||
respect_gitignore: true,
|
||||
},
|
||||
None,
|
||||
)
|
||||
.expect("run ok");
|
||||
assert!(
|
||||
nested_file_results
|
||||
.matches
|
||||
.iter()
|
||||
.any(|m| m.path.as_path() == Path::new(".vscode/settings.json"))
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn git_repo_still_respects_local_gitignore_when_enabled() {
|
||||
let temp = tempfile::tempdir().unwrap();
|
||||
let parent = temp.path().join("home");
|
||||
let repo = parent.join("repo");
|
||||
fs::create_dir_all(repo.join(".vscode")).unwrap();
|
||||
|
||||
fs::write(parent.join(".gitignore"), "*\n!.gitignore\n").unwrap();
|
||||
fs::write(
|
||||
repo.join(".gitignore"),
|
||||
".vscode/*\n!.vscode/\n!.vscode/settings.json\n!package.json\n",
|
||||
)
|
||||
.unwrap();
|
||||
fs::write(repo.join("package.json"), "{ \"name\": \"demo\" }\n").unwrap();
|
||||
fs::write(repo.join(".vscode/settings.json"), "{ \"editor\": true }\n").unwrap();
|
||||
fs::write(
|
||||
repo.join(".vscode/extensions.json"),
|
||||
"{ \"extensions\": [] }\n",
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
fs::create_dir_all(repo.join(".git")).unwrap();
|
||||
|
||||
let package_results = run(
|
||||
"package",
|
||||
vec![repo.clone()],
|
||||
FileSearchOptions {
|
||||
limit: NonZero::new(20).unwrap(),
|
||||
exclude: Vec::new(),
|
||||
threads: NonZero::new(2).unwrap(),
|
||||
compute_indices: false,
|
||||
respect_gitignore: true,
|
||||
},
|
||||
None,
|
||||
)
|
||||
.expect("run ok");
|
||||
assert!(
|
||||
package_results
|
||||
.matches
|
||||
.iter()
|
||||
.any(|m| m.path.as_path() == Path::new("package.json"))
|
||||
);
|
||||
|
||||
let ignored_results = run(
|
||||
"extensions.json",
|
||||
vec![repo.clone()],
|
||||
FileSearchOptions {
|
||||
limit: NonZero::new(20).unwrap(),
|
||||
exclude: Vec::new(),
|
||||
threads: NonZero::new(2).unwrap(),
|
||||
compute_indices: false,
|
||||
respect_gitignore: true,
|
||||
},
|
||||
None,
|
||||
)
|
||||
.expect("run ok");
|
||||
assert!(
|
||||
!ignored_results
|
||||
.matches
|
||||
.iter()
|
||||
.any(|m| m.path.as_path() == Path::new(".vscode/extensions.json"))
|
||||
);
|
||||
|
||||
let whitelisted_results = run(
|
||||
"settings.json",
|
||||
vec![repo],
|
||||
FileSearchOptions {
|
||||
limit: NonZero::new(20).unwrap(),
|
||||
exclude: Vec::new(),
|
||||
threads: NonZero::new(2).unwrap(),
|
||||
compute_indices: false,
|
||||
respect_gitignore: true,
|
||||
},
|
||||
None,
|
||||
)
|
||||
.expect("run ok");
|
||||
assert!(
|
||||
whitelisted_results
|
||||
.matches
|
||||
.iter()
|
||||
.any(|m| m.path.as_path() == Path::new(".vscode/settings.json"))
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user