Previously, `Quiet` mode in the summary printer always acted like
"print matching paths," except without the printing. This happened even
if we wanted to "print non-matching paths." Since this only afflicted
quiet mode, this had the effect of flipping the exit status when
`--files-without-match --quiet` was used.
Fixes#3108, Ref #3118
I'm not sure why I did this, but I think I was trying to imitate the
contract of [`std::path::Path::file_name`]:
> Returns None if the path terminates in `..`.
But the status quo clearly did not implement this. And as a result, if
you have a glob that ends in a `.`, it was instead treated as the empty
string (which only matches the empty string).
We fix this by implementing the semantic from the standard library
correctly.
Fixes#2990
[`std::path::Path::file_name`]: https://doc.rust-lang.org/std/path/struct.Path.html#method.file_name
Specifically, if the search was instructed to quit early, we might not
have correctly marked the number of bytes consumed.
I don't think this bug occurs when memory maps are used to read the
haystack.
Closes#2944
For example, `**/{node_modules/**/*/{ts,js},crates/**/*.{rs,toml}`.
I originally didn't add this I think for implementation simplicity, but
it turns out that it really isn't much work to do. There might have also
been some odd behavior in the regex engine for dealing with empty
alternates, but that has all been long fixed.
Closes#3048, Closes#3112
Previously, when `file` is empty (literally empty, as in, zero byte),
`rg -f file` and `rg -vf file` would behave identically. This is odd
and also doesn't match how GNU grep behaves. It's also not logically
correct. An empty file means _zero_ patterns which is an empty set. An
empty set matches nothing. Inverting the empty set should result in
matching everything.
This was because of an errant optimization that lets ripgrep quit early
if it can statically detect that no matches are possible.
Moreover, there was *also* a bug in how we constructed the PCRE2 pattern
when there are zero patterns. PCRE2 doesn't have a concept of sets of
patterns (unlike the `regex` crate), so we need to fake it with an empty
character class.
Fixes#1332, Fixes#3001, Closes#3041
This adds a `replacement` field to each submatch object in the JSON
output. In effect, this extends the `-r/--replace` flag so that it works
with `--json`.
This adds a new field instead of replacing the match text (which is how
the standard printer works) for maximum flexibility. This way, consumers
of the JSON output can access the original match text (and always rely
on it corresponding to the original match text) while also getting the
replacement text without needing to do the replacement themselves.
Closes#1872, Closes#2883
When searching in parallel with many more arguments than threads, the
first arguments are searched last -- unlike in the -j1 case.
This is unexpected for users who know about the parallel nature of rg
and think they can give the scheduler a hint by positioning larger
input files (L1, L2, ..) before smaller ones (█, ██). Instead, this can
result in sub-optimal thread usage and thus longer runtime (simplified
example with 2 threads):
T1: █ ██ █ █ █ █ ██ █ █ █ █ █ ██ ╠═════════════L1════════════╣
T2: █ █ ██ █ █ ██ █ █ █ ██ █ █ ╠═════L2════╣
┏━━━━┳━━━━┳━━━━┳━━━━┓
This is caused by assigning work to ┃ T1 ┃ T2 ┃ T3 ┃ T4 ┃
per-thread stacks in a round-robin ┡━━━━╇━━━━╇━━━━╇━━━━┩
manner, starting here → │ L1 │ L2 │ L3 │ L4 │ ↵
├────├────┼────┼────┤
│ s5 │ s6 │ s7 │ s8 │ ↵
├────┼────┼────┼────┤
╷ .. ╷ .. ╷ .. ╷ .. ╷
├────┼────┼────┼────┤
│ st │ su │ sv │ sw │ ↵
├────┼────┼────┼────┘
│ sx │ sy │ sz │
└────┴────┴────┘
and then processing them bottom-up: ↥ ↥ ↥ ↥
╷ .. ╷ .. ╷ .. ╷ .. ╷
This patch reverses the input order ├────┼────┼────┼────┤
so the two reversals cancel each other │ s7 │ s6 │ s5 │ L4 │ ↵
out. Now at least the first N ├────┼────┼────┼────┘
arguments, N=number-of-threads, are │ L3 │ L2 │ L1 │
processed before any others (then └────┴────┴────┘
work-stealing may happen):
T1: ╠═════════════L1════════════╣ █ ██ █ █ █ █ █ █ ██
T2: ╠═════L2════╣ █ █ ██ █ █ ██ █ █ █ ██ █ █ ██ █ █ █
(With some more shuffling T1 could always be assigned L1 etc., but
that would mostly be for optics).
Closes#2849
The *BSD build systems make use of "Makefile.inc" a lot. Make the
"make" type recognize this file by default. And more generally,
`Makefile.*` seems to be a convention, so just generalize it.
Closes#2846
This makes it so the presence of `.jj` will cause ripgrep to treat it
as a VCS directory, just as if `.git` were present. This is useful for
ripgrep's default behavior when working with jj repositories that don't
have a `.git` but do have `.gitignore`. Namely, ripgrep requires the
presence of a VCS repository in order to respect `.gitignore`.
We don't handle clone-specific exclude rules for jj repositories without
`.git` though. It seems it isn't 100% set yet where we can find
those[1].
Closes#2842
[1]: https://github.com/BurntSushi/ripgrep/pull/2842#discussion_r2020076722
The previous code deleted too many parts of the path when constructing
the absolute path, resulting in a shortened final path. This patch
creates the correct absolute path by only removing the necessary parts.
Fixes#829, Fixes#2731, Fixes#2747, Fixes#2778, Fixes#2836, Fixes#2933, Fixes#3144Closes#2933
The fish completions now also pay attention to the configuration file
to determine whether to suggest negation options and not just to the
current command line.
This doesn't cover all edge cases. For example the config file is
cached, and so changes may not take effect until the next shell
session. But the cases it doesn't cover are hopefully very rare.
Closes#2708
This feature causes nothing but problems and is frequently broken. The
only optimization it was enabling were SIMD optimizations for
transcoding. In particular, for UTF-16 transcoding. This is performed by
the [`encoding_rs`](https://github.com/hsivonen/encoding_rs) crate,
which specifically uses unstable portable SIMD APIs instead of the
stable non-portable SIMD APIs.
SIMD optimizations that apply to search have long been making use of
stable APIs, and are automatically enabled when your target supports
them. This is, IMO, the correct user experience and one that
`encoding_rs` refuses to support. I'm done dealing with it, so
transcoding will only use scalar code until the SIMD optimizations in
`encoding_rs` work on stable. (This doesn't mean that `encoding_rs` has
to change. This could also be fixed by stabilizing `std::simd`.)
Fixes#2748
It looks like there is a reference cycle caused by the compiled
matchers (compiled HashMap holds ref to Ignore and Ignore holds ref
to HashMap). Using weak refs fixes issue #2690 in my test project.
Also confirmed via before and after when profiling the code, see the
attached screenshots in #2692.
Fixes#2690
- Stop using `-n __fish_use_subcommand`. This had the effect of
ignoring options if a positional argument has already been given, but
that's not how ripgrep works.
- Only suggest negation options if the option they're negating is
passed (e.g., only complete `--no-pcre2` if `--pcre2` is present). The
zsh completions already do this.
- Take into account whether an option takes an argument. If an option
is not a switch then it won't suggest further options until the
argument is given, e.g. `-C<tab>` won't suggest options but `-i<tab>`
will.
- Suggest correct arguments for options. We already completed a fixed
set of choices where available, but now we go further:
- Filenames are only suggested for options that take filenames.
- `--pre` and `--hostname-bin` suggest binaries from `$PATH`.
- `-t`/`--type`/&c use `--type-list` for suggestions, like in zsh,
with a preview of the glob patterns.
- `--encoding` uses a hardcoded list extracted from the zsh
completions. This has been refactored into a separate file, and the
range globs (`{1..5}`) replaced by comma globs (`{1,2,3,4,5}`) since
those work in both shells. I verified that this produces the same
list as before in zsh, and the same list in fish (albeit in a
different order).
PR #2684
This is an embarrassing oversight. A `todo!()` actually made its way
into a release! Oof.
This was working in ripgrep 13, but I had redone some aspects of sorting
and this just got left undone.
Fixes#2664
As the FIXME comment says, ripgrep is not yet using the new line
terminator option in regex-automata exposed for exactly this purpose.
Because of that, line anchors like `(?m:^)` and `(?m:$)` will only match
`\n` as a line terminator. This means that when --null-data is used in
combination with --line-regexp, the anchors inserted by --line-regexp
will not match correctly. This is only a big deal in the "fast" path,
which requires the regex engine to deal with line terminators itself
correctly. The slow path strips line terminators regardless of what they
are, and so the line anchors can match (begin/end of haystack).
Fixes#2658
And also, negated options don't take arguments.
Specifically, the fish completion generator currently forgets to add
`-l` to negation options, leading to a list of these errors:
complete: too many arguments
~/.config/fish/completions/rg.fish (line 146):
complete -c rg -n '__fish_use_subcommand' no-sort-files -d '(DEPRECATED) Sort results by file path.'
^
from sourcing file ~/.config/fish/completions/rg.fish
(Type 'help complete' for related documentation)
To reproduce, run `fish -c 'rg --generate=complete-fish | source'`.
It also potentially suggests a list of choices for negation options,
even though those never take arguments. That case doesn't occur with
any of the current options but it's an easy fix.
Fixes#2659, Closes#2655
Basically, unless the -a/--text flag is given, it is generally always an
error to search for an explicit NUL byte because the binary detection
will prevent it from matching.
Fixes#1838
The --vimgrep flag has some severe footguns when using a pattern that
matches very frequently. We had already written some docs to warn about
that, but now we also include a suggestion to avoid exorbitant heap
usage.
Closes#2505
This adds info about whether PCRE2 is available or not to the output of
--version. Essentially, --version now subsumes --pcre2-version, although
we do retain the former because it (usefully) emits an exit code based
on whether PCRE2 is available or not.
Closes#2645