Commit Graph

18 Commits

Author SHA1 Message Date
Andrew Gallant
79d40d0e20 Tweak how binary files are handled internally.
This commit fixes two issues. The first issue is that if a file contained
many NUL bytes without any LF bytes, then the InputBuffer would read the
entire file into memory. This is not typically a problem, but if you run
rg on /proc, then bad things can happen when reading virtual memory mapping
files. Arguably, such files should be ignored, but we should also try to
avoid exhausting memory too. We fix this by pushing the `-a/--text` flag
option down into InputBuffer, so that it knows to stop immediately if it
finds a NUL byte.

The other issue this fixes is that binary detection is now applied to every
buffer instead of just the first one. This helps avoid detecting too many
files as plain text if the first parts of a binary file happen to contain
no NUL bytes. This issue still persists somewhat in the memory map
searcher, since we probably don't want to search the entire file upfront
for NUL bytes before actually performing our search. Instead, we search the
first 10KB for now.

Fixes #52, Fixes #311
2017-02-18 16:20:21 -05:00
Daniel Luz
c4633ff187 Remove trivial condition. 2017-01-08 17:02:57 -05:00
Leonardo Yvens
dd5ded2f78 fix some clippy lints (#288) 2016-12-23 14:53:35 -05:00
Andrew Gallant
e8a30cb893 Completely re-work colored output and tty handling.
This commit completely guts all of the color handling code and replaces
most of it with two new crates: wincolor and termcolor. wincolor
provides a simple API to coloring using the Windows console and
termcolor provides a platform independent coloring API tuned for
multithreaded command line programs. This required a lot more
flexibility than what the `term` crate provided, so it was dropped.
We instead switch to writing ANSI escape sequences directly and ignore
the TERMINFO database.

In addition to fixing several bugs, this commit also permits end users
to customize colors to a certain extent. For example, this command will
set the match color to magenta and the line number background to yellow:

    rg --colors 'match:fg:magenta' --colors 'line:bg:yellow' foo

For tty handling, we've adopted a hack from `git` to do tty detection in
MSYS/mintty terminals. As a result, ripgrep should get both color
detection and piping correct on Windows regardless of which terminal you
use.

Finally, switch to line buffering. Performance doesn't seem to be
impacted and it's an otherwise more user friendly option.

Fixes #37, Fixes #51, Fixes #94, Fixes #117, Fixes #182, Fixes #231
2016-11-20 11:14:52 -05:00
Andrew Gallant
03f7605322 Rename --files-without-matches to --files-without-match.
This is to be consistent with grep.
2016-11-19 20:15:41 -05:00
Daniel Luz
bd3e7eedb1 Add --files-without-matches flag.
Performs the opposite of --files-with-matches: only shows paths of
files that contain zero matches.

Closes #138
2016-11-19 21:48:59 -02:00
Andrew Gallant
92dc402f7f Switch from Docopt to Clap.
There were two important reasons for the switch:

1. Performance. Docopt does poorly when the argv becomes large, which is
   a reasonable common use case for search tools. (e.g., use with xargs)
2. Better failure modes. Clap knows a lot more about how a particular
   argv might be invalid, and can therefore provide much clearer error
   messages.

While both were important, (1) made it urgent.

Note that since Clap requires at least Rust 1.11, this will in turn
increase the minimum Rust version supported by ripgrep from Rust 1.9 to
Rust 1.11. It is therefore a breaking change, so the soonest release of
ripgrep with Clap will have to be 0.3.

There is also at least one subtle breaking change in real usage.
Previous to this commit, this used to work:

    rg -e -foo

Where this would cause ripgrep to search for the string `-foo`. Clap
currently has problems supporting this use case
(see: https://github.com/kbknapp/clap-rs/issues/742),
but it can be worked around by using this instead:

    rg -e [-]foo

or even

    rg [-]foo

and this still works:

    rg -- -foo

This commit also adds Bash, Fish and PowerShell completion files to the
release, fixes a bug that prevented ripgrep from working on file
paths containing invalid UTF-8 and shows short descriptions in the
output of `-h` but longer descriptions in the output of `--help`.

Fixes #136, Fixes #189, Fixes #210, Fixes #230
2016-11-17 19:53:41 -05:00
Andrew Gallant
2e5c3c05e8 reword 2016-11-06 19:48:49 -05:00
Andrew Gallant
6884eea2f5 reword 2016-11-06 19:48:17 -05:00
Andrew Gallant
58aca2efb2 Add -m/--max-count flag.
This flag limits the number of matches printed *per file*.

Closes #159
2016-11-06 13:09:53 -05:00
Andre Bogus
02de97b8ce Use the bytecount crate for fast line counting.
Fixes #128
2016-11-05 22:29:26 -04:00
Andrew Gallant
46dff8f4be Be better with short circuiting with --quiet.
It didn't make sense for --quiet to be part of the printer, because --quiet
doesn't just mean "don't print," it also means, "stop after the first
match is found." This needs to be wired all the way up through directory
traversal, and it also needs to cause all of the search workers to quit
as well. We do it with an atomic that is only checked with --quiet is
given.

Fixes #116.
2016-09-28 20:50:50 -04:00
Andrew Gallant
b034b77798 Don't replace NUL bytes when searching binary files as text.
This was a result of misinterpreting a feature in grep where NUL bytes
are replaced with \n. The primary reason for doing this is to avoid
excessive memory usage on truly binary data. However, grep only does this
when searching binary files as if they were binary, and which only reports
whether the file matched or not. When grep is told to search binary data
as text (the -a/--text flag), then it doesn't do any replacement so we
shouldn't either.

In general, this makes sense, because the user is essentially asserting
that a particular file that looks like binary is actually text. In that
case, we shouldn't try to replace any NUL bytes.

ripgrep doesn't actually support searching binary data for whether it
matches or not, so we don't actually need the replace_buf function.
However, it does seem like a potentially useful feature.
2016-09-25 21:26:49 -04:00
Andrew Gallant
9dc5464c84 Stop after first match is found with --quiet.
Fixes #77.
2016-09-25 15:01:29 -04:00
Andrew Schwartzmeyer
a8f3d9e87e Add --files-with-matches flag.
Closes #26.

Acts like --count but emits only the paths of files with matches,
suitable for piping to xargs. Both mmap and no-mmap searches terminate
after the first match is found. Documentation updated and tests added.
2016-09-24 21:40:17 -07:00
Andrew Gallant
fdca74148d Stream results when feasible.
For example, when only a single file (or stdin) is being searched, then we
should be able to print directly to the terminal instead of intermediate
buffers. (The buffers are only necessary for parallelism.)

Closes #4.
2016-09-13 21:11:46 -04:00
Andrew Gallant
19615245cd Make line counting much faster. 2016-09-10 01:35:44 -04:00
Andrew Gallant
e3da726836 Rename search module to search_stream.
The name better reflects the difference between it and the search_buffer
module.
2016-09-10 00:08:42 -04:00