From `man 7 glob`:
One can remove the special meaning of '?', '*' and '[' by preceding
them by a backslash, or, in case this is part of a shell command
line, enclosing them in quotes.
Conform to glob / fnmatch / git implementations by making `\` escape the
following character - for example `\?` will match a literal `?`.
However, only enable this by default on Unix platforms. Windows builds
will continue to use `\` as a path separator, but can still get the new
behavior by calling `globset.backslash_escape(true)`.
Adding tests for the `Globset::backslash_escape` option was a bit
involved, since the default value of this option is platform-dependent.
Extend the options framework to hold an `Option<T>` for each
knob, where `None` means "default" and `Some(v)` means "override with
`v`". This way we only have to specify the default values once in
`GlobOptions::default()` rather than replicated in both code and tests.
Finally write a few behavioral tests, and some tests to confirm it
varies by platform.
This commit removes, in retrospect, a silly use of `unsafe`. In particular,
to extract a file name extension (distinct from how `std` implements it),
we were transmuting an OsStr to its underlying WTF-8 byte representation
and then searching that. This required `unsafe` and relied on an
undocumented std API, so it was a bad choice to make, but everything gets
sacrificed at the Alter of Performance.
The thing I didn't seem to realize at the time was that:
1. On Unix, you can already get the raw byte representation in a manner
that has zero cost.
2. On Windows, paths are already being encoded and copied every which
way. So doing a UTF-8 check and, in rare cases (for invalid UTF-8),
an extra copy, doesn't seem like that much more of an added expense.
Thus, rewrite the extension extraction using safe APIs. On Unix, this
should have identical performance characteristics as the previous
implementation. On Windows, we do pay a higher cost in the UTF-8
check, but Windows is already paying a similar cost a few times over
anyway.
This adds a few tests that check for bugs reported here:
https://github.com/rust-lang/cargo/issues/4268
The bugs reported in the aforementioned issue are probably caused by not
enabling the `literal_separator` option in `GlobBuilder`. Enabling that
in the tests under question fixes the issue.
Closes#773
This threads the original glob given by end users through all of the
glob parsing errors. This was slightly trickier than it might appear
because the gitignore implementation actually modifies the glob before
compiling it. So in order to get better glob error messages everywhere,
we need to track the original glob both in the glob parser and in the
higher-level abstractions in the `ignore` crate.
Fixes#444
This commit completes the initial move of glob matching to an external
crate, including fixing up cross platform support, polishing the
external crate for others to use and fixing a number of bugs in the
process.
Fixes#87, #127, #131