grepp is a relatively complete implementation of the grep tool in Perl using Perl regular expressions, which attempts to be command-line and output-compatible with GNU grep. It features almost all the features of GNU grep 2.5.1, minus some obvious omissions (basic/extended regexp switches, etc). Why did I do this? Well, I wanted a version of grep which supported Perl regular expressions… now, GNU grep 2.5.1 does this (with the -P switch), however on Solaris systems with a Perl installation, but a limited grep binary, I’ve found this thing comes in pretty darn handy. In addition, I’ve found that muli-line matching a –not-regexp is particularly useful for searching through source code… handy, when you’re a software developer by trade. :)
Note, grepp does differ from GNU grep in one notable way: it does not match GNU grep’s behaviour when reading from stdin with the -m switch. GNU grep will ensure that stdin is pointing at the beginning of the line following the last match, whereas grepp currently does not match this behaviour. This will be in version 1.0. :) Oh, and grepp also does not traverse the filesystem in the same order as GNU grep.
Of course, as as hinted above, grepp supports a bunch of features that I’ve found useful:
- Multi-line matching using a match window.
- Negative expressions (so you can combine positive and negative match rules).
- Reverse mode (grep from the bottom up, like piping tac to grep).
- Regex-based file inclusion/exclusion (lifted from CVS grep… I think).
Now, grepp is not completely tested yet, so if anyone finds any combination of switches which results in behaviour which deviates from that of GNU grep, let me know.
Despite the seemingly major version number change, the only thing this update incorporates is a new, hackish –reverse flag. For files, this causes grepp to process the file in reverse (last lines first), not unlike if you piped tac into grepp, but with the advantage of working over multiple files, etc. ‘course, this doesn’t work for stdin (no seek) or line-buffered grepping (which doesn’t make sense anyway). And for binary files, it only causes the file to be processed in reverse a block at a time (ie, byte reads are not reversed).
So why did I bother? Because it’s useful to me, that’s why. :)
Oh, this version also includes a fix for –max-count (it had an off-by-one bug… whoops).
Fixed –files-without-match. It simply didn’t work right, before, thanks to typos. Whoops!
Another update, same day. Go figure. This fix makes it possible to specify multiple –include, –exclude, –include-regexp, and –exclude-regexp flags. In all cases, it’s disjunctive semantics, meaning only one of the expressions needs to match for the filter to trigger.
Turns out the inclusion and exclusion stuff eroneously applies to directories, as well as files, which kinda limits it’s usefulness. This update fixes this, so those checks are only done to the files themselves.
Now, there’s still an oddity in that –include and –exclude aren’t applied to stuff in the current directory if you just do a ‘grepp args *’. So, for now, use ‘grepp -r args pattern .’. Yes, it’s a bit of a hack, but it works well enough for now.
Changed the way –not-regexp behaves, to make it more sane when –not-regexp and –regexp are combined. Now, in the normal case, the matcher requires at least one regexp to match AND at least one not-regexp to not match. In the match-all condition, it requires all the regexps to match, and all the not-regexps to not match.