|Reported by:||Owned by:|
- Noteworthy changes in release 2.19 (2014-05-22) [stable]
Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'ǈ' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i ǈ' now matches both 'Ǉ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'ǉ' (U+01C9 LATIN SMALL LETTER LJ).