Regex: Fix a few mistakes in the documentation

This commit is contained in:
Frank LENORMAND 2017-10-13 09:42:58 +03:00 committed by Maxime Coste
parent 8c529d3cff
commit 3acb75c5c2

View File

@ -11,7 +11,7 @@ Regex Syntax
Kakoune regex syntax is based on the ECMAScript syntax, as defined by the Kakoune regex syntax is based on the ECMAScript syntax, as defined by the
ECMA-262 standard. ECMA-262 standard.
Kakoune's regex always run on unicode codepoint sequences, not on bytes. Kakoune's regex always run on Unicode codepoint sequences, not on bytes.
Literals Literals
-------- --------
@ -26,7 +26,7 @@ Some additional literals are available as escape sequences:
* `\n` matches the line feed character. * `\n` matches the line feed character.
* `\r` matches the carriage return character. * `\r` matches the carriage return character.
* `\t` matches the tabulation character. * `\t` matches the tabulation character.
* `\v` matches the the vertical tabulation character. * `\v` matches the vertical tabulation character.
Character classes Character classes
----------------- -----------------
@ -58,18 +58,18 @@ The `-` characters in a character class that are not specifying a
range are treated as literal `-`, so `[A-Z-+]` matches all upper case range are treated as literal `-`, so `[A-Z-+]` matches all upper case
characters, the `-` character, and the `+` character. characters, the `-` character, and the `+` character.
supported character class escapes are: Supported character class escapes are:
* `\d` which matches all digits. * `\d` which matches all digits.
* `\w` which matches all word characters. * `\w` which matches all word characters.
* `\s` which matches all whitespace characters. * `\s` which matches all whitespace characters.
* `\h` which matches all horizontal whitespace characters. * `\h` which matches all horizontal whitespace characters.
Using a upper case letter instead of a lower case one will negate Using an upper case letter instead of a lower case one will negate
the character class, meaning for example that `\D` will match every the character class, meaning for example that `\D` will match every
non-digit character. non-digit character.
character class escapes can be used outside of a character class, `\d` Character class escapes can be used outside of a character class, `\d`
is equivalent to `[\d]`. is equivalent to `[\d]`.
Any character Any character
@ -81,7 +81,7 @@ Groups
------ ------
Regex atoms can be grouped using `(` and `)` or `(?:` and `)`. If `(` is Regex atoms can be grouped using `(` and `)` or `(?:` and `)`. If `(` is
used, the group will be a capturing group. which means the positions from used, the group will be a capturing group, which means the positions from
the subject strings that matched between `(` and `)` will be recorded. the subject strings that matched between `(` and `)` will be recorded.
Capture groups are numbered starting at 1 (0 is a special capture group Capture groups are numbered starting at 1 (0 is a special capture group
@ -94,8 +94,8 @@ matches positions.
Alternations Alternations
------------ ------------
`|` introduces an alternation, which will either match its left hand side, `|` introduces an alternation, which will either match its left-hand side,
or its right hand side (preferring the left hand side) or its right-hand side (preferring the left-hand side)
For example, `foo|bar` matches either `foo` or `bar`, `foo(bar|baz|qux)` For example, `foo|bar` matches either `foo` or `bar`, `foo(bar|baz|qux)`
matches `foo` followed by either `bar`, `baz` or `qux`. matches `foo` followed by either `bar`, `baz` or `qux`.
@ -116,7 +116,7 @@ by a quantifier, which specifies the number of times they can match.
By default, quantifiers are *greedy*, which means they will prefer to By default, quantifiers are *greedy*, which means they will prefer to
match more characters if possible. Suffixing a quantifier with `?` will match more characters if possible. Suffixing a quantifier with `?` will
make it non-greedy, meaning it will prefer to match less characters. make it non-greedy, meaning it will prefer to match fewer characters.
Zero width assertions Zero width assertions
--------------------- ---------------------
@ -128,7 +128,7 @@ from matching if they are not fulfilled.
character, or at the subject begin (except if specified that the character, or at the subject begin (except if specified that the
subject begin is not a start of line). subject begin is not a start of line).
* `$` matches at the end of a line, that is just before a new line, or * `$` matches at the end of a line, that is just before a new line, or
at the subject end (except if specified that the subject end at the subject end (except if specified that the subject's end
is not an end of line). is not an end of line).
* `\b` matches at a word boundary, when one of the previous character * `\b` matches at a word boundary, when one of the previous character
and current character is a word character, and the other is not. and current character is a word character, and the other is not.
@ -144,11 +144,11 @@ More complex assertions can be expressed with lookarounds:
* `(?=...)` is a lookahead, it will match if its content matches the text * `(?=...)` is a lookahead, it will match if its content matches the text
following the current position following the current position
* `(?!...)` is a negative lookahead, it will match if its content does * `(?!...)` is a negative lookahead, it will match if its content does
not matches the text following the current position not match the text following the current position
* `(?<=...)` is a lookbehind, it will match if its content matches * `(?<=...)` is a lookbehind, it will match if its content matches
the text preceding the current position the text preceding the current position
* `(?<!...)` is a negative lookbehind, it will match if its content does * `(?<!...)` is a negative lookbehind, it will match if its content does
not matches the text preceding the current position not match the text preceding the current position
For performance reasons lookaround contents cannot be an arbitrary For performance reasons lookaround contents cannot be an arbitrary
regular expression, it must be sequence of literals, character classes regular expression, it must be sequence of literals, character classes
@ -161,11 +161,11 @@ preceded by `bar` and where `foo` matches from the current position
Modifiers Modifiers
--------- ---------
Some modifiers can control the matching behaviour of the atoms following Some modifiers can control the matching behavior of the atoms following
them: them:
* `(?i)` will enable case insensitive matching. * `(?i)` enables case-insensitive matching
* `(?I)` will disable case insensitive matching. * `(?I)` disables case-insensitive matching
Quoting Quoting
------- -------