Maxime Coste
3c2159f06c
options.asciidoc: Document other option commands, remove tabs
2017-11-02 17:36:10 +08:00
Maxime Coste
730e5725e9
Wrap: change indent atom to be a replaced empty buffer range
...
Avoid confusing the column highlighters.
2017-11-02 11:08:03 +08:00
Maxime Coste
fd95af0e3e
Add informations on -indent in wrap highlighter docstring
2017-11-02 11:04:15 +08:00
Maxime Coste
4fabba3d12
doc.kak: Render documentation internally instead of relying on man
...
doc.kak now behaves as a basic asciidoc renderer. Asciidoc is unfortunately
still a dependency to generate the manpage of the `kak` command.
2017-11-02 10:03:24 +08:00
Maxime Coste
90865b65cd
asciidoc.kak: Highlight ^=+
syntax sections
2017-11-02 09:52:18 +08:00
Maxime Coste
53069bcb2d
Ensure line-specs and range-specs options are sorted internally
2017-11-02 09:51:15 +08:00
Maxime Coste
424b2389cb
kakrc.kak: Fix highlighting of key words at start of buffer
2017-11-02 01:28:37 +08:00
Maxime Coste
329f5fca0e
Fix trailing spaces in highlighters.cc
2017-11-02 01:28:28 +08:00
Maxime Coste
6f2088cbc4
Wrap: Add -indent switch support that wraps preserving line indent
2017-11-02 01:28:28 +08:00
Maxime Coste
6bc408e9b9
Remove duplicated documentation from the README
...
Just point towards the relevant doc page.
2017-11-01 19:49:13 +08:00
Maxime Coste
ed65d86c72
Rename doc/manpages to doc/pages
...
That fact we use man for these is an implementation detail.
2017-11-01 19:05:37 +08:00
Maxime Coste
412c21bf70
Update highlighters documentation
...
Remove documentation from the README and point to the highlighters
doc.
2017-11-01 19:00:44 +08:00
Maxime Coste
25dac6b24e
Document the regex impl switch in the startup message
2017-11-01 14:18:13 +08:00
Maxime Coste
09de0686ef
Remove remaining references to boost from documentation/contrib files
2017-11-01 14:15:11 +08:00
Maxime Coste
51de90f366
Regex: Remove boost related code
2017-11-01 14:09:39 +08:00
Maxime Coste
2b295a265e
Regex: Add a Compatibility section to the regex documentation
...
Refer more explicitely to ECMAScript and document the
incompatibilities with it.
2017-11-01 14:05:15 +08:00
Maxime Coste
c74becc6af
Regex: fix RegexCompileFlags not being an enum class
2017-11-01 14:05:15 +08:00
Maxime Coste
2d901dc76f
Regex: slight readability improvement and workaround a potential gcc bug
2017-11-01 14:05:15 +08:00
Maxime Coste
f07375fb27
Regex: remove dead code
2017-11-01 14:05:15 +08:00
Maxime Coste
2c2073b417
Regex: Tweak struct layouts of ParsedRegex data
2017-11-01 14:05:15 +08:00
Maxime Coste
bbd7e604dc
Regex: Remove "Ast" from names in the ParsedRegex
...
It does not add much value, and makes names longer.
2017-11-01 14:05:15 +08:00
Maxime Coste
18a02ccacd
Regex: Optimize parsing and compilation
...
AstNodes are now POD, stored in a single vector, accessed through
their index. The children list is implicit, with nodes storing only
the node index at which their child graph ends.
That makes reverse iteration slower, but that is only used for reverse
matching regex, which are uncommon. In the general case compilation
is now faster.
2017-11-01 14:05:15 +08:00
Maxime Coste
aea2de885d
Regex: minor cleanup of the regex parsing code
2017-11-01 14:05:15 +08:00
Maxime Coste
6e0275e550
Regex: small code cleanup in the Save compilation code
2017-11-01 14:05:15 +08:00
Maxime Coste
9e15207d2a
Regex: put the other char boolean inside the general start char map
2017-11-01 14:05:15 +08:00
Maxime Coste
7c3bc48627
Fix ConstexprVector::resize
2017-11-01 14:05:15 +08:00
Maxime Coste
60e32d73ff
Regex: Fix handling of all unicode codepoint as start chars
2017-11-01 14:05:15 +08:00
Maxime Coste
df2bf9601c
Regex: fix wrong fallthough in dump_regex
2017-11-01 14:05:15 +08:00
Maxime Coste
e9e9a08e7b
Regex: refactor handling of Saves slightly, do not create them until really needed
2017-11-01 14:05:15 +08:00
Maxime Coste
d9b4076e3c
Regex: Go back to instruction based search of next start
...
The previous method, which was a bit faster in the general use case,
can hit some cases where we get quadratic behaviour and very slow
matching.
By using an instruction, we can guarantee our complexity of O(N*M)
as we will never have more than N threads (N being the instruction
count) and we run the threads once per codepoint in the subject
string.
That slows down the general case slightly, but ensure we dont have
pathological cases.
This new version is much faster than the previous instruction based
search because it does not use a plain `.*` searcher, but a specific,
smarter instruction specialized for finding the next start if we are
in the correct conditions.
2017-11-01 14:05:15 +08:00
Maxime Coste
3f627058b0
Regex: add support for \0, \cX, \xXX and \uXXXX escapes
2017-11-01 14:05:15 +08:00
Maxime Coste
c423b47109
Regex: compute if codepoints outside of the start chars map can start
2017-11-01 14:05:15 +08:00
Maxime Coste
2c6c0be0c1
Regex: abort compilation as soon as we hit the instruction count limit
2017-11-01 14:05:15 +08:00
Maxime Coste
b59ad2f09d
Regex: change description of lookarounds limitations
2017-11-01 14:05:15 +08:00
Maxime Coste
3d0a0f1369
Regex: apply danr's suggested changes to the regex syntax documentation
2017-11-01 14:05:15 +08:00
Maxime Coste
d44e160aa7
Regex: add a unit test for why lookaheads dont count for start chars anymore
2017-11-01 14:05:15 +08:00
Maxime Coste
87eec79d07
Regex: comment the mutables in CompiledRegex::Instruction and fix their init
2017-11-01 14:05:14 +08:00
Maxime Coste
8b2297f5ca
Regex: Introduce a Regex memory domain to track usage separately
2017-11-01 14:05:14 +08:00
Maxime Coste
9ec175f2f8
Regex: use binary search to for character class ranges check
2017-11-01 14:05:14 +08:00
Maxime Coste
6e65589a34
Regex: compute start chars from matchers, do not compute it from lookarounds
...
Computing potential start characters from lookarounds is more complex
than expected, and not worth the complexity.
2017-11-01 14:05:14 +08:00
Maxime Coste
621b0d3ab8
Regex: remove the need to a processed inst vector
...
Identify each step with a counter, and check if the instruction
was already processed this step. This makes the matching faster,
by removing the need to maintain a vector of instructions executed
this step.
2017-11-01 14:05:14 +08:00
Maxime Coste
cfc52d7e6a
Regex: use intrusive linked list for the free saves instead of a Vector
2017-11-01 14:05:14 +08:00
Frank LENORMAND
3acb75c5c2
Regex: Fix a few mistakes in the documentation
2017-11-01 14:05:14 +08:00
Maxime Coste
8c529d3cff
Regex: add a regex.asciidoc documentation page describing the syntax
2017-11-01 14:05:14 +08:00
Maxime Coste
df16fea82d
Regex: rename "flags" with the more common "modifiers"
2017-11-01 14:05:14 +08:00
Maxime Coste
52d443f764
Regex: Correctly handle ignore case mode for start chars computation
2017-11-01 14:05:14 +08:00
Maxime Coste
b8495f0953
Regex: Rework parsing, treat lookarounds as assertions, and flags separately
2017-11-01 14:05:14 +08:00
Maxime Coste
b0233262b8
Regex: Limit programs to std::numeric_limits<uint16_t>::max() instructions
2017-11-01 14:05:14 +08:00
Maxime Coste
8c8dcb3a84
Regex: Fix reverse searching behaviour, again
2017-11-01 14:05:14 +08:00
Maxime Coste
9753bcd0ad
Regex: limit explicit quantifiers value (too 1000 for now)
...
Fixes #1628
2017-11-01 14:05:14 +08:00