Commit Graph

4154 Commits

Author SHA1 Message Date
Maxime Coste
0a081b9f31 Do not allow rename-session to introduce '/' in session names 2017-11-06 11:55:56 +08:00
Maxime Coste
52f4af6a83 Merge remote-tracking branch 'lenormf/fix-private-commands-in-register' 2017-11-05 12:22:28 +08:00
Maxime Coste
6bac767124 CommandManager: tweak naming 2017-11-04 16:02:21 +08:00
Maxime Coste
7f51e51fcb Introduce matching_pairs option that controls the pairs used by m 2017-11-04 15:53:53 +08:00
Frank LENORMAND
8900690288 src: Don't save whitespace-led commands in the : register 2017-11-04 09:18:26 +03:00
Maxime Coste
aa82a90c39 Remote: stricter validation of the session names
Creating a session will not accept any slashes in the session path,
connecting to an existing session will accept at most one slash to
allow for specifying the session of a different user.

Fixes #1635
2017-11-04 12:01:25 +08:00
Maxime Coste
aa9bcf08fc Code style tweak 2017-11-04 12:01:23 +08:00
Maxime Coste
9b216e0e79 Merge remote-tracking branch 'lenormf/fix-rc-aliases' 2017-11-03 19:32:30 +08:00
Maxime Coste
400ef6d48c Wrap: rework logic to avoid infinite loop with multiple wrap highlighters
The display is still going to be wrong, as wrapping is going to take place
multiple times, but Kakoune should not freeze anymore.
2017-11-03 19:30:31 +08:00
Maxime Coste
9d6420caae Remove uneeded forward declaration 2017-11-03 19:24:58 +08:00
Frank LENORMAND
9127ed0d55 src rc: Rename exec/eval into execute-keys/evaluate-commands 2017-11-03 11:09:45 +03:00
Maxime Coste
39e63cf518 Append '/' to highlighter group completion candidates 2017-11-02 18:05:18 +08:00
Maxime Coste
730e5725e9 Wrap: change indent atom to be a replaced empty buffer range
Avoid confusing the column highlighters.
2017-11-02 11:08:03 +08:00
Maxime Coste
fd95af0e3e Add informations on -indent in wrap highlighter docstring 2017-11-02 11:04:15 +08:00
Maxime Coste
4fabba3d12 doc.kak: Render documentation internally instead of relying on man
doc.kak now behaves as a basic asciidoc renderer. Asciidoc is unfortunately
still a dependency to generate the manpage of the `kak` command.
2017-11-02 10:03:24 +08:00
Maxime Coste
53069bcb2d Ensure line-specs and range-specs options are sorted internally 2017-11-02 09:51:15 +08:00
Maxime Coste
329f5fca0e Fix trailing spaces in highlighters.cc 2017-11-02 01:28:28 +08:00
Maxime Coste
6f2088cbc4 Wrap: Add -indent switch support that wraps preserving line indent 2017-11-02 01:28:28 +08:00
Maxime Coste
ed65d86c72 Rename doc/manpages to doc/pages
That fact we use man for these is an implementation detail.
2017-11-01 19:05:37 +08:00
Maxime Coste
25dac6b24e Document the regex impl switch in the startup message 2017-11-01 14:18:13 +08:00
Maxime Coste
51de90f366 Regex: Remove boost related code 2017-11-01 14:09:39 +08:00
Maxime Coste
c74becc6af Regex: fix RegexCompileFlags not being an enum class 2017-11-01 14:05:15 +08:00
Maxime Coste
2d901dc76f Regex: slight readability improvement and workaround a potential gcc bug 2017-11-01 14:05:15 +08:00
Maxime Coste
f07375fb27 Regex: remove dead code 2017-11-01 14:05:15 +08:00
Maxime Coste
2c2073b417 Regex: Tweak struct layouts of ParsedRegex data 2017-11-01 14:05:15 +08:00
Maxime Coste
bbd7e604dc Regex: Remove "Ast" from names in the ParsedRegex
It does not add much value, and makes names longer.
2017-11-01 14:05:15 +08:00
Maxime Coste
18a02ccacd Regex: Optimize parsing and compilation
AstNodes are now POD, stored in a single vector, accessed through
their index. The children list is implicit, with nodes storing only
the node index at which their child graph ends.

That makes reverse iteration slower, but that is only used for reverse
matching regex, which are uncommon. In the general case compilation
is now faster.
2017-11-01 14:05:15 +08:00
Maxime Coste
aea2de885d Regex: minor cleanup of the regex parsing code 2017-11-01 14:05:15 +08:00
Maxime Coste
6e0275e550 Regex: small code cleanup in the Save compilation code 2017-11-01 14:05:15 +08:00
Maxime Coste
9e15207d2a Regex: put the other char boolean inside the general start char map 2017-11-01 14:05:15 +08:00
Maxime Coste
7c3bc48627 Fix ConstexprVector::resize 2017-11-01 14:05:15 +08:00
Maxime Coste
60e32d73ff Regex: Fix handling of all unicode codepoint as start chars 2017-11-01 14:05:15 +08:00
Maxime Coste
df2bf9601c Regex: fix wrong fallthough in dump_regex 2017-11-01 14:05:15 +08:00
Maxime Coste
e9e9a08e7b Regex: refactor handling of Saves slightly, do not create them until really needed 2017-11-01 14:05:15 +08:00
Maxime Coste
d9b4076e3c Regex: Go back to instruction based search of next start
The previous method, which was a bit faster in the general use case,
can hit some cases where we get quadratic behaviour and very slow
matching.

By using an instruction, we can guarantee our complexity of O(N*M)
as we will never have more than N threads (N being the instruction
count) and we run the threads once per codepoint in the subject
string.

That slows down the general case slightly, but ensure we dont have
pathological cases.

This new version is much faster than the previous instruction based
search because it does not use a plain `.*` searcher, but a specific,
smarter instruction specialized for finding the next start if we are
in the correct conditions.
2017-11-01 14:05:15 +08:00
Maxime Coste
3f627058b0 Regex: add support for \0, \cX, \xXX and \uXXXX escapes 2017-11-01 14:05:15 +08:00
Maxime Coste
c423b47109 Regex: compute if codepoints outside of the start chars map can start 2017-11-01 14:05:15 +08:00
Maxime Coste
2c6c0be0c1 Regex: abort compilation as soon as we hit the instruction count limit 2017-11-01 14:05:15 +08:00
Maxime Coste
d44e160aa7 Regex: add a unit test for why lookaheads dont count for start chars anymore 2017-11-01 14:05:15 +08:00
Maxime Coste
87eec79d07 Regex: comment the mutables in CompiledRegex::Instruction and fix their init 2017-11-01 14:05:14 +08:00
Maxime Coste
8b2297f5ca Regex: Introduce a Regex memory domain to track usage separately 2017-11-01 14:05:14 +08:00
Maxime Coste
9ec175f2f8 Regex: use binary search to for character class ranges check 2017-11-01 14:05:14 +08:00
Maxime Coste
6e65589a34 Regex: compute start chars from matchers, do not compute it from lookarounds
Computing potential start characters from lookarounds is more complex
than expected, and not worth the complexity.
2017-11-01 14:05:14 +08:00
Maxime Coste
621b0d3ab8 Regex: remove the need to a processed inst vector
Identify each step with a counter, and check if the instruction
was already processed this step. This makes the matching faster,
by removing the need to maintain a vector of instructions executed
this step.
2017-11-01 14:05:14 +08:00
Maxime Coste
cfc52d7e6a Regex: use intrusive linked list for the free saves instead of a Vector 2017-11-01 14:05:14 +08:00
Maxime Coste
df16fea82d Regex: rename "flags" with the more common "modifiers" 2017-11-01 14:05:14 +08:00
Maxime Coste
52d443f764 Regex: Correctly handle ignore case mode for start chars computation 2017-11-01 14:05:14 +08:00
Maxime Coste
b8495f0953 Regex: Rework parsing, treat lookarounds as assertions, and flags separately 2017-11-01 14:05:14 +08:00
Maxime Coste
b0233262b8 Regex: Limit programs to std::numeric_limits<uint16_t>::max() instructions 2017-11-01 14:05:14 +08:00
Maxime Coste
8c8dcb3a84 Regex: Fix reverse searching behaviour, again 2017-11-01 14:05:14 +08:00
Maxime Coste
9753bcd0ad Regex: limit explicit quantifiers value (too 1000 for now)
Fixes #1628
2017-11-01 14:05:14 +08:00
Maxime Coste
2b97e4e124 Regex: Fix handling of ^ and $ in backward matching mode 2017-11-01 14:05:14 +08:00
Maxime Coste
3c999aba37 Regex: Only reset processed and scheduled flags on relevant instructions
On big regex, reseting all those flags on all instructions for each
character can become the dominant operation. Track that actual
instructions index processed (the scheduled are already tracked in
the next_threads vector), and only reset these.
2017-11-01 14:05:14 +08:00
Maxime Coste
5bf4be645a Regex: Fix support for ignore case in lookarounds 2017-11-01 14:05:14 +08:00
Maxime Coste
80f6caee81 Regex: move try/catch blocks inside boost specific code 2017-11-01 14:05:14 +08:00
Maxime Coste
dd9e43e6f9 Regex: small code cleanup 2017-11-01 14:05:14 +08:00
Maxime Coste
23b3a221eb Regex: support more than two children in alternations
Avoid deep nested alternations, parse them flattened.
2017-11-01 14:05:14 +08:00
Maxime Coste
fb5243f710 Regex: print instruction index in dump_regex 2017-11-01 14:05:14 +08:00
Maxime Coste
c8966ca701 Regex: Assert that the regex direction matches the vm direction 2017-11-01 14:05:14 +08:00
Maxime Coste
74ed102cab Regex: Tweak definition of character class and control escape tables 2017-11-01 14:05:14 +08:00
Maxime Coste
df73b71dfc Regex: fix lookarounds handling when computing starting chars 2017-11-01 14:05:14 +08:00
Maxime Coste
1c95074657 Make use of custom regex backward searching support for reverse search 2017-11-01 14:05:14 +08:00
Maxime Coste
785cd34b4b Regex: Make boost checking disableable at compile time 2017-11-01 14:05:14 +08:00
Maxime Coste
065bbc8f59 Regex: switch to custom impl, use boost for checking 2017-11-01 14:05:14 +08:00
Maxime Coste
9305fa1369 Regex: Fix lookaround use in moon.kak
(?=[A-Z]\w*) is strictly the same as (?=[A-Z]) as \w* will always
at least match an empty string.
2017-11-01 14:05:14 +08:00
Maxime Coste
cca730193c Regex: Support any char and character classes in lookarounds
Lookarounds still need to be fixed size, but accept character classes
as well as plain literals.
2017-11-01 14:05:14 +08:00
Maxime Coste
b8cb65160a Regex: use std::conditional instead of custom template class to choose Utf8It 2017-11-01 14:05:14 +08:00
Maxime Coste
db06acdfab Regex: Fix computation of potential starts for lookaheads 2017-11-01 14:05:14 +08:00
Maxime Coste
34b1f1ccb6 Regex: detect when all characters can start and avoid allocating 2017-11-01 14:05:14 +08:00
Maxime Coste
ea85f79384 Regex: add elided braces to fix compilation on older gcc 2017-11-01 14:05:14 +08:00
Maxime Coste
bf3b50a543 Regex: Fix wrong size of character_class_escapes array 2017-11-01 14:05:14 +08:00
Maxime Coste
08ea68dc1f Regex: Fix handling of match_prev_avail for boost regex
We were passing around iterators that were not allowed to
go before the begin iterator.
2017-11-01 14:05:14 +08:00
Maxime Coste
9ec376135b Regex: Introduce RegexExecFlags::PrevAvailable
Rework assertion code as well.
2017-11-01 14:05:14 +08:00
Maxime Coste
73e177ec59 Regex: Do not use sized deallocation to support more compilers 2017-11-01 14:05:14 +08:00
Maxime Coste
30dacdade2 Regex: deallocate Saves memory on ThreadedRegexVM destruction 2017-11-01 14:05:14 +08:00
Maxime Coste
578640c8a4 Regex: Fix handling of control escapes inside character classes 2017-11-01 14:05:14 +08:00
Maxime Coste
f3736a4b48 Regex: tag instructions as scheduled as well instead of searching
And a few more code cleanup in the ThreadedRegexVM
2017-11-01 14:05:14 +08:00
Maxime Coste
6bc5823745 Regex: refactor ThreadedRegexVM::exec_from code 2017-11-01 14:05:14 +08:00
Maxime Coste
4ff655cc09 Regex: store the processed flag directly in CompiledRegex instructions 2017-11-01 14:05:14 +08:00
Maxime Coste
732b8bc2a4 Regex: abandon bytecode and just use a simple list of instructions
Makes the code simpler.
2017-11-01 14:05:14 +08:00
Maxime Coste
6434bca325 Regex: Add some comments, remove supurious semicolons 2017-11-01 14:05:14 +08:00
Maxime Coste
911a893225 Regex: fix get_base(std::reverse_iterator<...>) returning a ref to temporary 2017-11-01 14:05:14 +08:00
Maxime Coste
11abd544c6 Regex: avoid infinite loops 2017-11-01 14:05:14 +08:00
Maxime Coste
c47cdc06a7 Regex: Add support for backward matching
Regex can be compiled for backward matching instead of forward matching
and the ThreadedRegexVM is able to iterate in reverse on the subject
string to find the last match instead of the first.
2017-11-01 14:05:14 +08:00
Maxime Coste
071b897e00 Regex: Remove static RegexCompiler::compile 2017-11-01 14:05:14 +08:00
Maxime Coste
52ee62172a Regex: remove use of buffer_utils.hh from regex_impl.cc 2017-11-01 14:05:14 +08:00
Maxime Coste
c375268c2d Regex: Use memcpy to write/read offsets from bytecode
reinterpret_cast was undefined behaviour as we do not guarantee
that offsets are going to be stored properly aligned.
2017-11-01 14:05:14 +08:00
Maxime Coste
b53227d62c Regex: slight cleanup of the unit tests 2017-11-01 14:05:14 +08:00
Maxime Coste
337e58d4f9 Regex: Cleanup character class parsing a bit 2017-11-01 14:05:14 +08:00
Maxime Coste
236751cb84 Regex: Make ThreadedRegexVM a proper class, define a proper interface 2017-11-01 14:05:14 +08:00
Maxime Coste
3b69dda04e Regex: Find potential start position using a map of valid start chars
With this optimization we get close to performance parity with boost
regex on the common use cases in Kakoune.
2017-11-01 14:05:14 +08:00
Maxime Coste
741772aef9 Regex: Optimize single char character classes as literals 2017-11-01 14:05:14 +08:00
Maxime Coste
fabeab1ee1 Regex: reorder lookaround ops, group by direction 2017-11-01 14:05:14 +08:00
Maxime Coste
854144c535 Regex: Fix handling of Save instruction in ThreadedRegexVM
When not saving, we were not fully reading the instruction stream,
leading to an out of sync instruction pointer.
2017-11-01 14:05:14 +08:00
Maxime Coste
f1b4931824 Regex: Fix handling of non capturing groups (?:...)
We were wrongly keeping the `:` as a literal content of the group
2017-11-01 14:05:14 +08:00
Maxime Coste
5f6e71c4dc Regex: More code tweaks and cleanups in ThreadedRegexVM 2017-11-01 14:05:14 +08:00
Maxime Coste
5f54e0de0e Regex: Code cleanup and refactor for Saves handling 2017-11-01 14:05:14 +08:00
Maxime Coste
dbb175841b Regex: do not write the search prefix inside the program bytecode
Its faster to have specialized code in the VM directly
2017-11-01 14:05:14 +08:00
Maxime Coste
cf5055f68b Regex: small code tweak 2017-11-01 14:05:14 +08:00
Maxime Coste
e0fac20f6c Regex: Use a custom allocated buffer for Saves instead of a Vector 2017-11-01 14:05:14 +08:00