kakoune

Author	SHA1	Message	Date
Maxime Coste	2b74cd4b59	Remove instructions from ExecConfig We can just compute whenever we reset last_step, which does not happen often and we know `forward` at compile time anyway	2023-02-19 11:46:17 +11:00
Maxime Coste	f115af7a57	Optimize Regex CharacterClass matching Take advantage of ranges sorting to early out, make the logic inline.	2023-02-19 11:16:14 +11:00
Maxime Coste	85ceef29bd	Fix broken corner cases in DualThreadStack::grow_ifn We only grow when the ring buffer is full, which allows for a nice simplification of the code. Tell grow_ifn if we pushed in current or next so that we can distinguish between filled by next or filled by current when m_current == m_next_begin	2023-02-14 17:13:31 +11:00
Maxime Coste	d708b77186	Refactor DualThreadStack as a RingBuffer Instead of two stacks growing from the two ends of a buffer, use a ring buffer growing from the same mid spot. This avoids the costly memory copy every step when we set next threads as the current ones.	2023-02-14 07:04:54 +11:00
Maxime Coste	762064dc68	Remove scheduled optimization from ThreadedRegexVM This does not seem to actually speed up execution as threads will be dropped on next step anyway	2023-02-13 21:15:55 +11:00
Maxime Coste	f5d5274c5f	Fix incorrect use of subject end/begin in regex execution This could lead to reading past subject string end in certain conditions Fixes #4794	2023-01-23 17:38:02 +11:00
Maxime Coste	0c1d4808fa	Slight code style tweak	2022-08-20 11:03:03 +02:00
Maxime Coste	21047db4a0	Remove unnecessary utf8 decoding when looking for EOL in regex	2022-08-20 11:03:03 +02:00
Maxime Coste	c8c8051bd0	Refactor RegionsHighlighter to share regexes Instead of storing regexes in each regions, move them to the core highlighter in a hash map so that shared regexes between different regions are only applied once per update instead of once per region Also change iteration logic to apply all regex together to each changed lines to improve memory locality on big buffers. For the big_markdown.md file described in #4685 this reduces initial display time from 3.55s to 2.41s on my machine.	2022-08-20 11:02:59 +02:00
Maxime Coste	ca71d8997d	Reuse existing character classes when possible in regex	2022-08-05 20:31:39 +10:00
Maxime Coste	33e81af0f3	Fix regex alternation execution priority The ThreadedRegexVM implementation does not execute split opcodes as expected: on split the pending thread is pushed on top of the thread stack, which means that when multiple splits are executed in a row (such as with a disjunction with 3 or more branches) the last split target gets on top of the thread stack and gets executed next (when the thread from the first split target would be the expected one) Fixing this in the ThreadedRegexVM would have a performance impact as we would not be able to use a plain stack for current threads, so the best solution at the moment is to reverse the order of splits generated by a disjunction. Fixes #4519	2022-02-02 14:51:17 +11:00
Maxime Coste	ba379cba52	Micro-optimize regex character class/type matching Also force-inline step_thread as function call overhead has a mesurable impact.	2021-11-21 09:44:22 +11:00
Maxime Coste	8566ae14a0	Reduce the amount of Regex VM Instruction code Merge all lookarounds into the same instruction, merge splits, merge literal ignore case with literal... Besides reducing the amount of almost duplicated code, this improves performance by reducing pressure on the (often failing) branch target prediction for instruction dispatching by moving branches into the instruction code themselves where they are more likely to be well predicted.	2021-11-21 09:44:18 +11:00
Maxime Coste	da80a8cf6a	Raise ThreadedVM initial thread capacity to 16 Threads are 4 bytes, an initial capacity of 4 led to allocating 16 bytes, raising that to 64 bytes seems quite reasonable.	2021-03-03 20:51:24 +11:00
Maxime Coste	d539e8fb89	Do not decode utf-8 when looking for regex next start There is no need to decode as we know any non-ascii characters will be treated as Other in the StartDesc.	2019-12-04 22:33:11 +11:00
Jason Felice	d26bb0ce2b	Add static or const where useful	2019-11-09 12:53:45 -05:00
Maxime Coste	d9d2140ea2	Fix regex not always selecting the leftmost longest match (Actually the rightmost longest match when searching backwards) Fixes #2710	2019-02-04 17:33:29 +11:00
Maxime Coste	77b1216ace	Add a peephole optimization pass to the regex compiler	2019-01-20 22:59:28 +11:00
Maxime Coste	0364a99827	Refactor regex find next start not to be an instruction anymore The same logic can be hard coded, avoiding one thread and 3 instructions, improving the regex matching speed.	2019-01-20 22:59:28 +11:00
Maxime Coste	fd043435e5	Split compile time regex flags from runtime ones	2019-01-20 22:59:28 +11:00
Maxime Coste	328c497be2	Add support for named captures to the regex impl and regex highlighter ECMAScript is adding support for it, and it is a pretty isolated change to do. Fixes #2293	2019-01-03 22:55:50 +11:00
Maxime Coste	ef3419edbf	Do not pass thread to failed/consumed, capture it implicitely	2018-12-19 19:16:14 +11:00
Maxime Coste	0b9f782691	Take iterators by const-ref in ThreadedRegexVM::exec	2018-12-19 19:14:42 +11:00
Maxime Coste	021ba55b38	Small code tweak in DualThreadStack::swap_next	2018-11-14 17:50:17 +11:00
Maxime Coste	8c2c3d27ad	Fix memory leak in DualThreadStack Fixes #2556	2018-11-07 12:28:41 +11:00
Maxime Coste	7f83c41256	align ThreadedRegexVM::Thread to permit fused copy optimization Aligning makes gcc able to copy a Thread object with a single 32bit mov instruction instead of two 16bits one.	2018-11-06 20:13:09 +11:00
Maxime Coste	05a9eb62f4	Never grow the DualThreadStack in push_next As we do at most one push_next per step_thread, and we pop_current before step_thread, we can avoid a branch there at the expense of sometimes growing unecessarily (once).	2018-11-06 07:32:47 +11:00
Maxime Coste	7fbde0d44e	Various micro performance tweaks in ThreadedRegexVM	2018-11-05 21:54:29 +11:00
Maxime Coste	7959c7f731	Refactor ThreadedRegexVM::exec_program to avoid branching Moving logic into step_thread instead of returning an enum to select what to run avoids the switch logic and improves run time.	2018-11-05 19:46:53 +11:00
Maxime Coste	7463a0d449	Remove use of utf8::iterator in regex execution This avoids having two copies of the subject string bounds, one in the ExecConfig and one in the utf8 iterator.	2018-11-05 08:17:50 +11:00
Maxime Coste	4ac7df3842	Remove most regex impl special casing for backwards matching	2018-11-03 13:52:40 +11:00
Maxime Coste	ee74c2c2df	Use custom code instead of reverse_iterator in Regex VM	2018-11-02 08:23:39 +11:00
Maxime Coste	6fce8050ee	Use BufferCoord sentinel type for regex matching on BufferIterators BufferIterators are large-ish, and need to check the buffer pointer on comparison. Checking against a coord is just a 64 bit comparison.	2018-11-01 21:51:10 +11:00
Maxime Coste	4cd7583bbc	Improve regex vm to next start performance by avoiding iterator copies	2018-11-01 08:22:43 +11:00
Maxime Coste	d652ec9ce1	Cleanup regex lookarounds implementation and reject incompatible regex Fixes #2487	2018-10-10 22:47:59 +11:00
Maxime Coste	9024d41d64	Fix integer overflow leading to bad memory access in regex execution Fixes #2481 Fixes #2480	2018-10-08 12:43:12 +11:00
Maxime Coste	7cf3cbde8e	Cleanup some trailing whitespaces and double semicolon	2018-07-26 21:56:34 +10:00
Maxime Coste	0d6e04257b	Fix memory leak in regex execution	2018-07-25 20:57:11 +10:00
Maxime Coste	7ed5d53fe6	Fix RegexCompileFlags::Backwards having the same value as Optimize That means every Optimized regex had the Backwards version compiled as well, which doubled the time it took to compile them and doubled the memory usage of regex. This should improve #2152	2018-07-19 18:34:40 +10:00
Olivier Perret	67655de947	Use a dedicated vm op for dot when match-newline is false	2018-06-24 12:41:50 +02:00
Maxime Coste	787ca7f19b	Regex: small code style tweak	2018-04-29 19:58:18 +10:00
Maxime Coste	1e8026f143	Regex: Use only 128 characters in start desc and encode others as 0 Using 257 was using lots of memory for no good reason, as > 127 codepoint are not common enough to be treated specially.	2018-04-29 19:58:18 +10:00
Maxime Coste	528ecb7417	Regex: Use a custom 'DualThreadStack' structure to hold thread info Instead of using two vectors, we can hold both current and next threads in a single buffer, with stacks growing on each end. Benchmarking shows this to be slightly faster, and should use less memory.	2018-04-29 19:58:18 +10:00
Maxime Coste	8438b33175	Add a debug regex command to dump regex instructions	2018-04-27 08:35:09 +10:00
Maxime Coste	f10eb9faa3	Use indices instead of pointers for saves/instruction in ThreadedRegexVM Performance seems unaffacted, but memory usage should be lowered as the Thread struct is 4 bytes instead of 16.	2018-04-27 08:35:09 +10:00
Maxime Coste	fa17c46653	Regex: Refactor ThreadedRegexVM state handling Remove ExecState to store threads inside the ThreadedRegexVM so that memory buffers can be reused between executions. Extract an ExecConfig struct with all the data thats execution specific to avoid storing it needlessly inside the ThreadedRegexVM.	2018-04-25 21:19:04 +10:00
Maxime Coste	fb65fa60f8	Regex: take the full subject range as a parameter To allow more general look arounds out of the actual search range, pass a second range (the actual subject). This allows us to remove various flags such as PrevAvailable or NotBeginOfSubject, which are now easy to check from the subject range. Fixes #1902	2018-03-05 05:48:10 +11:00
Maxime Coste	d9e44dfacf	Regex: Remove helper functions from regex_impl.hh They were close duplicates from the ones in regex.hh and not used anywhere else.	2018-03-05 03:10:47 +11:00
Maxime Coste	933ac4d3d5	Regex: Improve comments and constify some variables Reword various comments to make some tricky parts of the regex engine easier to understand.	2018-02-24 17:40:08 +11:00
Maxime Coste	af21d4ca1e	regex: track CompiledRegex::StartDesc in the Regex memory domain	2018-02-24 16:29:24 +11:00

1 2 3

114 Commits