Commit Graph

4413 Commits

Author SHA1 Message Date
Maxime Coste
a822bcd6e0 Do not specify utf8 InvalidPolicy when we are using the default value
It was specified only in two call sites, and everywhere now only uses
the pass policy, which is the default.
2018-02-11 17:39:19 +11:00
Maxime Coste
66fe2d84da Refuse modification of ReadOnly buffers and make Debug buffer readonly
The debug buffer is a bit special as lots of events might mutate it,
permitting it to be modified leads to some buggy behaviour:

For example, `pipe` uses a ForwardChangeTracker to track buffer
changes, but when applied on a debug buffer with the profile flag
on, each shell execution will trigger an additional modification
of the buffer while applying the changes, leading to an assertion
failing as changes might not be happening in a forward way anymore.

Trying to modify a debug buffer will now raise an error immediatly.
2018-02-11 13:06:19 +11:00
Maxime Coste
3584e00d19 Regex: Use a template argument instead of a regular one for "forward"
forward (which controls if we are compling for forward or backward
matching) is always statically known, and compilation will first
compile forward, then backward (if needed), so by having separate
compiled function we get rid of runtime branches.
2018-02-09 22:45:53 +11:00
Maxime Coste
aa9f7753e8 Regex: minor code cleanup 2018-02-09 22:19:56 +11:00
Maxime Coste
cb16e52179 FaceRegistry: pass face names as StringViews instead of const String& 2018-02-09 22:08:29 +11:00
Maxime Coste
e46a5697e5 Add a limit to the size of selection with which we will try to diff on pipe
Limit to 100K of data for now, as we diff at the byte level.
2018-02-09 21:47:18 +11:00
Maxime Coste
bbf62d1779 diff: try to improve code readability 2018-02-09 21:31:10 +11:00
Maxime Coste
b7a3d80bde CommandManager: Use byte rather than columns for token positions
Not only are display columns rarely used to give error positions,
but they make the parsing much slower as for each token we need to
compute the column in the line.
2018-02-09 20:30:33 +11:00
Maxime Coste
e2c1d44a7f Fix parsing of percent tokens with unicode separators 2018-02-06 20:29:08 +11:00
Maxime Coste
4a96926c4b Handle errors while reloading buffer gracefully
Fixes #1831
2018-02-05 20:27:32 +11:00
Maxime Coste
f592768d3a Remove the New flag from a buffer after reloading it
If we reload a buffer, it means its underlying file exists, hence the
New flag does not make sense anymore. It could be that the file appeared
on the filesystem in the meantime.
2018-02-05 20:18:51 +11:00
Maxime Coste
05f57ace07 CommandManager: parse command lines as utf8 instead of ascii
Fixes #1829
2018-02-04 09:21:15 +11:00
Maxime Coste
52016d32bc Makefile: Only check for pkg-config when on a system that uses it
This fixes compilation on OSX where pkg-config is not installed
by default.
2018-02-03 13:37:09 +11:00
Maxime Coste
90c16d2b0d Profile the time it takes to source a file
`:source` command will now generate timings if profile is enabled
in the debug option, to help find which script can be slow to load.

This should help for #1823
2018-02-01 09:03:16 +11:00
Maxime Coste
e41b4ee65d Change m to search until the end of the buffer instead of end of line
Fixes #1774
 # Please enter the commit message for your changes. Lines starting
2018-01-31 11:10:18 +11:00
Maxime Coste
81eb2ee428 Do not strip whitespaces with '*'
Stripping whitespaces there is a failed experiment as it breaks the
ability to use multi-selections consistently: Using '*' followed by some
`N` to add following matches, we end up with mismatched selections
due to whitespace stripping the original selection still contains
whitespaces where all the new ones do not. Once we get to this state,
most selection commands will give different results for the initial
selection and the other ones, breaking predictible multiselection use,
one of the cornerstones of Kakoune editing model.
2018-01-31 09:31:18 +11:00
Maxime Coste
c30a954dfc ncurses: change handling of <c-z> suspend to improve terminal state
reset the mouse state so that the terminal can take back control
of the mouse while Kakoune is suspended, and does not emit focus
events anymore.

Fixes #1816
2018-01-30 10:46:34 +11:00
Maxime Coste
9c25e955df Use '/' register as the default register for <a-k> and <a-K>
Fixes #1808
2018-01-26 14:15:18 +11:00
Maxime Coste
bf73cb0109 Reset normal mode before hiding the reload buffer info box
Reseting normal mode will enable normal mode, which will trigger
a check for buffer modification. We do not want that check to
happen as we are trying to close the info box. Doing that mode
reset first will prevent the check from happening (as the info
box is already displayed), and will correctly hide it afterwards.

Fixes #1809
2018-01-26 13:45:36 +11:00
aver-d
29a7cd3ab4 Insert complete: Remove path info with one buffer
This change is useful when using `set scope completers word=buffer`,
instead of the default word=all.

If candidates all come from the same buffer, then the path/filename
information is the same and therefore unnecessary. This change
prevents the same path from being repeated, and the buffer's
source code is less obscured.

More generally, there could be an option to disable the path
information entirely in all cases, but for now this change seems
a reasonable solution until any such option exists.
2018-01-25 17:49:02 +00:00
Maxime Coste
220be30f02 Support multiline selections in C/<a-C>
Fixes #1725
2018-01-24 10:33:22 +11:00
Maxime Coste
299e22ca7c Do not block when waiting for next event if we have pending input
Handle next event should never block if we have already accumulated
input that we want to process. As we can accumulate new input in
lots of places (everytime we run a shell process for example, we
might end up reading input keys. That can be triggered during the
mode line generation which takes place during display of the window)

Fixes #1804
2018-01-21 12:00:40 +11:00
Maxime Coste
d22c989984 Rename InputModeChange hook to ModeChange
InputModeChange is a bit long to type and its pretty clear in Kakoune
that "Mode" means "Input mode", so use a shorter and as clear name.
2018-01-21 10:34:09 +11:00
Maxime Coste
07dfcd336d Fallback to getpwuid in the unlikely case $HOME is underfined
Add a homedir() helper function, and document the $kak_config
env var.
2018-01-20 11:19:23 +11:00
Maxime Coste
e7cbf38af7 Introduce a $kak_config env var containing the Kakoune user config dir
Makes it easier for users who want to locate their kakrc file, and
does not require to go through shell expansion to get it as
"${XDG_CONFIG_DIR:-${HOME}/.config}/kak"

Fixes #1740
2018-01-19 10:05:08 +11:00
Maxime Coste
55621fb4cc Do not save last command/pipe/regex in register when history is disabled 2018-01-19 09:48:57 +11:00
Maxime Coste
eeacb8b5a8 Use the _str and _sv string literals more often 2018-01-18 09:00:54 +11:00
Maxime Coste
b4f8497f8d Slight code refactor in InputHandler::handle_key 2018-01-15 10:25:58 +11:00
Maxime Coste
e74b581b0a Save/restore main selection from/to strings
Always consider that the first selection in the list is the main
one, save selections that way.

This approach was suggested by PR #1786 but the implementation here
is different, and is used more generally whenever we save selections
to strings.

This is also the prefered way to work only on the main selection:
save selections with Z, reduce to main with <space>, restore with z.

Closes #1786
Fixes #1750
2018-01-12 07:51:19 +11:00
Maxime Coste
2366af29e2 Slight refactor of jump collapsing code 2018-01-12 07:50:52 +11:00
Maxime Coste
827aab1386 Merge remote-tracking branch 'Delapouite/print_status' 2018-01-12 07:03:21 +11:00
Maxime Coste
af4cc11404 Merge remote-tracking branch 'lenormf/fix-makefile' 2018-01-12 07:02:40 +11:00
Delapouite
7ecc3d343f Remove extraneous face when clearing status line 2018-01-11 15:26:42 +01:00
Frank LENORMAND
202c977b3d src makefile: Abort compilation when pkg-config is not in PATH
Fixes #1792
2018-01-11 10:41:19 +03:00
Maxime Coste
bf66302d29 Small code style tweak 2018-01-11 13:57:33 +11:00
Maxime Coste
996d8abef4 Write new buffers even when unmodified
Fixes #1794
2018-01-08 09:42:26 +11:00
Maxime Coste
49e028b847 Add information on InputModeChange hook in the startup message 2017-12-29 10:00:45 +11:00
Maxime Coste
6333ae207f Correctly set the NotBeginOfSubject/NotEndOfSubject flags for regex matching
Fixes #1778
2017-12-29 09:55:53 +11:00
Maxime Coste
6851604546 Regex: Add a RegexExecFlags::NotEndOfSubject flag 2017-12-29 09:55:38 +11:00
Delapouite
74898120ed Add session name filtering for KakBegin hook 2017-12-21 22:22:33 +01:00
Maxime Coste
9b83589b18 Completion: Use a heap to gather the best matches instead of sorting
Generalize the behaviour of `shell-candidates` to insert completion,
gather the best 100 matches by using a heap and poping max a hundred
times.
2017-12-21 12:55:29 +11:00
Maxime Coste
a38d6cc3f0 Highlighter: In general, highlight replaced ranges
Fixes #1251
2017-12-21 12:26:25 +11:00
Maxime Coste
0814bb2962 InputHandler: Preserve no-hooks on insert mode from single command normal mode
Fixes #1775
2017-12-21 10:30:45 +11:00
Maxime Coste
e0b28fa421 Introduce InputModeChange hook
InputModeChange <old mode>:<new mode> is intended to replace the various
<Mode>Begin/<Mode>End hooks.

Fixes #1772
2017-12-18 11:09:54 +11:00
Delapouite
b81c6b5840 Use existing window reference in view mode 2017-12-17 14:42:09 +01:00
Maxime Coste
d8dc7d7f39 Go back to getpwuid call to get user name from user id
Unfortunately, reading /etc/passwd is not enough.
2017-12-15 08:19:56 +11:00
Maxime Coste
cafecda230 Window: avoid positionning window on a negative column
Fixes #1741
2017-12-15 08:17:35 +11:00
Maxime Coste
4b06c09c68 Make edit command work fine when running from an empty context
This way, the kind of context we get from a piped command allows
for opening a buffer and working with it directly.
2017-12-12 18:22:05 +11:00
Maxime Coste
ce2c0e54f4 Detect invalid coordinates in selection_from_string
Fixes #1751
2017-12-12 18:08:40 +11:00
Maxime Coste
0033491d69 InsertCompleter: Respected ignored_filename option in filename completer 2017-12-09 22:03:19 +11:00
Maxime Coste
a33d18e125 Go back to getpwuid call on OSX
Reading /etc/passwd on OSX does not give us the full list of users.

Fixes #1758
2017-12-09 12:13:02 +08:00
Maxime Coste
bedb98220c Ranges: add unit test and fix corner case in split view 2017-12-07 01:58:19 +08:00
Maxime Coste
2f48bbf6ff Tweak unescape template function to unescape the escape char as well
Test that more thouroughly in the unit tests.
2017-12-07 01:56:02 +08:00
Maxime Coste
99636c6230 Remove Vector returning split functions, use range adaptor
Do not allocate temporary vectors to store splitted data, use the
'split' range adaptor along with transform(unescape) to provide the
same feature with less allocations.
2017-12-06 17:18:44 +08:00
Maxime Coste
70e2122ae6 InsertCompleter: only accept words matching the target buffer word definition
The words we store in the WordDB are dependent on the extra_word_chars
options, which can be different for different buffers. When completing
words in a buffer based on the WordDB from another buffer, some candidates
might contains characters that are not considered word character for
the target buffer, ignore those words.
2017-12-06 14:15:36 +08:00
Maxime Coste
86fcc55e53 RankedMatch: Make punctuation ordered *before* alphanumeric characters 2017-12-06 13:58:34 +08:00
Maxime Coste
936b95ac34 Ensure that normal mode restores disabled hook status on disabled
Fixes #1744
2017-12-06 12:59:31 +08:00
Maxime Coste
274367116a Replace uses of getpwuid which is incompatible with static linking
Introduce a get_user_name function which parses '/etc/passwd' to find
the username associated with a user id.
2017-12-04 15:19:57 +08:00
Maxime Coste
73a239d3be Text-Objects: Use regex to select surroundings
Fixes #925
2017-12-03 17:15:24 +08:00
Maxime Coste
9a4b5de772 Regex: Introduce backward_regex_search helper function 2017-12-03 17:12:33 +08:00
Maxime Coste
b34bb6b794 Regex: make RegexIterator iterable and able to iter backwards 2017-12-02 14:02:41 +08:00
Maxime Coste
413f880e9e Regex: Support forward and backward matching code in the same CompiledRegex
No need to have two separate regexes to handle forward and backward
matching, just passing RegexCompileFlags::Backward will add support
for backward matching to the regex. For backward only regex, pass
RegexCompileFlags::NoForward as well to disable generation of
forward matching code.
2017-12-01 19:57:02 +08:00
Maxime Coste
7bfb695c45 Regex: Do not allow private use codepoints literals
We use them to encode non-literals in lookarounds, so they can
trigger bugs.

Fixes #1737
2017-12-01 16:37:18 +08:00
Maxime Coste
8d892eeb62 Regex: use StartDesc to early out when not searching
Early out as well if we do not find any potential start position.
2017-12-01 15:03:03 +08:00
Maxime Coste
65b057f261 Regex: rename StartChars to StartDesc
It only contains chars for now, but its still more generally
describing where matches can start.
2017-12-01 14:46:18 +08:00
Maxime Coste
b91f43b031 Regex: optimize parsing a bit 2017-11-30 14:32:29 +08:00
Maxime Coste
c1f0efa3f4 Regex: smarter handling of start chars computation for character class 2017-11-30 14:19:41 +08:00
Maxime Coste
839da764e7 Regex: avoid unneeded allocations and moves by reusing MatchResults storage 2017-11-29 14:07:04 +08:00
Maxime Coste
380ff553b5 Wrap: try to rework and simplify the algorithms further
Fixes #1731
2017-11-28 19:04:21 +08:00
Maxime Coste
ae0911b533 Regex: Various small code tweaks 2017-11-28 01:03:54 +08:00
Maxime Coste
4598832ed5 Regex: optimize compilation by reserving data 2017-11-28 00:59:57 +08:00
Maxime Coste
a52da6fe34 Regex: Tweak is_ctype implementation style 2017-11-28 00:13:42 +08:00
Maxime Coste
d142db80f2 Fix compute_modified_ranges corner case that would crash on undo
Fixes #1506
Fixes #1215
2017-11-27 20:29:01 +08:00
Maxime Coste
8b40f57145 Regex: Replace generic 'Matchers' with specialized functionality
Introduce CharacterClass and CharacterType Regex Op, and optimize
their evaluation.
2017-11-25 18:14:15 +08:00
Maxime Coste
0d44cf9591 Regex: do not decode utf8 in accept calls as they always run on ascii 2017-11-25 18:13:27 +08:00
Maxime Coste
ec6ecd5772 Add an InsertCompletionSelect hook
InsertCompletionSelect will be called whenever the selected insert
completion changes. If the original text is selected back, the hook
parameter will be empty. If another candidate is selected, the hook
parameter will be its text content.

Fixes #1676
2017-11-25 13:57:47 +08:00
Maxime Coste
1ae96c977c Small formating tweak 2017-11-25 13:46:55 +08:00
Maxime Coste
318e77b25e Highlighters: Introduce unique highlighter support
Some highlighters, such as wrap or line numbers, are not intended
to be used multiple times on the same display. Add support for unique
ids that are used by highlighters to disable themselves if another
unique highlighter with the same id is supposed to override them.

The usual highlighter "precedence" takes, place, that it, that most
nested highlighter will the the one to run (window in priority to
buffer in priority to global).
2017-11-25 12:53:33 +08:00
Delapouite
66250a06eb Rename KeyMapInfo → KeymapInfo 2017-11-24 10:34:56 +01:00
Maxime Coste
6084490a6e Merge remote-tracking branch 'Delapouite/remaining-buffers' 2017-11-24 16:37:53 +08:00
Maxime Coste
5a0332ac87 Window: fix buffer_coord when a line buffer range is empty
Fixes #1711
2017-11-24 16:36:37 +08:00
Maxime Coste
c0cec3e7c1 Merge remote-tracking branch 'fsub/warnings' 2017-11-23 12:30:24 +08:00
fsub
66ca53466f Remove unused lambda captures
This eliminates some warnings emitted by clang++.
2017-11-22 18:43:54 +01:00
Maxime Coste
179a1f6aa1 dynregex: slight code refactor, moving a helper function to lambda 2017-11-22 15:57:59 +08:00
Maxime Coste
77b367b3e0 Wrap: simplify logic a bit and fix case where too many lines got displayed
Fixes #1710
2017-11-21 13:01:02 +08:00
Delapouite
be94505e46 Add modified buffers count in error message of non-force quit 2017-11-20 19:25:47 +01:00
Maxime Coste
b57a53dfbf Merge remote-tracking branch 'Delapouite/common_prefix' 2017-11-20 17:30:28 +08:00
Delapouite
62912c6586 Remove extraneous common_prefix in input_handler
Related to: 52525a156f
2017-11-20 10:21:23 +01:00
Delapouite
bf222a0628 Docs: add missing -i <suffix> command line flag 2017-11-19 11:43:08 +01:00
Kylie McClain
3e1a4df3fb Makefile: Add ability to disable compressing manpage
Some distributions don't compress them.
2017-11-19 01:53:40 -05:00
Kylie McClain
ab390a02dc Makefile: use PKG_CONFIG, not pkg-config 2017-11-17 23:11:06 -05:00
Maxime Coste
706c1672d5 Normal: add <a-S> to select first and last char of selection
Fixes #550
2017-11-13 17:36:04 +08:00
Maxime Coste
5f5188a89c Merge remote-tracking branch 'Delapouite/jump-count' 2017-11-13 16:37:24 +08:00
Delapouite
a071e5b226 Add count support to jumps (<c-o> and <c-i>). Add jumps tests 2017-11-13 08:38:43 +01:00
Maxime Coste
615fe0368c Options: rework conversion to string of prefixed lists
* use the list_separator variable instead of hard coding ':'
* fix trailing separator when converting empty prefixed list to string
* correctly escape the prefix in case it contains a separator
2017-11-13 11:45:28 +08:00
Maxime Coste
078f0b5c90 option_types.hh: fix unfulfilled dependencies of the header 2017-11-13 11:27:55 +08:00
Maxime Coste
ffb639bf96 Regex: add unit test for #1693 2017-11-13 01:12:05 +08:00
fsub
0dd8a9ba93 Fix #1693: typo in RegexParser::character_class() 2017-11-12 17:35:03 +01:00
Maxime Coste
b298e01390 NCurses: use the general face merging function to handle default face
Merge attributes as well, and reuse an existing function instead of
reimplementing the same logic again.

Closes #1684
2017-11-12 23:02:40 +08:00
Maxime Coste
208f9641ef Remote: when converting to client, suspend *after* connecting
Also, do not quit server while there is a connection being accepted
Fixes #1690
2017-11-12 22:28:13 +08:00
Maxime Coste
00e0630272 Move Array and ConstexprVector to a constexpr_utils.hh header 2017-11-12 13:01:18 +08:00
Maxime Coste
5cfccad39c Regex: Use MemoryDomain::Regex for captures and MatchResults contents 2017-11-12 12:30:21 +08:00
Maxime Coste
c9b43d3634 Regex: directly store instruction pointer in Thread struct 2017-11-11 15:15:13 +08:00
Maxime Coste
b1115f7469 Wrap: fix scrolling to keep cursor visible logic 2017-11-10 21:17:05 +08:00
Maxime Coste
0942cd5084 InputHandler: handle of last insert keys happening in nested modes
Move recording of keys to the input handler itself instead of the
Insert mode so that eventual nested modes (potentially introduced
by <a-;> will get their keys recorded as well).

Fixes #1680
2017-11-08 14:39:52 +08:00
Maxime Coste
04993de687 Fix pipe logic in the case where the selections were accessed in the cmdline
When using an env var that needed the selections in the pipe command line,
say $kak_selection, the selection update code would run, modifying the
selections to adapt to eventual changes. But the rest of the pipe logic
was assuming the selections would not change, leading to bugs.
2017-11-08 00:02:49 +08:00
Maxime Coste
d45f16b6c8 Buffer: change clamp logic to preserve ordering
clamp could change ordering between a coordinate past the end.

Say in a buffer with 1 line of 2 char:
{0, 1} was clamped to {0, 1}
{1, 0} was clamped to {0, 0}

That was reversing their ordering, and might be the root cause
of the bug lurking in undo range computation.
2017-11-07 23:56:24 +08:00
Maxime Coste
80ce768994 Slight code cleanup in change update functions 2017-11-07 20:00:45 +08:00
Maxime Coste
2b6c84fc40 Add missing include in remote.cc
strerror is defined in <string.h>
2017-11-06 12:45:14 +08:00
Maxime Coste
0a081b9f31 Do not allow rename-session to introduce '/' in session names 2017-11-06 11:55:56 +08:00
Maxime Coste
52f4af6a83 Merge remote-tracking branch 'lenormf/fix-private-commands-in-register' 2017-11-05 12:22:28 +08:00
Maxime Coste
6bac767124 CommandManager: tweak naming 2017-11-04 16:02:21 +08:00
Maxime Coste
7f51e51fcb Introduce matching_pairs option that controls the pairs used by m 2017-11-04 15:53:53 +08:00
Frank LENORMAND
8900690288 src: Don't save whitespace-led commands in the : register 2017-11-04 09:18:26 +03:00
Maxime Coste
aa82a90c39 Remote: stricter validation of the session names
Creating a session will not accept any slashes in the session path,
connecting to an existing session will accept at most one slash to
allow for specifying the session of a different user.

Fixes #1635
2017-11-04 12:01:25 +08:00
Maxime Coste
aa9bcf08fc Code style tweak 2017-11-04 12:01:23 +08:00
Maxime Coste
9b216e0e79 Merge remote-tracking branch 'lenormf/fix-rc-aliases' 2017-11-03 19:32:30 +08:00
Maxime Coste
400ef6d48c Wrap: rework logic to avoid infinite loop with multiple wrap highlighters
The display is still going to be wrong, as wrapping is going to take place
multiple times, but Kakoune should not freeze anymore.
2017-11-03 19:30:31 +08:00
Maxime Coste
9d6420caae Remove uneeded forward declaration 2017-11-03 19:24:58 +08:00
Frank LENORMAND
9127ed0d55 src rc: Rename exec/eval into execute-keys/evaluate-commands 2017-11-03 11:09:45 +03:00
Maxime Coste
39e63cf518 Append '/' to highlighter group completion candidates 2017-11-02 18:05:18 +08:00
Maxime Coste
730e5725e9 Wrap: change indent atom to be a replaced empty buffer range
Avoid confusing the column highlighters.
2017-11-02 11:08:03 +08:00
Maxime Coste
fd95af0e3e Add informations on -indent in wrap highlighter docstring 2017-11-02 11:04:15 +08:00
Maxime Coste
4fabba3d12 doc.kak: Render documentation internally instead of relying on man
doc.kak now behaves as a basic asciidoc renderer. Asciidoc is unfortunately
still a dependency to generate the manpage of the `kak` command.
2017-11-02 10:03:24 +08:00
Maxime Coste
53069bcb2d Ensure line-specs and range-specs options are sorted internally 2017-11-02 09:51:15 +08:00
Maxime Coste
329f5fca0e Fix trailing spaces in highlighters.cc 2017-11-02 01:28:28 +08:00
Maxime Coste
6f2088cbc4 Wrap: Add -indent switch support that wraps preserving line indent 2017-11-02 01:28:28 +08:00
Maxime Coste
ed65d86c72 Rename doc/manpages to doc/pages
That fact we use man for these is an implementation detail.
2017-11-01 19:05:37 +08:00
Maxime Coste
25dac6b24e Document the regex impl switch in the startup message 2017-11-01 14:18:13 +08:00
Maxime Coste
51de90f366 Regex: Remove boost related code 2017-11-01 14:09:39 +08:00
Maxime Coste
c74becc6af Regex: fix RegexCompileFlags not being an enum class 2017-11-01 14:05:15 +08:00
Maxime Coste
2d901dc76f Regex: slight readability improvement and workaround a potential gcc bug 2017-11-01 14:05:15 +08:00
Maxime Coste
f07375fb27 Regex: remove dead code 2017-11-01 14:05:15 +08:00
Maxime Coste
2c2073b417 Regex: Tweak struct layouts of ParsedRegex data 2017-11-01 14:05:15 +08:00
Maxime Coste
bbd7e604dc Regex: Remove "Ast" from names in the ParsedRegex
It does not add much value, and makes names longer.
2017-11-01 14:05:15 +08:00
Maxime Coste
18a02ccacd Regex: Optimize parsing and compilation
AstNodes are now POD, stored in a single vector, accessed through
their index. The children list is implicit, with nodes storing only
the node index at which their child graph ends.

That makes reverse iteration slower, but that is only used for reverse
matching regex, which are uncommon. In the general case compilation
is now faster.
2017-11-01 14:05:15 +08:00
Maxime Coste
aea2de885d Regex: minor cleanup of the regex parsing code 2017-11-01 14:05:15 +08:00
Maxime Coste
6e0275e550 Regex: small code cleanup in the Save compilation code 2017-11-01 14:05:15 +08:00
Maxime Coste
9e15207d2a Regex: put the other char boolean inside the general start char map 2017-11-01 14:05:15 +08:00
Maxime Coste
7c3bc48627 Fix ConstexprVector::resize 2017-11-01 14:05:15 +08:00
Maxime Coste
60e32d73ff Regex: Fix handling of all unicode codepoint as start chars 2017-11-01 14:05:15 +08:00
Maxime Coste
df2bf9601c Regex: fix wrong fallthough in dump_regex 2017-11-01 14:05:15 +08:00
Maxime Coste
e9e9a08e7b Regex: refactor handling of Saves slightly, do not create them until really needed 2017-11-01 14:05:15 +08:00
Maxime Coste
d9b4076e3c Regex: Go back to instruction based search of next start
The previous method, which was a bit faster in the general use case,
can hit some cases where we get quadratic behaviour and very slow
matching.

By using an instruction, we can guarantee our complexity of O(N*M)
as we will never have more than N threads (N being the instruction
count) and we run the threads once per codepoint in the subject
string.

That slows down the general case slightly, but ensure we dont have
pathological cases.

This new version is much faster than the previous instruction based
search because it does not use a plain `.*` searcher, but a specific,
smarter instruction specialized for finding the next start if we are
in the correct conditions.
2017-11-01 14:05:15 +08:00
Maxime Coste
3f627058b0 Regex: add support for \0, \cX, \xXX and \uXXXX escapes 2017-11-01 14:05:15 +08:00
Maxime Coste
c423b47109 Regex: compute if codepoints outside of the start chars map can start 2017-11-01 14:05:15 +08:00
Maxime Coste
2c6c0be0c1 Regex: abort compilation as soon as we hit the instruction count limit 2017-11-01 14:05:15 +08:00
Maxime Coste
d44e160aa7 Regex: add a unit test for why lookaheads dont count for start chars anymore 2017-11-01 14:05:15 +08:00
Maxime Coste
87eec79d07 Regex: comment the mutables in CompiledRegex::Instruction and fix their init 2017-11-01 14:05:14 +08:00
Maxime Coste
8b2297f5ca Regex: Introduce a Regex memory domain to track usage separately 2017-11-01 14:05:14 +08:00
Maxime Coste
9ec175f2f8 Regex: use binary search to for character class ranges check 2017-11-01 14:05:14 +08:00
Maxime Coste
6e65589a34 Regex: compute start chars from matchers, do not compute it from lookarounds
Computing potential start characters from lookarounds is more complex
than expected, and not worth the complexity.
2017-11-01 14:05:14 +08:00
Maxime Coste
621b0d3ab8 Regex: remove the need to a processed inst vector
Identify each step with a counter, and check if the instruction
was already processed this step. This makes the matching faster,
by removing the need to maintain a vector of instructions executed
this step.
2017-11-01 14:05:14 +08:00
Maxime Coste
cfc52d7e6a Regex: use intrusive linked list for the free saves instead of a Vector 2017-11-01 14:05:14 +08:00
Maxime Coste
df16fea82d Regex: rename "flags" with the more common "modifiers" 2017-11-01 14:05:14 +08:00
Maxime Coste
52d443f764 Regex: Correctly handle ignore case mode for start chars computation 2017-11-01 14:05:14 +08:00
Maxime Coste
b8495f0953 Regex: Rework parsing, treat lookarounds as assertions, and flags separately 2017-11-01 14:05:14 +08:00
Maxime Coste
b0233262b8 Regex: Limit programs to std::numeric_limits<uint16_t>::max() instructions 2017-11-01 14:05:14 +08:00
Maxime Coste
8c8dcb3a84 Regex: Fix reverse searching behaviour, again 2017-11-01 14:05:14 +08:00
Maxime Coste
9753bcd0ad Regex: limit explicit quantifiers value (too 1000 for now)
Fixes #1628
2017-11-01 14:05:14 +08:00
Maxime Coste
2b97e4e124 Regex: Fix handling of ^ and $ in backward matching mode 2017-11-01 14:05:14 +08:00
Maxime Coste
3c999aba37 Regex: Only reset processed and scheduled flags on relevant instructions
On big regex, reseting all those flags on all instructions for each
character can become the dominant operation. Track that actual
instructions index processed (the scheduled are already tracked in
the next_threads vector), and only reset these.
2017-11-01 14:05:14 +08:00
Maxime Coste
5bf4be645a Regex: Fix support for ignore case in lookarounds 2017-11-01 14:05:14 +08:00
Maxime Coste
80f6caee81 Regex: move try/catch blocks inside boost specific code 2017-11-01 14:05:14 +08:00
Maxime Coste
dd9e43e6f9 Regex: small code cleanup 2017-11-01 14:05:14 +08:00
Maxime Coste
23b3a221eb Regex: support more than two children in alternations
Avoid deep nested alternations, parse them flattened.
2017-11-01 14:05:14 +08:00
Maxime Coste
fb5243f710 Regex: print instruction index in dump_regex 2017-11-01 14:05:14 +08:00
Maxime Coste
c8966ca701 Regex: Assert that the regex direction matches the vm direction 2017-11-01 14:05:14 +08:00
Maxime Coste
74ed102cab Regex: Tweak definition of character class and control escape tables 2017-11-01 14:05:14 +08:00
Maxime Coste
df73b71dfc Regex: fix lookarounds handling when computing starting chars 2017-11-01 14:05:14 +08:00
Maxime Coste
1c95074657 Make use of custom regex backward searching support for reverse search 2017-11-01 14:05:14 +08:00
Maxime Coste
785cd34b4b Regex: Make boost checking disableable at compile time 2017-11-01 14:05:14 +08:00
Maxime Coste
065bbc8f59 Regex: switch to custom impl, use boost for checking 2017-11-01 14:05:14 +08:00
Maxime Coste
9305fa1369 Regex: Fix lookaround use in moon.kak
(?=[A-Z]\w*) is strictly the same as (?=[A-Z]) as \w* will always
at least match an empty string.
2017-11-01 14:05:14 +08:00
Maxime Coste
cca730193c Regex: Support any char and character classes in lookarounds
Lookarounds still need to be fixed size, but accept character classes
as well as plain literals.
2017-11-01 14:05:14 +08:00
Maxime Coste
b8cb65160a Regex: use std::conditional instead of custom template class to choose Utf8It 2017-11-01 14:05:14 +08:00
Maxime Coste
db06acdfab Regex: Fix computation of potential starts for lookaheads 2017-11-01 14:05:14 +08:00
Maxime Coste
34b1f1ccb6 Regex: detect when all characters can start and avoid allocating 2017-11-01 14:05:14 +08:00
Maxime Coste
ea85f79384 Regex: add elided braces to fix compilation on older gcc 2017-11-01 14:05:14 +08:00
Maxime Coste
bf3b50a543 Regex: Fix wrong size of character_class_escapes array 2017-11-01 14:05:14 +08:00
Maxime Coste
08ea68dc1f Regex: Fix handling of match_prev_avail for boost regex
We were passing around iterators that were not allowed to
go before the begin iterator.
2017-11-01 14:05:14 +08:00
Maxime Coste
9ec376135b Regex: Introduce RegexExecFlags::PrevAvailable
Rework assertion code as well.
2017-11-01 14:05:14 +08:00
Maxime Coste
73e177ec59 Regex: Do not use sized deallocation to support more compilers 2017-11-01 14:05:14 +08:00
Maxime Coste
30dacdade2 Regex: deallocate Saves memory on ThreadedRegexVM destruction 2017-11-01 14:05:14 +08:00
Maxime Coste
578640c8a4 Regex: Fix handling of control escapes inside character classes 2017-11-01 14:05:14 +08:00
Maxime Coste
f3736a4b48 Regex: tag instructions as scheduled as well instead of searching
And a few more code cleanup in the ThreadedRegexVM
2017-11-01 14:05:14 +08:00
Maxime Coste
6bc5823745 Regex: refactor ThreadedRegexVM::exec_from code 2017-11-01 14:05:14 +08:00
Maxime Coste
4ff655cc09 Regex: store the processed flag directly in CompiledRegex instructions 2017-11-01 14:05:14 +08:00
Maxime Coste
732b8bc2a4 Regex: abandon bytecode and just use a simple list of instructions
Makes the code simpler.
2017-11-01 14:05:14 +08:00
Maxime Coste
6434bca325 Regex: Add some comments, remove supurious semicolons 2017-11-01 14:05:14 +08:00
Maxime Coste
911a893225 Regex: fix get_base(std::reverse_iterator<...>) returning a ref to temporary 2017-11-01 14:05:14 +08:00
Maxime Coste
11abd544c6 Regex: avoid infinite loops 2017-11-01 14:05:14 +08:00
Maxime Coste
c47cdc06a7 Regex: Add support for backward matching
Regex can be compiled for backward matching instead of forward matching
and the ThreadedRegexVM is able to iterate in reverse on the subject
string to find the last match instead of the first.
2017-11-01 14:05:14 +08:00
Maxime Coste
071b897e00 Regex: Remove static RegexCompiler::compile 2017-11-01 14:05:14 +08:00
Maxime Coste
52ee62172a Regex: remove use of buffer_utils.hh from regex_impl.cc 2017-11-01 14:05:14 +08:00
Maxime Coste
c375268c2d Regex: Use memcpy to write/read offsets from bytecode
reinterpret_cast was undefined behaviour as we do not guarantee
that offsets are going to be stored properly aligned.
2017-11-01 14:05:14 +08:00
Maxime Coste
b53227d62c Regex: slight cleanup of the unit tests 2017-11-01 14:05:14 +08:00
Maxime Coste
337e58d4f9 Regex: Cleanup character class parsing a bit 2017-11-01 14:05:14 +08:00
Maxime Coste
236751cb84 Regex: Make ThreadedRegexVM a proper class, define a proper interface 2017-11-01 14:05:14 +08:00
Maxime Coste
3b69dda04e Regex: Find potential start position using a map of valid start chars
With this optimization we get close to performance parity with boost
regex on the common use cases in Kakoune.
2017-11-01 14:05:14 +08:00
Maxime Coste
741772aef9 Regex: Optimize single char character classes as literals 2017-11-01 14:05:14 +08:00
Maxime Coste
fabeab1ee1 Regex: reorder lookaround ops, group by direction 2017-11-01 14:05:14 +08:00
Maxime Coste
854144c535 Regex: Fix handling of Save instruction in ThreadedRegexVM
When not saving, we were not fully reading the instruction stream,
leading to an out of sync instruction pointer.
2017-11-01 14:05:14 +08:00
Maxime Coste
f1b4931824 Regex: Fix handling of non capturing groups (?:...)
We were wrongly keeping the `:` as a literal content of the group
2017-11-01 14:05:14 +08:00
Maxime Coste
5f6e71c4dc Regex: More code tweaks and cleanups in ThreadedRegexVM 2017-11-01 14:05:14 +08:00
Maxime Coste
5f54e0de0e Regex: Code cleanup and refactor for Saves handling 2017-11-01 14:05:14 +08:00
Maxime Coste
dbb175841b Regex: do not write the search prefix inside the program bytecode
Its faster to have specialized code in the VM directly
2017-11-01 14:05:14 +08:00
Maxime Coste
cf5055f68b Regex: small code tweak 2017-11-01 14:05:14 +08:00
Maxime Coste
e0fac20f6c Regex: Use a custom allocated buffer for Saves instead of a Vector 2017-11-01 14:05:14 +08:00
Maxime Coste
1399563e40 Regex: make m_current_threads and m_next_threads local variable of exec 2017-11-01 14:05:14 +08:00
Maxime Coste
54da8098ae Regex: Add a NoSaves RegexExecFlags to disable saving positions 2017-11-01 14:05:14 +08:00
Maxime Coste
119bc38254 Regex: small refactor of ThreadedRegexVM::clone_saves 2017-11-01 14:05:14 +08:00
Maxime Coste
9fbafba4cb Regex: Refactor thread handling in ThreadedRegexVM 2017-11-01 14:05:14 +08:00
Maxime Coste
589cde67f0 Regex: store saves in a copy on write structure 2017-11-01 14:05:14 +08:00
Maxime Coste
11b9c996ea Regex: small code style tweak 2017-11-01 14:05:14 +08:00
Maxime Coste
51ad8b4c85 Regex: fix handling of negative escaped character classes 2017-11-01 14:05:14 +08:00
Maxime Coste
adcd02b7d2 Regex: Replace boost regex_iterator impl with our own
Ensure we check the results from our own regex impl in all uses of
regexs in Kakoune.
2017-11-01 14:05:14 +08:00
Maxime Coste
f007794d9c Regex: introduce RegexExecFlags to control various behaviours 2017-11-01 14:05:14 +08:00
Maxime Coste
73b14b11be Regex: small code tweak in ThreadedRegexVM 2017-11-01 14:05:14 +08:00
Maxime Coste
630d078b6d Regex: Fix use of not-yet-constructed CompiledRegex in TestVM impl 2017-11-01 14:05:14 +08:00
Maxime Coste
5b0c2cbdc2 Regex: Ensure we dont have a thread explosion in ThreadedRegexVM
Always remove threads with lower priority that end up on the same
instruction as a higher priority thread (as we know they will behave
the same from now on)
2017-11-01 14:05:14 +08:00
Maxime Coste
b4f923b7fc Regex: min/max quantifiers can be non greedy as well 2017-11-01 14:05:14 +08:00
Maxime Coste
f02b2645da Regex: validate that our custom impl gets the same results as boost regex
In addition to running boost regex, run our custom regex and compare
the results to ensure the two regex engine agree.
2017-11-01 14:05:14 +08:00
Maxime Coste
76dcfd5c52 Regex: support escaping characters in character classes 2017-11-01 14:05:14 +08:00
Maxime Coste
3d2262bebf Regex: add support for case insensitive matching, controlled by (?i) 2017-11-01 14:05:14 +08:00
Maxime Coste
7673781751 Regex: use \A \z for subject start/end
This is the most common syntax in various regex variants.
2017-11-01 14:05:14 +08:00
Maxime Coste
0bdfdac5c5 Regex: Implement lookarounds for fixed literal strings
We do not support anything else than a plain literal string for
lookarounds.
2017-11-01 14:05:14 +08:00
Maxime Coste
e96cd29f0e Regex: Support non greedy quantifiers 2017-11-01 14:05:14 +08:00
Maxime Coste
e4004a7b7f Regex: Add support for \h and \H "horizontal blank" character classes 2017-11-01 14:05:14 +08:00
Maxime Coste
4ac0d35d1e Regex: Add support for \K that reset the start capture 2017-11-01 14:05:14 +08:00
Maxime Coste
2f450e0080 Regex: Add support for \Q...\E quoted parts 2017-11-01 14:05:14 +08:00
Maxime Coste
7a313ddafe Regex: small error message improvement 2017-11-01 14:05:14 +08:00
Maxime Coste
c282b699d7 Regex: fix support for - at end of a character class 2017-11-01 14:05:14 +08:00
Maxime Coste
e41d228af8 Regex: Disable dumping regex instructions by default in unit tests 2017-11-01 14:05:14 +08:00
Maxime Coste
d5048281a6 Regex: slight cleanup of the unit tests 2017-11-01 14:05:14 +08:00
Maxime Coste
f7468b576e Regex: Refactor regex compilation to a regular RegexCompiler class 2017-11-01 14:05:14 +08:00
Maxime Coste
d5717edc9d Regex: improve regex parse error reporting
Display the place where parsing failed, refactor code to make
RegexParser a regular object.
2017-11-01 14:05:14 +08:00
Maxime Coste
080160553c Regex: support escaped character classes 2017-11-01 14:05:14 +08:00
Maxime Coste
1a8ad3759f Regex: fix handling of strict quantifiers {N}
Previous behaviour was treating {N} as {N,}
2017-11-01 14:05:14 +08:00
Maxime Coste
be157453ad Regex: Use a std::function based "Matcher" op to implement character classes
This is more extensible and should allow easier support for non ranges
classes.
2017-11-01 14:05:14 +08:00
Maxime Coste
eb1015cdfb Regex: whenever Kakoune compiles a regex, pass it to the custom impl as well
That way we can see which features are missing.
2017-11-01 14:05:14 +08:00
Maxime Coste
002aba562f Regex: work on unicode codepoints instead of raw bytes 2017-11-01 14:05:14 +08:00
Maxime Coste
75608ea223 Regex: when in full match mode, do not accept trailing data 2017-11-01 14:05:14 +08:00
Maxime Coste
490c130e41 Regex: Implement leftmost matching
Ensure threads are maintained in "priority" order, by having two
split instruction (prioritizing parent or child).
2017-11-01 14:05:14 +08:00
Maxime Coste
182b70cb0a Regex: Add initial support for character ranges 2017-11-01 14:05:14 +08:00
Maxime Coste
52678fafa1 Regex: Add support for searching
Always compile a `.*` as the first instructions in a regex bytecode,
depending on the match or search mode, the RegexVM will either execute
this or skip it and start directly at the matching bytecode.
2017-11-01 14:05:14 +08:00
Maxime Coste
f7b8c1c79d Regex: cleanup and reorganize regex code and improve capture support
Introduce the CompiledRegex class, rename ThreadedExecutor to
ThreadedRegexVM, remove the RegexProgram namespace.
2017-11-01 14:05:14 +08:00
Maxime Coste
023511deff Regex: WIP support for saving captures 2017-11-01 14:05:14 +08:00
Maxime Coste
ad546e516a Regex: Small comment tweaks 2017-11-01 14:05:14 +08:00
Maxime Coste
46a113e10a Regex: Add support for curly braces count expressions 2017-11-01 14:05:14 +08:00