Commit Graph

5781 Commits

Author SHA1 Message Date
Maxime Coste
ee364d911f Only push a first instruction thread when on a potential start
There is no need to push threads for each codepoint when we know
they will fail as the current codepoint is not a start candidate.
2024-03-21 18:09:00 +11:00
Maxime Coste
f6762724ea Revert "Always allocate saves"
This crashes in unit tests

This reverts commit cde5f5a258.
2024-03-15 22:03:52 +11:00
Johannes Altmanninger
24d719bf13 Fix off-by-two error in max size of frameless infoboxes
Framed info boxes need one cell for the border and one for inner
space padding.  That's 4 extra columns when counting both sides.
Frameless boxes have neither border nor padding so 0 columns here.

Closes #5106
2024-03-15 21:54:59 +11:00
Johannes Altmanninger
bdca6760fe Fail "define-command -menu" unless a completer is given
The "define-command -menu" flag does not do anything unless there is
a completer flag.  Let's reject it.
2024-03-15 21:54:26 +11:00
Maxime Coste
cde5f5a258 Always allocate saves
This sometimes allocates saves too eagerly, but it removes a branch
in release saves that executes on every thread failing which seems
slightly better.
2024-03-15 21:53:17 +11:00
Maxime Coste
83f12fc8e9 Avoid clearing iterator buffer on saves allocation
When creating a new save, we had to clear all iterators to have valid
values. This operation is relatively costly because it gets optimized
to a memset whose call overhead is pretty high (as we usually have
less than 32 bytes to clear). Bypass this by storing a bitmap of
valid iterators.
2024-03-13 23:45:51 +11:00
Maxime Coste
c4df0fac52 Simplify and accelerate start desc map
Store values for all possible bytes and fill utf8 multi byte start
values when necessary.
2024-03-13 17:29:05 +11:00
Maxime Coste
c956413046 Fix quantifier parsing bug 2024-03-13 07:14:33 +11:00
Maxime Coste
b06834bf47 Small cleanup 2024-03-12 21:15:30 +11:00
Maxime Coste
6264b049d9 Simplify Quantifier logic in regex parsing
Remove redundant type enum
2024-03-12 20:39:39 +11:00
Maxime Coste
e06acd3dc8 Simplify Split regex op handling by swapping target 2024-03-11 21:47:14 +11:00
Maxime Coste
f25b3c005e flatten ThreadedRegexVM::codepoint
Profiling shows that this does not always get the utf8::read_codepoint
call inlined and that almost doubles the time spent in the function.
2024-03-11 20:52:56 +11:00
Johannes Altmanninger
ec44d98347 Send SIGTERM on <c-c>, to more reliably kill background jobs
Consider

    sh -c 'sleep 5 & sleep inf'

Since the shell is non-interactive, there is no job control.
This makes the shell spawn the "sleep 5" process in the shell's own
process group[1] - presumably, because only interactive shells have
a need to forward signals to all processes in its foreground job.

When this non-interactive shell process is cancelled with SIGINT,
"sleep 5" keeps running [2]. At least the dash shell implements this
by running "signal(SIGINT, SIG_IGN)" in the forked child.  Unless the
child process explicitly overrides that (to SIG_DFL for example), it
will ignore SIGINT. Probably the reason for this behavior is to feign
consistency with interactive shells, without needing to actually run
background jobs in a dedicated process group like interactive shells
do. Bash documents this behavior[3]:

> When job control is not in effect, asynchronous commands ignore
> SIGINT and SIGQUIT in addition to these inherited handlers.

Several of our scripts[4] - most prominently ":make" - use the
"</dev/null >/dev/null 2>&1 &" pattern to run potentially long-running
processes in the background, without blocking the editor.

On <c-c>, we send SIGINT to our process group.
As explained above, this will generally not terminate any background processes.

This problem has been masked by a behavior that is unique to using
both Bash and its "eval" builtin. Given

    nop %sh{
        rm -f /tmp/fifo
        mkfifo /tmp/fifo
        (
            eval make >/tmp/fifo 2>&1 &
        ) >/dev/null 2>&1 </dev/null
    }
    edit -fifo /tmp/fifo *my-fifo*

When running this and pressing Control+C, Bash actually terminates
the Make processes. However if I remove the "eval", it no longer does.
This doesn't seems like something we should rely on.
Other commands like ":git blame" don't use "eval" so they cannot be
cancelled today.

Fix these issues by sending SIGTERM instead of SIGINT, which should
apply to the whole process group with pretty much the same effect.
Barely tested, let's see if this breaks some weird build system.

In future we might allow more fine-grained control over which processes
are cancelled by <c-c>.

{{{

Alternative solution:

With the above fix, scripts can opt-out of being terminated by <c-c>
by using setsid (though that's not POSIX unfortunately, and may
require nesting quotes) or the classic Unix double-forking trick to
create a daemon process.

Though it is certainly possible that someone expects commands like
this to survive <c-c>:

    nop %sh{ tail -f my-log </dev/null 2>&1 | grep some-error > some-file 2>&1 & }

I think it would be ideal to stick to SIGINT and match semantics of
a noninteractive shell, to avoid muddying the waters.

Background processes could still **opt into** being terminated by
<c-c>. For example by providing a simple program in libexec/ that does

    // interruptible.c
    int main(int argc, char** argv) {
        signal(SIGINT, SIG_DFL);
        execv(argv[1], &argv[1]);
    }

used as

    diff --git a/rc/tools/make.kak b/rc/tools/make.kak
    index b88f7e538..f6e041908 100644
    --- a/rc/tools/make.kak
    +++ b/rc/tools/make.kak
    @@ -16,3 +16,3 @@ define-command -params .. \
          mkfifo ${output}
    -     ( eval "${kak_opt_makecmd}" "$@" > ${output} 2>&1 & ) > /dev/null 2>&1 < /dev/null
    +     ( eval "interruptible ${kak_opt_makecmd}" "$@" > ${output} 2>&1 & ) > /dev/null 2>&1 < /dev/null

Unfortunately, it's inconvenient to add "interruptible" to commands
like clang-parse and git-blame because they background a whole subshell
with many commands, so we'd need to nest quotes.  Also I'm not sure
if this brings any benefit.

So I didn't explore this further yet although we can definitely do that.

}}}

Fixes #3751

[1]: https://stackoverflow.com/questions/45106725/why-do-shells-ignore-sigint-and-sigquit-in-backgrounded-processes/45106961#45106961
[2]: https://unix.stackexchange.com/questions/372541/why-doesnt-sigint-work-on-a-background-process-in-a-script/677742#677742
[3]: https://www.gnu.org/software/bash/manual/html_node/Signals.html
[4]: clang-parse, ctags-*, git blame, git log, gopls references,
     grep, jedi-complete, lint-*, make; I don't think any of these
     should be uninterruptible.
2024-03-08 20:10:15 +11:00
Maxime Coste
a1c52e08a4 Reduce Save access indirections
Most Save access are to modify the refcount. Now that the freelist
is index based it is not necessary to keep Save objects at fixed
memory locations.
2024-03-07 20:40:24 +11:00
Maxime Coste
dda908c478 Slight simplification of ThreadedRegexVM::exec
Remove redundant checking for end and double indirection to get
instructions pointer.
2024-03-05 22:37:04 +11:00
Maxime Coste
dbda8d6dc8 Ensure ReadOnly buffer flag get reflected to readonly option
Fixes #5110
2024-03-04 09:01:19 +11:00
Maxime Coste
165cdba67c Merge remote-tracking branch 'svmhdvn/posix-make' 2024-02-29 18:56:07 +11:00
Maxime Coste
e64babee8c Use a size_t :debug memory total to avoid overflow 2024-02-28 21:56:22 +11:00
Maxime Coste
74ce9f3cfe Fix missing destructor call and simplify code slightly
Although the destructor call is a no-op, it is more correct to
have it to ensure the compiler knows the lifetime of that object
ended.
2024-02-28 21:48:58 +11:00
Maxime Coste
b8a151ab46 Fix unnecessary buffer line copy in BufferManager::create_buffer 2024-02-28 20:54:37 +11:00
Maxime Coste
068af3d9d4 Use std::find when detecting end of line format 2024-02-28 19:05:45 +11:00
Maxime Coste
3d5a0c672e Templatize StringData::create
This improves performance by letting the compiler optimize most use
cases where string count and length are known are compile time.
2024-02-28 19:02:29 +11:00
Maxime Coste
57b794ede3 compare against 0 instead of -1 in hash map code 2024-02-28 12:34:51 +11:00
Maxime Coste
e52f83bd50 Fix shell command completion fallback to filename 2024-02-28 12:34:20 +11:00
Maxime Coste
9a299b0016 improve :debug shared-strings 2024-02-28 12:33:25 +11:00
Siva Mahadevan
b05295637e build: switch to POSIX make 2024-02-27 17:31:58 -05:00
Tobias Pisani
dbe8528231 Make insert repeat (.) more consistent
Insert repeat will now only record non-synthesized keys, and when played back
execute mappings as well. Constructing some tests, and with the specific goal
of fixing https://github.com/alexherbo2/auto-pairs.kak/issues/38, this appeared to
be the best approach. Other options could be evaluating the maps only when recording,
but this gave other issues (see tests/normal/repeat-insert/repeat-insert-mapped)

At this point, repeat-insert may be essentially just a hardcoded macro, at least I
haven't identified the difference. If this really is the case, it may make sense to
give it a dedicated register, and implement it as a macro.

Fixes #3600
2024-02-24 05:18:56 +01:00
Johannes Altmanninger
22b1d4ec4a Try to fix crash due to fifo read regression
I saw a crash when running

	git log --oneline %arg{@}
	hook -once buffer NormalIdle .* %{
		execute-keys -draft \
		%{gk!} \
		%{git diff --quiet || echo "Unstaged changes";} \
		%{git diff --quiet --cached || echo "Staged changes";} \
		<ret>
	}

Backtrace (I still have GDB attached):

    #4  0x00006502c740b13f in Kakoune::operator- (rhs=..., lhs=...) at /home/johannes/git/kakoune/src/units.hh:33
    33	   { return RealType(lhs.m_value - rhs.m_value); }
    (gdb) up
    #5  Kakoune::Buffer::next (coord=..., this=0x6502c90d7ff0) at /home/johannes/git/kakoune/src/buffer.inl.hh:18
    18	   if (coord.column < m_lines[coord.line].length() - 1)
    (gdb) up
    #6  FifoWatcher::read_fifo (this=0x6502c90d9e48) at buffer_utils.cc:252
    252	                           m_buffer.erase(pos, m_buffer.next(pos));

This was introduced in 582c3c56b (Do not add trailing newline to
non-scrolling fifo buffers, 2024-01-28).

The problem seems to be that we call "m_buffer.next()" on a position
that is past-end the buffer, so m_lines[coord.line] is out-of-bounds.
Fix it.
For some reason I have not managed to reproduce the crash, not even
with sanitize=address.

There might be another problem: m_had_trailing_newline is intentionally
uninitialized because it is supposed to be read only on the second
read() with a positive return value. Unfortunately I think it's
possible that e.g. a NormalIdle hook inserts some text before the
first positive read(). Then, this line

        const bool is_first = pos == BufferCoord{0,0};
        if (not m_scroll and (is_first or m_had_trailing_newline))
            pos = m_buffer.next(pos);

will read uninitialized "m_had_trailing_newline".  Fix that too, to
be on the safe side. Sadly I don't have a test for this one either
so I'm not sure.
2024-02-18 10:14:17 +11:00
Maxime Coste
4101e18144 Early reject regex instructions that were already scheduled this step 2024-02-12 08:08:16 +11:00
Maxime Coste
e0c7a34bc1 Use unicode is_word/to_{lower/upper} function in ranked match 2024-02-12 07:50:04 +11:00
Maxime Coste
bd91a255e4 Do not decode utf8 while looking for next regex match start candidate
If the first byte in the multi-byte utf8 sequence does not match,
it means the "other" character is not set, so none of the sequence
byte will match (as they are all with the MSB set). This tightens
the critical loop which ends up running faster in most cases.
2024-02-11 12:17:21 +11:00
Maxime Coste
cf95043b14 Merge remote-tracking branch 'krobelus/changelog' 2024-02-10 21:55:09 +11:00
Maxime Coste
3ef68188b4 Avoid iswlower, iswupper, towlower and towupper for ascii codepoints
Avoid the costly shared object function call when most codepoints
will be ascii.

The regex benchmark gets a nice speedup:

Regex                                   Before     After
--------------------------------------+----------+---------
'Twain'                               |    25 ms | 15 ms
'(?i)Twain'                           |    74 ms | 57 ms
'[a-z]shing'                          |   323 ms | 303 ms
'Huck[a-zA-Z]+|Saw[a-zA-Z]+'          |    26 ms | 17 ms
'\b\w+nn\b'                           |   424 ms | 393 ms
'[a-q][^u-z]{13}x'                    |   869 ms | 815 ms
'Tom|Sawyer|Huckleberry|Finn'         |    33 ms | 24 ms
'(?i)Tom|Sawyer|Huckleberry|Finn'     |   319 ms | 281 ms
'.{0,2}(Tom|Sawyer|Huckleberry|Finn)' |  1294 ms | 1293 ms
'.{2,4}(Tom|Sawyer|Huckleberry|Finn)' |  1470 ms | 1429 ms
'Tom.{10,25}river|river.{10,25}Tom'   |    69 ms | 61 ms
'[a-zA-Z]+ing'                        |   447 ms | 408 ms
'\s[a-zA-Z]{0,12}ing\s'               |   539 ms | 543 ms
'([A-Za-z]awyer|[A-Za-z]inn)\s'       |   588 ms | 552 ms
'["'][^"']{0,30}[?!\.]["']'           |    92 ms | 81 ms
2024-02-06 22:16:08 +11:00
Maxime Coste
04a96b059f Use different hash algorithms for strings and file hashing
For hash map, using fnv1a is faster as it is a much simpler algorithm
we can afford to inline. For files murmur3 should win as it processes
bytes 4 by 4.
2024-02-06 21:57:17 +11:00
Maxime Coste
707904a91b Avoid calling iswalnum for ascii characters
iswalnum can be pretty expensive as its a shared library call.
2024-02-06 20:47:59 +11:00
Johannes Altmanninger
7cea09d327 Changelog entries for new blame features 2024-02-05 21:42:02 +11:00
Johannes Altmanninger
582c3c56b2 Do not add trailing newline to non-scrolling fifo buffers
Internally, all lines have a trailing "\n".
Buffers created empty (like fifo buffers) start with a single line.

When reading data into fifo buffers, we insert *before* the last line's
trailing newline ("last newline").  This enables autoscrolling (enabled
with "edit -scroll") as long as the cursor is on the last newline.

When autoscrolling is disabled, we have a special case to insert
*after* the last newline.  This means that a cursor on that newline
won't be moved.  Then we transplant the newline character from the
beginning to the end of the buffer. This special case happens only on
the very first fifo read; on subsequent reads, the cursor at position
1.1 will not be moved anway because insertions happen below 1.1.

Since we always insert (effectively) before the last newline, fifo
buffers have a trailing empty line.

For autoscrolling buffers this seems correct; it gives users an
obvious way to toggle autoscrolling.

For non-scrolling buffers the newline is redundant.  Remove it.
This requires keeping track of whether the last newline comes from
the fifo, or was added by us.  The shortest fix I could find
is to always append to the buffer if not scrolling, and then delete
the added newline character if applicable.

    m_buffer.insert(m_scroll ? pos : m_buffer.next(pos), StringView(data, data+count));
    if (not m_scroll and not m_had_trailing_newline)
        m_buffer.erase(pos, m_buffer.next(pos));

maybe that's the best fix overall; but erasing at the end seems better
than erasing in the middle, so do that whenever possible.

Reported in https://lists.sr.ht/~mawww/kakoune/%3CZbTK7qit9nzvrMkx@gmail.com%3E
2024-01-30 08:24:27 +11:00
Maxime Coste
c124c8f517 Support -after switch for flag-lines highlighter 2024-01-30 08:20:13 +11:00
Maxime Coste
5ea5c99c58 Fix using invalid strings for undo content strings
The code was wrongly using the write_it instead of the read_it to
access the removed lines content.

Fixes #5088
2024-01-21 10:24:08 +11:00
Johannes Altmanninger
20b0eadfc8 Don't modify prompt history when validating empty input
Fixes #5076
2024-01-15 15:08:10 +01:00
Maxime Coste
43e8aadcaa Fix performance of diff-based reloading of buffers
It turns out diffing was pretty fast, but applying the diff was
sub-optimal as it was constantly inserting/erasing lines which
led to lots of unnecessary shifting. Fix this by manually tracking
a read/write iterator and only shifting when necessary (on keeps,
and inserts).
2023-12-26 21:21:01 +11:00
Maxime Coste
68e73d8a24 Take eq predicate by reference in for_each_diff 2023-12-26 21:21:01 +11:00
Maxime Coste
3c81a4a253 Small code style tweak 2023-12-26 18:49:16 +11:00
Maxime Coste
ba7059a2dc Fix wrong name 2023-12-26 18:09:25 +11:00
Johannes Altmanninger
556c7633ba Update changelog 2023-12-16 12:19:09 +01:00
Maxime Coste
533f51c744 Merge remote-tracking branch 'krobelus/prefer-input-order-over-alphabet' 2023-12-12 21:27:31 +11:00
Maxime Coste
96b74d1ad6 Merge remote-tracking branch 'arachsys/hide-unmapped' 2023-12-12 21:24:47 +11:00
Maxime Coste
065e0229a9 Fix stray semicolon 2023-12-12 21:23:57 +11:00
Maxime Coste
5ed2433b9f Merge remote-tracking branch 'arachsys/unsigned-char' 2023-12-12 21:19:52 +11:00
Chris Webb
a8d5b8bd2c Fix compiler warnings when char is unsigned
In several places, we check for a control character with something like

  char c;
  [...]
  if (c >= 0 and c <= 0x1F)
    [...]

When char is signed (e.g. amd64) this is fine, but when char is unsigned
by default (e.g. arm32 and arm64) this generates warnings about the
tautologous check that an unsigned value is non-negative.

Write as

  if ((unsigned char) c <= 0x1F)
    [...]

which is both correct and not suspicious under both conventions.
2023-12-10 11:09:55 +00:00