Running %sYeti<ret>casdf on file
[example.journal.txt](https://github.com/mawww/kakoune/issues/4685#issuecomment-1193243588)
can cause noticeable lag. This is because we insert text at 6000
selections, which means we need to update highlighters in those lines.
The runtime for updating range highlighters is quadratic in the
number of selections: for each selection, we call on_new_range(),
which calls add_matches(), which calls std::rotate(), which needs
needs linear time.
Fix the quadratic runtime by calling std::inplace_merge() once instead
of repeatedly calling std::rotate(). This is works because ranges
are already sorted.
I used this script to benchmark the improvements.
(In hindsight I could have just used "-ui json" instead of tmux).
#!/bin/sh
set -ex
N=${1:-100}
kak=${2:-./kak.opt}
for i in $(seq "$N")
do
echo -n "\
2022-02-06 * Earth
expense:electronics:audio 116.7 USD
liability:card -116.7 USD
2022-02-06 * Blue Yeti USB Microphone
expense:electronics:audio 116.7 USD
liability:card -116.7 USD
"
done > big-journal.ledger
echo > .empty-tmux.conf 'set -sg escape-time 5'
test_tmux() {
tmux -S .tmux-socket -f .empty-tmux.conf "$@"
}
test_tmux new-session -d "$kak" big-journal.ledger
test_tmux send-keys '%sYeti' Enter c 1234567890
sleep .2
test_tmux send-keys Escape
while ! test_tmux capture-pane -p | grep 123
do
sleep .1
done
test_tmux send-keys ':wq' Enter
while test_tmux ls
do
sleep .1
done
rm -f .tmux-socket .empty-tmux.conf
This script's runtime used to grow super-linearly but now it grows
linearly:
kak.old kak.new
N=10000 1.142 0.897
N=20000 2.879 1.400
Detailed results:
$ hyperfine -w 1 './bench.sh 10000 ./kak.opt.'{old,new}
Benchmark 1: ./bench.sh 10000 ./kak.opt.old
Time (mean ± σ): 1.142 s ± 0.072 s [User: 0.252 s, System: 0.059 s]
Range (min … max): 1.060 s … 1.242 s 10 runs
Benchmark 2: ./bench.sh 10000 ./kak.opt.new
Time (mean ± σ): 897.2 ms ± 19.3 ms [User: 241.6 ms, System: 57.4 ms]
Range (min … max): 853.9 ms … 923.6 ms 10 runs
Summary
'./bench.sh 10000 ./kak.opt.new' ran
1.27 ± 0.09 times faster than './bench.sh 10000 ./kak.opt.old'
$ hyperfine -w 1 './bench.sh 20000 ./kak.opt.'{old,new}
Benchmark 1: ./bench.sh 20000 ./kak.opt.old
Time (mean ± σ): 2.879 s ± 0.065 s [User: 0.553 s, System: 0.126 s]
Range (min … max): 2.768 s … 2.963 s 10 runs
Benchmark 2: ./bench.sh 20000 ./kak.opt.new
Time (mean ± σ): 1.400 s ± 0.018 s [User: 0.428 s, System: 0.083 s]
Range (min … max): 1.374 s … 1.429 s 10 runs
Summary
'./bench.sh 20000 ./kak.opt.new' ran
2.06 ± 0.05 times faster than '../repro.sh 20000 ./kak.opt.old'
LineRangeSet::add_range() calls Vector::erase() in a loop over the
same vector. This could cause performance problems when there are many
selections. Fix this by only calling Vector::erase() once. I didn't
measure anything because my benchmark is dominated by another issue
(see next commit).
LineRangeSet::remove_range() also has a suspicious call to erase()
but that one is only used in test code, so it doesn't matter.
clang/clangd complain about the new HashSet type:
hash_map.cc:98:20: warning: braces around scalar initializer [-Wbraced-scalar-init]
set.insert({10});
^~~~
The argument to HashSet<int>::insert is just an int, so we don't
need braces. Only an actual HashMap would need braces to construct
a HashItem object.
When passing a filename parameter to "write", the -force parameter
allows overwriting an existing file.
The "write!" variant (which allows writing files where the current
user does not have write permissions) already implies -force.
All other variants (like write-quit or write-all) do not take a
file parameter.
Hence -force is relevant only for "write". Let's hide it from the
autoinfo of the other commands.
It's difficult to avoid duplication when constructing the constexpr
SwitchMap because String is not constexpr-enabled. Today, all our
SwitchMap objects are known at compile time, so we could make SwitchMap
use StringView to work around this. In future we might want to allow
adding switches at runtime, which would need String again to avoid
lifetime issues.
As reported in
https://github.com/mawww/kakoune/issues/4685#issuecomment-1200530001
ledger.kak defines a region end that matches every character of
the buffer. This causes performance issues for large buffers.
Since the affected regions are only ever filled with a single color,
just use a regex highlighter instead of a region highlighter.
This improves performance when loading the file for the first time.
Speedup on [example.journal.txt](https://github.com/mawww/kakoune/issues/4685#issuecomment-1193243588)
$ HOME=$PWD hyperfine -w 1 'git checkout HEAD'{~,}' -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy'
Benchmark 1: git checkout HEAD~ -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy
Time (mean ± σ): 362.1 ms ± 5.1 ms [User: 336.6 ms, System: 30.2 ms]
Range (min … max): 352.6 ms … 369.1 ms 10 runs
Benchmark 2: git checkout HEAD -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy
Time (mean ± σ): 271.2 ms ± 16.7 ms [User: 252.8 ms, System: 24.0 ms]
Range (min … max): 253.9 ms … 305.0 ms 10 runs
Summary
'git checkout HEAD -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy' ran
1.33 ± 0.08 times faster than 'git checkout HEAD~ -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy'
Instead of storing regexes in each regions, move them to the core
highlighter in a hash map so that shared regexes between different
regions are only applied once per update instead of once per region
Also change iteration logic to apply all regex together to each
changed lines to improve memory locality on big buffers.
For the big_markdown.md file described in #4685 this reduces
initial display time from 3.55s to 2.41s on my machine.
This wording was valid for an old version of that patch that foolishly
add a switch to "map" to allow recording history. Happily we found
a better solution. Now commands inside user mappings never add to
prompt history (unless you use "set-register"), so be clear about that.
When I wrote this line I wanted to avoid adding the array size but
I didn't know about make_array().
I had unsuccessfully tried some alternatives, for example
Array{"a", "b", "c"}
which doesn't work because we need StringView (c.f. git blame on
this line)
also
Array<StringView>{"a", "b", "c"}
doesn't work because it's missing a template argument.
This makes the function easier to find for newcomers because
to_string() is the obvious name. It enables format() to do the
conversion automatically which seems like good idea (since there is
no other obvious representation).
Of course this change makes it a bit harder to grep but that's not
a problem with clang tooling.
We need to cast the function in one place when calling transform()
but that's acceptable.
After opening a markdown file
```b
```
```c
int main() {}
```
markdown-load-languages will run an "evaluate-commands -itersel".
The first selection makes us run "require-module b", which fails
because that module can't be found. Since -itersel only ignores the
"no selection remaining" error we fail to run "require-module c". Fix
this by ignoring errors.
Commit d470bd2cc (Make numeric registers setable, 2017-02-14) removed
the user-provided StaticRegister::operator= in favor of a set()
member function, so this comment is no longer valid.
Dockerfiles of the form `Dockerfile.foo-bar` were not detected for syntax
highlighting.
Mainly meaning for this to capture _ and -, but I don't see why we wouldn't
capture any special character.