There have been proposals to add more language aliases to markdown.kak
(#4592) and allow users to add their own aliases (#4489).
To recap: various markdown implementations allow specifying aliases
for languages. For example, here is a code block that should be
highlighted as filetype "haskell" but isn't:
```hs
-- highlight as haskell
```
There are lots of aliases out in the wild - "pygmentize -L" lists
some but I don't think there is a canonical list.
Today we have a hardcoded list of supported filetypes. This is hard
to mainta, extend, and it can impact performance.
This patch simply attempts to load the module "hs" and the shared
highlighter "hs". This means that users can use this (obvious?) snippet
to add their own aliases:
provide-module hs %{
require-module haskell
add-highlighter shared/hs ref haskell
}
Untrusted Markdown files can load arbitrary modules, but that was
already true before, and modules are assumed to be trusted anyway.
Since language highlighters are now loaded *after* the generic
code-block highlighter, we need to make sure the language highlighters
take precedence. Do this by making them sub-regions of the generic one.
Closes#4489
This improves performance on the [5MB Markdown
file](https://github.com/mawww/kakoune/issues/4685#issuecomment-1208129806).
$ HOME=$PWD hyperfine -w 1 'git checkout HEAD'{~,}' -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy'
Benchmark 1: git checkout HEAD~ -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy
Time (mean ± σ): 3.225 s ± 0.074 s [User: 3.199 s, System: 0.027 s]
Range (min … max): 3.099 s … 3.362 s 10 runs
Benchmark 2: git checkout HEAD -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy
Time (mean ± σ): 1.181 s ± 0.030 s [User: 1.162 s, System: 0.021 s]
Range (min … max): 1.149 s … 1.234 s 10 runs
Summary
'git checkout HEAD -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy' ran
2.73 ± 0.09 times faster than 'git checkout HEAD~ -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy'
(These numbers depend on another optimization.)
Filetypes markdown and restructuredtext reuse highlighters from other
filetypes to highlight code blocks. For example, to highlight a code
block of language foo they essentially do
require-module foo
add-highlighter [...] ref foo
This works great if the module name matches the shared
highlighter. This is the case almost all scripts in rc/filetype*.
The only exception is kakrc.kak: the highlighter is named "kakrc"
(just like the filetype) but the module is named "kak".
This requires weird hacks in markdown/restructuredtext. Ideally we
could remove this inconsistency by renaming both the filetype and the
highlighter to "kak" but that's a breaking change. Until we do that,
let's add an alias so we can treat filetypes uniformly. This helps
the following commits, which otherwise would need to add ugly extra
code for kakrc highlighters.
The following commit will generalize this approach, allowing users
to add arbitrary aliases.
As reported in
https://github.com/mawww/kakoune/issues/4685#issuecomment-1200530001
ledger.kak defines a region end that matches every character of
the buffer. This causes performance issues for large buffers.
Since the affected regions are only ever filled with a single color,
just use a regex highlighter instead of a region highlighter.
This improves performance when loading the file for the first time.
Speedup on [example.journal.txt](https://github.com/mawww/kakoune/issues/4685#issuecomment-1193243588)
$ HOME=$PWD hyperfine -w 1 'git checkout HEAD'{~,}' -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy'
Benchmark 1: git checkout HEAD~ -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy
Time (mean ± σ): 362.1 ms ± 5.1 ms [User: 336.6 ms, System: 30.2 ms]
Range (min … max): 352.6 ms … 369.1 ms 10 runs
Benchmark 2: git checkout HEAD -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy
Time (mean ± σ): 271.2 ms ± 16.7 ms [User: 252.8 ms, System: 24.0 ms]
Range (min … max): 253.9 ms … 305.0 ms 10 runs
Summary
'git checkout HEAD -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy' ran
1.33 ± 0.08 times faster than 'git checkout HEAD~ -- :/rc/filetype/ledger.kak && ./kak.opt example.journal.txt -e "modeline-parse; hook global NormalIdle .* quit" -ui dummy'
After opening a markdown file
```b
```
```c
int main() {}
```
markdown-load-languages will run an "evaluate-commands -itersel".
The first selection makes us run "require-module b", which fails
because that module can't be found. Since -itersel only ignores the
"no selection remaining" error we fail to run "require-module c". Fix
this by ignoring errors.
Dockerfiles of the form `Dockerfile.foo-bar` were not detected for syntax
highlighting.
Mainly meaning for this to capture _ and -, but I don't see why we wouldn't
capture any special character.
We often use the pattern «map global normal ": foo"». The space
after the colon is unnecessary since execution of the mapping won't
add to history anyway, since 217dd6a1d (Disable history when executing
maps, 2015-11-10).
With the parent commit, the space is no longer necessary for user
mappings, so there is no reason to continue the cargo-cult.
Remove the space from mappings to set a good example.
`x` is often criticized as hard to predict due to its slightly complex
behaviour of selecting next line if the current one is fully selected.
Change `x` to use the previous `<a-x>` behaviour, and change `<a-x>` to
trim to fully selected lines as `<a-X>` did.
Adapt existing indentation script to the new behaviour