rc markdown: only add language highlighters for actual code blocks in buffer

There have been proposals to add more language aliases to markdown.kak
(#4592) and allow users to add their own aliases (#4489).

To recap: various markdown implementations allow specifying aliases
for languages. For example, here is a code block that should be
highlighted as filetype "haskell" but isn't:

	```hs
	-- highlight as haskell
	```

There are lots of aliases out in the wild - "pygmentize -L" lists
some but I don't think there is a canonical list.

Today we have a hardcoded list of supported filetypes. This is hard
to mainta, extend, and it can impact performance.
This patch simply attempts to load the module "hs" and the shared
highlighter "hs". This means that users can use this (obvious?) snippet
to add their own aliases:

	provide-module hs %{
		require-module haskell
		add-highlighter shared/hs ref haskell
	}

Untrusted Markdown files can load arbitrary modules, but that was
already true before, and modules are assumed to be trusted anyway.

Since language highlighters are now loaded *after* the generic
code-block highlighter, we need to make sure the language highlighters
take precedence. Do this by making them sub-regions of the generic one.

Closes #4489

This improves performance on the [5MB Markdown
file](https://github.com/mawww/kakoune/issues/4685#issuecomment-1208129806).

	$ HOME=$PWD hyperfine -w 1 'git checkout HEAD'{~,}' -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy'
	Benchmark 1: git checkout HEAD~ -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy
	  Time (mean ± σ):      3.225 s ±  0.074 s    [User: 3.199 s, System: 0.027 s]
	  Range (min … max):    3.099 s …  3.362 s    10 runs
	 
	Benchmark 2: git checkout HEAD -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy
	  Time (mean ± σ):      1.181 s ±  0.030 s    [User: 1.162 s, System: 0.021 s]
	  Range (min … max):    1.149 s …  1.234 s    10 runs
	 
	Summary
	  'git checkout HEAD -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy' ran
	    2.73 ± 0.09 times faster than 'git checkout HEAD~ -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy'

(These numbers depend on another optimization.)
This commit is contained in:
Johannes Altmanninger 2022-08-16 19:47:23 +02:00
parent 647e568d3b
commit 907ad84f46

View File

@ -21,8 +21,12 @@ hook global WinSetOption filetype=markdown %{
} }
hook -group markdown-load-languages global WinSetOption filetype=markdown %{ hook -group markdown-load-languages global WinSetOption filetype=markdown %{
hook -group markdown-load-languages window NormalIdle .* markdown-load-languages markdown-load-languages '%'
hook -group markdown-load-languages window InsertIdle .* markdown-load-languages }
hook -group markdown-load-languages global WinSetOption filetype=markdown %{
hook -group markdown-load-languages window NormalIdle .* %{markdown-load-languages gtGbGl}
hook -group markdown-load-languages window InsertIdle .* %{markdown-load-languages gtGbGl}
} }
@ -46,31 +50,16 @@ add-highlighter shared/markdown/listblock/g default-region group
add-highlighter shared/markdown/listblock/g/ ref markdown/inline add-highlighter shared/markdown/listblock/g/ ref markdown/inline
add-highlighter shared/markdown/listblock/g/marker regex ^\h*([-*])\s 1:bullet add-highlighter shared/markdown/listblock/g/marker regex ^\h*([-*])\s 1:bullet
evaluate-commands %sh{
languages="
awk c cabal clojure coffee cpp crystal css cucumber d diff dockerfile elixir erlang fish
gas go haml haskell html ini java javascript json julia kak kickstart
latex lisp lua makefile markdown moon objc ocaml perl pug python ragel
ruby rust sass scala scss sh swift toml tupfile typescript yaml sql
"
for lang in ${languages}; do
printf 'add-highlighter shared/markdown/%s region -match-capture ^(\h*)```\h*(%s\\b|\\{[.=]?%s\\}) ^(\h*)``` regions\n' "${lang}" "${lang}" "${lang}"
printf 'add-highlighter shared/markdown/%s/ default-region fill meta\n' "${lang}"
printf 'add-highlighter shared/markdown/%s/inner region \A\h*```[^\\n]*\K (?=```) ref %s\n' "${lang}" "${lang}"
printf 'add-highlighter shared/markdown/listblock/%s region -match-capture ^(\h*)```\h*(%s\\b|\\{[.=]?%s\\}) ^(\h*)``` regions\n' "${lang}" "${lang}" "${lang}"
printf 'add-highlighter shared/markdown/listblock/%s/ default-region fill meta\n' "${lang}"
printf 'add-highlighter shared/markdown/listblock/%s/inner region \A\h*```[^\\n]*\K (?=```) ref %s\n' "${lang}" "${lang}"
done
}
add-highlighter shared/markdown/codeblock region -match-capture \ add-highlighter shared/markdown/codeblock region -match-capture \
^(\h*)```\h* \ ^(\h*)```\h* \
^(\h*)```\h*$ \ ^(\h*)```\h*$ \
fill meta regions
add-highlighter shared/markdown/codeblock/ default-region fill meta
add-highlighter shared/markdown/listblock/codeblock region -match-capture \ add-highlighter shared/markdown/listblock/codeblock region -match-capture \
^(\h*)```\h* \ ^(\h*)```\h* \
^(\h*)```\h*$ \ ^(\h*)```\h*$ \
fill meta regions
add-highlighter shared/markdown/listblock/codeblock/ default-region fill meta
add-highlighter shared/markdown/codeline region "^( {4}|\t)" "$" fill meta add-highlighter shared/markdown/codeline region "^( {4}|\t)" "$" fill meta
# https://spec.commonmark.org/0.29/#link-destination # https://spec.commonmark.org/0.29/#link-destination
@ -102,6 +91,21 @@ add-highlighter shared/markdown/inline/text/ regex "\H( {2,})$" 1:+r@meta
# Commands # Commands
# ‾‾‾‾‾‾‾‾ # ‾‾‾‾‾‾‾‾
define-command markdown-load-languages -params 1 %{
evaluate-commands -draft %{ try %{
execute-keys "%arg{1}s```\h*\{?[.=]?\K\w+<ret>" # }
evaluate-commands -itersel %{ try %{
require-module %val{selection}
add-highlighter "shared/markdown/codeblock/%val{selection}" region -match-capture "^(\h*)```\h*(%val{selection}\b|\{[.=]?%val{selection}\})" ^(\h*)``` regions
add-highlighter "shared/markdown/codeblock/%val{selection}/" default-region fill meta
add-highlighter "shared/markdown/codeblock/%val{selection}/inner" region \A\h*```[^\n]*\K (?=```) ref %val{selection}
add-highlighter "shared/markdown/listblock/codeblock/%val{selection}" region -match-capture "^(\h*)```\h*(%val{selection}\b|\{[.=]?%val{selection}\})" ^(\h*)``` regions
add-highlighter "shared/markdown/listblock/codeblock/%val{selection}/" default-region fill meta
add-highlighter "shared/markdown/listblock/codeblock/%val{selection}/inner" region \A\h*```[^\n]*\K (?=```) ref %val{selection}
}}
}}
}
define-command -hidden markdown-trim-indent %{ define-command -hidden markdown-trim-indent %{
evaluate-commands -no-hooks -draft -itersel %{ evaluate-commands -no-hooks -draft -itersel %{
execute-keys x execute-keys x
@ -123,11 +127,4 @@ define-command -hidden markdown-indent-on-new-line %{
} }
} }
define-command -hidden markdown-load-languages %{
evaluate-commands -draft %{ try %{
execute-keys 'gtGbGls```\h*\{?[.=]?\K[^}\s]+<ret>'
evaluate-commands -itersel %{ try %{ require-module %val{selection} } }
}}
}
} }