Commit Graph

39 Commits

Author SHA1 Message Date
Chris Webb
ca50379771 Avoid semantically significant comments in kak files
Kakoune's balanced strings require that delimiter characters nested inside
them are also paired, so for example in %{ }, each nested { must occur
before a corresponding } to balance it out.

In general this will automatically be the case for code in common scripting
languages, but sometimes regular expressions used for syntax highlighting
do end up containing an unbalanced bracket of one type or another.

This problem is easily solved because there is a free choice of balanced
delimiter characters. However, it can also be worked around by adding
a comment which itself contains an unbalanced delimiter character, to
'balance out' the unpaired one in the regular expression.

These unbalanced comments are not ideal as the semantic role they perform
is easy for a casual reader to overlook. A good example is

    catch %{
        # indent after lines with an unclosed { or (
        try %< execute-keys -draft [c[({],[)}] <ret> <a-k> \A[({][^\n]*\n[^\n]*\n?\z <ret> j<a- gt> >
        # indent after a switch's case/default statements
        try %[ execute-keys -draft kx <a-k> ^\h*(case|default).*:$ <ret> j<a-gt> ]
        # deindent closing brace(s) when after cursor
        try %[ execute-keys -draft x <a-k> ^\h*[})] <ret> gh / [})] <ret> m <a-S> 1<a-&> ]
    }

in rc/filetype/go/kak. Here, it is not instantly obvious that the comment
containing an unmatched { is required for correctness. If you change the
comment, delete it or rearrange the contents of the catch block, go.kak
will fail to load, and if you cut-and-paste this code as the basis for
a new filetype, it is a loaded gun pointing at your feet.

Luckily, a careful audit of the standard kakoune library turned up only
three such instances, in go.kak, hare.kak and markdown.kak.

The examples in go.kak and hare.kak are easily made robust by replacing
a %{ } with %< > or %[ ] respectively. The example in markdown.kak is
least-intrusively fixed by rewriting the affected regular expression
slightly so it has balanced { and } anyway.
2023-12-13 16:40:48 +00:00
Johannes Altmanninger
f7c3faa2e1 rc markdown taskpaper: require bare URL to start at word boundary
I can't think of a case where a URL would not start at a word boundary.
Let's add that to the regex. In addition to correctness, this also
slightly improves performance because matching can stop earlier.

	$ HOME=$PWD hyperfine -w 1 'git checkout HEAD'{~,}' -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy'
	Benchmark 1: git checkout HEAD~ -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy
	  Time (mean ± σ):      1.123 s ±  0.022 s    [User: 1.100 s, System: 0.027 s]
	  Range (min … max):    1.093 s …  1.174 s    10 runs
	 
	Benchmark 2: git checkout HEAD -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy
	  Time (mean ± σ):      1.019 s ±  0.026 s    [User: 1.001 s, System: 0.021 s]
	  Range (min … max):    0.984 s …  1.051 s    10 runs
	 
	Summary
	  'git checkout HEAD -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy' ran
	    1.10 ± 0.04 times faster than 'git checkout HEAD~ -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy'
2022-10-16 19:49:43 +02:00
Johannes Altmanninger
907ad84f46 rc markdown: only add language highlighters for actual code blocks in buffer
There have been proposals to add more language aliases to markdown.kak
(#4592) and allow users to add their own aliases (#4489).

To recap: various markdown implementations allow specifying aliases
for languages. For example, here is a code block that should be
highlighted as filetype "haskell" but isn't:

	```hs
	-- highlight as haskell
	```

There are lots of aliases out in the wild - "pygmentize -L" lists
some but I don't think there is a canonical list.

Today we have a hardcoded list of supported filetypes. This is hard
to mainta, extend, and it can impact performance.
This patch simply attempts to load the module "hs" and the shared
highlighter "hs". This means that users can use this (obvious?) snippet
to add their own aliases:

	provide-module hs %{
		require-module haskell
		add-highlighter shared/hs ref haskell
	}

Untrusted Markdown files can load arbitrary modules, but that was
already true before, and modules are assumed to be trusted anyway.

Since language highlighters are now loaded *after* the generic
code-block highlighter, we need to make sure the language highlighters
take precedence. Do this by making them sub-regions of the generic one.

Closes #4489

This improves performance on the [5MB Markdown
file](https://github.com/mawww/kakoune/issues/4685#issuecomment-1208129806).

	$ HOME=$PWD hyperfine -w 1 'git checkout HEAD'{~,}' -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy'
	Benchmark 1: git checkout HEAD~ -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy
	  Time (mean ± σ):      3.225 s ±  0.074 s    [User: 3.199 s, System: 0.027 s]
	  Range (min … max):    3.099 s …  3.362 s    10 runs
	 
	Benchmark 2: git checkout HEAD -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy
	  Time (mean ± σ):      1.181 s ±  0.030 s    [User: 1.162 s, System: 0.021 s]
	  Range (min … max):    1.149 s …  1.234 s    10 runs
	 
	Summary
	  'git checkout HEAD -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy' ran
	    2.73 ± 0.09 times faster than 'git checkout HEAD~ -- :/rc/filetype/markdown.kak && ./kak.opt big_markdown.md -e "hook global NormalIdle .* quit" -ui dummy'

(These numbers depend on another optimization.)
2022-09-10 07:35:29 +02:00
Johannes Altmanninger
647e568d3b rc kakrc: add kak=kakrc highlighter alias for markdown/restructuredtext
Filetypes markdown and restructuredtext reuse highlighters from other
filetypes to highlight code blocks. For example, to highlight a code
block of language foo they essentially do

	require-module foo
	add-highlighter [...] ref foo

This works great if the module name matches the shared
highlighter. This is the case almost all scripts in rc/filetype*.
The only exception is kakrc.kak: the highlighter is named "kakrc"
(just like the filetype) but the module is named "kak".

This requires weird hacks in markdown/restructuredtext.  Ideally we
could remove this inconsistency by renaming both the filetype and the
highlighter to "kak" but that's a breaking change.  Until we do that,
let's add an alias so we can treat filetypes uniformly.  This helps
the following commits, which otherwise would need to add ugly extra
code for kakrc highlighters.

The following commit will generalize this approach, allowing users
to add arbitrary aliases.
2022-09-10 07:35:29 +02:00
Johannes Altmanninger
feb912fb9f rc markdown: use language highlighting also for indented code blocks inside lists 2022-08-17 00:38:58 +02:00
Johannes Altmanninger
615ec3ef7e rc markdown: use language highlighting also for indented code blocks 2022-08-17 00:38:58 +02:00
Johannes Altmanninger
9d362b8b3e rc markdown: fix loading language highlighter module given multiple code blocks
After opening a markdown file

	```b
	```
	```c
	int main() {}
	```

markdown-load-languages will run an "evaluate-commands -itersel".
The first selection makes us run "require-module b", which fails
because that module can't be found.  Since -itersel only ignores the
"no selection remaining" error we fail to run "require-module c". Fix
this by ignoring errors.
2022-08-17 00:38:58 +02:00
Maxime Coste
ef8a11b3db Make x just select the full lines
`x` is often criticized as hard to predict due to its slightly complex
behaviour of selecting next line if the current one is fully selected.

Change `x` to use the previous `<a-x>` behaviour, and change `<a-x>` to
trim to fully selected lines as `<a-X>` did.

Adapt existing indentation script to the new behaviour
2022-07-05 08:43:40 +10:00
Johannes Altmanninger
6f28178b91 rc filetype: add trim-indent hooks to all languages that have indent hooks
An indent hook automatically adds whitespace, so it seems prudent to
add the hook to remove unwanted whitespace again. This is what we do
in most languages already.
2022-05-29 08:23:33 +02:00
Sidharth Kshatriya
b8981883ce markdown.kak: erlang, elixir and ocaml code should be highlighted in markdown 2021-11-17 20:38:12 +05:30
Lennard Hofmann
55b2b8c88d rc markdown: Fix fenced code blocks
The closing ``` in the following example was not detected because the
indented code block highlighter was higher up in the hierarchy than the
fenced code block highlighter:

```
    indented
```

The codeblock highlighter used to be inline so that it has an effect
inside listblocks. This commits adds a listblock/codeblock highlighter
as a replacement.

Fixes #4351
2021-09-27 17:34:31 +02:00
Hampus Fröjdholm
e0731b70cf Improve highlighting of markdown lists
Removes the inline code highlighter for lists to improve
readability in indented lists.
2021-07-06 13:32:23 +02:00
Taupiqueur
afc30a8940 Markdown: Add Crystal
https://crystal-lang.org
2021-04-26 22:06:48 +02:00
SeerLite
3397737b16 rc markdown: Fix code fence regex
The invalid regex `)\b` currently matches anything, so this didn't cause
any errors.
It is still invalid though, so I fixed it by moving the `\b` to the end
of the non-raw_attribute language name (like the original regex). The
raw_attribute one shouldn't need this because the `}` marks the end of
the language name anyway.

Fixes #4025
2021-04-01 22:53:44 -03:00
SeerLite
e84dd80244 rc markdown: Fix trailing whitespace removal
Modified the test cases accordingly too
2021-04-01 22:27:30 -03:00
SeerLite
5c03e2bd54 rc markdown: Add -insert hook 2021-04-01 22:22:08 -03:00
Lennard Hofmann
8d24041c1a rc markdown: Fix HTML highlighting in inline code
Because the HTML highlighter was higher up in the hierarchy than the code
highlighter, it took precedence. I fixed it by making it an inline region.
Using my new knowledge of "inline" I was able to remove one line of code.

Fixes #4091
2021-03-21 09:52:00 +01:00
Maxime Coste
69f1c8cae5 Merge remote-tracking branch 'Ordoviz/markdown' 2021-01-28 21:02:47 +11:00
Lennard Hofmann
61fabee03f rc markdown: Highlight HTML tags 2021-01-19 15:42:37 +01:00
Gregory Chamberlain
a49b1c4996 Adjust markdown code fences filetype regex
This highlighter (line 50 of markdown.kak) looks for the filetype
specified by the author at the top of the code fence, e.g.

``` python
print("hello")
```

and highlights the code within using Kakoune's relevant highlighter --
in this case Python.

Some flavours of markdown use curly braces and other characters in the
first line such as the following:

``` {=python}
print("hello")
```

Previously Kakoune recognised `{=python}` but not `{.python}`.  The latter
is Pandoc's flavour of markdown.  This patch adjusts the regex patterns
to recognise the dot notation as well.
2021-01-08 10:24:04 +00:00
Lennard Hofmann
f65d5210f8 rc markdown: Prevent underscores in URLs cause italic highlighting
Create regions to achieve that. Add support for inline links.
2021-01-05 18:51:37 +01:00
Frank LENORMAND
ae9088f192 rc markdown: Fix bullet highlighting
This commit prevents the lines following the one that holds the bullet
from being highlighted with the `bullet` face when they're indented:

- The bullet is highlighted properly, so is this sentence
  but this line the ones that would follow are not

Fixes #3582
2020-11-10 08:44:17 +03:00
Frank LENORMAND
f8a2176ed1 rc markdown: Highlight inline code blocks properly
This commit allows code blocks to be prefixed with tabulation
characters to be picked up and highlighted by the editor.

Indenting caused by the inclusion of an inline code block into a
list item is also taken into account. However, that might cause false
positives, for example with a hard wrapped list item indented with
an amount of spaces congruent to 4.
2020-10-23 16:35:01 +03:00
SeerLite
a06dcf8c10 markdown.kak: Support pandoc's raw_attribute 2020-10-11 20:53:39 -03:00
Frank LENORMAND
3145e3e939 rc markdown: Highlight trailing spaces properly
This commit addresses the following issues:

* highlight trailing space characters with the `meta` face, instead of
  `PrimarySelection`
* make the regex more readable by using a capture group in stead of
  `\K`
* specifically match space characters, not other horizontal whitespace
  characters
* match two or more space characters

Reference[1]:

> When you do want to insert a <br /> break tag using Markdown, you
> end a line with two or more spaces, then type return.

[1] https://daringfireball.net/projects/markdown/syntax#p

Note that the original reproducer doesn't seem to work anymore,
probably because of changes made to how lists are highlighted.

Fixes #911
2020-09-01 13:12:58 +03:00
Ivan Tham
34edb1a8e7 Highlight markdown code block with space 2020-07-11 21:06:12 +08:00
Maxime Coste
f939055e22 Merge remote-tracking branch 'lenormf/remove-bold-italic-faces' 2020-05-30 09:21:08 +10:00
Ivan Tham
752ccc0946 Fix regression on setext-style markdown header
Reproduce:

header1
header2
-------
2020-05-28 14:33:00 +08:00
Ivan Tham
09a45a2e96 Fix setext-style markdown header highlight
Reproduce:

    - item

    header
    ------
2020-05-27 18:23:53 +08:00
Frank LENORMAND
37706d7a95 colors: Retire the bold and italic faces
This commit removes declarations and mentions to the built-in `bold`
and `italic` faces.

While they could be a user-friendly way of customising how tokens
are emphasised in Markdown documents (similarly to the
`$LESS_TERMCAP_*` environment variables for `man` pagers), most other
markup languages do not have the concept of "strong" and "emphasis"
but refer directly to the font style/weight.

The faces were also not even set by default to highlight as their
names implied, so having markup language support scripts directly
use the +b and +i face attributes is more consistent.
2020-05-15 11:56:38 +03:00
Frank LENORMAND
21614cb06e src: Create a <semicolon> named key
This commit allows using the <semicolon> expansion in commands, instead
of `\;`.

It makes commands look more elegant, and prevents new-comers from
falling into the trap of using <a-;> without escaping the semicolon.
2019-10-22 11:02:06 +02:00
Maxime Coste
65327da4cf Merge remote-tracking branch 'laelath/markdown-lazy-load' 2019-07-24 17:38:00 +10:00
Justin Frank
8941002ce0 Give hooks a group so they're cleaned up 2019-07-22 19:03:04 -07:00
Justin Frank
89b50daa66 Use module alias pattern for markdown dynamic loading 2019-07-22 19:01:40 -07:00
Daniel Mulford
952f919214 Basic language support for Awk 2019-06-17 22:12:15 -07:00
Justin Frank
1adc5f080b Added wip markdown code lazy-loading hook 2019-05-24 09:41:05 -07:00
Justin Frank
6512eafa60 Update remaining files to new provide/require format 2019-04-11 15:54:58 -07:00
Justin Frank
1fab727f2b Modified a bunch of language support files to use modules 2019-04-08 17:02:44 -07:00
Alex Leferry 2
c0dccdd90d Add categories in rc/
Closes #2783
2019-03-21 01:06:16 +01:00