Previously, one of the syntaxes for italic was (greatly summarized) something
like this:
[^_]_[^_]+_[^_]
That is to say, the regex matched the blanks on both sides of the italic span,
as well as the actual span content. That means that if you had consecutive
italic words:
_some_ _italic_ _words_
...only the odd-numbered words would be highlighted: the space after "_some_"
was counted as part of that span, so it wasn't available as part of "_italic_"
and therefore "_italic_" wouldn't be highlighted. Likewise, if the first word
in a buffer was italic, it wouldn't be highlighted because the first underscore
was not preceded by a non-underscore character!
Now we use lookahead/lookbehind assertions, which don't count as part of the
matched span, so consecutive spans don't interfere with one another.
Fixes#2111.
Previously, a code block was anything between triple-backtics, including inline
blocks:
some text ```
not a codeblock, but highlighted as one
``` other text
and even if the closing backticks had the wrong indent:
```
this is a code block containing a triple backtick
```
this is still a code block, but Kakoune thinks otherwise
```
Now we use the -match-capture flag to ensure the start and end fences have
exactly the same indent.
Previously, the generic code-block region was defined first, which meant that
it took priority over all the language-specific highlighters. Now we define
the generic code-block highlighting *after* the others, which fixes#2304.
Previously, code-spans were defined as ordinary inline markup, but in Markdown
ordinary formatting doesn't work inside code-spans. Therefore, they are now
regions unto themselves, defined according to section 6.3 of the CommonMark
spec <https://spec.commonmark.org/0.28/#code-spans>, which addresses a comment
on #2111.
Automatic reparsing of %sh{...}, while convenient in many cases,
can be surprising as well, and can lead to security problems:
'echo %sh{ printf "foo\necho bar" }' runs 'echo foo', then 'echo bar'.
we make this danger explicit, and we fix the 'nop %sh{...}' pattern.
To reparse %sh{...} strings, they can be passed to evaluate-commands,
which has been fixed to work in every cases where %sh{...} reparsing
was used..
Previously, due to a typo, Kakoune would highlight backtick spans from the first backtick to the last backtick in a paragraph, no matter how many backticks were in between. Now spans correctly stop at the first backtick after the opening backtick.
Previously, two underscore characters in a document would mark the entire
space between them as italic, no matter how far apart. Now we only accept a
single consecutive newline within an inline-formatted span, so hard-wrapped
documents will still format nicely but stray characters won't mess up your
entire document.
Note that the highlighter for backtick-enclosed spans was modified,
even though it's redundant with the code-block highlighting introduced
in 5bc62c694.
As well as the traditional `[text](url)` syntax, Markdown allows the text to
be followed by a tag in square brackets. If the text is followed by nothing
at all, then the tag for that link is the text itself. The actual URL
is supplied later in the document, like a footnote at the bottom of the page:
Some text with [a link][tag] and [another link].
[tag]: http://www.example.com/link1
[another link]: http://www.example.com/link2
This adds the "link" face to the URL in such footnote lines.
Fixes#1735
We need \K to not interfer with languages own interpretation of ` like multiline strings in javascript
We need \b in e.g. java\b otherwise it blocks javascript
I couldn't get the bare ``` to not block the other highlighters when introducing \K any other way than negative lookahead of all possible highlighers
That means we can now have highlighters active at global, buffer, and
window scope. The add-highlighter and remove-highlighter syntax changed
to take the parent path (scope/group/...) as a mandatory argument,
superseeding the previous -group switch.
Previously, if you opened a new line after an underlined heading (what
the CommonMark spec calls a "Setext heading") or inserted a newline into
a line that started with `**strong emphasis**` the Markdown autoindent
hook would assume the leading symbols were list bullets and paste them
at the beginning of the new line.
However, the CommonMark specification says that list bullets must be
followed by at least one horizontal whitespace character, so Setext
heading underlines and strong emphasis are not valid list bullets and
should not be matched by the autoindent pattern.
This commit changes the regex that selects the pastable prefix of the
previous line so that it must match either:
- One or more `>` characters with optional whitespace between them
(a blockquote prefix), optionally followed by a list bullet; or
- An optional blockquote prefix and a list bullet
Since we don't strictly need either the blockquote prefix nor the list
bullet, we could concievably just make both optional... but for lines
without either, the regex would find a zero-length match, and for the
purposes of copy/paste Kakoune treats that as a one-character match.
Therefore, the regex is written to fail if neither pattern is found.
As specified at https://daringfireball.net/projects/markdown/syntax#em
italics are made with either single asterisks/underscores, and bold is
double asterisks/underscores. Before this, single asterisks were
understood as bold, and only underscores were understood as italics;
both of which behaviors are incorrect.
The option is now used as a fallback when detection by extension fails. Some
scripts like `base/mail.kak` and `base/html.kak` still rely heavily on it.