xenia/home

Author	SHA1	Message	Date
Maxime Coste	dbcddafbfd	Change utf8::to_next/to_previous so that they are more symetrical The previous implementation could yield different positions when iterating forward and backward, leading to confusion in boost regex. This makes an existing problem a bit more visible: iterating with to_next and with read_codepoint wont behave the same way, as read_codepoint will put the iterator onto the byte following the utf8 codepoint, whereas to_next will put it on the next utf8 character start byte, which might be different if the buffer content is not valid utf8. Fixes #1195	2017-04-20 16:18:49 +01:00
Maxime Coste	249ec4835e	Rename get_width to codepoint_width	2016-10-01 13:45:00 +01:00
Maxime Coste	35559b65dd	Support codepoints of variable width Add a ColumnCount type and use it in place of CharCount whenever more appropriate, take column size of codepoints into account for vertical movements and docstring wrapping. Fixes #811	2016-10-01 13:45:00 +01:00
Maxime Coste	14f59d415d	Avoid underlying iterator copies in utf8_iterator	2016-07-27 21:36:32 +01:00
Maxime Coste	1401c55531	Faster implementation of utf8::advance not copying iterators at each step	2016-07-15 20:26:33 +01:00
Maxime Coste	73fdc726fb	Avoid postfix increment in utf8::distance	2016-07-15 20:07:47 +01:00
Maxime Coste	94cbd5a837	More string usage cleanup	2016-02-05 09:13:07 +00:00
Maxime Coste	4ea89def3b	Avoid (*it++) pattern in utf8.hh	2015-09-25 13:19:21 +01:00
Maxime Coste	aa4b98af7c	Add utf8::read_codepoint that both gets the codepoint and advance iterator	2015-09-24 23:00:47 +01:00
Maxime Coste	e601bd5fe8	Minor additional cleanup in utf8.hh	2015-09-23 22:09:37 +01:00
Maxime Coste	ceafa5459a	Avoid unneeded iterator copies in utf8.hh	2015-09-23 19:48:15 +01:00
Maxime Coste	eb0d03f437	Use Pass as default policy for invalid utf8 avoid asserting on that	2014-10-13 21:07:23 +01:00
Maxime Coste	ed68d1ff28	utf8: use end of sequence iterators for more security	2014-07-05 12:10:06 +01:00
Maxime Coste	3f70d91f8c	Use unsigned char rather than char in utf8 decoding to avoid sign extension	2014-07-05 12:10:06 +01:00
Maxime Coste	db423e4a88	utf8::is_character_start takes directly the char value	2014-05-14 19:49:03 +01:00
Maxime Coste	2d96f853f8	Add utf8::codepoint_size function	2013-05-30 18:49:50 +02:00
Maxime Coste	270e950cf1	sort includes directives	2013-04-09 20:05:40 +02:00
Maxime Coste	5adee4a6a7	rename assert to kak_assert to avoid collisions	2013-04-09 20:04:11 +02:00
Maxime Coste	9f9ad58b39	utf8::dump uses a copy of the output iterator instead of a reference	2013-02-27 23:50:33 +01:00
Maxime Coste	7865223587	Add utf8::character_start function	2013-02-26 14:05:51 +01:00
Maxime Coste	ee882d9d02	utf8: use CharCount instead of size_t	2012-10-27 13:26:40 +02:00
Maxime Coste	df400f90ab	utf8: replace InvalidBytePolicy::Throw with InvalidBytePolicy::Assert	2012-10-17 17:01:51 +02:00
Maxime Coste	dfafcdb6e6	utf8::codepoint: configurable invalid byte policy	2012-10-13 19:05:14 +02:00
Maxime Coste	0ce6bd9bf5	use ByteCount instead of CharCount when we are really counting bytes (that is most of the time when we are not concerned with displaying)	2012-10-11 00:41:48 +02:00
Maxime Coste	571861bc7b	Return something in utf8::distance, thanks again gcc for letting this work	2012-10-11 00:39:17 +02:00
Maxime Coste	ffba94fcde	Actually return something in utf8::codepoint, thanks gcc for using rax	2012-10-10 19:14:18 +02:00
Maxime Coste	7a8366da2b	add a unicode.hh header for Codepoint related functions, s/utf8::Codepoint/Codepoint/	2012-10-09 19:15:05 +02:00
Maxime Coste	1af7465107	utf8: add dump(OutputIterator& it, Codepoint cp)	2012-10-09 14:29:37 +02:00
Maxime Coste	2db1d02329	add utf8 helpers in utf8.hh	2012-10-08 14:25:05 +02:00

29 Commits