The previous implementation could yield different positions when
iterating forward and backward, leading to confusion in boost regex.
This makes an existing problem a bit more visible: iterating with
to_next and with read_codepoint wont behave the same way, as
read_codepoint will put the iterator onto the byte following the
utf8 codepoint, whereas to_next will put it on the next utf8
character start byte, which might be different if the buffer content
is not valid utf8.
Fixes#1195