Skip to content
Snippets Groups Projects
user avatar
Vladimir Davydov authored
Historically, we encode strings that contain invalid or non-printable
utf-8 sequences in YAML as binary base64 blobs. We do that because of
limitations/bugs of the YAML encoder, which refuses to encode invalid
utf-8 strings. To work around this issue, we introduced the helper
utf8_check_printable, which is basically a copy of yaml_check_utf8,
and treat strings for which it fails as binary data (MP_BIN).

This commit updates the YAML submodule to the version where all known
issues with encoding invalid/unprintable utf-8 strings are fixed and
removes special treatment of such strings (drops utf8_check_printable).
Now unprintable or invalid utf-8 sequences are emitted as code points,
e.g. '\xFF' or '\uFFFF'. This change is a pre-requisite for introducing
the new varbinary type to Lua. Without it plain strings would be
implicitly converted to varbinary after decoding/encoding them in YAML,
which would be confusing.

Closes #8756

NO_DOC=bug fix

(cherry picked from commit 890a821c)
8caf1fff
History