gmarkup tests: tab character escape/unescape

"\t" is not escaped by g_markup_escape_text(), as per its documentation:

"Note that this function doesn't protect whitespace and line endings
from being processed according to the XML rules for normalization of
line endings and attribute values."

The relevant portion of the XML specification
https://www.w3.org/TR/xml/#AVNormalize

"For a character reference, append the referenced character to the
normalized value."
"For a white space character (#x20, #xD, #xA, #x9), append a space
character (#x20) to the normalized value."

So the unescape code in GMarkup does the right thing as can be verified
by the added valid-17.* data files for the markup-parse unit test.

(Note that the valid-13.* data files already tested a plain tab
character in an attribute value, among other white space characters).

Note that the libxml2's xmlSetProp() function escapes "\t" into the
character reference "	".

See https://gitlab.gnome.org/GNOME/glib/-/issues/2080
This commit is contained in:
Sébastien Wilmet 2020-03-31 16:53:02 +02:00
parent 5a540c8bca
commit 50a3064933
3 changed files with 11 additions and 0 deletions

View File

@ -37,6 +37,9 @@ static EscapeTest escape_tests[] =
{ "N\xc2\x80N", "N€N" },
{ "N\xc2\x79N", "N\xc2\x79N" },
{ "N\xc2\x9fN", "NŸN" },
/* As per g_markup_escape_text()'s documentation, whitespace is not escaped: */
{ "\t", "\t" },
};
static void

View File

@ -0,0 +1,6 @@
ELEMENT 'foo'
tab=" "
END 'foo'
ELEMENT 'bar'
tab_character_reference=" "
END 'bar'

View File

@ -0,0 +1,2 @@
<foo tab=" " />
<bar tab_character_reference="&#9;" />