- Update to 6.0.0:
* ``lxml.html.diff`` is faster and provides structurally better diffs.
* The factories ``Element`` and ``ElementTree`` can now be used in type
hints.
* Parsing from ``memoryview`` and other buffers is supported to allow
zero-copy parsing.
* ``lxml.html.builder`` was missing several HTML5 tag names.
* ``CDATA`` can now be written into the incremental ``xmlfile()`` writer.
* A new parser option ``decompress=False`` was added that controls the
automatic input decompression when using libxml2 2.15.0 or later.
* The set of compile time / runtime supported libxml2 feature names is
available as ``etree.LIBXML_COMPILED_FEATURES`` and
``etree.LIBXML_FEATURES``.
* Predicates in ``.find*()`` could mishandle tag indices if a default
namespace is provided.
* The ``head`` and ``body`` properties of ``lxml.html`` elements failed
if no such element was found. They now return ``None`` instead.
* Tag names provided by code (API, not data) that are longer than
``INT_MAX`` could be truncated or mishandled in other ways.
* ``.text_content()`` on ``lxml.html`` elements accidentally returned
a "smart string" without additional information. It now returns a plain
string.
* Support for Python < 3.8 was removed.
* Parsing directly from zlib (or lzma) compressed data is now considered
an optional feature in lxml.
* The ``Schematron`` class is deprecated and will become non-functional in
a future lxml version.
* Built using Cython 3.1.2.
* The debug methods ``MemDebug.dump()`` and ``MemDebug.show()`` were
removed completely.
OBS-URL: https://build.opensuse.org/request/show/1294982
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=114
- 5.3.0 (2024-08-10)
Features added
- GH#421: Nested CDATA sections are no longer rejected but split
on output to represent ]]> correctly. Patch by Gertjan Klein.
Bugs fixed
- LP#2060160: Attribute values serialised differently in
xmlfile.element() and xmlfile.write().
- LP#2058177: The ISO-Schematron implementation could fail on
unknown prefixes. Patch by David Lakin.
Other changes
- LP#2067707: The strip_cdata option in HTMLParser() turned out
to be useless and is now deprecated.
- Built with Cython 3.0.11.
OBS-URL: https://build.opensuse.org/request/show/1203593
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=202
- Remove not needed patch skip-test-under-libexpat-2.6.0.patch
- Update to 5.2.2:
- GH#417: The test_feed_parser test could fail if lxml_html_clean
was not installed. It is now skipped in that case.
- LP#2059910: The minimum CPU architecture for the Linux x86 binary
wheels was set back to "core2", without SSE 4.2.
- If libxml2 uses iconv, the compile time version is available as
etree.ICONV_COMPILED_VERSION.
- 5.2.1
- LP#2059910: The minimum CPU architecture for the Linux x86 binary
wheels was set back to "core2", but with SSE 4.2 enabled.
- LP#2059977: ``Element.iterfind("//absolute_path")`` failed with a
``SyntaxError`` where it should have issued a warning.
- GH#416: The documentation build was using the non-standard
``which`` command. Patch by Michał Górny.
- 5.2.0
- LP#1958539: The ``lxml.html.clean`` implementation suffered from
several (only if used) security issues in the past and was now
extracted into a separate library:
https://github.com/fedora-python/lxml_html_clean
Projects that use lxml without "lxml.html.clean" will not notice
any difference, except that they won't have potentially vulnerable
code installed. The module is available as an "extra" setuptools
dependency "lxml[html_clean]", so that Projects that need
"lxml.html.clean" will need to switch their requirements from
"lxml" to "lxml[html_clean]", or install the new library
themselves.
- The minimum CPU architecture for the Linux x86 binary wheels was
upgraded to "sandybridge" (launched 2011), and glibc 2.28 / gcc 12
(manylinux_2_28) wheels were added.
OBS-URL: https://build.opensuse.org/request/show/1180847
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=110
- Update to 5.2.2:
- GH#417: The test_feed_parser test could fail if lxml_html_clean
was not installed. It is now skipped in that case.
- LP#2059910: The minimum CPU architecture for the Linux x86 binary
wheels was set back to "core2", without SSE 4.2.
- If libxml2 uses iconv, the compile time version is available as
etree.ICONV_COMPILED_VERSION.
- 5.2.1
- LP#2059910: The minimum CPU architecture for the Linux x86 binary
wheels was set back to "core2", but with SSE 4.2 enabled.
- LP#2059977: ``Element.iterfind("//absolute_path")`` failed with a
``SyntaxError`` where it should have issued a warning.
- GH#416: The documentation build was using the non-standard
``which`` command. Patch by Michał Górny.
- 5.2.0
- LP#1958539: The ``lxml.html.clean`` implementation suffered from
several (only if used) security issues in the past and was now
extracted into a separate library:
https://github.com/fedora-python/lxml_html_clean
Projects that use lxml without "lxml.html.clean" will not notice
any difference, except that they won't have potentially vulnerable
code installed. The module is available as an "extra" setuptools
dependency "lxml[html_clean]", so that Projects that need
"lxml.html.clean" will need to switch their requirements from
"lxml" to "lxml[html_clean]", or install the new library
themselves.
- The minimum CPU architecture for the Linux x86 binary wheels was
upgraded to "sandybridge" (launched 2011), and glibc 2.28 / gcc 12
(manylinux_2_28) wheels were added.
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=200
- update to 4.9.4:
* LP#2046398: Inserting/replacing an ancestor into a node's
children could loop indefinitely.
* LP#1980767, GH#379: ``TreeBuilder.close()`` could fail with a
``TypeError`` after parsing incorrect input.
* LP#1522052: A file-system specific test is now optional and
should no longer fail on systems that don't support it.
* Built with Cython 0.29.37.
- drop libxml2212-tests.patch (upstream)
- remove python 2.x from testing
- allow building against any libxml2 version in sle15
* Built with Cython 0.29.28.
* LP#1835708: ElementInclude incorrectly rejected repeated
* LP#1755825: iterwalk() failed to return the 'start' event for the initial
- ElementTree.write() has a new option doctype that writes out
a doctype string before the serialisation, in the same way as
- GH#220: xmlfile allows switching output methods at an element
- LP#1595781, GH#240: added a PyCapsule Python API and C-level
API for passing externally generated libxml2 documents into
- GH#244: error log entries have a new property path with an
XPath expression (if known, None otherwise) that points to the
- The namespace prefix mapping that can be used in ElementPath
- GH#238: Character escapes were not hex-encoded in the xmlfile
- GH#229: fix for externally created XML documents.
strips the option values specified in form attributes but only
- LP#1551797: revert previous fix for XSLT error logging as it
- LP#1673355, GH#233: fromstring() html5parser failed to parse
- The previously undocumented docstring option in
ElementTree.write() produces a deprecation warning and will
OBS-URL: https://build.opensuse.org/request/show/1134342
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=106
* LP#2046398: Inserting/replacing an ancestor into a node's
children could loop indefinitely.
* LP#1980767, GH#379: ``TreeBuilder.close()`` could fail with a
``TypeError`` after parsing incorrect input.
* LP#1522052: A file-system specific test is now optional and
should no longer fail on systems that don't support it.
* Built with Cython 0.29.37.
- drop libxml2212-tests.patch (upstream)
- remove python 2.x from testing
- allow building against any libxml2 version in sle15
* Built with Cython 0.29.28.
* LP#1835708: ElementInclude incorrectly rejected repeated
* LP#1755825: iterwalk() failed to return the 'start' event for the initial
- ElementTree.write() has a new option doctype that writes out
a doctype string before the serialisation, in the same way as
- GH#220: xmlfile allows switching output methods at an element
- LP#1595781, GH#240: added a PyCapsule Python API and C-level
API for passing externally generated libxml2 documents into
- GH#244: error log entries have a new property path with an
XPath expression (if known, None otherwise) that points to the
- The namespace prefix mapping that can be used in ElementPath
- GH#238: Character escapes were not hex-encoded in the xmlfile
- GH#229: fix for externally created XML documents.
strips the option values specified in form attributes but only
- LP#1551797: revert previous fix for XSLT error logging as it
- LP#1673355, GH#233: fromstring() html5parser failed to parse
- The previously undocumented docstring option in
ElementTree.write() produces a deprecation warning and will
- remove patch lxml-fix-attribute-quoting.patch because it is now
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=191
- update to 4.9.3:
* ``lxml.objectify`` accepted non-decimal numbers like ``²²²``
as integers.
* A memory leak in ``lxml.html.clean`` was resolved by
switching to Cython 0.29.34+.
* GH#348: URL checking in the HTML cleaner was improved.
* GH#371, GH#373: Some regex strings were changed to raw
strings to fix Python warnings.
* Built with Cython 0.29.36 to adapt to changes in Python 3.12.
OBS-URL: https://build.opensuse.org/request/show/1103711
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=103
* ``lxml.objectify`` accepted non-decimal numbers like ``²²²``
as integers.
* A memory leak in ``lxml.html.clean`` was resolved by
switching to Cython 0.29.34+.
* GH#348: URL checking in the HTML cleaner was improved.
* GH#371, GH#373: Some regex strings were changed to raw
strings to fix Python warnings.
* Built with Cython 0.29.36 to adapt to changes in Python 3.12.
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=185
- update to version 4.9.1:
* Bugs fixed
+ A crash was resolved when using iterwalk() (or canonicalize())
after parsing certain incorrect input. Note that iterwalk() can
crash on valid input parsed with the same parser after failing
to parse the incorrect input.
Note: The doc pdf seems to be outdated, but I couldn't find any newer ones on their webpage. Perhaps the doc package should be deleted?
OBS-URL: https://build.opensuse.org/request/show/988040
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=158
- update to 4.8.0:
* GH#337: Path-like objects are now supported throughout the API instead of
just strings.
* The ``ElementMaker`` now supports ``QName`` values as tags, which always
override the default namespace of the factory.
* GH#338: In lxml.objectify, the XSI float annotation "nan" and "inf" were spelled in
lower case, whereas XML Schema datatypes define them as "NaN" and "INF" respectively.
* Built with Cython 0.29.28.
OBS-URL: https://build.opensuse.org/request/show/955744
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=89
* GH#337: Path-like objects are now supported throughout the API instead of
just strings.
* The ``ElementMaker`` now supports ``QName`` values as tags, which always
override the default namespace of the factory.
* GH#338: In lxml.objectify, the XSI float annotation "nan" and "inf" were spelled in
lower case, whereas XML Schema datatypes define them as "NaN" and "INF" respectively.
* Built with Cython 0.29.28.
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=155
- update to 4.7.1:
* Chunked Unicode string parsing via ``parser.feed()`` now encodes the input data
to the native UTF-8 encoding directly, instead of going through ``Py_UNICODE`` /
``wchar_t`` encoding first, which previously required duplicate recoding in most cases.
* The standard namespace prefixes were mishandled during "C14N2" serialisation on Python 3.
* ``lxml.objectify`` previously accepted non-XML numbers with underscores (like "1_000")
as integers or float values in Python 3.6 and later. It now adheres to the number
format of the XML spec again.
* LP#1939031: Static wheels of lxml now contain the header files of zlib and libiconv
(in addition to the already provided headers of libxml2/libxslt/libexslt).
* Wheels include libxml2 2.9.12+ and libxslt 1.1.34 (also on Windows).
OBS-URL: https://build.opensuse.org/request/show/945448
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=88
* Chunked Unicode string parsing via ``parser.feed()`` now encodes the input data
to the native UTF-8 encoding directly, instead of going through ``Py_UNICODE`` /
``wchar_t`` encoding first, which previously required duplicate recoding in most cases.
* The standard namespace prefixes were mishandled during "C14N2" serialisation on Python 3.
* ``lxml.objectify`` previously accepted non-XML numbers with underscores (like "1_000")
as integers or float values in Python 3.6 and later. It now adheres to the number
format of the XML spec again.
* LP#1939031: Static wheels of lxml now contain the header files of zlib and libiconv
(in addition to the already provided headers of libxml2/libxslt/libexslt).
* Wheels include libxml2 2.9.12+ and libxslt 1.1.34 (also on Windows).
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=154
- update to 4.6.2:
* A vulnerability (CVE-2020-27783) was discovered in the HTML Cleaner by Yaniv Nizry,
which allowed JavaScript to pass through. The cleaner now removes more sneaky
"style" content.
* A vulnerability was discovered in the HTML Cleaner by Yaniv Nizry, which allowed
JavaScript to pass through. The cleaner now removes more sneaky "style" content.
* GH#310: ``lxml.html.InputGetter`` supports ``__len__()`` to count the number of input fields.
Patch by Aidan Woolley.
* ``lxml.html.InputGetter`` has a new ``.items()`` method to ease processing all input fields.
* ``lxml.html.InputGetter.keys()`` now returns the field names in document order.
* GH-309: The API documentation is now generated using ``sphinx-apidoc``.
* LP#1869455: C14N 2.0 serialisation failed for unprefixed attributes
when a default namespace was defined.
* ``TreeBuilder.close()`` raised ``AssertionError`` in some error cases where it
should have raised ``XMLSyntaxError``. It now raises a combined exception to
keep up backwards compatibility, while switching to ``XMLSyntaxError`` as an
interface.
OBS-URL: https://build.opensuse.org/request/show/866353
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=82
* A vulnerability (CVE-2020-27783) was discovered in the HTML Cleaner by Yaniv Nizry,
which allowed JavaScript to pass through. The cleaner now removes more sneaky
"style" content.
* A vulnerability was discovered in the HTML Cleaner by Yaniv Nizry, which allowed
JavaScript to pass through. The cleaner now removes more sneaky "style" content.
* GH#310: ``lxml.html.InputGetter`` supports ``__len__()`` to count the number of input fields.
Patch by Aidan Woolley.
* ``lxml.html.InputGetter`` has a new ``.items()`` method to ease processing all input fields.
* ``lxml.html.InputGetter.keys()`` now returns the field names in document order.
* GH-309: The API documentation is now generated using ``sphinx-apidoc``.
* LP#1869455: C14N 2.0 serialisation failed for unprefixed attributes
when a default namespace was defined.
* ``TreeBuilder.close()`` raised ``AssertionError`` in some error cases where it
should have raised ``XMLSyntaxError``. It now raises a combined exception to
keep up backwards compatibility, while switching to ``XMLSyntaxError`` as an
interface.
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=143
- update to 4.5.2:
* ``Cleaner()`` now validates that only known configuration options can be set.
* LP#1882606: ``Cleaner.clean_html()`` discarded comments and PIs regardless of the
corresponding configuration option, if ``remove_unknown_tags`` was set.
* LP#1880251: Instead of globally overwriting the document loader in libxml2, lxml now
sets it per parser run, which improves the interoperability with other users of libxml2
such as libxmlsec.
* LP#1881960: Fix build in CPython 3.10 by using Cython 0.29.21.
* The setup options "--with-xml2-config" and "--with-xslt-config" were accidentally renamed
to "--xml2-config" and "--xslt-config" in 4.5.1 and are now available again.
OBS-URL: https://build.opensuse.org/request/show/821439
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=81
* ``Cleaner()`` now validates that only known configuration options can be set.
* LP#1882606: ``Cleaner.clean_html()`` discarded comments and PIs regardless of the
corresponding configuration option, if ``remove_unknown_tags`` was set.
* LP#1880251: Instead of globally overwriting the document loader in libxml2, lxml now
sets it per parser run, which improves the interoperability with other users of libxml2
such as libxmlsec.
* LP#1881960: Fix build in CPython 3.10 by using Cython 0.29.21.
* The setup options "--with-xml2-config" and "--with-xslt-config" were accidentally renamed
to "--xml2-config" and "--xslt-config" in 4.5.1 and are now available again.
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=141
* A new function indent() was added to insert tail whitespace
for pretty-printing an XML tree.
* LP#1857794 Tail text of nodes that get removed from a document
using item deletion disappeared silently instead of sticking with the node
that was removed.
* LP#1840234: The package version number is now available as lxml.__version__
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=135
- version update to 4.4.0
* ``Element.clear()`` accepts a new keyword argument ``keep_tail=True`` to
clear everything but the tail text. This is helpful in some document-style
use cases.
* When creating attributes or namespaces from a dict in Python 3.6+, lxml now
preserves the original insertion order of that dict, instead of always sorting
the items by name. A similar change was made for ElementTree in CPython 3.8.
See https://bugs.python.org/issue34160
* Integer elements in ``lxml.objectify`` implement the ``__index__()`` special method.
* GH#269: Read-only elements in XSLT were missing the ``nsmap`` property.
Original patch by Jan Pazdziora.
* ElementInclude can now restrict the maximum inclusion depth via a ``max_depth``
argument to prevent content explosion. It is limited to 6 by default.
* The ``target`` object of the XMLParser can have ``start_ns()`` and ``end_ns()``
callback methods to listen to namespace declarations.
* The ``TreeBuilder`` has new arguments ``comment_factory`` and ``pi_factory`` to
pass factories for creating comments and processing instructions, as well as
flag arguments ``insert_comments`` and ``insert_pis`` to discard them from the
tree when set to false.
* A `C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>`_ implementation was added as
``etree.canonicalize()``, a corresponding ``C14NWriterTarget`` class, and
a ``c14n2`` serialisation method.
* bugfixes, see CHANGES.txt
- deleted sources
- lxmldoc-4.3.3.pdf (renamed)
- added sources
+ lxmldoc-4.4.0.pdf
+ world.txt
OBS-URL: https://build.opensuse.org/request/show/720214
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=127
- Update to 4.3.2:
* Crash in 4.3.1 when appending a child subtree with certain text nodes.
- Update to v4.3.1
* Fixed crash when appending a child subtree that contains unsubstituted
entity references
- from v4.3.0
* Features
+ The module ``lxml.sax`` is compiled using Cython in order to speed it up.
+ lxml.sax.ElementTreeProducer now preserves the namespace prefixes.
If two prefixes point to the same URI, the first prefix in alphabetical
order is used.
+ Updated ISO-Schematron implementation to 2013 version (now MIT licensed)
and the corresponding schema to the 2016 version (with optional "properties").
* Other
+ Support for Python 2.6 and 3.3 was removed.
+ The minimum dependency versions were raised to libxml2 2.9.2 and libxslt 1.1.27,
which were released in 2014 and 2012 respectively.
- from v4.2.6
* Fix a DeprecationWarning in Py3.7+.
* Import warnings in Python 3.6+ were resolved.
- Remove no longer needed
0001-Make-test-more-resilient-against-changes-in-latest-l.patch
OBS-URL: https://build.opensuse.org/request/show/681724
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=70
- Update to 4.2.4 (2018-08-03)
+ Features added
* GH#259: Allow using ``pkg-config`` for build configuration.
Patch by Patrick Griffis.
+ Bugs fixed
* LP#1773749, GH#268: Crash when moving an element to another document with
``Element.insert()``.
Patch by Alexander Weggerle.
- Update to 4.2.3
+ Bugs fixed
* Reverted GH#265: lxml links against zlib as a shared library again.
- Update to 4.2.2
+ Bugs fixed
* GH#266: Fix sporadic crash during GC when parse-time schema validation is used
and the parser participates in a reference cycle.
Original patch by Julien Greard.
* GH#265: lxml no longer links against zlib as a shared library, only on static builds.
Patch by Nehal J Wani.
OBS-URL: https://build.opensuse.org/request/show/627950
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=109
- Version update to 4.2.0:
* GH#255: ``SelectElement.value`` returns more standard-compliant and
browser-like defaults for non-multi-selects. If no option is selected, the
value of the first option is returned (instead of None). If multiple options
are selected, the value of the last one is returned (instead of that of the
first one). If no options are present (not standard-compliant)
``SelectElement.value`` still returns ``None``.
* GH#261: The ``HTMLParser()`` now supports the ``huge_tree`` option.
Patch by stranac.
* LP#1551797: Some XSLT messages were not captured by the transform error log.
* LP#1737825: Crash at shutdown after an interrupted iterparse run with XMLSchema
validation.
- Add patch python-lxml-assert.patch to pass test fail on threading
OBS-URL: https://build.opensuse.org/request/show/588625
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=64
* GH#255: ``SelectElement.value`` returns more standard-compliant and
browser-like defaults for non-multi-selects. If no option is selected, the
value of the first option is returned (instead of None). If multiple options
are selected, the value of the last one is returned (instead of that of the
first one). If no options are present (not standard-compliant)
``SelectElement.value`` still returns ``None``.
* GH#261: The ``HTMLParser()`` now supports the ``huge_tree`` option.
Patch by stranac.
* LP#1551797: Some XSLT messages were not captured by the transform error log.
* LP#1737825: Crash at shutdown after an interrupted iterparse run with XMLSchema
validation.
- Add patch python-lxml-assert.patch to pass test fail on threading
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=105
- update to 4.1.1
- ElementPath supports text predicates for current node, like "[.='text']".
- ElementPath allows spaces in predicates.
- Custom Element classes and XPath functions can now be registered with
a decorator rather than explicit dict assignments.
- LP#1722776: Requesting non-Element objects like comments from
a document with PythonElementClassLookup could fail with a TypeError.
OBS-URL: https://build.opensuse.org/request/show/574238
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=103
- update to 3.6.1:
* Separate option ``inline_style`` for Cleaner that only removes ``style``
attributes instead of all styles.
* Windows build support for Python 3.5.
* Exclude ``file`` fields from ``FormElement.form_values`` (as browsers do).
* Try to provide base URL from ``Resolver.resolve_string()``.
* More accurate float serialisation in ``objectify.FloatElement``.
* Repair XSLT error logging.
OBS-URL: https://build.opensuse.org/request/show/419562
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=90
- Drop lxml-dont-depend-on-URL-formatting-in-test.patch, merged upstream
- Update to 3.4.3:
* Expression cache in ElementPath was ignored. Fix by Changaco.
* LP#1426868: Passing a default namespace and a prefixed namespace mapping
as nsmap into ``xmlfile.element()`` raised a ``TypeError``.
* LP#1421927: DOCTYPE system URLs were incorrectly quoted when containing
double quotes. Patch by Olli Pottonen.
* LP#1419354: meta-redirect URLs were incorrectly processed by
``iterlinks()`` if preceded by whitespace.
* LP#1415907: Crash when creating an XMLSchema from a non-root element
of an XML document.
* LP#1369362: HTML cleaning failed when hitting processing instructions
with pseudo-attributes.
* ``CDATA()`` wrapped content was rejected for tail text.
* CDATA sections were not serialised as tail text of the top-level element.
* New ``htmlfile`` HTML generator to accompany the incremental ``xmlfile``
serialisation API. Patch by Burak Arslan.
* ``lxml.sax.ElementTreeContentHandler`` did not initialise its superclass.
OBS-URL: https://build.opensuse.org/request/show/298550
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=80
- Update to version 3.3.2:
- The properties resolvers and version, as well as the methods
set_element_class_lookup() and makeelement(), were lost from iterparse
objects.
- LP#1222132: instances of XMLSchema, Schematron and RelaxNG did not clear
their local error_log before running a validation.
- LP#1238500: lxml.doctestcompare mixed up "expected" and "actual" in
attribute values.
- Some file I/O tests were failing in MS-Windows due to incorrect temp file
usage. Initial patch by Gabi Davar.
- LP#910014: duplicate IDs in a document were not reported by DTD
validation.
- LP#1185332: tostring(method="html") did not use HTML serialisation
semantics for trailing tail text. Initial patch by Sylvain Viollon.
- LP#1281139: .attrib value of Comments lost its mutation methods in 3.3.0.
Even though it is empty and immutable, it should still provide the same
interface as that returned for Elements.
- Run tests during build
OBS-URL: https://build.opensuse.org/request/show/224393
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=73
- update to 3.2.1:
* The methods ``apply_templates()`` and ``process_children()`` of XSLT
extension elements have gained two new boolean options ``elements_only``
and ``remove_blank_text`` that discard either all strings or whitespace-only
strings from the result list.
* When moving Elements to another tree, the namespace cleanup mechanism
no longer drops namespace prefixes from attributes for which it finds
a default namespace declaration, to prevent them from appearing as
unnamespaced attributes after serialisation.
* Returning non-type objects from a custom class lookup method could lead
to a crash.
* Instantiating and using subtypes of Comments and ProcessingInstructions
crashed. (forwarded request 175226 from dirkmueller)
OBS-URL: https://build.opensuse.org/request/show/175240
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=41
- update to 3.2.1:
* The methods ``apply_templates()`` and ``process_children()`` of XSLT
extension elements have gained two new boolean options ``elements_only``
and ``remove_blank_text`` that discard either all strings or whitespace-only
strings from the result list.
* When moving Elements to another tree, the namespace cleanup mechanism
no longer drops namespace prefixes from attributes for which it finds
a default namespace declaration, to prevent them from appearing as
unnamespaced attributes after serialisation.
* Returning non-type objects from a custom class lookup method could lead
to a crash.
* Instantiating and using subtypes of Comments and ProcessingInstructions
crashed.
OBS-URL: https://build.opensuse.org/request/show/175226
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=56
- update to 3.2.0:
* Leading whitespace could change the behaviour of the string
parsing functions in ``lxml.html``.
* LP#599318: The string parsing functions in ``lxml.html`` are more robust
in the face of uncommon HTML content like framesets or missing body tags.
Patch by Stefan Seelmann.
* LP#712941: I/O errors while trying to access files with paths that contain
non-ASCII characters could raise ``UnicodeDecodeError`` instead of properly
reporting the ``IOError``.
* LP#673205: Parsing from in-memory strings disabled network access in the
default parser and made subsequent attempts to parse from a URL fail.
* LP#971754: lxml.html.clean appends 'nofollow' to 'rel' attributes instead
of overwriting the current value.
* LP#715687: lxml.html.clean no longer discards scripts that are explicitly
allowed by the user provided whitelist. Patch by Christine Koppelt.
- update to 3.2.0:
* Leading whitespace could change the behaviour of the string
parsing functions in ``lxml.html``.
* LP#599318: The string parsing functions in ``lxml.html`` are more robust
in the face of uncommon HTML content like framesets or missing body tags.
Patch by Stefan Seelmann.
* LP#712941: I/O errors while trying to access files with paths that contain
non-ASCII characters could raise ``UnicodeDecodeError`` instead of properly
reporting the ``IOError``.
* LP#673205: Parsing from in-memory strings disabled network access in the
default parser and made subsequent attempts to parse from a URL fail.
* LP#971754: lxml.html.clean appends 'nofollow' to 'rel' attributes instead
of overwriting the current value.
* LP#715687: lxml.html.clean no longer discards scripts that are explicitly (forwarded request 173959 from dirkmueller)
OBS-URL: https://build.opensuse.org/request/show/174252
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=40
- update to 3.2.0:
* Leading whitespace could change the behaviour of the string
parsing functions in ``lxml.html``.
* LP#599318: The string parsing functions in ``lxml.html`` are more robust
in the face of uncommon HTML content like framesets or missing body tags.
Patch by Stefan Seelmann.
* LP#712941: I/O errors while trying to access files with paths that contain
non-ASCII characters could raise ``UnicodeDecodeError`` instead of properly
reporting the ``IOError``.
* LP#673205: Parsing from in-memory strings disabled network access in the
default parser and made subsequent attempts to parse from a URL fail.
* LP#971754: lxml.html.clean appends 'nofollow' to 'rel' attributes instead
of overwriting the current value.
* LP#715687: lxml.html.clean no longer discards scripts that are explicitly
allowed by the user provided whitelist. Patch by Christine Koppelt.
- update to 3.2.0:
* Leading whitespace could change the behaviour of the string
parsing functions in ``lxml.html``.
* LP#599318: The string parsing functions in ``lxml.html`` are more robust
in the face of uncommon HTML content like framesets or missing body tags.
Patch by Stefan Seelmann.
* LP#712941: I/O errors while trying to access files with paths that contain
non-ASCII characters could raise ``UnicodeDecodeError`` instead of properly
reporting the ``IOError``.
* LP#673205: Parsing from in-memory strings disabled network access in the
default parser and made subsequent attempts to parse from a URL fail.
* LP#971754: lxml.html.clean appends 'nofollow' to 'rel' attributes instead
of overwriting the current value.
* LP#715687: lxml.html.clean no longer discards scripts that are explicitly
OBS-URL: https://build.opensuse.org/request/show/173959
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=54
- Update to version 2.3.3:
* lxml.html.tostring() gained new serialisation options with_tail and doctype.
* Fixed a crash when using iterparse() for HTML parsing and requesting start events.
* Fixed parsing of more selectors in cssselect. Whitespace before
pseudo-elements and pseudo-classes is significant as it is a descendant
combinator. "E :pseudo" should parse the same as "E *:pseudo", not "E:pseudo".
* lxml.html.diff no longer raises an exception when hitting 'img' tags without 'src' attribute.
- Changes from version 2.3.2:
* lxml.objectify.deannotate() has a new boolean option cleanup_namespaces to
remove the objectify namespace declarations (and generally clean up the
namespace declarations) after removing the type annotations.
* lxml.objectify gained its own SubElement() function as a copy of
etree.SubElement to avoid an otherwise redundant import of lxml.etree on the user side.
* Fixed the "descendant" bug in cssselect a second time
* Fixed parsing of some selectors in cssselect.
- Changes from version 2.3.1:
* New option kill_tags in lxml.html.clean to remove specific tags and their
content (i.e. their whole subtree).
* pi.get() and pi.attrib on processing instructions to parse
pseudo-attributes from the text content of processing instructions.
* lxml.get_include() returns a list of include paths that can be used to
compile external C code against lxml.etree.
* Resolver.resolve_file() takes an additional option close_file that
configures if the file(-like) object will be closed after reading or not.
* HTML cleaning didn't remove 'data:' links.
* The html5lib parser integration now uses the 'official' implementation in
html5lib itself, which makes it work with newer releases of the library.
* In lxml.sax, endElementNS() could incorrectly reject a plain tag name when
the corresponding start event inferred the same plain tag name to be in the default namespace.
* When an open file-like object is passed into parse() or iterparse(), the
OBS-URL: https://build.opensuse.org/request/show/108688
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-lxml?expand=0&rev=31
* lxml.html.tostring() gained new serialisation options with_tail and doctype.
* Fixed a crash when using iterparse() for HTML parsing and requesting start events.
* Fixed parsing of more selectors in cssselect. Whitespace before
pseudo-elements and pseudo-classes is significant as it is a descendant
combinator. "E :pseudo" should parse the same as "E *:pseudo", not "E:pseudo".
* lxml.html.diff no longer raises an exception when hitting 'img' tags without 'src' attribute.
- Changes from version 2.3.2:
* lxml.objectify.deannotate() has a new boolean option cleanup_namespaces to
remove the objectify namespace declarations (and generally clean up the
namespace declarations) after removing the type annotations.
* lxml.objectify gained its own SubElement() function as a copy of
etree.SubElement to avoid an otherwise redundant import of lxml.etree on the user side.
* Fixed the "descendant" bug in cssselect a second time
* Fixed parsing of some selectors in cssselect.
- Changes from version 2.3.1:
* New option kill_tags in lxml.html.clean to remove specific tags and their
content (i.e. their whole subtree).
* pi.get() and pi.attrib on processing instructions to parse
pseudo-attributes from the text content of processing instructions.
* lxml.get_include() returns a list of include paths that can be used to
compile external C code against lxml.etree.
* Resolver.resolve_file() takes an additional option close_file that
configures if the file(-like) object will be closed after reading or not.
* HTML cleaning didn't remove 'data:' links.
* The html5lib parser integration now uses the 'official' implementation in
html5lib itself, which makes it work with newer releases of the library.
* In lxml.sax, endElementNS() could incorrectly reject a plain tag name when
the corresponding start event inferred the same plain tag name to be in the default namespace.
* When an open file-like object is passed into parse() or iterparse(), the
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=38
Features added
* When looking for children, lxml.objectify takes '{}tag' as
meaning an empty namespace, as opposed to the parent namespace.
Bugs fixed
* When finished reading from a file-like object, the parser
immediately calls its close() method.
* When finished parsing, iterparse() immediately closes the input
file.
* Work-around for libxml2 bug that can leave the HTML parser in a
non-functional state after parsing a severly broken document (fixed
in libxml2 2.7.8).
* marque tag in HTML cleanup code is correctly named marquee.
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-lxml?expand=0&rev=30
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.