Name: python-lxml Version: 4.9.3 Release: 2%{?dist} Summary: XML processing library combining libxml2/libxslt with the ElementTree API # The lxml project is licensed under BSD-3-Clause # Some code is derived from ElementTree and cElementTree # thus using the MIT-CMU elementtree license # .xsl schematron files are under the MIT license License: BSD-3-Clause AND MIT-CMU AND MIT URL: https://github.com/lxml/lxml # We use the get-lxml-source.sh script to generate the tarball # without the isoschematron RNG validation file under a problematic license. # See: https://gitlab.com/fedora/legal/fedora-license-data/-/issues/154 Source0: lxml-%{version}-no-isoschematron-rng.tar.gz Source1: get-lxml-source.sh # Make the validation of ISO-Schematron files optional in lxml, # depending on the availability of the RNG validation file # Rebased from https://github.com/lxml/lxml/commit/4bfab2c821961fb4c5ed8a04e329778c9b09a1df # Will be included in lxml 5.0 Patch: Make-the-validation-of-ISO-Schematron-files-optional.patch # Skip test_isoschematron.test_schematron_invalid_schema_empty without the RNG file Patch: https://github.com/lxml/lxml/pull/380.patch # Upstream issue: https://bugs.launchpad.net/lxml/+bug/2016939 Patch: Skip-failing-test-test_html_prefix_nsmap.patch # Cython 3 support backported from future lxml 5.0 Patch: https://github.com/lxml/lxml/commit/dcbc0cc1cb0cedf8019184aaca805d2a649cd8de.patch Patch: https://github.com/lxml/lxml/commit/a03a4b3c6b906d33c5ef1a15f3d5ca5fff600c76.patch BuildRequires: gcc BuildRequires: libxml2-devel BuildRequires: libxslt-devel BuildRequires: python%{python3_pkgversion}-devel # Some of the extras create a build dependency loop. # - [cssselect] Requires cssselect BuildRequires lxml # - [html5] Requires html5lib BuildRequires lxml # - [htmlsoup] Requires beautifulsoup4 Requires lxml # Hence we provide a bcond to disable the extras altogether. # By default, the extras are disabled in RHEL, to avoid dependencies. %bcond extras %{undefined rhel} %global _description \ lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It\ provides safe and convenient access to these libraries using the ElementTree It\ extends the ElementTree API significantly to offer support for XPath, RelaxNG,\ XML Schema, XSLT, C14N and much more. %description %{_description} %package -n python%{python3_pkgversion}-lxml Summary: %{summary} %if %{with extras} Suggests: python%{python3_pkgversion}-lxml+cssselect Suggests: python%{python3_pkgversion}-lxml+html5 Suggests: python%{python3_pkgversion}-lxml+htmlsoup %endif %description -n python%{python3_pkgversion}-lxml %{_description} Python 3 version. %if %{with extras} %pyproject_extras_subpkg -n python%{python3_pkgversion}-lxml cssselect html5 htmlsoup %endif %prep %autosetup -n lxml-%{version} -p1 # Don't run html5lib tests --without extras %{!?without_extras:rm src/lxml/html/tests/test_html5parser.py} %generate_buildrequires %pyproject_buildrequires -x source%{?with_extras:,cssselect,html5,htmlsoup} # Remove pregenerated Cython C sources # We need to do this after %%pyproject_buildrequires because setup.py errors # without Cython and without the .c files. find -type f -name '*.c' -print -delete >&2 %build export WITH_CYTHON=true %pyproject_wheel %install %pyproject_install %pyproject_save_files lxml %check # The tests assume inplace build, so we copy the built library to source-dir. # If not done that, Python can either import the tests or the extension modules, but not both. cp -a build/lib.%{python3_platform}-*/* src/ # The options are: verbose, unit, functional %{python3} test.py -vuf %files -n python%{python3_pkgversion}-lxml -f %{pyproject_files} %license doc/licenses/BSD.txt doc/licenses/elementtree.txt %doc README.rst %changelog * Fri Jul 28 2023 Miro Hrončok - 4.9.3-2 - Fix build with Cython 3 * Fri Jul 21 2023 Lumír Balhar - 4.9.3-1 - Update to 4.9.3 (rhbz#2219811) * Fri Jul 21 2023 Fedora Release Engineering - 4.9.2-9 - Rebuilt for https://fedoraproject.org/wiki/Fedora_39_Mass_Rebuild * Fri Jul 14 2023 Miro Hrončok - 4.9.2-8 - Bring back the isoschematron submodule, but without the validation of the schema file itself * Fri Jun 16 2023 Python Maint - 4.9.2-7 - Rebuilt for Python 3.12 * Tue Jun 13 2023 Python Maint - 4.9.2-6 - Bootstrap for Python 3.12 * Wed May 31 2023 Miro Hrončok - 4.9.2-5 - Remove the isoschematron submodule * Tue May 30 2023 Yaakov Selkowitz - 4.9.2-4 - Disable extra subpackages in RHEL builds * Mon May 29 2023 Tomáš Hrnčiar - 4.9.2-3 - Skip failing test to avoid FTBFS * Fri Jan 20 2023 Fedora Release Engineering - 4.9.2-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_38_Mass_Rebuild * Wed Dec 14 2022 Lumír Balhar - 4.9.2-1 - Update to 4.9.2 (rhbz#2153063) * Wed Sep 14 2022 Charalampos Stratakis - 4.9.1-1 - Update to 4.9.1 - Fix for CVE-2022-2309 - Resolves: rhbz#2107571, rhbz#2110131 * Wed Aug 31 2022 Miro Hrončok - 4.7.1-6 - Use SPDX license identifiers - The schematron files are not Zlib licensed, but MIT - Package the lxml[cssselect], lxml[html5] and lxml[htmlsoup] extras * Fri Jul 22 2022 Fedora Release Engineering - 4.7.1-5 - Rebuilt for https://fedoraproject.org/wiki/Fedora_37_Mass_Rebuild * Wed Jun 22 2022 Charalampos Stratakis - 4.7.1-4 - Fix FTBFS with setuptools >= 62.1 - Resolves: rhbz#2097102 * Mon Jun 13 2022 Python Maint - 4.7.1-3 - Rebuilt for Python 3.11 * Fri Jan 21 2022 Fedora Release Engineering - 4.7.1-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_36_Mass_Rebuild * Thu Jan 06 2022 Charalampos Stratakis - 4.7.1-1 - Update to 4.7.1 - Fixes CVE-2021-43818 - Resolves: rhbz#2031686, rhbz#2032572 * Fri Nov 26 2021 Miro Hrončok - 4.6.3-5 - Run the tests during build - Resolves: rhbz#2026941 * Fri Jul 23 2021 Fedora Release Engineering - 4.6.3-4 - Rebuilt for https://fedoraproject.org/wiki/Fedora_35_Mass_Rebuild * Thu Jun 03 2021 Charalampos Stratakis - 4.6.3-3 - Update the license information * Wed Jun 02 2021 Python Maint - 4.6.3-2 - Rebuilt for Python 3.10 * Thu May 20 2021 Charalampos Stratakis - 4.6.3-1 - Update to 4.6.3 - Fixes CVE-2021-28957 - Fixes: rhbz#1941773 - Fixes: rhbz#1941535 * Wed Jan 27 2021 Fedora Release Engineering - 4.6.2-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_34_Mass_Rebuild * Tue Dec 01 2020 Miro Hrončok - 4.6.2-1 - Update to 4.6.2 - Fixes CVE-2020-27783 and another vulnerability in the HTML Cleaner - Fixes: rhbz#1855415 - Fixes: rhbz#1901634 * Wed Jul 29 2020 Fedora Release Engineering - 4.5.1-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_33_Mass_Rebuild * Mon Jun 01 2020 Igor Raits - 4.5.1-1 - Update to 4.5.1 * Fri May 22 2020 Miro Hrončok - 4.4.1-5 - Rebuilt for Python 3.9 * Thu Jan 30 2020 Fedora Release Engineering - 4.4.1-4 - Rebuilt for https://fedoraproject.org/wiki/Fedora_32_Mass_Rebuild * Wed Nov 20 2019 Miro Hrončok - 4.4.1-3 - Subpackage python2-lxml has been removed See https://fedoraproject.org/wiki/Changes/Mass_Python_2_Package_Removal * Sat Sep 07 2019 Igor Gnatenko - 4.4.1-2 - Generate C files using py3 Cython * Sat Sep 07 2019 Igor Gnatenko - 4.4.1-1 - Update to 4.4.1 * Fri Aug 16 2019 Miro Hrončok - 4.4.0-2 - Rebuilt for Python 3.8 * Sat Aug 03 2019 Igor Gnatenko - 4.4.0-1 - Update to 4.4.0 * Fri Jul 26 2019 Fedora Release Engineering - 4.2.5-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_31_Mass_Rebuild * Sat Feb 02 2019 Fedora Release Engineering - 4.2.5-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_30_Mass_Rebuild * Tue Dec 18 2018 Igor Gnatenko - 4.2.5-1 - Update to 4.2.5 * Sun Sep 02 2018 Igor Gnatenko - 4.2.4-1 - Update to 4.2.4 * Sat Jul 14 2018 Fedora Release Engineering - 4.2.3-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_29_Mass_Rebuild * Sat Jul 07 2018 Igor Gnatenko - 4.2.3-1 - Update to 4.2.3 * Sun Jun 17 2018 Miro Hrončok - 4.2.1-2 - Rebuilt for Python 3.7 * Wed Apr 25 2018 Igor Gnatenko - 4.2.1-1 - Update to 4.2.1 * Fri Feb 09 2018 Fedora Release Engineering - 4.1.1-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_28_Mass_Rebuild * Sun Nov 05 2017 Igor Gnatenko - 4.1.1-1 - Update to 4.1.1 * Tue Oct 10 2017 Mikolaj Izdebski - 4.0.0-2 - Conditionally allow building without Cython * Thu Oct 05 2017 Igor Gnatenko - 4.0.0-1 - Update to 4.0.0 * Sat Aug 12 2017 Kevin Fenzi - 3.8.0-1 - Update to 3.8.0. Fixes bug #1458529 * Thu Aug 03 2017 Fedora Release Engineering - 3.7.2-4 - Rebuilt for https://fedoraproject.org/wiki/Fedora_27_Binutils_Mass_Rebuild * Thu Jul 27 2017 Fedora Release Engineering - 3.7.2-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_27_Mass_Rebuild * Sat Feb 11 2017 Fedora Release Engineering - 3.7.2-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_26_Mass_Rebuild * Mon Jan 09 2017 Fabio Alessandro Locati - 3.7.2-1 - Update to 3.7.2 * Sun Dec 25 2016 Fabio Alessandro Locati - 3.7.1-1 - Update to 3.7.1 * Tue Dec 13 2016 Stratakis Charalampos - 3.7.0-2 - Rebuild for Python 3.6 * Sun Dec 11 2016 Fabio Alessandro Locati - 3.7.0-1 - Update to 3.7.0 * Thu Sep 08 2016 Fabio Alessandro Locati - 3.6.4-1 - Update to 3.6.4 * Tue Jul 19 2016 Fedora Release Engineering - 3.4.4-5 - https://fedoraproject.org/wiki/Changes/Automatic_Provides_for_Python_RPM_Packages * Thu Feb 04 2016 Fedora Release Engineering - 3.4.4-4 - Rebuilt for https://fedoraproject.org/wiki/Fedora_24_Mass_Rebuild * Thu Jan 21 2016 Dan Horák - 3.4.4-3 - fix conditional * Fri Nov 06 2015 Robert Kuska - 3.4.4-2 - Rebuilt for Python3.5 rebuild * Fri Aug 28 2015 Peter Robinson 3.4.4-1 - Update to 3.4.4 - Use %%license, cleanup spec * Thu Jun 18 2015 Fedora Release Engineering - 3.3.6-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_23_Mass_Rebuild * Fri Aug 29 2014 Jeffrey C. Ollie - 3.3.6-1 - 3.3.6 (2014-08-28) - ================== - - Bugs fixed - ---------- - - * Prevent tree cycle creation when adding Elements as siblings. - - * LP#1361948: crash when deallocating Element siblings without parent. - - * LP#1354652: crash when traversing internally loaded documents in XSLT - extension functions. * Sun Aug 17 2014 Fedora Release Engineering - Rebuilt for https://fedoraproject.org/wiki/Fedora_21_22_Mass_Rebuild * Sat Jun 07 2014 Fedora Release Engineering - 3.3.5-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_21_Mass_Rebuild * Wed May 14 2014 Bohuslav Kabrda - 3.3.5-2 - Rebuilt for https://fedoraproject.org/wiki/Changes/Python_3.4 * Mon Apr 28 2014 Jeffrey Ollie - 3.3.5-1 - 3.3.5 (2014-04-18) - ================== - - Bugs fixed - ---------- - - * HTML cleaning could fail to strip javascript links that mix control - characters into the link scheme. * Mon Apr 28 2014 Jeffrey Ollie - 3.3.4-1 - 3.3.4 (2014-04-03) - ================== - - Features added - -------------- - - * Source line numbers above 65535 are available on Elements when - using libxml2 2.9 or later. - - Bugs fixed - ---------- - - * lxml.html.fragment_fromstring() failed for bytes input in Py3. * Wed Mar 26 2014 Jeffrey Ollie - 3.3.3-4 - Fix macro definition * Wed Mar 26 2014 Jeffrey Ollie - 3.3.3-3 - Add python3-cssselect to correct package * Mon Mar 24 2014 Jeffrey Ollie - 3.3.3-3 - python3-cssselect is not available on F19 * Mon Mar 24 2014 Jeffrey Ollie - 3.3.3-2 - BZ#1075070 add requires and buildrequires for cssselect * Tue Mar 11 2014 Jeffrey Ollie - 3.3.3-1 - 3.3.3 (2014-03-04) - ================== - - Bugs fixed - ---------- - - * LP#1287118: Crash when using Element subtypes with ``__slots__``. - - Other changes - ------------- - - * The internal classes ``_LogEntry`` and ``_Attrib`` can no longer be - subclassed from Python code. * Tue Mar 11 2014 Alexander Todorov - 3.3.2-2 - Add check section #1075070 * Fri Feb 28 2014 Jeffrey Ollie - 3.3.2-1 - 3.3.2 (2014-02-26) - ================== - - Bugs fixed - ---------- - - * The properties ``resolvers`` and ``version``, as well as the methods - ``set_element_class_lookup()`` and ``makeelement()``, were lost from - ``iterparse`` objects. - - * LP#1222132: instances of ``XMLSchema``, ``Schematron`` and ``RelaxNG`` - did not clear their local ``error_log`` before running a validation. - - * LP#1238500: lxml.doctestcompare mixed up "expected" and "actual" in - attribute values. - - * Some file I/O tests were failing in MS-Windows due to incorrect temp - file usage. Initial patch by Gabi Davar. - - * LP#910014: duplicate IDs in a document were not reported by DTD - validation. - - * LP#1185332: ``tostring(method="html")`` did not use HTML serialisation - semantics for trailing tail text. Initial patch by Sylvain Viollon. - - * LP#1281139: ``.attrib`` value of Comments lost its mutation methods - in 3.3.0. Even though it is empty and immutable, it should still - provide the same interface as that returned for Elements. * Fri Feb 28 2014 Jeffrey Ollie - 3.3.2-1 - 3.3.1 (2014-02-12) - ================== - - Bugs fixed - ---------- - - * LP#1014290: HTML documents parsed with ``parser.feed()`` failed to find - elements during tag iteration. - - * LP#1273709: Building in PyPy failed due to missing support for - ``PyUnicode_Compare()`` and ``PyByteArray_*()`` in PyPy's C-API. - - * LP#1274413: Compilation in MSVC failed due to missing "stdint.h" standard - header file. - - * LP#1274118: iterparse() failed to parse BOM prefixed files. * Mon Jan 27 2014 Jeffrey Ollie - 3.3.0-2 - Update Cython requirement to >= 0.20 * Mon Jan 27 2014 Jeffrey Ollie - 3.3.0-1 - 3.3.0 (2014-01-26) - ================== - - Features added - -------------- - - Bugs fixed - ---------- - - * The heuristic that distinguishes file paths from URLs was tightened - to produce less false negatives. - - Other changes - ------------- - - - 3.3.0beta5 (2014-01-18) - ======================= - - Features added - -------------- - - * The PEP 393 unicode parsing support gained a fallback for wchar strings - which might still be somewhat common on Windows systems. - - Bugs fixed - ---------- - - * Several error handling problems were fixed throughout the code base that - could previously lead to exceptions being silently swallowed or not - properly reported. - - * The C-API function ``appendChild()`` is now deprecated as it does not - propagate exceptions (its return type is ``void``). The new function - ``appendChildToElement()`` was added as a safe replacement. - - * Passing a string into ``fromstringlist()`` raises an exception instead of - parsing the string character by character. - - Other changes - ------------- - - * Document cleanup code was simplified using the new GC features in - Cython 0.20. - - - 3.3.0beta4 (2014-01-12) - ======================= - - Features added - -------------- - - Bugs fixed - ---------- - - * The (empty) value returned by the ``attrib`` property of Entity and - Comment objects was mutable. - - * Element class lookup wasn't available for the new pull parsers or when - using a custom parser target. - - * Setting Element attributes on instantiation with both the ``attrib`` - argument and keyword arguments could modify the mapping passed as - ``attrib``. - - * LP#1266171: DTDs instantiated from internal/external subsets (i.e. - through the docinfo property) lost their attribute declarations. - - Other changes - ------------- - - * Built with Cython 0.20pre (gitrev 012ae82eb) to prepare support for - Python 3.4. - - - 3.3.0beta3 (2014-01-02) - ======================= - - Features added - -------------- - - * Unicode string parsing was optimised for Python 3.3 (PEP 393). - - Bugs fixed - ---------- - - * HTML parsing of Unicode strings could misdecode the input on some - platforms. - - * Crash in xmlfile() when closing open elements out of order in an error - case. - - Other changes - ------------- - - - 3.3.0beta2 (2013-12-20) - ======================= - - Features added - -------------- - - * ``iterparse()`` supports the ``recover`` option. - - Bugs fixed - ---------- - - * Crash in ``iterparse()`` for HTML parsing. - - * Crash in target parsing with attributes. - - Other changes - ------------- - - * The safety check in the read-only tree implementation (e.g. used by - ``PythonElementClassLookup``) raises a more appropriate - ``ReferenceError`` for illegal access after tree disposal instead of - an ``AssertionError``. This should only impact test code that - specifically checks the original behaviour. - - - 3.3.0beta1 (2013-12-12) - ======================= - - Features added - -------------- - - * New option ``handle_failures`` in ``make_links_absolute()`` and - ``resolve_base_href()`` (lxml.html) that enables ignoring or - discarding links that fail to parse as URLs. - - * New parser classes ``XMLPullParser`` and ``HTMLPullParser`` for - incremental parsing, as implemented for ElementTree in Python 3.4. - - * ``iterparse()`` enables recovery mode by default for HTML parsing - (``html=True``). - - Bugs fixed - ---------- - - * LP#1255132: crash when trying to run validation over non-Element (e.g. - comment or PI). - - * Error messages in the log and in exception messages that originated - from libxml2 could accidentally be picked up from preceding warnings - instead of the actual error. - - * The ``ElementMaker`` in lxml.objectify did not accept a dict as - argument for adding attributes to the element it's building. This - works as in lxml.builder now. - - * LP#1228881: ``repr(XSLTAccessControl)`` failed in Python 3. - - * Raise ``ValueError`` when trying to append an Element to itself or - to one of its own descendants, instead of running into an infinite - loop. - - * LP#1206077: htmldiff discarded whitespace from the output. - - * Compressed plain-text serialisation to file-like objects was broken. - - * lxml.html.formfill: Fix textarea form filling. - The textarea used to be cleared before the new content was set, - which removed the name attribute. - - Other changes - ------------- - - * Some basic API classes use freelists internally for faster - instantiation. This can speed up some ``iterparse()`` scenarios, - for example. - - * ``iterparse()`` was rewritten to use the new ``*PullParser`` - classes internally instead of being a parser itself. * Mon Nov 11 2013 Jeffrey Ollie - 3.2.4-1 - 3.2.4 (2013-11-07) - ================== - - Bugs fixed - ---------- - - * Memory leak when creating an XPath evaluator in a thread. - - * LP#1228881: ``repr(XSLTAccessControl)`` failed in Python 3. - - * Raise ``ValueError`` when trying to append an Element to itself or - to one of its own descendants. - - * LP#1206077: htmldiff discarded whitespace from the output. - - * Compressed plain-text serialisation to file-like objects was broken. * Wed Sep 18 2013 Jeffrey Ollie - 3.2.3-2 - Add requirement for on python-cssselect for the python2 version * Sun Jul 28 2013 Jeffrey Ollie - 3.2.3-1 - and here's a version 3.2.3. The last release accidentally lost the ability - to work on Python 2.4. There are no other changes over 3.2.2. - - 3.2.2 (2013-07-28) - ================== - - Features added - -------------- - - Bugs fixed - ---------- - - * LP#1185701: spurious XMLSyntaxError after finishing iterparse(). - - * Crash in lxml.objectify during xsi annotation. - - Other changes - ------------- - - * Return values of user provided element class lookup methods are now - validated against the type of the XML node they represent to prevent - API class mismatches. * Sun May 12 2013 Jeffrey Ollie - 3.2.1-1 - 3.2.1 (2013-05-11) - ================== - - Features added - -------------- - - * The methods ``apply_templates()`` and ``process_children()`` of XSLT - extension elements have gained two new boolean options ``elements_only`` - and ``remove_blank_text`` that discard either all strings or - whitespace-only strings from the result list. - - Bugs fixed - ---------- - - * When moving Elements to another tree, the namespace cleanup mechanism - no longer drops namespace prefixes from attributes for which it finds - a default namespace declaration, to prevent them from appearing as - unnamespaced attributes after serialisation. - - * Returning non-type objects from a custom class lookup method could lead - to a crash. - - * Instantiating and using subtypes of Comments and ProcessingInstructions - crashed. * Fri May 10 2013 Jeffrey Ollie - 3.2.0-1 - 3.2.0 (2013-04-28) - ================== - - Features added - -------------- - - Bugs fixed - ---------- - - * LP#690319: Leading whitespace could change the behaviour of the string - parsing functions in ``lxml.html``. - - * LP#599318: The string parsing functions in ``lxml.html`` are more robust - in the face of uncommon HTML content like framesets or missing body tags. - Patch by Stefan Seelmann. - - * LP#712941: I/O errors while trying to access files with paths that - contain non-ASCII characters could raise ``UnicodeDecodeError`` instead - of properly reporting the ``IOError``. - - * LP#673205: Parsing from in-memory strings disabled network access in the - default parser and made subsequent attempts to parse from a URL fail. - - * LP#971754: lxml.html.clean appends 'nofollow' to 'rel' attributes instead - of overwriting the current value. - - * LP#715687: lxml.html.clean no longer discards scripts that are explicitly - allowed by the user provided whitelist. Patch by Christine Koppelt. - - 3.1.2 (2013-04-12) - ================== - - Bugs fixed - ---------- - - * LP#1136509: Passing attributes through the namespace-unaware API of - the sax bridge (i.e. the ``handler.startElement()`` method) failed - with a ``TypeError``. Patch by Mike Bayer. - - * LP#1123074: Fix serialisation error in XSLT output when converting - the result tree to a Unicode string. - - * GH#105: Replace illegal usage of ``xmlBufLength()`` in libxml2 2.9.0 - by properly exported API function ``xmlBufUse()``. - - 3.1.1 (2013-03-29) - ================== - - Features added - -------------- - - Bugs fixed - ---------- - - * LP#1160386: Write access to ``lxml.html.FormElement.fields`` raised - an AttributeError in Py3. - - * Illegal memory access during cleanup in incremental xmlfile writer. - - Other changes - ------------- - - * The externally useless class ``lxml.etree._BaseParser`` was removed - from the module dict. * Fri Mar 8 2013 Jeffrey Ollie - 3.1.0-1 - 3.1.0 (2013-02-10) - ================== - - Features added - -------------- - - * GH#89: lxml.html.clean allows overriding the set of attributes that it - considers 'safe'. Patch by Francis Devereux. - - Bugs fixed - ---------- - - * LP#1104370: ``copy.copy(el.attrib)`` raised an exception. It now returns - a copy of the attributes as a plain Python dict. - - * GH#95: When used with namespace prefixes, the ``el.find*()`` methods - always used the first namespace mapping that was provided for each - path expression instead of using the one that was actually passed - in for the current run. - - * LP#1092521, GH#91: Fix undefined C symbol in Python runtimes compiled - without threading support. Patch by Ulrich Seidl. - - Other changes - ------------- - - - 3.1beta1 (2012-12-21) - ===================== - - Features added - -------------- - - * New build-time option ``--with-unicode-strings`` for Python 2 that - makes the API always return Unicode strings for names and text - instead of byte strings for plain ASCII content. - - * New incremental XML file writing API ``etree.xmlfile()``. - - * E factory in lxml.objectify is callable to simplify the creation of - tags with non-identifier names without having to resort to getattr(). - - Bugs fixed - ---------- - - * When starting from a non-namespaced element in lxml.objectify, searching - for a child without explicitly specifying a namespace incorrectly found - namespaced elements with the requested local name, instead of restricting - the search to non-namespaced children. - - * GH#85: Deprecation warnings were fixed for Python 3.x. - - * GH#33: lxml.html.fromstring() failed to accept bytes input in Py3. - - * LP#1080792: Static build of libxml2 2.9.0 failed due to missing file. - - Other changes - ------------- - - * The externally useless class ``_ObjectifyElementMakerCaller`` was - removed from the module API of lxml.objectify. - - * LP#1075622: lxml.builder is faster for adding text to elements with - many children. Patch by Anders Hammarquist. * Thu Feb 14 2013 Fedora Release Engineering - 3.0.1-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_19_Mass_Rebuild * Mon Oct 15 2012 Jeffrey Ollie - 3.0.1-1 - 3.0.1 (2012-10-14) - Bugs fixed - - * LP#1065924: Element proxies could disappear during garbage collection - in PyPy without proper cleanup. - * GH#71: Failure to work with libxml2 2.6.x. - * LP#1065139: static MacOS-X build failed in Py3. * Wed Oct 10 2012 Jeffrey Ollie - 3.0-1 - 3.0 (2012-10-08) - ================ - - Features added - -------------- - - Bugs fixed - ---------- - - * End-of-file handling was incorrect in iterparse() when reading from - a low-level C file stream and failed in libxml2 2.9.0 due to its - improved consistency checks. - - Other changes - ------------- - - * The build no longer uses Cython by default unless the generated C files - are missing. To use Cython, pass the option "--with-cython". To ignore - the fatal build error when Cython is required but not available (e.g. to - run special setup.py commands that do not actually run a build), pass - "--without-cython". - - - 3.0beta1 (2012-09-26) - ===================== - - Features added - -------------- - - * Python level access to (optional) libxml2 memory debugging features - to simplify debugging of memory leaks etc. - - Bugs fixed - ---------- - - * Fix a memory leak in XPath by switching to Cython 0.17.1. - - * Some tests were adapted to work with PyPy. - - Other changes - ------------- - - * The code was adapted to work with the upcoming libxml2 2.9.0 release. - - - 3.0alpha2 (2012-08-23) - ====================== - - Features added - -------------- - - * The .iter() method of elements now accepts tag arguments like "{*}name" - to search for elements with a given local name in any namespace. With - this addition, all combinations of wildcards now work as expected: - "{ns}name", "{}name", "{*}name", "{ns}*", "{}*" and "{*}*". Note that - "name" is equivalent to "{}name", but "*" is "{*}*". The same change - applies to the .getiterator(), .itersiblings(), .iterancestors(), - .iterdescendants(), .iterchildren() and .itertext() methods, the - strip_attributes(), strip_elements() and strip_tags() functions as well - as the iterparse() function. - - * C14N allows specifying the inclusive prefixes to be promoted to - top-level during exclusive serialisation. - - Bugs fixed - ---------- - - * Passing long Unicode strings into the feed() parser interface failed to - read the entire string. - - Other changes - ------------- - - - 3.0alpha1 (2012-07-31) - ====================== - - Features added - -------------- - - * Initial support for building in PyPy (through cpyext). - - * DTD objects gained an API that allows read access to their - declarations. - - * xpathgrep.py gained support for parsing line-by-line (e.g. - from grep output) and for surrounding the output with a new root - tag. - - * E-factory in lxml.builder accepts subtypes of known data - types (such as string subtypes) when building elements around them. - - * Tree iteration and iterparse() with a selective tag - argument supports passing a set of tags. Tree nodes will be - returned by the iterators if they match any of the tags. - - Bugs fixed - ---------- - - * The .find*() methods in lxml.objectify no longer use XPath - internally, which makes them faster in many cases (especially when - short circuiting after a single or couple of elements) and fixes - some behavioural differences compared to lxml.etree. Note that - this means that they no longer support arbitrary XPath expressions - but only the subset that the ElementPath language supports. - The previous implementation was also redundant with the normal - XPath support, which can be used as a replacement. - - * el.find('*') could accidentally return a comment or processing - instruction that happened to be in the wrong spot. (Same for the - other .find*() methods.) - - * The error logging is less intrusive and avoids a global setup where - possible. - - * Fixed undefined names in html5lib parser. - - * xpathgrep.py did not work in Python 3. - - * Element.attrib.update() did not accept an attrib of - another Element as parameter. - - * For subtypes of ElementBase that make the .text or .tail - properties immutable (as in objectify, for example), inserting text - when creating Elements through the E-Factory feature of the class - constructor would fail with an exception, stating that the text - cannot be modified. - - Other changes - -------------- - - * The code base was overhauled to properly use 'const' where the API - of libxml2 and libxslt requests it. This also has an impact on the - public C-API of lxml itself, as defined in etreepublic.pxd, as - well as the provided declarations in the lxml/includes/ directory. - Code that uses these declarations may have to be adapted. On the - plus side, this fixes several C compiler warnings, also for user - code, thus making it easier to spot real problems again. - - * The functionality of "lxml.cssselect" was moved into a separate PyPI - package called "cssselect". To continue using it, you must install - that package separately. The "lxml.cssselect" module is still - available and provides the same interface, provided the "cssselect" - package can be imported at runtime. - - * Element attributes passed in as an attrib dict or as keyword - arguments are now sorted by (namespaced) name before being created - to make their order predictable for serialisation and iteration. - Note that adding or deleting attributes afterwards does not take - that order into account, i.e. setting a new attribute appends it - after the existing ones. - - * Several classes that are for internal use only were removed - from the lxml.etree module dict: - _InputDocument, _ResolverRegistry, _ResolverContext, _BaseContext, - _ExsltRegExp, _IterparseContext, _TempStore, _ExceptionContext, - __ContentOnlyElement, _AttribIterator, _NamespaceRegistry, - _ClassNamespaceRegistry, _FunctionNamespaceRegistry, - _XPathFunctionNamespaceRegistry, _ParserDictionaryContext, - _FileReaderContext, _ParserContext, _PythonSaxParserTarget, - _TargetParserContext, _ReadOnlyProxy, _ReadOnlyPIProxy, - _ReadOnlyEntityProxy, _ReadOnlyElementProxy, _OpaqueNodeWrapper, - _OpaqueDocumentWrapper, _ModifyContentOnlyProxy, - _ModifyContentOnlyPIProxy, _ModifyContentOnlyEntityProxy, - _AppendOnlyElementProxy, _SaxParserContext, _FilelikeWriter, - _ParserSchemaValidationContext, _XPathContext, - _XSLTResolverContext, _XSLTContext, _XSLTQuotedStringParam - - * Several internal classes can no longer be inherited from: - _InputDocument, _ResolverRegistry, _ExsltRegExp, _ElementUnicodeResult, - _IterparseContext, _TempStore, _AttribIterator, _ClassNamespaceRegistry, - _XPathFunctionNamespaceRegistry, _ParserDictionaryContext, - _FileReaderContext, _PythonSaxParserTarget, _TargetParserContext, - _ReadOnlyPIProxy, _ReadOnlyEntityProxy, _OpaqueDocumentWrapper, - _ModifyContentOnlyPIProxy, _ModifyContentOnlyEntityProxy, - _AppendOnlyElementProxy, _FilelikeWriter, _ParserSchemaValidationContext, - _XPathContext, _XSLTResolverContext, _XSLTContext, - _XSLTQuotedStringParam, _XSLTResultTree, _XSLTProcessingInstruction * Thu Sep 27 2012 Jeffrey Ollie - 2.3.5-1 - Bugs fixed - - * Crash when merging text nodes in element.remove(). - * Crash in sax/target parser when reporting empty doctype. * Thu Sep 27 2012 Jeffrey Ollie - 2.3.4-1 - Bugs fixed - - * Crash when building an nsmap (Element property) with empty namespace - URIs. - * Crash due to race condition when errors (or user messages) occur during - threaded XSLT processing (or compilation). - * XSLT stylesheet compilation could ignore compilation errors. * Sat Aug 04 2012 David Malcolm - 2.3.3-4 - rebuild for https://fedoraproject.org/wiki/Features/Python_3.3 * Fri Aug 3 2012 David Malcolm - 2.3.3-3 - remove rhel logic from with_python3 conditional * Sat Jul 21 2012 Fedora Release Engineering - 2.3.3-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_18_Mass_Rebuild * Thu Jan 5 2012 Jeffrey C. Ollie - 2.3.3-1 - 2.3.3 (2012-01-04) - Features added - - * lxml.html.tostring() gained new serialisation options with_tail and - doctype. - - Bugs fixed - - * Fixed a crash when using iterparse() for HTML parsing and requesting - start events. - * Fixed parsing of more selectors in cssselect. Whitespace before pseudo- - elements and pseudo-classes is significant as it is a descendant - combinator. "E :pseudo" should parse the same as "E *:pseudo", not - "E:pseudo". Patch by Simon Sapin. - * lxml.html.diff no longer raises an exception when hitting 'img' tags - without 'src' attribute. * Mon Nov 14 2011 Jeffrey C. Ollie - 2.3.2-1 - 2.3.2 (2011-11-11) - Features added - - * lxml.objectify.deannotate() has a new boolean option - cleanup_namespaces to remove the objectify namespace declarations - (and generally clean up the namespace declarations) after removing - the type annotations. - * lxml.objectify gained its own SubElement() function as a copy of - etree.SubElement to avoid an otherwise redundant import of - lxml.etree on the user side. - - Bugs fixed - - * Fixed the "descendant" bug in cssselect a second time (after a first - fix in lxml 2.3.1). The previous change resulted in a serious - performance regression for the XPath based evaluation of the - translated expression. Note that this breaks the usage of some - of the generated XPath expressions as XSLT location paths that - previously worked in 2.3.1. - * Fixed parsing of some selectors in cssselect. Whitespace after - combinators ">", "+" and "~" is now correctly ignored. Previously - it was parsed as a descendant combinator. For example, "div> .foo" - was parsed the same as "div>* .foo" instead of "div>.foo". Patch by - Simon Sapin. * Sun Sep 25 2011 Jeffrey C. Ollie - 2.3.1-1 - Features added - -------------- - - * New option kill_tags in lxml.html.clean to remove specific - tags and their content (i.e. their whole subtree). - - * pi.get() and pi.attrib on processing instructions to parse - pseudo-attributes from the text content of processing instructions. - - * lxml.get_include() returns a list of include paths that can be - used to compile external C code against lxml.etree. This is - specifically required for statically linked lxml builds when code - needs to compile against the exact same header file versions as lxml - itself. - - * Resolver.resolve_file() takes an additional option - close_file that configures if the file(-like) object will be - closed after reading or not. By default, the file will be closed, - as the user is not expected to keep a reference to it. - - Bugs fixed - ---------- - - * HTML cleaning didn't remove 'data:' links. - - * The html5lib parser integration now uses the 'official' - implementation in html5lib itself, which makes it work with newer - releases of the library. - - * In lxml.sax, endElementNS() could incorrectly reject a plain - tag name when the corresponding start event inferred the same plain - tag name to be in the default namespace. - - * When an open file-like object is passed into parse() or - iterparse(), the parser will no longer close it after use. This - reverts a change in lxml 2.3 where all files would be closed. It is - the users responsibility to properly close the file(-like) object, - also in error cases. - - * Assertion error in lxml.html.cleaner when discarding top-level elements. - - * In lxml.cssselect, use the xpath 'A//B' (short for - 'A/descendant-or-self::node()/B') instead of 'A/descendant::B' for the - css descendant selector ('A B'). This makes a few edge cases to be - consistent with the selector behavior in WebKit and Firefox, and makes - more css expressions valid location paths (for use in xsl:template - match). - - * In lxml.html, non-selected