From add99b967cede87dc10f333787fbabd5938fb9450c312374ccd848eaa0752786 Mon Sep 17 00:00:00 2001 From: Dirk Mueller Date: Thu, 11 Jul 2024 10:53:38 +0000 Subject: [PATCH] - update to 2.11.2 (bsc#1224474, CVE-2024-1968): * Redirects to non-HTTP protocols are no longer followed. Please, see the 23j4-mw76-5v7h security advisory for more information. (:issue:`457`) * The Authorization header is now dropped on redirects to a different scheme (http:// or https://) or port, even if the domain is the same. Please, see the 4qqq-9vqf-3h3f security advisory for more information. * When using system proxy settings that are different for http:// and https://, redirects to a different URL scheme will now also trigger the corresponding change in proxy settings for the redirected request. Please, see the jm3v-qxmh-hxwv security advisory for more information. (:issue:`767`) * :attr:`Spider.allowed_domains ` is now enforced for all requests, and not only requests from spider callbacks. * :func:`~scrapy.utils.iterators.xmliter_lxml` no longer resolves XML entities. * defusedxml is now used to make :class:`scrapy.http.request.rpc.XmlRpcRequest` more secure. * Restored support for brotlipy_, which had been dropped in Scrapy 2.11.1 in favor of brotli. (:issue:`6261`) Note brotlipy is deprecated, both in Scrapy and upstream. Use brotli instead if you can. * Make :setting:`METAREFRESH_IGNORE_TAGS` ["noscript"] by default. This prevents :class:`~scrapy.downloadermiddlewares. redirect.MetaRefreshMiddleware` from following redirects that would not be followed by web browsers with JavaScript enabled. OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=41 --- Scrapy-2.11.1.tar.gz | 3 -- python-Scrapy.changes | 66 +++++++++++++++++++++++++++++++++++++++++++ python-Scrapy.spec | 11 +++++--- scrapy-2.11.2.tar.gz | 3 ++ 4 files changed, 76 insertions(+), 7 deletions(-) delete mode 100644 Scrapy-2.11.1.tar.gz create mode 100644 scrapy-2.11.2.tar.gz diff --git a/Scrapy-2.11.1.tar.gz b/Scrapy-2.11.1.tar.gz deleted file mode 100644 index 3a26761..0000000 --- a/Scrapy-2.11.1.tar.gz +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:733a039c7423e52b69bf2810b5332093d4e42a848460359c07b02ecff8f73ebe -size 1176726 diff --git a/python-Scrapy.changes b/python-Scrapy.changes index 41df8b1..79e931b 100644 --- a/python-Scrapy.changes +++ b/python-Scrapy.changes @@ -1,3 +1,69 @@ +------------------------------------------------------------------- +Thu Jul 11 10:38:36 UTC 2024 - Dirk Müller + +- update to 2.11.2 (bsc#1224474, CVE-2024-1968): + * Redirects to non-HTTP protocols are no longer followed. + Please, see the 23j4-mw76-5v7h security advisory for more + information. (:issue:`457`) + * The Authorization header is now dropped on redirects to a + different scheme (http:// or https://) or port, even if the + domain is the same. Please, see the 4qqq-9vqf-3h3f security + advisory for more information. + * When using system proxy settings that are different for + http:// and https://, redirects to a different URL scheme + will now also trigger the corresponding change in proxy + settings for the redirected request. Please, see the + jm3v-qxmh-hxwv security advisory for more information. + (:issue:`767`) + * :attr:`Spider.allowed_domains + ` is now enforced for all + requests, and not only requests from spider callbacks. + * :func:`~scrapy.utils.iterators.xmliter_lxml` no longer + resolves XML entities. + * defusedxml is now used to make + :class:`scrapy.http.request.rpc.XmlRpcRequest` more secure. + * Restored support for brotlipy_, which had been dropped in + Scrapy 2.11.1 in favor of brotli. (:issue:`6261`) Note + brotlipy is deprecated, both in Scrapy and upstream. Use + brotli instead if you can. + * Make :setting:`METAREFRESH_IGNORE_TAGS` ["noscript"] by + default. This prevents :class:`~scrapy.downloadermiddlewares. + redirect.MetaRefreshMiddleware` from following redirects that + would not be followed by web browsers with JavaScript + enabled. + * During :ref:`feed export `, do not close + the underlying file from :ref:`built-in post-processing + plugins `. + * :class:`LinkExtractor + ` now + properly applies the unique and canonicalize parameters. + * Do not initialize the scheduler disk queue if + :setting:`JOBDIR` is an empty string. + * Fix :attr:`Spider.logger ` not logging + custom extra information. + * robots.txt files with a non-UTF-8 encoding no longer prevent + parsing the UTF-8-compatible (e.g. ASCII) parts of the + document. + * :meth:`scrapy.http.cookies.WrappedRequest.get_header` no + longer raises an exception if default is None. + :func:`scrapy.utils.response.get_base_url` to determine the + base URL of a given :class:`~scrapy.http.Response`. + * :class:`~scrapy.selector.Selector` now uses + :func:`scrapy.utils.response.get_base_url` to determine the + base URL of a given :class:`~scrapy.http.Response`. + (:issue:`6265`) + * The :meth:`media_to_download` method of :ref:`media pipelines + ` now logs exceptions before stripping + them. + * When passing a callback to the :command:`parse` command, + build the callback callable with the right signature. + * Add a FAQ entry about :ref:`creating blank requests `. + * Document that :attr:`scrapy.selector.Selector.type` can be + "json". + * Make builds reproducible. + * Packaging and test fixes + ------------------------------------------------------------------- Mon Mar 25 14:12:20 UTC 2024 - Dirk Müller diff --git a/python-Scrapy.spec b/python-Scrapy.spec index 218615d..3f01d4d 100644 --- a/python-Scrapy.spec +++ b/python-Scrapy.spec @@ -18,15 +18,16 @@ %{?sle15_python_module_pythons} Name: python-Scrapy -Version: 2.11.1 +Version: 2.11.2 Release: 0 Summary: A high-level Python Screen Scraping framework License: BSD-3-Clause Group: Development/Languages/Python URL: https://scrapy.org -Source: https://files.pythonhosted.org/packages/source/S/Scrapy/Scrapy-%{version}.tar.gz +Source: https://files.pythonhosted.org/packages/source/S/Scrapy/scrapy-%{version}.tar.gz +BuildRequires: %{python_module Brotli} BuildRequires: %{python_module Pillow} -BuildRequires: %{python_module Protego >= 0.1.15} +BuildRequires: %{python_module Protego} BuildRequires: %{python_module PyDispatcher >= 2.0.5} BuildRequires: %{python_module Twisted >= 18.9.0} BuildRequires: %{python_module attrs} @@ -35,6 +36,7 @@ BuildRequires: %{python_module botocore >= 1.4.87} BuildRequires: %{python_module cryptography >= 36.0.0} BuildRequires: %{python_module cssselect >= 0.9.1} BuildRequires: %{python_module dbm} +BuildRequires: %{python_module defusedxml >= 0.7.1} BuildRequires: %{python_module itemadapter >= 0.1.0} BuildRequires: %{python_module itemloaders >= 1.0.1} BuildRequires: %{python_module lxml >= 4.4.1} @@ -63,6 +65,7 @@ Requires: python-PyDispatcher >= 2.0.5 Requires: python-Twisted >= 18.9.0 Requires: python-cryptography >= 36.0.0 Requires: python-cssselect >= 0.9.1 +Requires: python-defusedxml >= 0.7.1 Requires: python-itemadapter >= 0.1.0 Requires: python-itemloaders >= 1.0.1 Requires: python-lxml >= 4.4.1 @@ -93,7 +96,7 @@ Group: Documentation/HTML Provides documentation for %{name}. %prep -%autosetup -p1 -n Scrapy-%{version} +%autosetup -p1 -n scrapy-%{version} sed -i -e 's:= python:= python3:g' docs/Makefile diff --git a/scrapy-2.11.2.tar.gz b/scrapy-2.11.2.tar.gz new file mode 100644 index 0000000..dec29ba --- /dev/null +++ b/scrapy-2.11.2.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dfbd565384fc3fffeba121f5a3a2d0899ac1f756d41432ca0879933fbfb3401d +size 1187710