From a84bf5033fd1050072e56a322a1d7152610b5f531731baa2f4b7da37abcbe34e Mon Sep 17 00:00:00 2001 From: Dirk Mueller Date: Mon, 12 Sep 2022 08:00:07 +0000 Subject: [PATCH] Accepting request 1002338 from home:yarunachalam:branches:devel:languages:python - Update to v2.6.2 Security bug fix: * When HttpProxyMiddleware processes a request with proxy metadata, and that proxy metadata includes proxy credentials, HttpProxyMiddleware sets the Proxy-Authentication header, but only if that header is not already set. * There are third-party proxy-rotation downloader middlewares that set different proxy metadata every time they process a request. * Because of request retries and redirects, the same request can be processed by downloader middlewares more than once, including both HttpProxyMiddleware and any third-party proxy-rotation downloader middleware. * These third-party proxy-rotation downloader middlewares could change the proxy metadata of a request to a new value, but fail to remove the Proxy-Authentication header from the previous value of the proxy metadata, causing the credentials of one proxy to be sent to a different proxy. * To prevent the unintended leaking of proxy credentials, the behavior of HttpProxyMiddleware is now as follows when processing a request: + If the request being processed defines proxy metadata that includes credentials, the Proxy-Authorization header is always updated to feature those credentials. + If the request being processed defines proxy metadata without credentials, the Proxy-Authorization header is removed unless it was originally defined for the same proxy URL. + To remove proxy credentials while keeping the same proxy URL, remove the Proxy-Authorization header. + If the request has no proxy metadata, or that metadata is a falsy value (e.g. None), the Proxy-Authorization header is removed. + It is no longer possible to set a proxy URL through the proxy metadata but set the credentials through the Proxy-Authorization header. Set proxy credentials through the proxy metadata instead. * Also fixes the following regressions introduced in 2.6.0: + CrawlerProcess supports again crawling multiple spiders (issue 5435, issue 5436) + Installing a Twisted reactor before Scrapy does (e.g. importing twisted.internet.reactor somewhere at the module level) no longer prevents Scrapy from starting, as long as a different reactor is not specified in TWISTED_REACTOR (issue 5525, issue 5528) + Fixed an exception that was being logged after the spider finished under certain conditions (issue 5437, issue 5440) + The --output/-o command-line parameter supports again a value starting with a hyphen (issue 5444, issue 5445) + The scrapy parse -h command no longer throws an error (issue 5481, issue 5482) OBS-URL: https://build.opensuse.org/request/show/1002338 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=28 --- Scrapy-2.6.1.tar.gz | 3 --- Scrapy-2.6.2.tar.gz | 3 +++ python-Scrapy.changes | 30 ++++++++++++++++++++++++++++++ python-Scrapy.spec | 2 +- 4 files changed, 34 insertions(+), 4 deletions(-) delete mode 100644 Scrapy-2.6.1.tar.gz create mode 100644 Scrapy-2.6.2.tar.gz diff --git a/Scrapy-2.6.1.tar.gz b/Scrapy-2.6.1.tar.gz deleted file mode 100644 index af6599e..0000000 --- a/Scrapy-2.6.1.tar.gz +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:56fd55a59d0f329ce752892358abee5a6b50b4fc55a40420ea317dc617553827 -size 1103155 diff --git a/Scrapy-2.6.2.tar.gz b/Scrapy-2.6.2.tar.gz new file mode 100644 index 0000000..5bf4044 --- /dev/null +++ b/Scrapy-2.6.2.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:55e21181165f25337105fff1efc8393296375cea7de699a7e703bbd265595f26 +size 1107021 diff --git a/python-Scrapy.changes b/python-Scrapy.changes index cc96040..e05ba83 100644 --- a/python-Scrapy.changes +++ b/python-Scrapy.changes @@ -1,3 +1,33 @@ +------------------------------------------------------------------- +Fri Sep 9 15:21:20 UTC 2022 - Yogalakshmi Arunachalam + +- Update to v2.6.2 + Security bug fix: + * When HttpProxyMiddleware processes a request with proxy metadata, and that proxy metadata includes proxy credentials, + HttpProxyMiddleware sets the Proxy-Authentication header, but only if that header is not already set. + * There are third-party proxy-rotation downloader middlewares that set different proxy metadata every time they process a request. + * Because of request retries and redirects, the same request can be processed by downloader middlewares more than once, + including both HttpProxyMiddleware and any third-party proxy-rotation downloader middleware. + * These third-party proxy-rotation downloader middlewares could change the proxy metadata of a request to a new value, + but fail to remove the Proxy-Authentication header from the previous value of the proxy metadata, causing the credentials of one + proxy to be sent to a different proxy. + * To prevent the unintended leaking of proxy credentials, the behavior of HttpProxyMiddleware is now as follows when processing a request: + + If the request being processed defines proxy metadata that includes credentials, the Proxy-Authorization header is always updated + to feature those credentials. + + If the request being processed defines proxy metadata without credentials, the Proxy-Authorization header is removed unless + it was originally defined for the same proxy URL. + + To remove proxy credentials while keeping the same proxy URL, remove the Proxy-Authorization header. + + If the request has no proxy metadata, or that metadata is a falsy value (e.g. None), the Proxy-Authorization header is removed. + + It is no longer possible to set a proxy URL through the proxy metadata but set the credentials through the Proxy-Authorization header. + Set proxy credentials through the proxy metadata instead. + * Also fixes the following regressions introduced in 2.6.0: + + CrawlerProcess supports again crawling multiple spiders (issue 5435, issue 5436) + + Installing a Twisted reactor before Scrapy does (e.g. importing twisted.internet.reactor somewhere at the module level) + no longer prevents Scrapy from starting, as long as a different reactor is not specified in TWISTED_REACTOR (issue 5525, issue 5528) + + Fixed an exception that was being logged after the spider finished under certain conditions (issue 5437, issue 5440) + + The --output/-o command-line parameter supports again a value starting with a hyphen (issue 5444, issue 5445) + + The scrapy parse -h command no longer throws an error (issue 5481, issue 5482) + ------------------------------------------------------------------- Fri Mar 4 00:06:54 UTC 2022 - Ben Greiner diff --git a/python-Scrapy.spec b/python-Scrapy.spec index f85c82e..f3b8a82 100644 --- a/python-Scrapy.spec +++ b/python-Scrapy.spec @@ -19,7 +19,7 @@ %{?!python_module:%define python_module() python3-%{**}} %define skip_python2 1 Name: python-Scrapy -Version: 2.6.1 +Version: 2.6.2 Release: 0 Summary: A high-level Python Screen Scraping framework License: BSD-3-Clause