14
0
forked from pool/python-Scrapy
Files
python-Scrapy/python-Scrapy.spec
Dirk Mueller a84bf5033f Accepting request 1002338 from home:yarunachalam:branches:devel:languages:python
- Update to v2.6.2 
  Security bug fix:
  * When HttpProxyMiddleware processes a request with proxy metadata, and that proxy metadata includes proxy credentials,
    HttpProxyMiddleware sets the Proxy-Authentication header, but only if that header is not already set.
  * There are third-party proxy-rotation downloader middlewares that set different proxy metadata every time they process a request.
  * Because of request retries and redirects, the same request can be processed by downloader middlewares more than once,
    including both HttpProxyMiddleware and any third-party proxy-rotation downloader middleware.
  * These third-party proxy-rotation downloader middlewares could change the proxy metadata of a request to a new value,
    but fail to remove the Proxy-Authentication header from the previous value of the proxy metadata, causing the credentials of one
    proxy to be sent to a different proxy.
  * To prevent the unintended leaking of proxy credentials, the behavior of HttpProxyMiddleware is now as follows when processing a request:
    + If the request being processed defines proxy metadata that includes credentials, the Proxy-Authorization header is always updated 
    to feature those credentials.
    + If the request being processed defines proxy metadata without credentials, the Proxy-Authorization header is removed unless
    it was originally defined for the same proxy URL.
    + To remove proxy credentials while keeping the same proxy URL, remove the Proxy-Authorization header.
    + If the request has no proxy metadata, or that metadata is a falsy value (e.g. None), the Proxy-Authorization header is removed.
    + It is no longer possible to set a proxy URL through the proxy metadata but set the credentials through the Proxy-Authorization header.
    Set proxy credentials through the proxy metadata instead.
  * Also fixes the following regressions introduced in 2.6.0:
    + CrawlerProcess supports again crawling multiple spiders (issue 5435, issue 5436)
    + Installing a Twisted reactor before Scrapy does (e.g. importing twisted.internet.reactor somewhere at the module level)
    no longer prevents Scrapy from starting, as long as a different reactor is not specified in TWISTED_REACTOR (issue 5525, issue 5528)
    + Fixed an exception that was being logged after the spider finished under certain conditions (issue 5437, issue 5440)
    + The --output/-o command-line parameter supports again a value starting with a hyphen (issue 5444, issue 5445)
    + The scrapy parse -h command no longer throws an error (issue 5481, issue 5482)

OBS-URL: https://build.opensuse.org/request/show/1002338
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=28
2022-09-12 08:00:07 +00:00

136 lines
4.4 KiB
RPMSpec

#
# spec file for package python-Scrapy
#
# Copyright (c) 2022 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
# upon. The license for this file, and modifications and additions to the
# file, is the same license as for the pristine package itself (unless the
# license for the pristine package is not an Open Source License, in which
# case the license is the MIT License). An "Open Source License" is a
# license that conforms to the Open Source Definition (Version 1.9)
# published by the Open Source Initiative.
# Please submit bugfixes or comments via https://bugs.opensuse.org/
#
%{?!python_module:%define python_module() python3-%{**}}
%define skip_python2 1
Name: python-Scrapy
Version: 2.6.2
Release: 0
Summary: A high-level Python Screen Scraping framework
License: BSD-3-Clause
Group: Development/Languages/Python
URL: https://scrapy.org
Source: https://files.pythonhosted.org/packages/source/S/Scrapy/Scrapy-%{version}.tar.gz
BuildRequires: %{python_module Pillow}
BuildRequires: %{python_module Protego >= 0.1.15}
BuildRequires: %{python_module PyDispatcher >= 2.0.5}
BuildRequires: %{python_module Twisted >= 17.9.0}
BuildRequires: %{python_module botocore}
BuildRequires: %{python_module cryptography >= 2.0}
BuildRequires: %{python_module cssselect >= 0.9.1}
BuildRequires: %{python_module dbm}
BuildRequires: %{python_module itemadapter >= 0.1.0}
BuildRequires: %{python_module itemloaders >= 1.0.1}
BuildRequires: %{python_module jmespath}
BuildRequires: %{python_module lxml >= 3.5.0}
BuildRequires: %{python_module parsel >= 1.5.0}
BuildRequires: %{python_module pyOpenSSL >= 16.2.0}
BuildRequires: %{python_module pyftpdlib}
BuildRequires: %{python_module pytest-xdist}
BuildRequires: %{python_module pytest}
BuildRequires: %{python_module queuelib >= 1.4.2}
BuildRequires: %{python_module service_identity >= 16.0.0}
BuildRequires: %{python_module setuptools}
BuildRequires: %{python_module sybil}
BuildRequires: %{python_module testfixtures >= 6.0.0}
BuildRequires: %{python_module tldextract}
BuildRequires: %{python_module uvloop}
BuildRequires: %{python_module w3lib >= 1.17.0}
BuildRequires: %{python_module zope.interface >= 4.1.3}
BuildRequires: fdupes
BuildRequires: python-rpm-macros
BuildRequires: python3-Sphinx
BuildRequires: (python3-dataclasses if python3-base < 3.7)
Requires: python-Protego >= 0.1.15
Requires: python-PyDispatcher >= 2.0.5
Requires: python-Twisted >= 17.9.0
Requires: python-cryptography >= 2.0
Requires: python-cssselect >= 0.9.1
Requires: python-itemadapter >= 0.1.0
Requires: python-itemloaders >= 1.0.1
Requires: python-lxml >= 3.5.0
Requires: python-parsel >= 1.5.0
Requires: python-pyOpenSSL >= 16.2.0
Requires: python-queuelib >= 1.4.2
Requires: python-service_identity >= 16.0.0
Requires: python-setuptools
Requires: python-tldextract
Requires: python-w3lib >= 1.17.2
Requires: python-zope.interface >= 4.1.3
Requires(post): update-alternatives
Requires(postun):update-alternatives
BuildArch: noarch
%python_subpackages
%description
Scrapy is a high level scraping and web crawling framework for writing spiders
to crawl and parse web pages for all kinds of purposes, from information
retrieval to monitoring or testing web sites.
%package -n %{name}-doc
Summary: Documentation for %{name}
Group: Documentation/HTML
%description -n %{name}-doc
Provides documentation for %{name}.
%prep
%setup -n Scrapy-%{version}
%autopatch -p1
sed -i -e 's:= python:= python3:g' docs/Makefile
%build
%python_build
pushd docs
%make_build html && rm -r build/html/.buildinfo
popd
%install
%python_install
%python_clone -a %{buildroot}%{_bindir}/scrapy
%python_expand %fdupes %{buildroot}%{$python_sitelib}
%check
# no color in obs chroot console
skiplist="test_pformat"
# no online connection to toscrapy.com
skiplist="$skiplist or CheckCommandTest"
%{pytest \
-k "not (${skiplist})" \
-W ignore::DeprecationWarning \
tests}
%post
%python_install_alternative scrapy
%postun
%python_uninstall_alternative scrapy
%files %{python_files}
%license LICENSE
%doc AUTHORS README.rst
%{python_sitelib}/scrapy
%{python_sitelib}/Scrapy-%{version}*-info
%python_alternative %{_bindir}/scrapy
%files -n %{name}-doc
%doc docs/build/html
%changelog