14
0
forked from pool/python-Scrapy
Files
python-Scrapy/python-Scrapy.spec

145 lines
5.2 KiB
RPMSpec
Raw Normal View History

#
# spec file for package python-Scrapy
#
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
# Copyright (c) 2021 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
# upon. The license for this file, and modifications and additions to the
# file, is the same license as for the pristine package itself (unless the
# license for the pristine package is not an Open Source License, in which
# case the license is the MIT License). An "Open Source License" is a
# license that conforms to the Open Source Definition (Version 1.9)
# published by the Open Source Initiative.
# Please submit bugfixes or comments via https://bugs.opensuse.org/
#
%{?!python_module:%define python_module() python3-%{**}}
%define skip_python2 1
Name: python-Scrapy
Accepting request 923811 from home:bnavigator:branches:devel:languages:python - Update to 2.5.1, Security bug fix * boo#1191446, CVE-2021-41125 * If you use HttpAuthMiddleware (i.e. the http_user and http_pass spider attributes) for HTTP authentication, any request exposes your credentials to the request target. * To prevent unintended exposure of authentication credentials to unintended domains, you must now additionally set a new, additional spider attribute, http_auth_domain, and point it to the specific domain to which the authentication credentials must be sent. * If the http_auth_domain spider attribute is not set, the domain of the first request will be considered the HTTP authentication target, and authentication credentials will only be sent in requests targeting that domain. * If you need to send the same HTTP authentication credentials to multiple domains, you can use w3lib.http.basic_auth_header instead to set the value of the Authorization header of your requests. * If you really want your spider to send the same HTTP authentication credentials to any domain, set the http_auth_domain spider attribute to None. * Finally, if you are a user of scrapy-splash, know that this version of Scrapy breaks compatibility with scrapy-splash 0.7.2 and earlier. You will need to upgrade scrapy-splash to a greater version for it to continue to work. OBS-URL: https://build.opensuse.org/request/show/923811 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=21
2021-10-07 16:57:39 +00:00
Version: 2.5.1
Release: 0
Summary: A high-level Python Screen Scraping framework
License: BSD-3-Clause
Group: Development/Languages/Python
URL: https://scrapy.org
Source: https://files.pythonhosted.org/packages/source/S/Scrapy/Scrapy-%{version}.tar.gz
# PATCH-FIX-OPENSUSE remove-h2-version-restriction.patch boo#1190035 -- run scrapy with h2 >= 4.0.0
Patch0: remove-h2-version-restriction.patch
# PATCH-FIX-UPSTREAM add-peak-method-to-queues.patch https://github.com/scrapy/scrapy/commit/68379197986ae3deb81a545b5fd6920ea3347094
Patch1: add-peak-method-to-queues.patch
BuildRequires: %{python_module Pillow}
BuildRequires: %{python_module Protego >= 0.1.15}
BuildRequires: %{python_module PyDispatcher >= 2.0.5}
BuildRequires: %{python_module Twisted >= 17.9.0}
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
BuildRequires: %{python_module botocore}
BuildRequires: %{python_module cryptography >= 2.0}
BuildRequires: %{python_module cssselect >= 0.9.1}
BuildRequires: %{python_module dbm}
BuildRequires: %{python_module itemadapter >= 0.1.0}
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
BuildRequires: %{python_module itemloaders >= 1.0.1}
BuildRequires: %{python_module jmespath}
BuildRequires: %{python_module lxml >= 3.5.0}
BuildRequires: %{python_module parsel >= 1.5.0}
BuildRequires: %{python_module pyOpenSSL >= 16.2.0}
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
BuildRequires: %{python_module pyftpdlib}
BuildRequires: %{python_module pytest-xdist}
BuildRequires: %{python_module pytest}
BuildRequires: %{python_module queuelib >= 1.4.2}
BuildRequires: %{python_module service_identity >= 16.0.0}
BuildRequires: %{python_module setuptools}
BuildRequires: %{python_module sybil}
BuildRequires: %{python_module testfixtures >= 6.0.0}
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
BuildRequires: %{python_module uvloop}
BuildRequires: %{python_module w3lib >= 1.17.2}
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
BuildRequires: %{python_module zope.interface >= 4.1.3}
BuildRequires: fdupes
BuildRequires: python-rpm-macros
BuildRequires: python3-Sphinx
BuildRequires: (python3-dataclasses if python3-base < 3.7)
Requires: python-Protego >= 0.1.15
Requires: python-PyDispatcher >= 2.0.5
Requires: python-Twisted >= 17.9.0
Requires: python-cryptography >= 2.0
Requires: python-cssselect >= 0.9.1
Requires: python-itemadapter >= 0.1.0
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
Requires: python-itemloaders >= 1.0.1
Requires: python-lxml >= 3.5.0
Requires: python-parsel >= 1.5.0
Requires: python-pyOpenSSL >= 16.2.0
Requires: python-queuelib >= 1.4.2
Requires: python-service_identity >= 16.0.0
Requires: python-setuptools
Requires: python-w3lib >= 1.17.2
Requires: python-zope.interface >= 4.1.3
Requires(post): update-alternatives
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
Requires(postun):update-alternatives
BuildArch: noarch
%python_subpackages
%description
Scrapy is a high level scraping and web crawling framework for writing spiders
to crawl and parse web pages for all kinds of purposes, from information
retrieval to monitoring or testing web sites.
%package -n %{name}-doc
Summary: Documentation for %{name}
Group: Documentation/HTML
%description -n %{name}-doc
Provides documentation for %{name}.
%prep
%setup -n Scrapy-%{version}
%autopatch -p1
sed -i -e 's:= python:= python3:g' docs/Makefile
%build
%python_build
pushd docs
%make_build html && rm -r build/html/.buildinfo
popd
%install
%python_install
%python_clone -a %{buildroot}%{_bindir}/scrapy
%python_expand %fdupes %{buildroot}%{$python_sitelib}
%check
# tests/test_proxy_connect.py: requires mitmproxy == 0.10.1
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
# tests/test_downloader_handlers_*.py and test_http2_client_protocol.py: no network
# tests/test_command_check.py: twisted dns resolution of example.com error
# no color in obs chroot console
skiplist="test_pformat"
# correct exception but not recognized due to different format
python310_skiplist=" or test_callback_kwargs"
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
%{pytest \
--ignore tests/test_proxy_connect.py \
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
--ignore tests/test_command_check.py \
--ignore tests/test_downloader_handlers.py \
Accepting request 889030 from home:bnavigator:branches:devel:languages:python - Update to 2.5.0: * Official Python 3.9 support * Experimental HTTP/2 support * New get_retry_request() function to retry requests from spider callbacks * New headers_received signal that allows stopping downloads early * New Response.protocol attribute - Release 2.4.1: * Fixed feed exports overwrite support * Fixed the asyncio event loop handling, which could make code hang * Fixed the IPv6-capable DNS resolver CachingHostnameResolver for download handlers that call reactor.resolve * Fixed the output of the genspider command showing placeholders instead of the import part of the generated spider module (issue 4874) - Release 2.4.0: * Python 3.5 support has been dropped. * The file_path method of media pipelines can now access the source item. * This allows you to set a download file path based on item data. * The new item_export_kwargs key of the FEEDS setting allows to define keyword parameters to pass to item exporter classes. * You can now choose whether feed exports overwrite or append to the output file. * For example, when using the crawl or runspider commands, you can use the -O option instead of -o to overwrite the output file. * Zstd-compressed responses are now supported if zstandard is installed. * In settings, where the import path of a class is required, it is now possible to pass a class object instead. - Release 2.3.0: * Feed exports now support Google Cloud Storage as a storage backend * The new FEED_EXPORT_BATCH_ITEM_COUNT setting allows to deliver output items in batches of up to the specified number of items. * It also serves as a workaround for delayed file delivery, which causes Scrapy to only start item delivery after the crawl has finished when using certain storage backends (S3, FTP, and now GCS). * The base implementation of item loaders has been moved into a separate library, itemloaders, allowing usage from outside Scrapy and a separate release schedule - Release 2.2.1: * The startproject command no longer makes unintended changes to the permissions of files in the destination folder, such as removing execution permissions. OBS-URL: https://build.opensuse.org/request/show/889030 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=18
2021-04-28 13:47:21 +00:00
--ignore tests/test_downloader_handlers_http2.py \
--ignore tests/test_http2_client_protocol.py \
-k "not (${skiplist} ${$python_skiplist})" \
-W ignore::DeprecationWarning \
tests}
%post
%python_install_alternative scrapy
%postun
%python_uninstall_alternative scrapy
%files %{python_files}
%license LICENSE
%doc AUTHORS README.rst
%{python_sitelib}/scrapy
%{python_sitelib}/Scrapy-%{version}*-info
%python_alternative %{_bindir}/scrapy
%files -n %{name}-doc
%doc docs/build/html
%changelog