forked from pool/python-Scrapy
- update to 2.11.1 (bsc#1220514, CVE-2024-1892):
* Addressed `ReDoS vulnerabilities` (bsc#1220514, CVE-2024-1892) - ``scrapy.utils.iterators.xmliter`` is now deprecated in favor of :func:`~scrapy.utils.iterators.xmliter_lxml`, which :class:`~scrapy.spiders.XMLFeedSpider` now uses. To minimize the impact of this change on existing code, :func:`~scrapy.utils.iterators.xmliter_lxml` now supports indicating the node namespace with a prefix in the node name, and big files with highly nested trees when using libxml2 2.7+. - Fixed regular expressions in the implementation of the :func:`~scrapy.utils.response.open_in_browser` function. .. _ReDoS vulnerabilities: https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS * :setting:`DOWNLOAD_MAXSIZE` and :setting:`DOWNLOAD_WARNSIZE` now also apply to the decompressed response body. Please, see the `7j7m-v7m3-jqm7 security advisory`_ for more information. .. _7j7m-v7m3-jqm7 security advisory: https://github.com/scrapy/scrapy/security/advisories/GHSA-7j7m-v7m3-jqm7 * Also in relation with the `7j7m-v7m3-jqm7 security advisory`_, the deprecated ``scrapy.downloadermiddlewares.decompression`` module has been removed. * The ``Authorization`` header is now dropped on redirects to a different domain. Please, see the `cw9j-q3vf-hrrv security advisory`_ for more information. * The OS signal handling code was refactored to no longer use private Twisted functions. (:issue:`6024`, :issue:`6064`, :issue:`6112`) * Improved documentation for :class:`~scrapy.crawler.Crawler` initialization changes made in the 2.11.0 release. (:issue:`6057`, :issue:`6147`) * Extended documentation for :attr:`Request.meta <scrapy.http.Request.meta>`. * Fixed the :reqmeta:`dont_merge_cookies` documentation. (:issue:`5936`, * Added a link to Zyte's export guides to the :ref:`feed exports * Added a missing note about backward-incompatible changes in OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Scrapy?expand=0&rev=37
This commit is contained in:
@@ -1,3 +0,0 @@
|
|||||||
version https://git-lfs.github.com/spec/v1
|
|
||||||
oid sha256:3cbdedce0c3f0e0482d61be2d7458683be7cd7cf14b0ee6adfbaddb80f5b36a5
|
|
||||||
size 1171092
|
|
3
Scrapy-2.11.1.tar.gz
Normal file
3
Scrapy-2.11.1.tar.gz
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:733a039c7423e52b69bf2810b5332093d4e42a848460359c07b02ecff8f73ebe
|
||||||
|
size 1176726
|
@@ -1,3 +1,48 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Mon Mar 25 14:12:20 UTC 2024 - Dirk Müller <dmueller@suse.com>
|
||||||
|
|
||||||
|
- update to 2.11.1 (bsc#1220514, CVE-2024-1892):
|
||||||
|
* Addressed `ReDoS vulnerabilities` (bsc#1220514, CVE-2024-1892)
|
||||||
|
- ``scrapy.utils.iterators.xmliter`` is now deprecated in favor of
|
||||||
|
:func:`~scrapy.utils.iterators.xmliter_lxml`, which
|
||||||
|
:class:`~scrapy.spiders.XMLFeedSpider` now uses.
|
||||||
|
|
||||||
|
To minimize the impact of this change on existing code,
|
||||||
|
:func:`~scrapy.utils.iterators.xmliter_lxml` now supports indicating
|
||||||
|
the node namespace with a prefix in the node name, and big files with
|
||||||
|
highly nested trees when using libxml2 2.7+.
|
||||||
|
|
||||||
|
- Fixed regular expressions in the implementation of the
|
||||||
|
:func:`~scrapy.utils.response.open_in_browser` function.
|
||||||
|
.. _ReDoS vulnerabilities: https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
|
||||||
|
|
||||||
|
* :setting:`DOWNLOAD_MAXSIZE` and :setting:`DOWNLOAD_WARNSIZE` now also apply
|
||||||
|
to the decompressed response body. Please, see the `7j7m-v7m3-jqm7 security
|
||||||
|
advisory`_ for more information.
|
||||||
|
|
||||||
|
.. _7j7m-v7m3-jqm7 security advisory: https://github.com/scrapy/scrapy/security/advisories/GHSA-7j7m-v7m3-jqm7
|
||||||
|
|
||||||
|
* Also in relation with the `7j7m-v7m3-jqm7 security advisory`_, the
|
||||||
|
deprecated ``scrapy.downloadermiddlewares.decompression`` module has been
|
||||||
|
removed.
|
||||||
|
* The ``Authorization`` header is now dropped on redirects to a different
|
||||||
|
domain. Please, see the `cw9j-q3vf-hrrv security advisory`_ for more
|
||||||
|
information.
|
||||||
|
* The OS signal handling code was refactored to no longer use private Twisted
|
||||||
|
functions. (:issue:`6024`, :issue:`6064`, :issue:`6112`)
|
||||||
|
* Improved documentation for :class:`~scrapy.crawler.Crawler` initialization
|
||||||
|
changes made in the 2.11.0 release. (:issue:`6057`, :issue:`6147`)
|
||||||
|
* Extended documentation for :attr:`Request.meta <scrapy.http.Request.meta>`.
|
||||||
|
* Fixed the :reqmeta:`dont_merge_cookies` documentation. (:issue:`5936`,
|
||||||
|
* Added a link to Zyte's export guides to the :ref:`feed exports
|
||||||
|
* Added a missing note about backward-incompatible changes in
|
||||||
|
:class:`~scrapy.exporters.PythonItemExporter` to the 2.11.0 release notes.
|
||||||
|
* Added a missing note about removing the deprecated
|
||||||
|
``scrapy.utils.boto.is_botocore()`` function to the 2.8.0 release notes.
|
||||||
|
* Other documentation improvements. (:issue:`6128`, :issue:`6144`,
|
||||||
|
:issue:`6163`, :issue:`6190`, :issue:`6192`)
|
||||||
|
- drop twisted-23.8.0-compat.patch (upstream)
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Wed Jan 10 07:50:52 UTC 2024 - Daniel Garcia <daniel.garcia@suse.com>
|
Wed Jan 10 07:50:52 UTC 2024 - Daniel Garcia <daniel.garcia@suse.com>
|
||||||
|
|
||||||
|
@@ -16,21 +16,21 @@
|
|||||||
#
|
#
|
||||||
|
|
||||||
|
|
||||||
|
%{?sle15_python_module_pythons}
|
||||||
Name: python-Scrapy
|
Name: python-Scrapy
|
||||||
Version: 2.11.0
|
Version: 2.11.1
|
||||||
Release: 0
|
Release: 0
|
||||||
Summary: A high-level Python Screen Scraping framework
|
Summary: A high-level Python Screen Scraping framework
|
||||||
License: BSD-3-Clause
|
License: BSD-3-Clause
|
||||||
Group: Development/Languages/Python
|
Group: Development/Languages/Python
|
||||||
URL: https://scrapy.org
|
URL: https://scrapy.org
|
||||||
Source: https://files.pythonhosted.org/packages/source/S/Scrapy/Scrapy-%{version}.tar.gz
|
Source: https://files.pythonhosted.org/packages/source/S/Scrapy/Scrapy-%{version}.tar.gz
|
||||||
# PATCH-FIX-UPSTREAM twisted-23.8.0-compat.patch gh#scrapy/scrapy#6064
|
|
||||||
Patch1: twisted-23.8.0-compat.patch
|
|
||||||
BuildRequires: %{python_module Pillow}
|
BuildRequires: %{python_module Pillow}
|
||||||
BuildRequires: %{python_module Protego >= 0.1.15}
|
BuildRequires: %{python_module Protego >= 0.1.15}
|
||||||
BuildRequires: %{python_module PyDispatcher >= 2.0.5}
|
BuildRequires: %{python_module PyDispatcher >= 2.0.5}
|
||||||
BuildRequires: %{python_module Twisted >= 18.9.0}
|
BuildRequires: %{python_module Twisted >= 18.9.0}
|
||||||
BuildRequires: %{python_module attrs}
|
BuildRequires: %{python_module attrs}
|
||||||
|
BuildRequires: %{python_module base >= 3.8}
|
||||||
BuildRequires: %{python_module botocore >= 1.4.87}
|
BuildRequires: %{python_module botocore >= 1.4.87}
|
||||||
BuildRequires: %{python_module cryptography >= 36.0.0}
|
BuildRequires: %{python_module cryptography >= 36.0.0}
|
||||||
BuildRequires: %{python_module cssselect >= 0.9.1}
|
BuildRequires: %{python_module cssselect >= 0.9.1}
|
||||||
@@ -40,8 +40,9 @@ BuildRequires: %{python_module itemloaders >= 1.0.1}
|
|||||||
BuildRequires: %{python_module lxml >= 4.4.1}
|
BuildRequires: %{python_module lxml >= 4.4.1}
|
||||||
BuildRequires: %{python_module parsel >= 1.5.0}
|
BuildRequires: %{python_module parsel >= 1.5.0}
|
||||||
BuildRequires: %{python_module pexpect >= 4.8.1}
|
BuildRequires: %{python_module pexpect >= 4.8.1}
|
||||||
|
BuildRequires: %{python_module pip}
|
||||||
BuildRequires: %{python_module pyOpenSSL >= 21.0.0}
|
BuildRequires: %{python_module pyOpenSSL >= 21.0.0}
|
||||||
BuildRequires: %{python_module pyftpdlib}
|
BuildRequires: %{python_module pyftpdlib >= 1.5.8}
|
||||||
BuildRequires: %{python_module pytest-xdist}
|
BuildRequires: %{python_module pytest-xdist}
|
||||||
BuildRequires: %{python_module pytest}
|
BuildRequires: %{python_module pytest}
|
||||||
BuildRequires: %{python_module queuelib >= 1.4.2}
|
BuildRequires: %{python_module queuelib >= 1.4.2}
|
||||||
@@ -52,11 +53,11 @@ BuildRequires: %{python_module testfixtures}
|
|||||||
BuildRequires: %{python_module tldextract}
|
BuildRequires: %{python_module tldextract}
|
||||||
BuildRequires: %{python_module uvloop}
|
BuildRequires: %{python_module uvloop}
|
||||||
BuildRequires: %{python_module w3lib >= 1.17.0}
|
BuildRequires: %{python_module w3lib >= 1.17.0}
|
||||||
|
BuildRequires: %{python_module wheel}
|
||||||
BuildRequires: %{python_module zope.interface >= 5.1.0}
|
BuildRequires: %{python_module zope.interface >= 5.1.0}
|
||||||
BuildRequires: fdupes
|
BuildRequires: fdupes
|
||||||
BuildRequires: python-rpm-macros
|
BuildRequires: python-rpm-macros
|
||||||
BuildRequires: python3-Sphinx
|
BuildRequires: python3-Sphinx
|
||||||
BuildRequires: (python3-dataclasses if python3-base < 3.7)
|
|
||||||
Requires: python-Protego >= 0.1.15
|
Requires: python-Protego >= 0.1.15
|
||||||
Requires: python-PyDispatcher >= 2.0.5
|
Requires: python-PyDispatcher >= 2.0.5
|
||||||
Requires: python-Twisted >= 18.9.0
|
Requires: python-Twisted >= 18.9.0
|
||||||
@@ -65,6 +66,7 @@ Requires: python-cssselect >= 0.9.1
|
|||||||
Requires: python-itemadapter >= 0.1.0
|
Requires: python-itemadapter >= 0.1.0
|
||||||
Requires: python-itemloaders >= 1.0.1
|
Requires: python-itemloaders >= 1.0.1
|
||||||
Requires: python-lxml >= 4.4.1
|
Requires: python-lxml >= 4.4.1
|
||||||
|
Requires: python-packaging
|
||||||
Requires: python-parsel >= 1.5.0
|
Requires: python-parsel >= 1.5.0
|
||||||
Requires: python-pyOpenSSL >= 21.0.0
|
Requires: python-pyOpenSSL >= 21.0.0
|
||||||
Requires: python-queuelib >= 1.4.2
|
Requires: python-queuelib >= 1.4.2
|
||||||
@@ -74,7 +76,7 @@ Requires: python-tldextract
|
|||||||
Requires: python-w3lib >= 1.17.2
|
Requires: python-w3lib >= 1.17.2
|
||||||
Requires: python-zope.interface >= 5.1.0
|
Requires: python-zope.interface >= 5.1.0
|
||||||
Requires(post): update-alternatives
|
Requires(post): update-alternatives
|
||||||
Requires(postun):update-alternatives
|
Requires(postun): update-alternatives
|
||||||
BuildArch: noarch
|
BuildArch: noarch
|
||||||
%python_subpackages
|
%python_subpackages
|
||||||
|
|
||||||
@@ -96,13 +98,13 @@ Provides documentation for %{name}.
|
|||||||
sed -i -e 's:= python:= python3:g' docs/Makefile
|
sed -i -e 's:= python:= python3:g' docs/Makefile
|
||||||
|
|
||||||
%build
|
%build
|
||||||
%python_build
|
%pyproject_wheel
|
||||||
pushd docs
|
pushd docs
|
||||||
%make_build html && rm -r build/html/.buildinfo
|
%make_build html && rm -r build/html/.buildinfo
|
||||||
popd
|
popd
|
||||||
|
|
||||||
%install
|
%install
|
||||||
%python_install
|
%pyproject_install
|
||||||
%python_clone -a %{buildroot}%{_bindir}/scrapy
|
%python_clone -a %{buildroot}%{_bindir}/scrapy
|
||||||
%python_expand %fdupes %{buildroot}%{$python_sitelib}
|
%python_expand %fdupes %{buildroot}%{$python_sitelib}
|
||||||
|
|
||||||
@@ -128,7 +130,7 @@ skiplist="$skiplist or test_start_requests_laziness"
|
|||||||
%license LICENSE
|
%license LICENSE
|
||||||
%doc AUTHORS README.rst
|
%doc AUTHORS README.rst
|
||||||
%{python_sitelib}/scrapy
|
%{python_sitelib}/scrapy
|
||||||
%{python_sitelib}/Scrapy-%{version}*-info
|
%{python_sitelib}/Scrapy-%{version}.dist-info
|
||||||
%python_alternative %{_bindir}/scrapy
|
%python_alternative %{_bindir}/scrapy
|
||||||
|
|
||||||
%files -n %{name}-doc
|
%files -n %{name}-doc
|
||||||
|
@@ -1,254 +0,0 @@
|
|||||||
Index: Scrapy-2.11.0/scrapy/crawler.py
|
|
||||||
===================================================================
|
|
||||||
--- Scrapy-2.11.0.orig/scrapy/crawler.py
|
|
||||||
+++ Scrapy-2.11.0/scrapy/crawler.py
|
|
||||||
@@ -404,8 +404,8 @@ class CrawlerProcess(CrawlerRunner):
|
|
||||||
:param bool stop_after_crawl: stop or not the reactor when all
|
|
||||||
crawlers have finished
|
|
||||||
|
|
||||||
- :param bool install_signal_handlers: whether to install the shutdown
|
|
||||||
- handlers (default: True)
|
|
||||||
+ :param bool install_signal_handlers: whether to install the OS signal
|
|
||||||
+ handlers from Twisted and Scrapy (default: True)
|
|
||||||
"""
|
|
||||||
from twisted.internet import reactor
|
|
||||||
|
|
||||||
@@ -416,15 +416,17 @@ class CrawlerProcess(CrawlerRunner):
|
|
||||||
return
|
|
||||||
d.addBoth(self._stop_reactor)
|
|
||||||
|
|
||||||
- if install_signal_handlers:
|
|
||||||
- install_shutdown_handlers(self._signal_shutdown)
|
|
||||||
resolver_class = load_object(self.settings["DNS_RESOLVER"])
|
|
||||||
resolver = create_instance(resolver_class, self.settings, self, reactor=reactor)
|
|
||||||
resolver.install_on_reactor()
|
|
||||||
tp = reactor.getThreadPool()
|
|
||||||
tp.adjustPoolsize(maxthreads=self.settings.getint("REACTOR_THREADPOOL_MAXSIZE"))
|
|
||||||
reactor.addSystemEventTrigger("before", "shutdown", self.stop)
|
|
||||||
- reactor.run(installSignalHandlers=False) # blocking call
|
|
||||||
+ if install_signal_handlers:
|
|
||||||
+ reactor.addSystemEventTrigger(
|
|
||||||
+ "after", "startup", install_shutdown_handlers, self._signal_shutdown
|
|
||||||
+ )
|
|
||||||
+ reactor.run(installSignalHandlers=install_signal_handlers) # blocking call
|
|
||||||
|
|
||||||
def _graceful_stop_reactor(self) -> Deferred:
|
|
||||||
d = self.stop()
|
|
||||||
Index: Scrapy-2.11.0/scrapy/utils/ossignal.py
|
|
||||||
===================================================================
|
|
||||||
--- Scrapy-2.11.0.orig/scrapy/utils/ossignal.py
|
|
||||||
+++ Scrapy-2.11.0/scrapy/utils/ossignal.py
|
|
||||||
@@ -19,13 +19,10 @@ def install_shutdown_handlers(
|
|
||||||
function: SignalHandlerT, override_sigint: bool = True
|
|
||||||
) -> None:
|
|
||||||
"""Install the given function as a signal handler for all common shutdown
|
|
||||||
- signals (such as SIGINT, SIGTERM, etc). If override_sigint is ``False`` the
|
|
||||||
- SIGINT handler won't be install if there is already a handler in place
|
|
||||||
- (e.g. Pdb)
|
|
||||||
+ signals (such as SIGINT, SIGTERM, etc). If ``override_sigint`` is ``False`` the
|
|
||||||
+ SIGINT handler won't be installed if there is already a handler in place
|
|
||||||
+ (e.g. Pdb)
|
|
||||||
"""
|
|
||||||
- from twisted.internet import reactor
|
|
||||||
-
|
|
||||||
- reactor._handleSignals()
|
|
||||||
signal.signal(signal.SIGTERM, function)
|
|
||||||
if signal.getsignal(signal.SIGINT) == signal.default_int_handler or override_sigint:
|
|
||||||
signal.signal(signal.SIGINT, function)
|
|
||||||
Index: Scrapy-2.11.0/scrapy/utils/testproc.py
|
|
||||||
===================================================================
|
|
||||||
--- Scrapy-2.11.0.orig/scrapy/utils/testproc.py
|
|
||||||
+++ Scrapy-2.11.0/scrapy/utils/testproc.py
|
|
||||||
@@ -2,7 +2,7 @@ from __future__ import annotations
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
-from typing import Iterable, Optional, Tuple, cast
|
|
||||||
+from typing import Iterable, List, Optional, Tuple, cast
|
|
||||||
|
|
||||||
from twisted.internet.defer import Deferred
|
|
||||||
from twisted.internet.error import ProcessTerminated
|
|
||||||
@@ -26,14 +26,15 @@ class ProcessTest:
|
|
||||||
env = os.environ.copy()
|
|
||||||
if settings is not None:
|
|
||||||
env["SCRAPY_SETTINGS_MODULE"] = settings
|
|
||||||
+ assert self.command
|
|
||||||
cmd = self.prefix + [self.command] + list(args)
|
|
||||||
pp = TestProcessProtocol()
|
|
||||||
- pp.deferred.addBoth(self._process_finished, cmd, check_code)
|
|
||||||
+ pp.deferred.addCallback(self._process_finished, cmd, check_code)
|
|
||||||
reactor.spawnProcess(pp, cmd[0], cmd, env=env, path=self.cwd)
|
|
||||||
return pp.deferred
|
|
||||||
|
|
||||||
def _process_finished(
|
|
||||||
- self, pp: TestProcessProtocol, cmd: str, check_code: bool
|
|
||||||
+ self, pp: TestProcessProtocol, cmd: List[str], check_code: bool
|
|
||||||
) -> Tuple[int, bytes, bytes]:
|
|
||||||
if pp.exitcode and check_code:
|
|
||||||
msg = f"process {cmd} exit with code {pp.exitcode}"
|
|
||||||
Index: Scrapy-2.11.0/setup.py
|
|
||||||
===================================================================
|
|
||||||
--- Scrapy-2.11.0.orig/setup.py
|
|
||||||
+++ Scrapy-2.11.0/setup.py
|
|
||||||
@@ -6,8 +6,7 @@ version = (Path(__file__).parent / "scra
|
|
||||||
|
|
||||||
|
|
||||||
install_requires = [
|
|
||||||
- # 23.8.0 incompatibility: https://github.com/scrapy/scrapy/issues/6024
|
|
||||||
- "Twisted>=18.9.0,<23.8.0",
|
|
||||||
+ "Twisted>=18.9.0",
|
|
||||||
"cryptography>=36.0.0",
|
|
||||||
"cssselect>=0.9.1",
|
|
||||||
"itemloaders>=1.0.1",
|
|
||||||
Index: Scrapy-2.11.0/tests/CrawlerProcess/sleeping.py
|
|
||||||
===================================================================
|
|
||||||
--- /dev/null
|
|
||||||
+++ Scrapy-2.11.0/tests/CrawlerProcess/sleeping.py
|
|
||||||
@@ -0,0 +1,24 @@
|
|
||||||
+from twisted.internet.defer import Deferred
|
|
||||||
+
|
|
||||||
+import scrapy
|
|
||||||
+from scrapy.crawler import CrawlerProcess
|
|
||||||
+from scrapy.utils.defer import maybe_deferred_to_future
|
|
||||||
+
|
|
||||||
+
|
|
||||||
+class SleepingSpider(scrapy.Spider):
|
|
||||||
+ name = "sleeping"
|
|
||||||
+
|
|
||||||
+ start_urls = ["data:,;"]
|
|
||||||
+
|
|
||||||
+ async def parse(self, response):
|
|
||||||
+ from twisted.internet import reactor
|
|
||||||
+
|
|
||||||
+ d = Deferred()
|
|
||||||
+ reactor.callLater(3, d.callback, None)
|
|
||||||
+ await maybe_deferred_to_future(d)
|
|
||||||
+
|
|
||||||
+
|
|
||||||
+process = CrawlerProcess(settings={})
|
|
||||||
+
|
|
||||||
+process.crawl(SleepingSpider)
|
|
||||||
+process.start()
|
|
||||||
Index: Scrapy-2.11.0/tests/requirements.txt
|
|
||||||
===================================================================
|
|
||||||
--- Scrapy-2.11.0.orig/tests/requirements.txt
|
|
||||||
+++ Scrapy-2.11.0/tests/requirements.txt
|
|
||||||
@@ -1,5 +1,6 @@
|
|
||||||
# Tests requirements
|
|
||||||
attrs
|
|
||||||
+pexpect >= 4.8.0
|
|
||||||
# https://github.com/giampaolo/pyftpdlib/issues/560
|
|
||||||
pyftpdlib; python_version < "3.12"
|
|
||||||
pytest
|
|
||||||
Index: Scrapy-2.11.0/tests/test_command_shell.py
|
|
||||||
===================================================================
|
|
||||||
--- Scrapy-2.11.0.orig/tests/test_command_shell.py
|
|
||||||
+++ Scrapy-2.11.0/tests/test_command_shell.py
|
|
||||||
@@ -1,11 +1,15 @@
|
|
||||||
+import sys
|
|
||||||
+from io import BytesIO
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
+from pexpect.popen_spawn import PopenSpawn
|
|
||||||
from twisted.internet import defer
|
|
||||||
from twisted.trial import unittest
|
|
||||||
|
|
||||||
from scrapy.utils.testproc import ProcessTest
|
|
||||||
from scrapy.utils.testsite import SiteTest
|
|
||||||
from tests import NON_EXISTING_RESOLVABLE, tests_datadir
|
|
||||||
+from tests.mockserver import MockServer
|
|
||||||
|
|
||||||
|
|
||||||
class ShellTest(ProcessTest, SiteTest, unittest.TestCase):
|
|
||||||
@@ -133,3 +137,25 @@ class ShellTest(ProcessTest, SiteTest, u
|
|
||||||
args = ["-c", code, "--set", f"TWISTED_REACTOR={reactor_path}"]
|
|
||||||
_, _, err = yield self.execute(args, check_code=True)
|
|
||||||
self.assertNotIn(b"RuntimeError: There is no current event loop in thread", err)
|
|
||||||
+
|
|
||||||
+
|
|
||||||
+class InteractiveShellTest(unittest.TestCase):
|
|
||||||
+ def test_fetch(self):
|
|
||||||
+ args = (
|
|
||||||
+ sys.executable,
|
|
||||||
+ "-m",
|
|
||||||
+ "scrapy.cmdline",
|
|
||||||
+ "shell",
|
|
||||||
+ )
|
|
||||||
+ logfile = BytesIO()
|
|
||||||
+ p = PopenSpawn(args, timeout=5)
|
|
||||||
+ p.logfile_read = logfile
|
|
||||||
+ p.expect_exact("Available Scrapy objects")
|
|
||||||
+ with MockServer() as mockserver:
|
|
||||||
+ p.sendline(f"fetch('{mockserver.url('/')}')")
|
|
||||||
+ p.sendline("type(response)")
|
|
||||||
+ p.expect_exact("HtmlResponse")
|
|
||||||
+ p.sendeof()
|
|
||||||
+ p.wait()
|
|
||||||
+ logfile.seek(0)
|
|
||||||
+ self.assertNotIn("Traceback", logfile.read().decode())
|
|
||||||
Index: Scrapy-2.11.0/tests/test_crawler.py
|
|
||||||
===================================================================
|
|
||||||
--- Scrapy-2.11.0.orig/tests/test_crawler.py
|
|
||||||
+++ Scrapy-2.11.0/tests/test_crawler.py
|
|
||||||
@@ -1,13 +1,16 @@
|
|
||||||
import logging
|
|
||||||
import os
|
|
||||||
import platform
|
|
||||||
+import signal
|
|
||||||
import subprocess
|
|
||||||
import sys
|
|
||||||
import warnings
|
|
||||||
from pathlib import Path
|
|
||||||
+from typing import List
|
|
||||||
|
|
||||||
import pytest
|
|
||||||
from packaging.version import parse as parse_version
|
|
||||||
+from pexpect.popen_spawn import PopenSpawn
|
|
||||||
from pytest import mark, raises
|
|
||||||
from twisted.internet import defer
|
|
||||||
from twisted.trial import unittest
|
|
||||||
@@ -289,9 +292,12 @@ class ScriptRunnerMixin:
|
|
||||||
script_dir: Path
|
|
||||||
cwd = os.getcwd()
|
|
||||||
|
|
||||||
- def run_script(self, script_name: str, *script_args):
|
|
||||||
+ def get_script_args(self, script_name: str, *script_args: str) -> List[str]:
|
|
||||||
script_path = self.script_dir / script_name
|
|
||||||
- args = [sys.executable, str(script_path)] + list(script_args)
|
|
||||||
+ return [sys.executable, str(script_path)] + list(script_args)
|
|
||||||
+
|
|
||||||
+ def run_script(self, script_name: str, *script_args: str) -> str:
|
|
||||||
+ args = self.get_script_args(script_name, *script_args)
|
|
||||||
p = subprocess.Popen(
|
|
||||||
args,
|
|
||||||
env=get_mockserver_env(),
|
|
||||||
@@ -517,6 +523,29 @@ class CrawlerProcessSubprocess(ScriptRun
|
|
||||||
self.assertIn("Spider closed (finished)", log)
|
|
||||||
self.assertIn("The value of FOO is 42", log)
|
|
||||||
|
|
||||||
+ def test_shutdown_graceful(self):
|
|
||||||
+ sig = signal.SIGINT if sys.platform != "win32" else signal.SIGBREAK
|
|
||||||
+ args = self.get_script_args("sleeping.py")
|
|
||||||
+ p = PopenSpawn(args, timeout=5)
|
|
||||||
+ p.expect_exact("Spider opened")
|
|
||||||
+ p.expect_exact("Crawled (200)")
|
|
||||||
+ p.kill(sig)
|
|
||||||
+ p.expect_exact("shutting down gracefully")
|
|
||||||
+ p.expect_exact("Spider closed (shutdown)")
|
|
||||||
+ p.wait()
|
|
||||||
+
|
|
||||||
+ def test_shutdown_forced(self):
|
|
||||||
+ sig = signal.SIGINT if sys.platform != "win32" else signal.SIGBREAK
|
|
||||||
+ args = self.get_script_args("sleeping.py")
|
|
||||||
+ p = PopenSpawn(args, timeout=5)
|
|
||||||
+ p.expect_exact("Spider opened")
|
|
||||||
+ p.expect_exact("Crawled (200)")
|
|
||||||
+ p.kill(sig)
|
|
||||||
+ p.expect_exact("shutting down gracefully")
|
|
||||||
+ p.kill(sig)
|
|
||||||
+ p.expect_exact("forcing unclean shutdown")
|
|
||||||
+ p.wait()
|
|
||||||
+
|
|
||||||
|
|
||||||
class CrawlerRunnerSubprocess(ScriptRunnerMixin, unittest.TestCase):
|
|
||||||
script_dir = Path(__file__).parent.resolve() / "CrawlerRunner"
|
|
Reference in New Issue
Block a user