forked from pool/python-beautifulsoup4
so keep just one around
- Update to 4.7.1:
* Fixed a significant performance problem introduced in 4.7.0. [bug=1810617]
* Fixed an incorrectly raised exception when inserting a tag before or
after an identical tag. [bug=1810692]
* Beautiful Soup will no longer try to keep track of namespaces that
are not defined with a prefix; this can confuse soupselect. [bug=1810680]
* Tried even harder to avoid the deprecation warning originally fixed in
4.6.1. [bug=1778909]
* Beautiful Soup's CSS Selector implementation has been replaced by a
dependency on Isaac Muse's SoupSieve project (the soupsieve package
on PyPI). The good news is that SoupSieve has a much more robust and
complete implementation of CSS selectors, resolving a large number
of longstanding issues. The bad news is that from this point onward,
SoupSieve must be installed if you want to use the select() method.
* Added the PageElement.extend() method, which works like list.append().
[bug=1514970]
* PageElement.insert_before() and insert_after() now take a variable
number of arguments. [bug=1514970]
* Fix a number of problems with the tree builder that caused
trees that were superficially okay, but which fell apart when bits
were extracted. Patch by Isaac Muse. [bug=1782928,1809910]
* Fixed a problem with the tree builder in which elements that
contained no content (such as empty comments and all-whitespace
elements) were not being treated as part of the tree. Patch by Isaac
Muse. [bug=1798699]
* Fixed a problem with multi-valued attributes where the value
contained whitespace. Thanks to Jens Svalgaard for the
fix. [bug=1787453]
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-beautifulsoup4?expand=0&rev=66
105 lines
4.0 KiB
RPMSpec
105 lines
4.0 KiB
RPMSpec
#
|
|
# spec file for package python-beautifulsoup4
|
|
#
|
|
# Copyright (c) 2019 SUSE LINUX GmbH, Nuernberg, Germany.
|
|
#
|
|
# All modifications and additions to the file contributed by third parties
|
|
# remain the property of their copyright owners, unless otherwise agreed
|
|
# upon. The license for this file, and modifications and additions to the
|
|
# file, is the same license as for the pristine package itself (unless the
|
|
# license for the pristine package is not an Open Source License, in which
|
|
# case the license is the MIT License). An "Open Source License" is a
|
|
# license that conforms to the Open Source Definition (Version 1.9)
|
|
# published by the Open Source Initiative.
|
|
|
|
# Please submit bugfixes or comments via https://bugs.opensuse.org/
|
|
#
|
|
|
|
|
|
%{?!python_module:%define python_module() python-%{**} python3-%{**}}
|
|
Name: python-beautifulsoup4
|
|
Version: 4.7.1
|
|
Release: 0
|
|
Summary: HTML/XML Parser for Quick-Turnaround Applications Like Screen-Scraping
|
|
License: MIT
|
|
Group: Development/Libraries/Python
|
|
URL: https://www.crummy.com/software/BeautifulSoup/
|
|
Source: https://files.pythonhosted.org/packages/source/b/beautifulsoup4/beautifulsoup4-%{version}.tar.gz
|
|
# PATCH-FIX-UPSTREAM speilicke@suse.com -- Backport of https://code.launchpad.net/~saschpe/beautifulsoup/beautifulsoup/+merge/200849
|
|
Patch0: beautifulsoup4-lxml-fixes.patch
|
|
BuildRequires: %{python_module pytest}
|
|
BuildRequires: %{python_module setuptools}
|
|
BuildRequires: %{python_module soupsieve}
|
|
BuildRequires: fdupes
|
|
BuildRequires: python-rpm-macros
|
|
BuildRequires: python3-Sphinx
|
|
Requires: python-soupsieve
|
|
Suggests: python-html5lib >= 0.999999
|
|
Suggests: python-lxml >= 3.4.4
|
|
BuildArch: noarch
|
|
%python_subpackages
|
|
|
|
%description
|
|
Beautiful Soup is a Python HTML/XML parser designed for quick turnaround
|
|
projects like screen-scraping. Three features make it powerful:
|
|
|
|
* Beautiful Soup won't choke if you give it bad markup. It yields a parse tree
|
|
that makes approximately as much sense as your original document. This is
|
|
usually good enough to collect the data you need and run away
|
|
|
|
* Beautiful Soup provides a few simple methods and Pythonic idioms for
|
|
navigating, searching, and modifying a parse tree: a toolkit for dissecting a
|
|
document and extracting what you need. You don't have to create a custom
|
|
parser for each application
|
|
|
|
* Beautiful Soup automatically converts incoming documents to Unicode and
|
|
outgoing documents to UTF-8. You don't have to think about encodings, unless
|
|
the document doesn't specify an encoding and Beautiful Soup can't autodetect
|
|
one. Then you just have to specify the original encoding
|
|
|
|
Beautiful Soup parses anything you give it, and does the tree traversal stuff
|
|
for you. You can tell it "Find all the links", or "Find all the links of class
|
|
externalLink", or "Find all the links whose urls match "foo.com", or "Find the
|
|
table heading that's got bold text, then give me that text."
|
|
|
|
Valuable data that was once locked up in poorly-designed websites is now within
|
|
your reach. Projects that would have taken hours take only minutes with
|
|
Beautiful Soup.
|
|
|
|
%package -n python-beautifulsoup4-doc
|
|
Summary: Documentation for %{name}
|
|
Group: Development/Libraries/Python
|
|
Recommends: %{name} = %{version}
|
|
Obsoletes: python2-beautifulsoup4-doc
|
|
Obsoletes: python3-beautifulsoup4-doc
|
|
|
|
%description -n python-beautifulsoup4-doc
|
|
Documentation and help files for %{name}
|
|
|
|
%prep
|
|
%setup -q -n beautifulsoup4-%{version}
|
|
%patch0 -p1
|
|
|
|
%build
|
|
%python_build
|
|
pushd doc && make html && rm build/html/.buildinfo build/html/objects.inv && popd
|
|
|
|
%install
|
|
%python_install
|
|
%python_expand %fdupes -s %{buildroot}%{$python_sitelib}
|
|
|
|
%check
|
|
export LANG=en_US.UTF-8
|
|
export PYTHONDONTWRITEBYTECODE=1
|
|
%python_expand PYTHONPATH=%{buildroot}%{$python_sitelib} py.test-%{$python_bin_suffix} %{buildroot}%{$python_sitelib}/bs4/tests
|
|
|
|
%files %{python_files}
|
|
%license COPYING.txt
|
|
%{python_sitelib}/bs4/
|
|
%{python_sitelib}/beautifulsoup4-%{version}-py*.egg-info
|
|
|
|
%files -n python-beautifulsoup4-doc
|
|
%doc NEWS.txt README.md TODO.txt doc/build/html
|
|
|
|
%changelog
|