python-pdfminer.six/python-pdfminer.six.spec

#
# spec file for package python-pdfminer.six
#
# Copyright (c) 2021 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
# upon. The license for this file, and modifications and additions to the
# file, is the same license as for the pristine package itself (unless the
# license for the pristine package is not an Open Source License, in which
# case the license is the MIT License). An "Open Source License" is a
# license that conforms to the Open Source Definition (Version 1.9)
# published by the Open Source Initiative.

# Please submit bugfixes or comments via https://bugs.opensuse.org/
#


%{?!python_module:%define python_module() python-%{**} python3-%{**}}
%define skip_python2 1
Name:           python-pdfminer.six
Version:        20200726
Release:        0
Summary:        PDF parser and analyzer
License:        MIT
URL:            https://github.com/pdfminer/pdfminer.six
Source:         https://github.com/pdfminer/pdfminer.six/archive/%{version}.tar.gz#/pdfminer.six-%{version}.tar.gz
# https://github.com/pdfminer/pdfminer.six/pull/489
Patch0:         python-pdfminer.six-remove-nose.patch
Patch1:         import-from-non-pythonpath-files.patch
BuildRequires:  %{python_module chardet}
BuildRequires:  %{python_module cryptography}
BuildRequires:  %{python_module pycryptodome}
BuildRequires:  %{python_module pytest}
BuildRequires:  %{python_module setuptools}
BuildRequires:  %{python_module six}
BuildRequires:  %{python_module sortedcontainers}
BuildRequires:  fdupes
BuildRequires:  python-rpm-macros
Requires:       python-chardet
Requires:       python-cryptography
Requires:       python-pycryptodome
Requires:       python-six
Requires:       python-sortedcontainers
Requires(post): update-alternatives
Requires(postun):update-alternatives
Provides:       python-pdfminer3k = %{version}
Obsoletes:      python-pdfminer3k < %{version}
BuildArch:      noarch
%python_subpackages

%description
Fork of PDFMiner using six for Python3 compatibility.

PDFMiner is a tool for extracting information from PDF documents.
Unlike other PDF-related tools, it focuses entirely on getting
and analyzing text data. PDFMiner allows to obtain the exact
location of texts in a page, as well as other information such
as fonts or lines. It includes a PDF converter that can transform
PDF files into other text formats (such as HTML). It has an
extensible PDF parser that can be used for other purposes instead
of text analysis.

%prep
%setup -q -n pdfminer.six-%{version}
%autopatch -p1
sed -i -e '/^#!\//, 1d' pdfminer/psparser.py
sed  -i '1i #!%{_bindir}/python3' tools/dumppdf.py tools/pdf2txt.py

%build
%python_build

%install
%python_install
%python_expand %fdupes %{buildroot}%{$python_sitelib}

mv %{buildroot}%{_bindir}/dumppdf.py %{buildroot}%{_bindir}/dumppdf
mv %{buildroot}%{_bindir}/pdf2txt.py %{buildroot}%{_bindir}/pdf2txt
%python_clone -a %{buildroot}%{_bindir}/pdf2txt
%python_clone -a %{buildroot}%{_bindir}/dumppdf

%check
%pytest

%post
%python_install_alternative pdf2txt
%python_install_alternative dumppdf

%postun
%python_uninstall_alternative pdf2txt
%python_uninstall_alternative dumppdf

%files %{python_files}
%license LICENSE
%doc README.md
%python_alternative %{_bindir}/dumppdf
%python_alternative %{_bindir}/pdf2txt
%{python_sitelib}/pdfminer*

%changelog
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`#`
			`# spec file for package python-pdfminer.six`
			`#`
- Use pytest to run the testsuite. - Add patch import-from-non-pythonpath-files.patch: * Allow the test suite to find modules not shipped as modules. OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=7 2021-11-09 07:33:46 +00:00			`# Copyright (c) 2021 SUSE LLC`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`#`
			`# All modifications and additions to the file contributed by third parties`
			`# remain the property of their copyright owners, unless otherwise agreed`
			`# upon. The license for this file, and modifications and additions to the`
			`# file, is the same license as for the pristine package itself (unless the`
			`# license for the pristine package is not an Open Source License, in which`
			`# case the license is the MIT License). An "Open Source License" is a`
			`# license that conforms to the Open Source Definition (Version 1.9)`
			`# published by the Open Source Initiative.`

Accepting request 807600 from home:pgajdos:python submit OBS-URL: https://build.opensuse.org/request/show/807600 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=3 2020-05-20 11:28:27 +00:00			`# Please submit bugfixes or comments via https://bugs.opensuse.org/`
			`#`

Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00
			`%{?!python_module:%define python_module() python-%{} python3-%{}}`
			`%define skip_python2 1`
			`Name: python-pdfminer.six`
Accepting request 833056 from home:pgajdos:python - version update to 20200726 - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change - Always try to get CMap, not only for identity encodings - Support for painting multiple rectangles at once - Validate image object in do_EI is a PDFStream - Hiding fallback xref by default from dumppdf.py output - Raise a warning instead of an error when extracting text from a non-extractable PDF - Switched from pycryptodome to cryptography package for AES decryption - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - Allow boxes_flow LAParam to be passed as None, validate the input, and update documentation - Also accept file-like objects in high level functions `extract_text` and `extract_pages` - Text no longer comes in reverse order when advanced layout analysis is disabled - Updated misleading documentation for `word_margin` and `char_margin` - Ignore ValueError when converting font encoding differences - Grouping of text lines outside of parent container bounding box - Group text lines if they are centered - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - do not require nose for testing - added patches fix https://github.com/pdfminer/pdfminer.six/pull/489 + python-pdfminer.six-remove-nose.patch OBS-URL: https://build.opensuse.org/request/show/833056 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=5 2020-09-08 18:34:28 +00:00			`Version: 20200726`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`Release: 0`
			`Summary: PDF parser and analyzer`
			`License: MIT`
			`URL: https://github.com/pdfminer/pdfminer.six`
			`Source: https://github.com/pdfminer/pdfminer.six/archive/%{version}.tar.gz#/pdfminer.six-%{version}.tar.gz`
Accepting request 833056 from home:pgajdos:python - version update to 20200726 - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change - Always try to get CMap, not only for identity encodings - Support for painting multiple rectangles at once - Validate image object in do_EI is a PDFStream - Hiding fallback xref by default from dumppdf.py output - Raise a warning instead of an error when extracting text from a non-extractable PDF - Switched from pycryptodome to cryptography package for AES decryption - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - Allow boxes_flow LAParam to be passed as None, validate the input, and update documentation - Also accept file-like objects in high level functions `extract_text` and `extract_pages` - Text no longer comes in reverse order when advanced layout analysis is disabled - Updated misleading documentation for `word_margin` and `char_margin` - Ignore ValueError when converting font encoding differences - Grouping of text lines outside of parent container bounding box - Group text lines if they are centered - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - do not require nose for testing - added patches fix https://github.com/pdfminer/pdfminer.six/pull/489 + python-pdfminer.six-remove-nose.patch OBS-URL: https://build.opensuse.org/request/show/833056 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=5 2020-09-08 18:34:28 +00:00			`# https://github.com/pdfminer/pdfminer.six/pull/489`
			`Patch0: python-pdfminer.six-remove-nose.patch`
- Use pytest to run the testsuite. - Add patch import-from-non-pythonpath-files.patch: * Allow the test suite to find modules not shipped as modules. OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=7 2021-11-09 07:33:46 +00:00			`Patch1: import-from-non-pythonpath-files.patch`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`BuildRequires: %{python_module chardet}`
Accepting request 833056 from home:pgajdos:python - version update to 20200726 - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change - Always try to get CMap, not only for identity encodings - Support for painting multiple rectangles at once - Validate image object in do_EI is a PDFStream - Hiding fallback xref by default from dumppdf.py output - Raise a warning instead of an error when extracting text from a non-extractable PDF - Switched from pycryptodome to cryptography package for AES decryption - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - Allow boxes_flow LAParam to be passed as None, validate the input, and update documentation - Also accept file-like objects in high level functions `extract_text` and `extract_pages` - Text no longer comes in reverse order when advanced layout analysis is disabled - Updated misleading documentation for `word_margin` and `char_margin` - Ignore ValueError when converting font encoding differences - Grouping of text lines outside of parent container bounding box - Group text lines if they are centered - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - do not require nose for testing - added patches fix https://github.com/pdfminer/pdfminer.six/pull/489 + python-pdfminer.six-remove-nose.patch OBS-URL: https://build.opensuse.org/request/show/833056 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=5 2020-09-08 18:34:28 +00:00			`BuildRequires: %{python_module cryptography}`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`BuildRequires: %{python_module pycryptodome}`
Accepting request 833056 from home:pgajdos:python - version update to 20200726 - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change - Always try to get CMap, not only for identity encodings - Support for painting multiple rectangles at once - Validate image object in do_EI is a PDFStream - Hiding fallback xref by default from dumppdf.py output - Raise a warning instead of an error when extracting text from a non-extractable PDF - Switched from pycryptodome to cryptography package for AES decryption - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - Allow boxes_flow LAParam to be passed as None, validate the input, and update documentation - Also accept file-like objects in high level functions `extract_text` and `extract_pages` - Text no longer comes in reverse order when advanced layout analysis is disabled - Updated misleading documentation for `word_margin` and `char_margin` - Ignore ValueError when converting font encoding differences - Grouping of text lines outside of parent container bounding box - Group text lines if they are centered - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - do not require nose for testing - added patches fix https://github.com/pdfminer/pdfminer.six/pull/489 + python-pdfminer.six-remove-nose.patch OBS-URL: https://build.opensuse.org/request/show/833056 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=5 2020-09-08 18:34:28 +00:00			`BuildRequires: %{python_module pytest}`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`BuildRequires: %{python_module setuptools}`
			`BuildRequires: %{python_module six}`
			`BuildRequires: %{python_module sortedcontainers}`
			`BuildRequires: fdupes`
			`BuildRequires: python-rpm-macros`
			`Requires: python-chardet`
Accepting request 833056 from home:pgajdos:python - version update to 20200726 - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change - Always try to get CMap, not only for identity encodings - Support for painting multiple rectangles at once - Validate image object in do_EI is a PDFStream - Hiding fallback xref by default from dumppdf.py output - Raise a warning instead of an error when extracting text from a non-extractable PDF - Switched from pycryptodome to cryptography package for AES decryption - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - Allow boxes_flow LAParam to be passed as None, validate the input, and update documentation - Also accept file-like objects in high level functions `extract_text` and `extract_pages` - Text no longer comes in reverse order when advanced layout analysis is disabled - Updated misleading documentation for `word_margin` and `char_margin` - Ignore ValueError when converting font encoding differences - Grouping of text lines outside of parent container bounding box - Group text lines if they are centered - Python3 shebang line to script in tools - Fix ordering of textlines within a textbox when `boxes_flow=None` - do not require nose for testing - added patches fix https://github.com/pdfminer/pdfminer.six/pull/489 + python-pdfminer.six-remove-nose.patch OBS-URL: https://build.opensuse.org/request/show/833056 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=5 2020-09-08 18:34:28 +00:00			`Requires: python-cryptography`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`Requires: python-pycryptodome`
			`Requires: python-six`
			`Requires: python-sortedcontainers`
Accepting request 807600 from home:pgajdos:python submit OBS-URL: https://build.opensuse.org/request/show/807600 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=3 2020-05-20 11:28:27 +00:00			`Requires(post): update-alternatives`
- Use pytest to run the testsuite. - Add patch import-from-non-pythonpath-files.patch: * Allow the test suite to find modules not shipped as modules. OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=7 2021-11-09 07:33:46 +00:00			`Requires(postun):update-alternatives`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`Provides: python-pdfminer3k = %{version}`
			`Obsoletes: python-pdfminer3k < %{version}`
			`BuildArch: noarch`
			`%python_subpackages`

			`%description`
			`Fork of PDFMiner using six for Python3 compatibility.`

			`PDFMiner is a tool for extracting information from PDF documents.`
			`Unlike other PDF-related tools, it focuses entirely on getting`
			`and analyzing text data. PDFMiner allows to obtain the exact`
			`location of texts in a page, as well as other information such`
			`as fonts or lines. It includes a PDF converter that can transform`
			`PDF files into other text formats (such as HTML). It has an`
			`extensible PDF parser that can be used for other purposes instead`
			`of text analysis.`

			`%prep`
			`%setup -q -n pdfminer.six-%{version}`
- Use pytest to run the testsuite. - Add patch import-from-non-pythonpath-files.patch: * Allow the test suite to find modules not shipped as modules. OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=7 2021-11-09 07:33:46 +00:00			`%autopatch -p1`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`sed -i -e '/^#!\//, 1d' pdfminer/psparser.py`
Accepting request 807600 from home:pgajdos:python submit OBS-URL: https://build.opensuse.org/request/show/807600 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=3 2020-05-20 11:28:27 +00:00			`sed -i '1i #!%{_bindir}/python3' tools/dumppdf.py tools/pdf2txt.py`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00
			`%build`
			`%python_build`

			`%install`
			`%python_install`
			`%python_expand %fdupes %{buildroot}%{$python_sitelib}`

			`mv %{buildroot}%{_bindir}/dumppdf.py %{buildroot}%{_bindir}/dumppdf`
			`mv %{buildroot}%{_bindir}/pdf2txt.py %{buildroot}%{_bindir}/pdf2txt`
Accepting request 807600 from home:pgajdos:python submit OBS-URL: https://build.opensuse.org/request/show/807600 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=3 2020-05-20 11:28:27 +00:00			`%python_clone -a %{buildroot}%{_bindir}/pdf2txt`
			`%python_clone -a %{buildroot}%{_bindir}/dumppdf`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00
			`%check`
- Use pytest to run the testsuite. - Add patch import-from-non-pythonpath-files.patch: * Allow the test suite to find modules not shipped as modules. OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=7 2021-11-09 07:33:46 +00:00			`%pytest`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00
Accepting request 807600 from home:pgajdos:python submit OBS-URL: https://build.opensuse.org/request/show/807600 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=3 2020-05-20 11:28:27 +00:00			`%post`
			`%python_install_alternative pdf2txt`
			`%python_install_alternative dumppdf`

			`%postun`
			`%python_uninstall_alternative pdf2txt`
			`%python_uninstall_alternative dumppdf`

Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`%files %{python_files}`
			`%license LICENSE`
			`%doc README.md`
Accepting request 807600 from home:pgajdos:python submit OBS-URL: https://build.opensuse.org/request/show/807600 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=3 2020-05-20 11:28:27 +00:00			`%python_alternative %{_bindir}/dumppdf`
			`%python_alternative %{_bindir}/pdf2txt`
Accepting request 774365 from home:mnhauke Initial package for python-pdfminer.six OBS-URL: https://build.opensuse.org/request/show/774365 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1 2020-02-14 14:52:12 +00:00			`%{python_sitelib}/pdfminer*`

			`%changelog`