14
0
Commit Graph

9 Commits

Author SHA256 Message Date
a0f2e1025c - Update to 20250327:
* Added
    + Support for Python 3.13
    + Support for zipped jpeg's
    + Fuzzing harnesses for integration into Google's OSS-Fuzz
    + Support for setuptools-git-versioning version 2.0.0
  * Changed
    + Reduce memory overhead on runlength encoding by using lists
    + Using pyproject.toml instead of setup.py
    + Updated Python 3.7 syntax to 3.8
    + Updated all Python version specifications to a minimum of 3.8
    + Using absolute instead of relative imports
    + Using standard library functions for ascii85 and asciihex
  * Fixed
    + TypeError when CID character widths are not parseable as floats
    + TypeError raised by extract_text method with compressed PDF file
    + PSBaseParser can't handle tokens split across end of buffer
    + TypeError when CropBox is an indirect object reference
    + Remove redundant line to be able to recognize rectangles
    + Support indirect objects for filters
    + Make sure bytes is bytes where it counts
    + TypeError when corrupt PDF object reference cannot be parsed as int
    + TypeError when corrupt PDF literal cannot be converted to str
    + ValueError when corrupt PDF specifies a negative xref location
    + ValueError when corrupt PDF specifies an invalid mediabox
    + RecursionError when corrupt PDF specifies a recursive /Pages object
    + TypeError when corrupt PDF specifies text-positioning operators with
      invalid values
    + inline image parsing fails when stream data contains "EI\n"
    + TypeError when parsing object reference as mediabox

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=16
2025-04-07 05:36:55 +00:00
024a4eba22 - update to 20231228:
* Removed Support for Python 3.6 and 3.7
  * Output converter for the hOCR format
  * Font name aliases for Arial, Courier New and Times New Roman
  * Documentation on why special characters can sometimes not be
    extracted
  * Storing Bezier path and dashing style of line in LTCurve
  * Broken CI/CD pipeline by setting upper version limit for
    black, mypy, pip and setuptools
  * `flake8` failures
  * `ValueError` when bmp images with 1 bit channel are decoded
  * `ValueError` when trying to decrypt empty metadata values
  * Sphinx errors during building of documentation
  * `TypeError` when getting default width of font
  * Installing typing-extensions on Python 3.6 and 3.7
  * `TypeError` in cmapdb.py when parsing null characters
  * Color "convenience operators" now (per spec) also set color
    space
  * `ValueError` when extracting images, due to breaking changes
    in Pillow
  * Small typo's and issues in the documentation
  * Ignore non-Unicode cmaps in TrueType fonts
  * Using non-hardcoded version string and setuptools-git-
    versioning to enable installation from source and building on
    Python 3.12
  * Usage of `if __name__ == "__main__"` where it was only
    intended for testing purposes
  - Option to disable boxes flow layout analysis when using pdf2txt
  - Exporting images without any specific encoding
  - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=14
2024-01-07 20:38:38 +00:00
aa03b1dd03 Accepting request 1132937 from home:jonapap
- Update to 20221105
  - Option to disable boxes flow layout analysis when using pdf2txt 
  - Add support for PDF 2.0 (ISO 32000-2) AES-256 encryption
  - Support for Paeth PNG filter compression (predictor value = 4)
  - Type annotations
  - Export type annotations from pypi package per PEP561
  - Support for identity cmap's
  - Add support for PDF page labels
  - Installation of Pillow as an optional extra dependency
  - Exporting images without any specific encoding 
  - Output converter for the hOCR format
  - Font name aliases for Arial, Courier New and Times New Roman
  - Documentation on why special characters can sometimes not be extracted
- Remove patch python-pdfminer.six-remove-nose.patch
- Update dependencies

OBS-URL: https://build.opensuse.org/request/show/1132937
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=12
2023-12-14 09:40:54 +00:00
f4809546ff Accepting request 1105920 from home:ecsos:python
- Add %{?sle15_python_module_pythons}

OBS-URL: https://build.opensuse.org/request/show/1105920
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=10
2023-08-25 18:45:00 +00:00
42871aca5a Accepting request 1078469 from home:pgajdos:python
- python-six is not required
- python-pycryptodome is not required

OBS-URL: https://build.opensuse.org/request/show/1078469
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=8
2023-04-11 18:23:03 +00:00
41ebe191d8 - Use pytest to run the testsuite.
- Add patch import-from-non-pythonpath-files.patch:
  * Allow the test suite to find modules not shipped as modules.

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=7
2021-11-09 07:33:46 +00:00
Tomáš Chvátal
74eddac484 Accepting request 833056 from home:pgajdos:python
- version update to 20200726
  - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change 
  - Always try to get CMap, not only for identity encodings 
  - Support for painting multiple rectangles at once 
  - Validate image object in do_EI is a PDFStream 
  - Hiding fallback xref by default from dumppdf.py output 
  - Raise a warning instead of an error when extracting text from a non-extractable PDF 
  - Switched from pycryptodome to cryptography package for AES decryption 
  - Python3 shebang line to script in tools 
  - Fix ordering of textlines within a textbox when `boxes_flow=None` 
  - Allow boxes_flow LAParam to be passed as None, validate the input, and update documentation 
  - Also accept file-like objects in high level functions `extract_text` and `extract_pages` 
  - Text no longer comes in reverse order when advanced layout analysis is disabled 
  - Updated misleading documentation for `word_margin` and `char_margin` 
  - Ignore ValueError when converting font encoding differences 
  - Grouping of text lines outside of parent container bounding box 
  - Group text lines if they are centered 
  - Python3 shebang line to script in tools 
  - Fix ordering of textlines within a textbox when `boxes_flow=None` 
- do not require nose for testing
- added patches
  fix https://github.com/pdfminer/pdfminer.six/pull/489
  + python-pdfminer.six-remove-nose.patch

OBS-URL: https://build.opensuse.org/request/show/833056
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=5
2020-09-08 18:34:28 +00:00
Tomáš Chvátal
e00946054d Accepting request 807600 from home:pgajdos:python
submit

OBS-URL: https://build.opensuse.org/request/show/807600
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=3
2020-05-20 11:28:27 +00:00
7eac542c34 Accepting request 774365 from home:mnhauke
Initial package for python-pdfminer.six

OBS-URL: https://build.opensuse.org/request/show/774365
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=1
2020-02-14 14:52:12 +00:00