* Added
+ Support for Python 3.13
+ Support for zipped jpeg's
+ Fuzzing harnesses for integration into Google's OSS-Fuzz
+ Support for setuptools-git-versioning version 2.0.0
* Changed
+ Reduce memory overhead on runlength encoding by using lists
+ Using pyproject.toml instead of setup.py
+ Updated Python 3.7 syntax to 3.8
+ Updated all Python version specifications to a minimum of 3.8
+ Using absolute instead of relative imports
+ Using standard library functions for ascii85 and asciihex
* Fixed
+ TypeError when CID character widths are not parseable as floats
+ TypeError raised by extract_text method with compressed PDF file
+ PSBaseParser can't handle tokens split across end of buffer
+ TypeError when CropBox is an indirect object reference
+ Remove redundant line to be able to recognize rectangles
+ Support indirect objects for filters
+ Make sure bytes is bytes where it counts
+ TypeError when corrupt PDF object reference cannot be parsed as int
+ TypeError when corrupt PDF literal cannot be converted to str
+ ValueError when corrupt PDF specifies a negative xref location
+ ValueError when corrupt PDF specifies an invalid mediabox
+ RecursionError when corrupt PDF specifies a recursive /Pages object
+ TypeError when corrupt PDF specifies text-positioning operators with
invalid values
+ inline image parsing fails when stream data contains "EI\n"
+ TypeError when parsing object reference as mediabox
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=16
* Removed Support for Python 3.6 and 3.7
* Output converter for the hOCR format
* Font name aliases for Arial, Courier New and Times New Roman
* Documentation on why special characters can sometimes not be
extracted
* Storing Bezier path and dashing style of line in LTCurve
* Broken CI/CD pipeline by setting upper version limit for
black, mypy, pip and setuptools
* `flake8` failures
* `ValueError` when bmp images with 1 bit channel are decoded
* `ValueError` when trying to decrypt empty metadata values
* Sphinx errors during building of documentation
* `TypeError` when getting default width of font
* Installing typing-extensions on Python 3.6 and 3.7
* `TypeError` in cmapdb.py when parsing null characters
* Color "convenience operators" now (per spec) also set color
space
* `ValueError` when extracting images, due to breaking changes
in Pillow
* Small typo's and issues in the documentation
* Ignore non-Unicode cmaps in TrueType fonts
* Using non-hardcoded version string and setuptools-git-
versioning to enable installation from source and building on
Python 3.12
* Usage of `if __name__ == "__main__"` where it was only
intended for testing purposes
- Option to disable boxes flow layout analysis when using pdf2txt
- Exporting images without any specific encoding
- Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=14
- Update to 20221105
- Option to disable boxes flow layout analysis when using pdf2txt
- Add support for PDF 2.0 (ISO 32000-2) AES-256 encryption
- Support for Paeth PNG filter compression (predictor value = 4)
- Type annotations
- Export type annotations from pypi package per PEP561
- Support for identity cmap's
- Add support for PDF page labels
- Installation of Pillow as an optional extra dependency
- Exporting images without any specific encoding
- Output converter for the hOCR format
- Font name aliases for Arial, Courier New and Times New Roman
- Documentation on why special characters can sometimes not be extracted
- Remove patch python-pdfminer.six-remove-nose.patch
- Update dependencies
OBS-URL: https://build.opensuse.org/request/show/1132937
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=12
- version update to 20200726
- Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change
- Always try to get CMap, not only for identity encodings
- Support for painting multiple rectangles at once
- Validate image object in do_EI is a PDFStream
- Hiding fallback xref by default from dumppdf.py output
- Raise a warning instead of an error when extracting text from a non-extractable PDF
- Switched from pycryptodome to cryptography package for AES decryption
- Python3 shebang line to script in tools
- Fix ordering of textlines within a textbox when `boxes_flow=None`
- Allow boxes_flow LAParam to be passed as None, validate the input, and update documentation
- Also accept file-like objects in high level functions `extract_text` and `extract_pages`
- Text no longer comes in reverse order when advanced layout analysis is disabled
- Updated misleading documentation for `word_margin` and `char_margin`
- Ignore ValueError when converting font encoding differences
- Grouping of text lines outside of parent container bounding box
- Group text lines if they are centered
- Python3 shebang line to script in tools
- Fix ordering of textlines within a textbox when `boxes_flow=None`
- do not require nose for testing
- added patches
fix https://github.com/pdfminer/pdfminer.six/pull/489
+ python-pdfminer.six-remove-nose.patch
OBS-URL: https://build.opensuse.org/request/show/833056
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pdfminer.six?expand=0&rev=5