From e8d71ece1b803685c395c24fd68190220243bd618cfc2872d8d62e14c8219745 Mon Sep 17 00:00:00 2001 From: Nico Krapp Date: Mon, 12 May 2025 11:44:19 +0000 Subject: [PATCH] - Convert to pip-based build OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-PyPDF2?expand=0&rev=24 --- .gitattributes | 23 ++ .gitignore | 1 + 2.11.1.tar.gz | 3 + python-PyPDF2.changes | 933 ++++++++++++++++++++++++++++++++++++++++++ python-PyPDF2.spec | 70 ++++ 5 files changed, 1030 insertions(+) create mode 100644 .gitattributes create mode 100644 .gitignore create mode 100644 2.11.1.tar.gz create mode 100644 python-PyPDF2.changes create mode 100644 python-PyPDF2.spec diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..9b03811 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,23 @@ +## Default LFS +*.7z filter=lfs diff=lfs merge=lfs -text +*.bsp filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.gem filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.jar filter=lfs diff=lfs merge=lfs -text +*.lz filter=lfs diff=lfs merge=lfs -text +*.lzma filter=lfs diff=lfs merge=lfs -text +*.obscpio filter=lfs diff=lfs merge=lfs -text +*.oxt filter=lfs diff=lfs merge=lfs -text +*.pdf filter=lfs diff=lfs merge=lfs -text +*.png filter=lfs diff=lfs merge=lfs -text +*.rpm filter=lfs diff=lfs merge=lfs -text +*.tbz filter=lfs diff=lfs merge=lfs -text +*.tbz2 filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.ttf filter=lfs diff=lfs merge=lfs -text +*.txz filter=lfs diff=lfs merge=lfs -text +*.whl filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..57affb6 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.osc diff --git a/2.11.1.tar.gz b/2.11.1.tar.gz new file mode 100644 index 0000000..05f1f20 --- /dev/null +++ b/2.11.1.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a4e323120adc5103d53c370782bfc86381143ea7b69e9213eb1263c7aaf39df8 +size 6460157 diff --git a/python-PyPDF2.changes b/python-PyPDF2.changes new file mode 100644 index 0000000..ca001c0 --- /dev/null +++ b/python-PyPDF2.changes @@ -0,0 +1,933 @@ +------------------------------------------------------------------- +Mon May 12 10:44:03 UTC 2025 - Markéta Machová + +- Convert to pip-based build + +------------------------------------------------------------------- +Fri Aug 25 14:08:04 UTC 2023 - ecsos + +- Add %{?sle15_python_module_pythons} + +------------------------------------------------------------------- +Thu Oct 27 21:04:36 UTC 2022 - Yogalakshmi Arunachalam + +- Update to version 2.11.1 + Bug Fixes (BUG) + * td matrix (#1373) + * Cope with cmap from #1322 (#1372) + Robustness (ROB) + * Cope with str returned from get_data in cmap (#1380) + Full Changelog: https://github.com/py-pdf/PyPDF2/compare/2.11.0…2.11.1 + +------------------------------------------------------------------- +Wed Oct 12 02:36:06 UTC 2022 - Yogalakshmi Arunachalam + +- Update to version 2.11.0 + * New Features (ENH) + Addition of optional visitor-functions in extract_text() (#1252) + Add metadata.creation_date and modification_date (#1364) + Add PageObject.images attribute (#1330) + * Bug Fixes (BUG) + Lookup index in _xobj_to_image can be ByteStringObject (#1366) + ‘IndexError: index out of range’ when using extract_text (#1361) + Errors in transfer_rotation_to_content() (#1356) + * Robustness (ROB) + Ensure update_page_form_field_values does not fail if no fields (#1346) + Full Changelog: https://github.com/py-pdf/PyPDF2/compare/2.10.9…2.11.0 + +------------------------------------------------------------------- +Wed Sep 7 18:19:10 UTC 2022 - Yogalakshmi Arunachalam + +- Spec changes: + Changed the source to github + Renamed CHANGELOG to CHANGELOG.md + +------------------------------------------------------------------- +Wed Sep 7 16:36:36 UTC 2022 - Yogalakshmi Arunachalam + +- Update to version 2.6.0: + New Features (ENH): + - Add color and font_format to PdfReader.outlines[i] (#1104) + - Extract Text Enhancement (whitespaces) (#1084) + Bug Fixes (BUG): + - Use `build_destination` for named destination outlines (#1128) + - Avoid a crash when a ToUnicode CMap has an empty dstString in beginbfchar (#1118) + - Prevent deduplication of PageObject (#1105) + - None-check in DictionaryObject.read_from_stream (#1113) + - Avoid IndexError in _cmap.parse_to_unicode (#1110) + Documentation (DOC): - Explanation for git submodule - Watermark and stamp (#1095) Maintenance (MAINT): + - Text extraction improvements (#1126) + - Destination.color returns ArrayObject instead of tuple as fallback (#1119) + - Use add_bookmark_destination in add_bookmark (#1100) + - Use add_bookmark_destination in add_bookmark_dict (#1099) + Testing (TST): + - Remove xfail from test_outline_title_issue_1121 + - Add test for arab text (#1127) + - Add xfail for decryption fail (#1125) + - Add xfail test for IndexError when extracting text (#1124) + - Add MCVE showing outline title issue (#1123) + Code Style (STY): + - Apply black and isort + - Use IntFlag for permissions_flag / update_page_form_field_values (#1094) + - Simplify code (#1101) + +- Update to version 2.5.0: + New Features (ENH): + - Add PageObject._get_fonts (#1083) + - Add support for indexed color spaces / BitsPerComponent for decoding PNGs (#1067) + Performance Improvements (PI): + - Use iterative DFS in PdfWriter._sweep_indirect_references (#1072) + Bug Fixes (BUG): + - Let Page.scale also scale the crop-/trim-/bleed-/artbox (#1066) + - Column default for CCITTFaxDecode (#1079) + Robustness (ROB): + - Guard against None-value in _get_outlines (#1060) + Documentation (DOC): + - Stamps and watermarks (#1082) + - OCR vs PDF text extraction (#1081) + - Python Version support + - Formatting of CHANGELOG + Developer Experience (DEV): + - Cache downloaded files (#1070) + - Speed-up for CI (#1069) + Maintenance (MAINT): + - Set page.rotate(angle: int) (#1092) + - Issue #416 was fixed by #1015 (#1078) + Testing (TST): + - Image extraction (#1080) + - Image extraction (#1077) + Code Style (STY): + - Apply black + - Typo in Changelog + +- Update to version 2.4.2: + New Features (ENH): + - Add PdfReader.xfa attribute (#1026) + Bug Fixes (BUG): + - Wrong page inserted when PdfMerger.merge is done (#1063) + - Resolve IndirectObject when it refers to a free entry (#1054) + Developer Experience (DEV): + - Added {posargs} to tox.ini (#1055) + Maintenance (MAINT): + - Remove PyPDF2._utils.bytes_type (#1053) + Testing (TST): + - Scale page (indirect rect object) (#1057) + - Simplify pathlib PdfReader test (#1056) + - IndexError of VirtualList (#1052) + - Invalid XML in xmp information (#1051) + - No pycryptodome (#1050) + - Increase test coverage (#1045) + Code Style (STY): + - DOC of compress_content_streams (#1061) + - Minimize diff for #879 (#1049) + +- Update to version 2.4.1: + New Features (ENH): + - Add writer.pdf_header property (getter and setter) (#1038) + Performance Improvements (PI): + - Remove b_ call in FloatObject.write_to_stream (#1044) + - Check duplicate objects in writer._sweep_indirect_references (#207) + Documentation (DOC): + - How to surppress exceptions/warnings/log messages (#1037) + - Remove hyphen from lossless (#1041) + - Compression of content streams (#1040) + - Fix inconsistent variable names in add-watermark.md (#1039) + - File size reduction + - Add CHANGELOG to the rendered docs (#1023) + Maintenance (MAINT): + - Handle XML error when reading XmpInformation (#1030) + - Deduplicate Code / add mutmut config (#1022) + Code Style (STY): + - Use unnecessary one-line function / class attribute (#1043) + - Docstring formatting (#1033) + +- Update to version 2.4.0: + New Features (ENH): + - Support R6 decrypting (#1015) + - Add PdfReader.pdf_header (#1013) + Performance Improvements (PI): + - Remove ord_ calls (#1014) + Bug Fixes (BUG): + - Fix missing page for bookmark (#1016) + Robustness (ROB): + - Deal with invalid Destinations (#1028) + Documentation (DOC): + - get_form_text_fields does not extract dropdown data (#1029) + - Adjust PdfWriter.add_uri docstring + - Mention crypto extra_requires for installation (#1017) + Developer Experience (DEV): + - Use /n line endings everywhere (#1027) + - Adjust string formatting to be able to use mutmut (#1020) + - Update Bug report template + +- Update to version 2.3.1: + BUG: Forgot to add the interal `_codecs` subpackage. + +- Update to version 2.3.0: + The highlight of this release is improved support for file encryption + (AES-128 and AES-256, R5 only). See #749 for the amazing work of + @exiledkingcc confetti_ball Thank you hugs + Deprecations (DEP): + - Rename names to be PEP8-compliant (#967) + - `PdfWriter.get_page`: the pageNumber parameter is renamed to page_number + - `PyPDF2.filters`: + * For all classes, a parameter rename: decodeParms ➔ decode_parms + * decodeStreamData ➔ decode_stream_data + - `PyPDF2.xmp`: + * XmpInformation.rdfRoot ➔ XmpInformation.rdf_root + * XmpInformation.xmp_createDate ➔ XmpInformation.xmp_create_date + * XmpInformation.xmp_creatorTool ➔ XmpInformation.xmp_creator_tool + * XmpInformation.xmp_metadataDate ➔ XmpInformation.xmp_metadata_date + * XmpInformation.xmp_modifyDate ➔ XmpInformation.xmp_modify_date + * XmpInformation.xmpMetadata ➔ XmpInformation.xmp_metadata + * XmpInformation.xmpmm_documentId ➔ XmpInformation.xmpmm_document_id + * XmpInformation.xmpmm_instanceId ➔ XmpInformation.xmpmm_instance_id + - `PyPDF2.generic`: + * readHexStringFromStream ➔ read_hex_string_from_stream + * initializeFromDictionary ➔ initialize_from_dictionary + * createStringObject ➔ create_string_object + * TreeObject.hasChildren ➔ TreeObject.has_children + * TreeObject.emptyTree ➔ TreeObject.empty_tree + New Features (ENH): + - Add decrypt support for V5 and AES-128, AES-256 (R5 only) (#749) + Robustness (ROB): + - Fix corrupted (wrongly) linear PDF (#1008) + Maintenance (MAINT): + - Move PDF_Samples folder into ressources + - Fix typos (#1007) + Testing (TST): + - Improve encryption/decryption test (#1009) + - Add merger test cases with real PDFs (#1006) + - Add mutmut config + Code Style (STY): + - Put pure data mappings in separate files (#1005) + - Make encryption module private, apply pre-commit (#1010) + +- Update to version 2.2.1: + Performance Improvements (PI): + - Remove b_ calls (#992, #986) + - Apply improvements to _utils suggested by perflint (#993) + Robustness (ROB): + - utf-16-be\' codec can\'t decode (...) (#995) + Documentation (DOC): + - Remove reference to Scripts (#987) + Developer Experience (DEV): + - Fix type annotations for add_bookmarks (#1000) + Testing (TST): + - Add test for PdfMerger (#1001) + - Add tests for XMP information (#996) + - reader.get_fields / zlib issue / LZW decode issue (#1004) + - reader.get_fields with report generation (#1002) + - Improve test coverage by extracting texts (#998) + Code Style (STY): + - Apply fixes suggested by pylint (#999) + +- Update to version 2.2.0: + The 2.2.0 release improves text extraction again via (#969): + * Improvements around /Encoding / /ToUnicode + * Extraction of CMaps improved + * Fallback for font def missing + * Support for /Identity-H and /Identity-V: utf-16-be + * Support for /GB-EUC-H / /GB-EUC-V / GBp/c-EUC-H / /GBpc-EUC-V (beta release for evaluation) + * Arabic (for evaluation) + * Whitespace extraction improvements + Those changes should mainly improve the text extraction for non-ASCII alphabets, + e.g. Russian / Chinese / Japanese / Korean / Arabic. + +- Update to version 2.1.1: + New Features (ENH): + - Add support for pathlib as input for PdfReader (#979) + Performance Improvements (PI): + - Optimize read_next_end_line (#646) + Bug Fixes (BUG): + - Adobe Acrobat \'Would you like to save this file?\' (#970) + Documentation (DOC): + - Notes on annotations (#982) + - Who uses PyPDF2 + - intendet \xe2\x9e\x94 in robustness page (#958) + Maintenance (MAINT): + - pre-commit / requirements.txt updates (#977) + - Mark read_next_end_line as deprecated (#965) + - Export `PageObject` in PyPDF2 root (#960) + Testing (TST): + - Add MCVE of issue #416 (#980) + - FlateDecode.decode decodeParms (#964) + - Xmp module (#962) + - utils.paeth_predictor (#959) + Code Style (STY): + - Use more tuples and list/dict comprehensions (#976) + +- Update to version 2.1.0: + The highlight of the 2.1.0 release is the most massive improvement to the + text extraction capabilities of PyPDF2 since 2016 partying_faceconfetti_ball A very big thank you goes + to [pubpub-zz](https://github.com/pubpub-zz) who took a lot of time and + knowledge about the PDF format to finally get those improvements into PyPDF2. + Thank you hugsgreen_heart + In case the new function causes any issues, you can use `_extract_text_old` + for the old functionality. Please also open a bug ticket in that case. + There were several people who have attempted to bring similar improvements to + PyPDF2. All of those were valuable. The main reason why they didn't get merged + is the big amount of open PRs / issues. pubpub-zz was the most comprehensive + PR which also incorporated the latest changes of PyPDF2 2.0.0. + Thank you to [VictorCarlquist](https://github.com/VictorCarlquist) for #858 and + [asabramo](https://github.com/asabramo) for #464 hugs + New Features (ENH): + - Massive text extraction improvement (#924). Closed many open issues: + - Exceptions / missing spaces in extract_text() method (#17) man_dancing + - Whitespace issues in extract_text() (#42) woman_dancing + - pypdf2 reads the hifenated words in a new line (#246) + - PyPDF2 failing to read unicode character (#37) + - Unable to read bullets (#230) + - ExtractText yields nothing for apparently good PDF (#168) tada + - Encoding issue in extract_text() (#235) + - extractText() doesn't work on Chinese PDF (#252) + - encoding error (#260) + - Trouble with apostophes in names in text "O'Doul" (#384) + - extract_text works for some PDF files, but not the others (#437) + - Euro sign not being recognized by extractText (#443) + - Failed extracting text from French texts (#524) + - extract_text doesn't extract ligatures correctly (#598) + - reading spanish text - mark convert issue (#635) + - Read PDF changed from text to random symbols (#654) + - .extractText() reads / as 1. (#789) + - Update glyphlist (#947) - inspired by #464 + - Allow adding PageRange objects (#948) + Bug Fixes (BUG): + - Delete .python-version file (#944) + - Compare StreamObject.decoded_self with None (#931) + Robustness (ROB): + - Fix some conversion errors on non conform PDF (#932) + Documentation (DOC): + - Elaborate on PDF text extraction difficulties (#939) + - Add logo (#942) + - rotate vs Transformation().rotate (#937) + - Example how to use PyPDF2 with AWS S3 (#938) + - How to deprecate (#930) + - Fix typos on robustness page (#935) + - Remove scripts (pdfcat) from docs (#934) + Developer Experience (DEV): + - Ignore .python-version file + - Mark deprecated code with no-cover (#943) + - Automatically create Github releases from tags (#870) + Testing (TST): + - Text extraction for non-latin alphabets (#954) + - Ignore PdfReadWarning in benchmark (#949) + - writer.remove_text (#946) + - Add test for Tree and _security (#945) + Code Style (STY): + - black, isort, Flake8, splitting buildCharMap (#950) + +- Update to version 2.0.0: + The 2.0.0 release of PyPDF2 includes three core changes: + 1. Dropping support for Python 3.5 and older. + 2. Introducing type annotations. + 3. Interface changes, mostly to have PEP8-compliant names + We introduced a [deprecation process](#930) + that hopefully helps users to avoid unexpected breaking changes. + Breaking Changes(DEP): + - PyPDF2 2.0 requires Python 3.6+. Python 2.7 and 3.5 support were dropped. + - PdfFileReader: The "warndest" parameter was removed + - PdfFileReader and PdfFileMerger no longer have the `overwriteWarnings` + parameter. The new behavior is `overwriteWarnings=False`. + - merger: OutlinesObject was removed without replacement. + - merger.py ➔ _merger.py: You must import PdfFileMerger from PyPDF2 directly. + - utils: + * `ConvertFunctionsToVirtualList` was removed + * `formatWarning` was removed + * `isInt(obj)`: Use `instance(obj, int)` instead + * `u_(s)`: Use `s` directly + * `chr_(c)`: Use `chr(c)` instead + * `barray(b)`: Use `bytearray(b)` instead + * `isBytes(b)`: Use `instance(b, type(bytes()))` instead + * `xrange_fn`: Use `range` instead + * `string_type`: Use `str` instead + * `isString(s)`: Use `instance(s, str)` instead + * `_basestring`: Use `str` instead + * All Exceptions are now in `PyPDF2.errors`: + - PageSizeNotDefinedError + - PdfReadError + - PdfReadWarning + - PyPdfError + -`PyPDF2.pdf` (the `pdf` module) no longer exists. The contents were moved with + the library. You should most likely import directly from `PyPDF2` instead. + The `RectangleObject` is in `PyPDF2.generic`. + -The `Resources`, `Scripts`, and `Tests` will no longer be part of the distribution + files on PyPI. This should have little to no impact on most people. The + `Tests` are renamed to `tests`, the `Resources` are renamed to `resources`. + Both are still in the git repository. The `Scripts` are now in + https://github.com/py-pdf/cpdf. `Sample_Code` was moved to the `docs`. + For a full list of deprecated functions, please see the changelog of version 1.28.0. + New Features (ENH): + - Improve space setting for text extraction (#922) + - Allow setting the decryption password in PdfReader.__init__ (#920) + - Add Page.add_transformation (#883) + Bug Fixes (BUG): + - Fix error adding transformation to page without /Contents (#908) + Robustness (ROB): + - Cope with invalid length in streams (#861) + Documentation (DOC): + - Fix style of 1.25 and 1.27 patch notes (#927) + - Transformation (#907) + Developer Experience (DEV): + - Create flake8 config file (#916) + - Use relative imports (#875) + Maintenance (MAINT): + - Use Python 3.6 language features (#849) + - Add wrapper function for PendingDeprecationWarnings (#928) + - Use new PEP8 compliant names (#884) + - Explicitly represent transformation matrix (#878) + - Inline PAGE_RANGE_HELP string (#874) + - Remove unnecessary generics imports (#873) + - Remove star imports (#865) + - merger.py ➔ _merger.py (#864) + - Type annotations for all functions/methods (#854) + - Add initial type support with mypy (#853) + Testing (TST): + - Regression test for xmp_metadata converter (#923) + - Checkout submodule sample-files for benchmark + - Add text extracting performance benchmark + - Use new PyPDF2 API in benchmark (#902) + - Make test suite fail for uncaught warnings (#892) + - Remove -OO testrun from CI (#901) + - Improve tests for convert_to_int (#899) + +- Update to version 1.28.4: + Bug Fixes (BUG): + - XmpInformation._converter_date was unusable (#921) + - Update to version 1.28.3: + Deprecations (DEP): + - PEP8 renaming (#905) + Bug Fixes (BUG): + - XmpInformation missing method _getText (#917) + - Fix PendingDeprecationWarning on _merge_page (#904) + +- Update to version 1.28.2: + Bug Fixes (BUG): + - PendingDeprecationWarning for getContents (#893) + - PendingDeprecationWarning on using PdfMerger (#891) + - Update to version 1.28.1: + Bug Fixes (BUG): + - Incorrectly show deprecation warnings on internal usage (#887) + Maintenance (MAINT): + - Add stacklevel=2 to deprecation warnings (#889) + - Remove duplicate warnings imports (#888) + +- Update to version 1.28.0: + This release adds a lot of deprecation warnings in preparation of the + PyPDF2 2.0.0 release. The changes are mostly using snake_case function-, method-, + and variable-names as well as using properties instead of getter-methods. + Maintenance (MAINT): + - Remove IronPython Fallback for zlib (#868) + * Make the `PyPDF2.utils` module private + * Rename of core classes: + * PdfFileReader ➔ PdfReader + * PdfFileWriter ➔ PdfWriter + * PdfFileMerger ➔ PdfMerger + * Use PEP8 conventions for function names and parameters + * If a property and a getter-method are both present, use the property + In many places: + - getObject ➔ get_object + - writeToStream ➔ write_to_stream + - readFromStream ➔ read_from_stream + PyPDF2.generic + - readObject ➔ read_object + - convertToInt ➔ convert_to_int + - DocumentInformation.getText ➔ DocumentInformation._get_text : + This method should typically not be used; please let me know if you need it. + PdfReader class: + - `reader.getPage(pageNumber)` ➔ `reader.pages[page_number]` + - `reader.getNumPages()` / `reader.numPages` ➔ `len(reader.pages)` + - getDocumentInfo ➔ metadata + - flattenedPages attribute ➔ flattened_pages + - resolvedObjects attribute ➔ resolved_objects + - xrefIndex attribute ➔ xref_index + - getNamedDestinations / namedDestinations attribute ➔ named_destinations + - getPageLayout / pageLayout ➔ page_layout attribute + - getPageMode / pageMode ➔ page_mode attribute + - getIsEncrypted / isEncrypted ➔ is_encrypted attribute + - getOutlines ➔ get_outlines + - readObjectHeader ➔ read_object_header (TODO: read vs get?) + - cacheGetIndirectObject ➔ cache_get_indirect_object (TODO: public vs private?) + - cacheIndirectObject ➔ cache_indirect_object (TODO: public vs private?) + - getDestinationPageNumber ➔ get_destination_page_number + - readNextEndLine ➔ read_next_end_line + - _zeroXref ➔ _zero_xref + - _authenticateUserPassword ➔ _authenticate_user_password + - _pageId2Num attribute ➔ _page_id2num + - _buildDestination ➔ _build_destination + - _buildOutline ➔ _build_outline + - _getPageNumberByIndirect(indirectRef) ➔ _get_page_number_by_indirect(indirect_ref) + - _getObjectFromStream ➔ _get_object_from_stream + - _decryptObject ➔ _decrypt_object + - _flatten(..., indirectRef) ➔ _flatten(..., indirect_ref) + - _buildField ➔ _build_field + - _checkKids ➔ _check_kids + - _writeField ➔ _write_field + - _write_field(..., fieldAttributes) ➔ _write_field(..., field_attributes) + - _read_xref_subsections(..., getEntry, ...) ➔ _read_xref_subsections(..., get_entry, ...) + PdfWriter class: + - `writer.getPage(pageNumber)` ➔ `writer.pages[page_number]` + - `writer.getNumPages()` ➔ `len(writer.pages)` + - addMetadata ➔ add_metadata + - addPage ➔ add_page + - addBlankPage ➔ add_blank_page + - addAttachment(fname, fdata) ➔ add_attachment(filename, data) + - insertPage ➔ insert_page + - insertBlankPage ➔ insert_blank_page + - appendPagesFromReader ➔ append_pages_from_reader + - updatePageFormFieldValues ➔ update_page_form_field_values + - cloneReaderDocumentRoot ➔ clone_reader_document_root + - cloneDocumentFromReader ➔ clone_document_from_reader + - getReference ➔ get_reference + - getOutlineRoot ➔ get_outline_root + - getNamedDestRoot ➔ get_named_dest_root + - addBookmarkDestination ➔ add_bookmark_destination + - addBookmarkDict ➔ add_bookmark_dict + - addBookmark ➔ add_bookmark + - addNamedDestinationObject ➔ add_named_destination_object + - addNamedDestination ➔ add_named_destination + - removeLinks ➔ remove_links + - removeImages(ignoreByteStringObject) ➔ remove_images(ignore_byte_string_object) + - removeText(ignoreByteStringObject) ➔ remove_text(ignore_byte_string_object) + - addURI ➔ add_uri + - addLink ➔ add_link + - getPage(pageNumber) ➔ get_page(page_number) + - getPageLayout / setPageLayout / pageLayout ➔ page_layout attribute + - getPageMode / setPageMode / pageMode ➔ page_mode attribute + - _addObject ➔ _add_object + - _addPage ➔ _add_page + - _sweepIndirectReferences ➔ _sweep_indirect_references + PdfMerger class + - `__init__` parameter: strict=True ➔ strict=False (the PdfFileMerger still has the old default) + - addMetadata ➔ add_metadata + - addNamedDestination ➔ add_named_destination + - setPageLayout ➔ set_page_layout + - setPageMode ➔ set_page_mode + Page class: + - artBox / bleedBox/ cropBox/ mediaBox / trimBox ➔ artbox / bleedbox/ cropbox/ mediabox / trimbox + - getWidth, getHeight ➔ width / height + - getLowerLeft_x / getUpperLeft_x ➔ left + - getUpperRight_x / getLowerRight_x ➔ right + - getLowerLeft_y / getLowerRight_y ➔ bottom + - getUpperRight_y / getUpperLeft_y ➔ top + - getLowerLeft / setLowerLeft ➔ lower_left property + - upperRight ➔ upper_right + - mergePage ➔ merge_page + - rotateClockwise / rotateCounterClockwise ➔ rotate_clockwise + - _mergeResources ➔ _merge_resources + - _contentStreamRename ➔ _content_stream_rename + - _pushPopGS ➔ _push_pop_gs + - _addTransformationMatrix ➔ _add_transformation_matrix + - _mergePage ➔ _merge_page + XmpInformation class: + - getElement(..., aboutUri, ...) ➔ get_element(..., about_uri, ...) + - getNodesInNamespace(..., aboutUri, ...) ➔ get_nodes_in_namespace(..., aboutUri, ...) + - _getText ➔ _get_text + utils.py: + - matrixMultiply ➔ matrix_multiply + - RC4_encrypt is moved to the security module + - Update to version 1.27.12: + Bug Fixes (BUG): + - _rebuild_xref_table expects trailer to be a dict (#857) + Documentation (DOC): + - Security Policy + +- Update to version 1.27.11: + Bug Fixes (BUG): + - Incorrectly issued xref warning/exception (#855) + +- Update to version 1.27.10: + Robustness (ROB): + - Handle missing destinations in reader (#840) + - warn-only in readStringFromStream (#837) + - Fix corruption in startxref or xref table (#788 and #830) + Documentation (DOC): + - Project Governance (#799) + - History of PyPDF2 + - PDF feature/version support (#816) + - More details on text parsing issues (#815) + Developer Experience (DEV): + - Add benchmark command to Makefile + - Ignore IronPython parts for code coverage (#826) + Maintenance (MAINT): + - Split pdf module (#836) + - Separated CCITTFax param parsing/decoding (#841) + - Update requirements files + Testing (TST): + - Use external repository for larger/more PDFs for testing (#820) + - Swap incorrect test names (#838) + - Add test for PdfFileReader and page properties (#835) + - Add tests for PyPDF2.generic (#831) + - Add tests for utils, form fields, PageRange (#827) + - Add test for ASCII85Decode (#825) + - Add test for FlateDecode (#823) + - Add test for filters.ASCIIHexDecode (#822) + Code Style (STY): + - Apply pre-commit (black, isort) + use snake_case variables (#832) + - Remove debug code (#828) + - Documentation, Variable names (#839) + +- Update to version 1.27.9: + A change I would like to highlight is the performance improvement for + large PDF files (#808) tada + New Features (ENH): + - Add papersizes (#800) + - Allow setting permission flags when encrypting (#803) + - Allow setting form field flags (#802) + Bug Fixes (BUG): + - TypeError in xmp._converter_date (#813) + - Improve spacing for text extraction (#806) + - Fix PDFDocEncoding Character Set (#809) + Robustness (ROB): + - Use null ID when encrypted but no ID given (#812) + - Handle recursion error (#804) + Documentation (DOC): + - CMaps (#811) + - The PDF Format + commit prefixes (#810) + - Add compression example (#792) + Developer Experience (DEV): + - Add Benchmark for Performance Testing (#781) + Maintenance (MAINT): + - Validate PDF magic byte in strict mode (#814) + - Make PdfFileMerger.addBookmark() behave life PdfFileWriters\' (#339) + - Quadratic runtime while parsing reduced to linear (#808) + Testing (TST): + - Newlines in text extraction (#807) + +- Update to version 1.27.8: + Bug Fixes (BUG): + - Use 1MB as offset for readNextEndLine (#321) + - 'PdfFileWriter' object has no attribute 'stream' (#787) + Robustness (ROB): + - Invalid float object; use 0 as fallback (#782) + Documentation (DOC): + - Robustness (#785) + - Update to version 1.27.7: + Bug Fixes (BUG): + - Import exceptions from PyPDF2.errors in PyPDF2.utils (#780) + Code Style (STY): + - Naming in 'make_changelog.py' + - Update to version 1.27.6: + Deprecations (DEP): + - Remove support for Python 2.6 and older (#776) + New Features (ENH): + - Extract document permissions (#320) + Bug Fixes (BUG): + - Clip by trimBox when merging pages, which would otherwise be ignored (#240) + - Add overwriteWarnings parameter PdfFileMerger (#243) + - IndexError for getPage() of decryped file (#359) + - Handle cases where decodeParms is an ArrayObject (#405) + - Updated PDF fields don't show up when page is written (#412) + - Set Linked Form Value (#414) + - Fix zlib -5 error for corrupt files (#603) + - Fix reading more than last1K for EOF (#642) + - Acciental import + Robustness (ROB): + - Allow extra whitespace before "obj" in readObjectHeader (#567) + Documentation (DOC): + - Link to pdftoc in Sample_Code (#628) + - Working with annotations (#764) + - Structure history + Developer Experience (DEV): + - Add issue templates (#765) + - Add tool to generate changelog + Maintenance (MAINT): + - Use grouped constants instead of string literals (#745) + - Add error module (#768) + - Use decorators for @staticmethod (#775) + - Split long functions (#777) + Testing (TST): + - Run tests in CI once with -OO Flags (#770) + - Filling out forms (#771) + - Add tests for Writer (#772) + - Error cases (#773) + - Check Error messages (#769) + - Regression test for issue #88 + - Regression test for issue #327 +Code Style (STY): + - Make variable naming more consistent in test + - Update to version 1.27.5: + Security (SEC): + - ContentStream_readInlineImage had potential infinite loop (#740) + Bug fixes (BUG): + - Fix merging encrypted files (#757) + - CCITTFaxDecode decodeParms can be an ArrayObject (#756) + Robustness improvements (ROBUST): + - title sometimes None (#744) + Documentation (DOC): + - Adjust short description of the package + Tests and Test setup (TST): + - Rewrite JS tests from unittest to pytest (#746) + - Increase Test coverage, mainly with filters (#756) + - Add test for inline images (#758) + Developer Experience Improvements (DEV): + - Remove unused Travis-CI configuration (#747) + - Show code coverage (#754, #755) + - Add mutmut (#760) + Miscellaneous: + - STY: Closing file handles, explicit exports, ... (#743) + +- Update to version 1.27.4: + Bug fixes (BUG): + - Guard formatting of __init__.__doc__ string (#738) + Packaging (PKG): + - Add more precise license field to setup (#733) + Testing (TST): + - Add test for issue #297 + Miscellaneous: + - DOC: Miscallenious ➔ Miscellaneous (Typo) + - TST: Fix CI triggering (master ➔ main) (#739) + - STY: Fix various style issues (#742) + +- Update to version 1.27.3: + - PKG: Make Tests not a subpackage (#728) + - BUG: Fix ASCII85Decode.decode assertion (#729) + - BUG: Error in Chinese character encoding (#463) + - BUG: Code duplication in Scripts/2-up.py + - ROBUST: Guard 'obj.writeToStream' with 'if obj is not None' + - ROBUST: Ignore a /Prev entry with the value 0 in the trailer + - MAINT: Remove Sample_Code (#726) + - TST: Close file handle in test_writer (#722) + - TST: Fix test_get_images (#730) + - DEV: Make tox use pytest and add more Python versions (#721) + - DOC: Many (#720, #723-725, #469) + +- Update to version 1.27.2: + - Add Scripts (including `pdfcat`), Resources, Tests, and Sample_Code back to + PyPDF2. It was removed by accident in 1.27.0, but might get removed with 2.0.0 + See #718 for discussion + +- Update to version 1.27.0: + Features: + - Add alpha channel support for png files in Script (#614) + Bug fixes (BUG): + - Fix formatWarning for filename without slash (#612) + - Add whitespace between words for extractText() (#569, #334) + - "invalid escape sequence" SyntaxError (#522) + - Avoid error when printing warning in pythonw (#486) + - Stream operations can be List or Dict (#665) + Documentation (DOC): + - Added Scripts/pdf-image-extractor.py + - Documentation improvements (#550, #538, #324, #426, #394) + Tests and Test setup (TST): + - Add Github Action which automatically run unit tests via pytest and + static code analysis with Flake8 (#660) + - Add several unit tests (#661, #663) + - Add .coveragerc to create coverage reports + Developer Experience Improvements (DEV): + - Pre commit: Developers can now `pre-commit install` to avoid tiny issues + like trailing whitespaces + Miscallenious: + - Add the LICENSE file to the distributed packages (#288) + - Use setuptools instead of distutils (#599) + - Improvements for the PyPI page (#644) + - Python 3 changes (#504, #366) + +------------------------------------------------------------------- +Mon Oct 21 22:55:54 UTC 2019 - Simon Lees + +- change the copyright to 2019 + +------------------------------------------------------------------- +Thu Dec 6 13:22:02 UTC 2018 - Tomáš Chvátal + +- Fix fdupes call + +------------------------------------------------------------------- +Tue Dec 4 12:52:37 UTC 2018 - Matej Cepl + +- Remove superfluous devel dependency for noarch package + +------------------------------------------------------------------- +Mon May 14 10:11:40 UTC 2018 - tchvatal@suse.com + +- Use license macro + +------------------------------------------------------------------- +Thu Apr 20 04:22:33 UTC 2017 - sflees@suse.de + +- Convert to single spec +- Update to version 1.26.0 + * NOTE: Active maintenance on PyPDF2 is resuming after a hiatus + * Fixed a bug where image resources where incorrectly overwritten + when merging pages + * Added dictionary for JavaScript actions to the root (louib) + * Added unit tests for the JS functionality (louib) + * Add more Python 3 compatibility when reading inline images (im2703 + and (VyacheslavHashov) + * Return NullObject instead of raising error when failing to resolve + object (ctate) + * Don't output warning for non-zeroed xref table when strict=False + (BenRussert) + * Remove extraneous zeroes from output formatting (speedplane) + * Fix bug where reading an inline image would cut off prematurely in + certain cases (speedplane) +- Changes for 1.25 +BUGFIXES: + * Added Python 3 algorithm for ASCII85Decode. Fixes issue when + reading reportlab-generated files with Py 3 (jerickbixly) + * Recognize more escape sequence which would otherwise throw an + exception (manuelzs, robertsoakes) + * Fixed overflow error in generic.py. Occurred + when reading a too-large int in Python 2 (by Raja Jamwal) + * Allow access to files which were encrypted with an empty + password. Previously threw a "File has not been decrypted" + exception (Elena Williams) + * Do not attempt to decode an empty data stream. Previously + would cause an error in decode algorithms (vladir) + * Fixed some type issues specific to Py 2 or Py 3 + * Fix issue when stream data begins with whitespace (soloma83) + * Recognize abbreviated filter names (AlmightyOatmeal and + Matthew Weiss) + * Copy decryption key from PdfFileReader to PdfFileMerger. + Allows usage of PdfFileMerger with encrypted files (twolfson) + * Fixed bug which occurred when a NameObject is present at end + of a file stream. Threw a "Stream has ended unexpectedly" + exception (speedplane) +FEATURES: + * Initial work on a test suite; to be expanded in future. + Tests and Resources directory added, README updated (robertsoakes) + * Added document cloning methods to PdfFileWriter: + appendPagesFromReader, cloneReaderDocumentRoot, and + cloneDocumentFromReader. See official documentation (robertsoakes) + * Added method for writing to form fields: updatePageFormFieldValues. + This will be enhanced in the future. See official documentation + (robertsoakes) + * New addAttachment method. See documentation. Support for adding + and extracting embedded files to be enhanced in the future + (moshekaplan) + * Added methods to get page number of given PageObject or + Destination: getPageNumber and getDestinationPageNumber. + See documentation (mozbugbox) + +------------------------------------------------------------------- +Mon May 11 18:00:56 UTC 2015 - benoit.monin@gmx.fr + +- update to version 1.24: + * Bugfixes for reading files in Python 3 (by Anthony Tuininga and + pqqp) + * Appropriate errors are now raised instead of infinite loops (by + naure and Cyrus Vafadari) + * Bugfix for parsing number tokens with leading spaces (by Maxim + Kamenkov) + * Don't crash on bad /Outlines reference (by eshellman) + * Conform tabs/spaces and blank lines to PEP 8 standards + * Utilize the readUntilRegex method when reading Number Objects + (by Brendan Jurd) + * More bugfixes for Python 3 and clearer exception handling + * Fixed encoding issue in merger (with eshellman) + * Created separate folder for scripts +- additional changes from version 1.23: + * Documentation now available at http://pythonhosted.org//PyPDF2 + * Bugfix in pagerange.py for when __init__.__doc__ has no value + (by Vladir Cruz) + * Fix typos in OutlinesObject().add() (by shilluc) + * Re-added a missing return statement in a utils.py method + * Corrected viewing mode names (by Jason Scheirer) + * New PdfFileWriter method: addJS() (by vfigueiro) + * New bookmark features: color, boldness, italics, and page fit + (by Joshua Arnott) + * New PdfFileReader method: getFields(). Used to extract field + information from PDFs with interactive forms. See documentation + for details + * Converted README file to markdown format (by Stephen Bussard) + * Several improvements to overall performance and efficiency (by + mozbugbox) + * Fixed a bug where geospatial information was not scaling along + with its page + * Fixed a type issue and a Python 3 issue in the decryption + algorithms (with Francisco Vieira and koba-ninkigumi) + * Fixed a bug causing an infinite loop in the ASCII 85 decoding + algorithm (by madmaardigan) + * Annotations (links, comment windows, etc.) are now preserved + when pages are merged together + * Used the Destination class in addLink() and addBookmark() so + that the page fit option could be properly customized +- additional changes from version 1.22: + * Added .DS_Store to .gitignore (for Mac users) (by Steve Witham) + * Removed __init__() implementation in NameObject (by Steve + Witham) + * Fixed bug (inf. loop) when merging pages in Python 3 (by commx) + * Corrected error when calculating height in scaleTo() + * Removed unnecessary code from DictionaryObject (by Georges + Dubus) + * Fixed bug where an exception was thrown upon reading a NULL + string (by speedplane) + * Allow string literals (non-unicode strings in Python 2) to be + passed to PdfFileReader + * Allow ConvertFunctionsToVirtualList to be indexed with slices + and longs (in Python 2) (by Matt Gilson) + * Major improvements and bugfixes to addLink() method (see + documentation in source code) (by Henry Keiter) + * General code clean-up and improvements (with Steve Witham and + Henry Keiter) + * Fixed bug that caused crash when comments are present at end of + dictionary +- additional changes from version 1.21: + * Fix for when /Type isn't present in the Pages dictionary (by + Rob1080) + * More tolerance for extra whitespace in Indirect Objects + * Improved Exception handling + * Fixed error in getHeight() method (by Simon Kaempflein) + * implement use of utils.string_type to resolve Py2-3 + compatibility issues + * Prevent exception for multiple definitions in a dictionary + (with carlosfunk) (only when strict = False) + * Fixed errors when parsing a slice using pdfcat on command line + (by Steve Witham) + * Tolerance for EOF markers within 1024 bytes of the actual end + of the file (with David Wolever) + * Added overwriteWarnings parameter to PdfFileReader constructor, + if False PyPDF2 will NOT overwrite methods from Python's + warnings.py module with a custom implementation. + * Fix NumberObject and NameObject constructors for compatibility + with PyPy (Rüdiger Jungbeck, Xavier Dupré, shezadkhan137, + Steven Witham) + * Utilize utils.Str in pdf.py and pagerange.py to resolve type + issues (by egbutter) + * Improvements in implementing StringIO for Python 2 and BytesIO + for Python 3 (by Xavier Dupré) + * Added /x00 to Whitespaces, defined utils.WHITESPACES to clarify + code (by Maxim Kamenkov) + * Bugfix for merging 3 or more resources with the same name (by + lucky-user) + * Improvements to Xref parsing algorithm (by speedplane) +- additional changes from version 1.20: + * Official Python 3+ support (with contributions from TWAC and + cgammans) Support for Python versions 2.6 and 2.7 will be + maintained + * Command line concatenation (see pdfcat in sample code) (by + Steve Witham) + * New FAQ; link included in README + * Allow more (although unnecessary) escape sequences + * Prevent exception when reading a null object in decoding + parameters + * Corrected error in reading destination types (added a slash + since they are name objects) + * Corrected TypeError in scaleTo() method + * addBookmark() method in PdfFileMerger now returns bookmark (so + nested bookmarks can be created) + * Additions to Sample Code and Sample PDFs + * changes to allow 2up script to work (see sample code) (by Dylan + McNamee) + * changes to metadata encoding (by Chris Hiestand) + * New methods for links: addLink() (by Enrico Lambertini) and + removeLinks() + * Bugfix to handle nested bookmarks correctly (by Jamie Lentin) + * New methods removeImages() and removeText() available for + PdfFileWriter (by Tien Haï) + * Exception handling for illegal characters in Name Objects +- remove unwanted shebang in pagerange.py +- rename README to README.md: changed upstream + +------------------------------------------------------------------- +Tue Dec 3 10:52:18 UTC 2013 - cfarrell@suse.com + +- license update: BSD-3-Clause + See LICENSE + +------------------------------------------------------------------- +Sun Nov 24 21:44:43 UTC 2013 - p.drouand@gmail.com + +- Initial release ( version 1.19 ) + diff --git a/python-PyPDF2.spec b/python-PyPDF2.spec new file mode 100644 index 0000000..f7f5a47 --- /dev/null +++ b/python-PyPDF2.spec @@ -0,0 +1,70 @@ +# +# spec file for package python-PyPDF2 +# +# Copyright (c) 2025 SUSE LLC +# +# All modifications and additions to the file contributed by third parties +# remain the property of their copyright owners, unless otherwise agreed +# upon. The license for this file, and modifications and additions to the +# file, is the same license as for the pristine package itself (unless the +# license for the pristine package is not an Open Source License, in which +# case the license is the MIT License). An "Open Source License" is a +# license that conforms to the Open Source Definition (Version 1.9) +# published by the Open Source Initiative. + +# Please submit bugfixes or comments via https://bugs.opensuse.org/ +# + + +%{?sle15_python_module_pythons} +Name: python-PyPDF2 +Version: 2.11.1 +Release: 0 +Summary: PDF toolkit +License: BSD-3-Clause +Group: Development/Languages/Python +URL: https://github.com/py-pdf/PyPDF2 +Source: https://github.com/py-pdf/PyPDF2/archive/refs/tags/%{version}.tar.gz +BuildRequires: %{python_module pip} +BuildRequires: %{python_module setuptools} +BuildRequires: %{python_module wheel} +BuildRequires: fdupes +BuildRequires: python-rpm-macros +BuildArch: noarch +%python_subpackages + +%description +A Pure-Python library built as a PDF toolkit. It is capable of: + +- extracting document information (title, author, ...), +- splitting documents page by page, +- merging documents page by page, +- cropping pages, +- merging multiple pages into a single page, +- encrypting and decrypting PDF files. + +By being Pure-Python, it should run on any Python platform without any +dependencies on external libraries. It can also work entirely on StringIO +objects rather than file streams, allowing for PDF manipulation in memory. +It is therefore a useful tool for websites that manage or manipulate PDFs. + +%prep +%setup -q -n PyPDF2-%{version} +#remove unwanted shebang +sed -i '/^#!/ d' PyPDF2/pagerange.py + +%build +%pyproject_wheel + +%install +%pyproject_install +%python_expand %fdupes %{buildroot}%{$python_sitelib} +chmod a-x CHANGELOG.md LICENSE README.md + +%files %{python_files} +%license LICENSE +%doc CHANGELOG.md README.md +%{python_sitelib}/PyPDF2 +%{python_sitelib}/[Pp]y[Pp][Dd][Ff]2-%{version}*-info + +%changelog