commit b1b430c7150e9e3cbb3ca595374770af24ba466ec43f4003f1e3d7eecc2d844c Author: Matej Cepl Date: Mon Jun 3 16:18:33 2024 +0000 Sorry, didn't see that the version was hard coded in the URL OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-pypdf?expand=0&rev=1 diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..9b03811 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,23 @@ +## Default LFS +*.7z filter=lfs diff=lfs merge=lfs -text +*.bsp filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.gem filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.jar filter=lfs diff=lfs merge=lfs -text +*.lz filter=lfs diff=lfs merge=lfs -text +*.lzma filter=lfs diff=lfs merge=lfs -text +*.obscpio filter=lfs diff=lfs merge=lfs -text +*.oxt filter=lfs diff=lfs merge=lfs -text +*.pdf filter=lfs diff=lfs merge=lfs -text +*.png filter=lfs diff=lfs merge=lfs -text +*.rpm filter=lfs diff=lfs merge=lfs -text +*.tbz filter=lfs diff=lfs merge=lfs -text +*.tbz2 filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.ttf filter=lfs diff=lfs merge=lfs -text +*.txz filter=lfs diff=lfs merge=lfs -text +*.whl filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..57affb6 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.osc diff --git a/python-pypdf-4.2.0.tar.gz b/python-pypdf-4.2.0.tar.gz new file mode 100644 index 0000000..38b4a00 --- /dev/null +++ b/python-pypdf-4.2.0.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4096459bdb19df0231360617f2266d8068a40b9eb202bbea9c54274a320f0c55 +size 8009612 diff --git a/python-pypdf.changes b/python-pypdf.changes new file mode 100644 index 0000000..28f289e --- /dev/null +++ b/python-pypdf.changes @@ -0,0 +1,1623 @@ +------------------------------------------------------------------- +Tue May 21 12:50:19 UTC 2024 - Christian Goll + +- Update to 4.2.0 what includes the upstream renaming to python-pydf + introduced in 3.2.0. Changes are: + +- Version 4.2.0, 2024-04-07 + New Features (ENH) + * Allow multiple charsets for NameObject.read_from_stream (#2585) + * Add support for /Kids in page labels (#2562) + * Allow to update fields on many pages (#2571) + * Tolerate PDF with invalid xref pointed objects (#2335) + * Add Enforce from PDF2.0 in viewer_preferences (#2511) + * Add += and -= operators to ArrayObject (#2510) + Bug Fixes (BUG) + * Fix merge_page sometimes generating unknown operator 'QQ' (#2588) + * Fix fields update where annotations are kids of field (#2570) + * Process CMYK images without a filter correctly (#2557) + * Extract text in layout mode without finding resources (#2555) + * Prevent recursive loop in some PDF files (#2505) + Robustness (ROB) + * Tolerate "truncated" xref (#2580) + * Replace error by warning for EOD in RunLengthDecode/ASCIIHexDecode (#2334) + * Rebuild xref table if one entry is invalid (#2528) + * Robustify stream extraction (#2526) + Documentation (DOC) + * Update release process for latest changes (#2564) + * Encryption/decryption: Clone document instead of copying all pages (#2546) + * Minor improvements (#2542) + * Update annotation list (#2534) + * Update references and formatting (#2529) + * Correct threads reference, plus minor changes (#2521) + * Minor readability increases (#2515) + * Simplify PaperSize examples (#2504) + * Minor improvements (#2501) + Developer Experience (DEV) + * Remove unused dependencies (#2572) + * Remove page labels PR link from message (#2561) + * Fix changelog generator regarding whitespace and handling of "Other" group (#2492) + * Add REL to known PR prefixes (#2554) + * Release using the REL commit instead of git tag (#2500) + * Unify code between PdfReader and PdfWriter (#2497) + * Bump softprops/action-gh-release from 1 to 2 (#2514) + Maintenance (MAINT) + * Ressources → Resources (and internal name childs) (#2550) + * Fix typos found by codespell (#2549) + * Update Read the Docs configuration (#2538) + * Add root_object, _info and _ID to PdfReader (#2495) + Testing (TST) + * Allow loading truncated images if required (#2586) + * Fix download issues from #2562 (#2578) + * Improve test_get_contents_from_nullobject to show real use-case (#2524) + * Add missing test annotations (#2507) + - Version 4.1.0, 2024-03-03 + Generating name objects (`NameObject`) without a leading slash + is considered deprecated now. Previously, just a plain warning + would be logged, leading to possibly invalid PDF files. According + to our deprecation policy, this will log a *DeprecationWarning* + for now. + New Features (ENH) + * Add get_pages_from_field (#2494) + * Add reattach_fields function (#2480) + * Automatic access to pointed object for IndirectObject (#2464) + Bug Fixes (BUG) + * Missing error on name without leading / (#2387) + * encode_pdfdocencoding() always returns bytes (#2440) + * BI in text content identified as image tag (#2459) + Robustness (ROB) + * Missing basefont entry in type 3 font (#2469) + Documentation (DOC) + * Improve lossless compression example (#2488) + * Amend robustness documentation (#2479) + Developer Experience (DEV) + * Fix changelog for UTF-8 characters (#2462) + Maintenance (MAINT) + * Add _get_page_number_from_indirect in writer (#2493) + * Remove user assignment for feature requests (#2483) + * Remove reference to old 2.0.0 branch (#2482) + Testing (TST) + * Fix benchmark failures (#2481) + * Broken test due to expired test file URL (#2468) + * Resolve file naming conflict in test_iss1767 (#2445) +- Version 4.0.2, 2024-02-18 + Bug Fixes (BUG) + * Use NumberObject for /Border elements of annotations (#2451) +- Version 4.0.1, 2024-01-28 + Bug Fixes (BUG) + * layout mode text extraction ZeroDivisionError (#2417) + Testing (TST) + * Skip tests using fpdf2 if it's not installed (#2419) +- Version 4.0.0, 2024-01-19 + Deprecations (DEP) + * Drop Python 3.6 support (#2369) + * Remove deprecated code (#2367) + * Remove deprecated XMP properties (#2386) + New Features (ENH) + * Add "layout" mode for text extraction (#2388) + * Add Jupyter Notebook integration for PdfReader (#2375) + * Improve/rewrite PDF permission retrieval (#2400) + Bug Fixes (BUG) + * PdfWriter.add_uri was setting the wrong type (#2406) + * Add support for GBK2K cmaps (#2385) + Maintenance (MAINT) + * Return None instead of -1 when page is not attached (#2376) + * Complete FileSpecificationDictionaryEntries constants (#2416) + * Replace warning with logging.error (#2377) +- Version 3.17.4, 2023-12-24 + Bug Fixes (BUG) + * Handle IndirectObject as image filter (#2355) +- Version 3.17.3, 2023-12-17 + Robustness (ROB) + * Out-of-bounds issue in handle_tj (text extraction) (#2342) + Developer Experience (DEV) + * Make make_release.py easier to configure (#2348) + Maintenance (MAINT) + * Bump actions/download-artifact from 3 to 4 (#2344) +- Version 3.17.2, 2023-12-10 + Bug Fixes (BUG) + * Cope with deflated images with CMYK Black Only (#2322) + * Handle indirect objects as parameters for CCITTFaxDecode (#2307) + * check words length in _cmap type1_alternative function (#2310) + Robustness (ROB) + * Relax flate decoding for too many lookup values (#2331) + * Let _build_destination skip in case of missing /D key (#2018) + - Version 3.17.1, 2023-11-14 + Bug Fixes (BUG) + * Mediabox expansion size when applying non-right angle rotation (#2282) + Robustness (ROB) + * MissingWidth is IndirectObject (#2288) + * Initialize states array with an empty value (#2280) + - Version 3.17.0, 2023-10-29 + Security (SEC) + * Infinite recursion when using PdfWriter(clone_from=reader) (#2264) + New Features (ENH) + * Add parameter to select images to be removed (#2214) + Bug Fixes (BUG) + * Correctly handle image mode 1 with FlateDecode (#2249) + * Error when filling a value with parentheses #2268 (#2269) + * Handle empty root outline (#2239) +- Version 3.16.4, 2023-10-10 + Bug Fixes (BUG) + * Avoid exceeding recursion depth when retrieving image mode (#2251) +- Version 3.16.3, 2023-10-08 + Bug Fixes (BUG) + * Invalid cm/tm in visitor functions (#2206) + * Encrypt / decrypt Stream object dictionaries (#2228) + * Support nested color spaces for the /DeviceN color space (#2241) + * Images property fails if NullObject in list (#2215) + Developer Experience (DEV) + * Unify mypy options and warn redundant workarounds (#2223) +- Version 3.16.2, 2023-09-24 + Bug Fixes (BUG) + * PDF size increases because of too high float writing precision (#2213) + * Fix test_watermarking_reportlab_rendering() (#2203) +- Version 3.16.1, 2023-09-17 + ⚠️ The 'rename PdfWriter.create_viewer_preference to + PdfWriter.create_viewer_preferences (#2190)' could be a breaking change for you, + if you use it. As it was only introduced last week I'm confident enough that + nobody will be affected though. Hence only the patch update. + Bug Fixes (BUG) + * Missing new line in extract_text with cm operations (#2142) + * _get_fonts not processing properly CIDFonts and annotations (#2194) + Maintenance (MAINT) + * Rename PdfWriter.create_viewer_preference to PdfWriter.create_viewer_preferences (#2190) +- Version 3.16.0, 2023-09-10 + Security (SEC) + * Infinite recursion caused by IndirectObject clone (#2156) + New Features (ENH) + * Ease access to ViewerPreferences (#2144) + Bug Fixes (BUG) + * Catch the case where w[0] is an IndirectObject instead of an int (#2154) + * Cope with indirect objects in filters and remove deprecated code (#2177) + * Accept tabs in cmaps (#2174) / cope with extra space (#2151) + * Merge pages without resources (#2150) + * getcontents() shall return None if contents is NullObject (#2161) + * Fix conversion from 1 to LA (#2175) + Robustness (ROB) + * Accept XYZ with no arguments (#2178) +- Version 3.15.5, 2023-09-03 + Bug Fixes (BUG) + * Cope with missing /I in articles (#2134) + * Fix image look-up table in EncodedStreamObject (#2128) + * remove_images not operating in sub level forms (#2133) + Robustness (ROB) + * Cope with damaged PDF (#2129) +- Version 3.15.4, 2023-08-27 + Performance Improvements (PI) + * Making pypdf as fast as pdfrw (#2086) + Maintenance (MAINT) + * Relax typing_extensions version (#2104) +- Version 3.15.3, 2023-08-26 + Bug Fixes (BUG) + * Check version of crypt provider (#2115) + * TypeError: can't concat str to bytes (#2114) + * Require flit_core >= 3.9 (#2091) +- Version 3.15.2, 2023-08-20 + Security (SEC) + * Avoid endless recursion of reading damaged PDF file (#2093) + Performance Improvements (PI) + * Reuse content stream (#2101) + Maintenance (MAINT) + * Make ParseError inherit from PyPdfError (#2097) +- Version 3.15.1, 2023-08-13 + Performance Improvements (PI) + * optimize _decode_png_prediction (#2068) + Bug Fixes (BUG) + * Fix incorrect tm_matrix in call to visitor_text (#2060) + * Writing German characters into form fields (#2047) + * Prevent stall when accessing image in corrupted pdf (#2081) + * append() fails when articles do not have /T (#2080) + Robustness (ROB) + * Cope with xref not followed by separator (#2083) +- Version 3.15.0, 2023-08-06 + New Features (ENH) + * Add `level` parameter to compress_content_streams (#2044) + * Process /uniHHHH for text_extract (#2043) + Bug Fixes (BUG) + * Fix AnnotationBuilder.link (#2066) + * JPX image without ColorSpace (#2062) + * Added check for field /Info when cloning reader document (#2055) + * Fix indexed/CMYK images (#2039) + Maintenance (MAINT) + * Cryptography as primary dependency (#2053) +- Version 3.14.0, 2023-07-29 + New Features (ENH) + * Accelerate image list keys generation (#2014) + * Use `cryptography` for encryption/decryption as a fallback for PyCryptodome (#2000) + * Extract LaTeX characters (#2016) + * ASCIIHexDecode.decode now returns bytes instead of str (#1994) + Bug Fixes (BUG) + * Add RunLengthDecode filter (#2012) + * Process /Separation ColorSpace (#2007) + * Handle single element ColorSpace list (#2026) + * Process lookup decoded as TextStringObjects (#2008) + Robustness (ROB) + * Cope with garbage collector during cloning (#1841) + Maintenance (MAINT) + * Cleanup of annotations (#1745) +- Version 3.13.0, 2023-07-23 + New Features (ENH) + * Add is_open in outlines in PdfReader and PdfWriter (#1960) + Bug Fixes (BUG) + * Search /DA in hierarchy fields (#2002) + * Cope with different ISO date length (#1999) + * Decode Black only/CMYK deviceN images (#1984) + * Process CMYK in deflate images (#1977) + Developer Experience (DEV) + * Add mypy to pre-commit (#2001) + * Release automation (#1991, #1985) +- Version 3.12.2, 2023-07-16 + Bug Fixes (BUG) + * Accept calRGB and calGray color_spaces (#1968) + * Process 2bits and 4bits images (#1967) + * Check for AcroForm and ensure it is not None (#1965) + Developer Experience (DEV) + * Automate the release process (#1970) +- Version 3.12.1, 2023-07-09 + Bug Fixes (BUG) + * Prevent updating page contents after merging page (stamping/watermarking) (#1952) + * % to be hex encoded in names (#1958) + * Inverse color in CMYK images (#1947) + * Dates conversion not working with Z00\'00\' (#1946) + * Support UTF-16-LE Strings (#1884) +- Version 3.12.0, 2023-07-02 + New Features (ENH) + * Add AES support for encrypting PDF files (#1918, #1935, #1936, #1938) + * Add page deletion feature to PdfWriter (#1843) + Bug Fixes (BUG) + * PdfReader.get_fields() attempts to delete non-existing index "/Off" (#1933) + * Remove unused objects when cloning_from (#1926) + * Add the TK.SIZE into the trailer (#1911) + * add_named_destination() maintains named destination list sort order (#1930) +- Version 3.11.1, 2023-06-25 + Bug Fixes (BUG) + * Cascaded filters in image objects (#1913) + * Append pdf with named destination using numbers for pages (#1858) + * Ignore "/B" fields only on pages in PdfWriter.append() (#1875) +- Version 3.11.0, 2023-06-23 + New Features (ENH) + * Add page_number property (#1856) + Bug Fixes (BUG) + * File expansion when updating with Page Contents (#1906) + * Missing Alternate in indexed/ICCbased colorspaces (#1896) +- Version 3.10.0, 2023-06-18 + New Features (ENH) + * Extraction of inline images (#1850) + * Add capability to replace image (#1849) + * Extend images interface by returning an ImageFile(File) class (#1848) + * Add set_data to EncodedStreamObject (#1854) + Bug Fixes (BUG) + * Fix RGB FlateEncode Images(PNG) and transparency (#1834) + * Generate static appearance for fields (#1864) +- Version 3.9.1, 2023-06-04 + Deprecations (DEP) + * Deprecate PdfMerger (#1866) + Bug Fixes (BUG) + * Ignore UTF-8 decode errors (#1865) + Robustness (ROB) + * Handle missing /Type entry in Page tree (#1859) +- Version 3.9.0, 2023-05-21 + New Features (ENH) + * Simplify metadata input (Document Information Dictionary) (#1851) + * Extend cmap compatibility to GBK_EUC_H/V (#1812) + Bug Fixes (BUG) + * Prevent infinite loop when no character follows after a comment (#1828) + * get_contents does not return ContentStream (#1847) + * Accept XYZ destination with zoom missing (default to zoom=0.0) (#1844) + * Cope with 1 Bit images (#1815) + Robustness (ROB) + * Handle missing /Type entry in Page tree (#1845) + Documentation (DOC) + * Expand file size explanations (#1835) + * Add comparison with pdfplumber (#1837) + * Clarify that PyPDF2 is dead (#1827) + * Add Hunter King as Contributor for #1806 + Maintenance (MAINT) + * Refactor internal Encryption class (#1821) + * Add R parameter to generate_values (#1820) + * Make encryption_key parameter of write_to_stream optional (#1819) + * Prepare for adding AES encryption support (#1818) +- Version 3.8.1, 2023-04-23 + Bug Fixes (BUG) + * Convert color space before saving (#1802) + Documentation (DOC) + * PDF/A (#1807) + * Use append instead of add_page + * Document core mechanics of pypdf (#1783) +- Version 3.8.0, 2023-04-16 + New Features (ENH) + * Add transform method to Transformation class (#1765) + * Cope with UC2 fonts in text_extraction (#1785) + Robustness (ROB) + * Invalid startxref pointing 1 char before (#1784) + Maintenance (MAINT) + * Mark code handling old parameters as deprecated (#1798) +- Version 3.7.1, 2023-04-09 + Security (SEC) + * Warn about PDF encryption security (#1755) + Robustness (ROB) + * Prevent loop in Cloning (#1770) + * Capture UnicodeDecodeError at PdfReader.pdf_header (#1768) + Documentation (DOC) + * Add .readthedocs.yaml and bump docs dependencies using `tox -e deps` (#1750, #1752) + Developer Experience (DEV) + * Make make_changelog.py idempotent + Maintenance (MAINT) + * Move generation of file identifiers to a method (#1760) + Testing (TST) + * Add xmp test (#1775) +- Version 3.7.0, 2023-03-26 + Security (SEC) + * Use Python's secrets module instead of random module (#1748) + New Features (ENH) + * Add AnnotationBuilder.highlight text markup annotation (#1740) + * Add AnnotationBuilder.popup (#1665) + * Add AnnotationBuilder.polyline annotation support (#1726) + * Add clone_from parameter in PdfWriter constructor (#1703) + Bug Fixes (BUG) + * 'DictionaryObject' object has no attribute 'indirect_reference' (#1729) + Robustness (ROB) + * Handle params NullObject in decode_stream_data (#1738) + Documentation (DOC) + * Project scope (#1743) + Maintenance (MAINT) + * Add AnnotationFlag (#1746) + * Add LazyDict.__str__ (#1727) +- Version 3.6.0, 2023-03-18 + New Features (ENH) + * Extend PdfWriter.append() to PageObjects (#1704) + * Support qualified names in update_page_form_field_values (#1695) + Robustness (ROB) + * Tolerate streams without length field (#1717) + * Accept DictionaryObject in /D of NamedDestination (#1720) + * Widths def in cmap calls IndirectObject (#1719) +- Version 3.5.2, 2023-03-12 + ⚠️ We discovered that compress_content_stream has to be applied to a page of + the PdfWriter. It may not be applied to a page of the PdfReader! + Bug Fixes (BUG) + * compress_content_stream not readable in Adobe Acrobat (#1698) + * Pass logging parameters correctly in set_need_appearances_writer (#1697) + * Write /Root/AcroForm in set_need_appearances_writer (#1639) + Robustness (ROB) + * Allow more whitespaces within linearized file (#1701) +- Version 3.5.1, 2023-03-05 + Robustness (ROB) + * Some attributes not copied in DictionaryObject._clone (#1635) + * Allow merging multiple time pages with annots (#1624) + Testing (TST) + * Replace pytest.mark.external by enable_socket (#1657) +- Version 3.5.0, 2023-02-26 + New Features (ENH) + * Add reader.attachments public interface (#1611, #1661) + * Add PdfWriter.remove_objects_from_page(page: PageObject, to_delete: ObjectDeletionFlag) (#1648) + * Allow free-text annotation to have transparent border/background (#1664) + Bug Fixes (BUG) + * Allow decryption with empty password for AlgV5 (#1663) + * Let PdfWriter.pages return PageObject after calling `clone_document_from_reader()` (#1613) + * Invalid font pointed during merge_resources (#1641) + Robustness (ROB) + * Cope with invalid objects in IndirectObject.clone (#1637) + * Improve tolerance to invalid Names/Dests (#1658) + * Decode encoded values in get_fields (#1636) + * Let PdfWriter.merge cope with missing "/Fields" (#1628) + - Version 3.4.1, 2023-02-12 + Bug Fixes (BUG) + * Switch from trimbox to cropbox when merging pages (#1622) + * Text extraction not working with one glyph to char sequence (#1620) + Robustness (ROB) + * Fix 2 cases of "object has no attribute \'indirect_reference\'" (#1616) + Testing (TST) + * Add multiple retry on get_url for external PDF downloads (#1626) +- Version 3.4.0, 2023-02-05 + NOTICE: pypdf changed the way it represents numbers parsed from PDF files. + pypdf<3.4.0 represented numbers as Decimal, pypdf>=3.4.0 represents them as + floats. Several other PDF libraries to this, as well as many PDF viewers. + We hope to fix issues with too high precision like this and get a speed boost. + In case your PDF documents rely on more than 18 decimals of precision you + should check if it still works as expected. + To clarify: This does not affect the text shown in PDF documents. It affects + numbers, e.g. when graphics are drawn on the PDF or very exact positions are + used. Typically, 5 decimals should be enough. + New Features (ENH) + * Enable merging forms with overlapping names (#1553) + * Add 'over' parameter to merge_transformend_page & co (#1567) + Bug Fixes (BUG) + * Fix getter of the PageObject.rotation property with an indirect object (#1602) + * Restore merge_transformed_page & co (#1567) + * Replace decimal by float (#1563) + Robustness (ROB) + * PdfWriter.remove_images: /Contents might not be in page_ref (#1598) + Developer Experience (DEV) + * Introduce ruff (#1586, #1609) + Maintenance (MAINT) + * Remove decimal (#1608) + - Version 3.3.0, 2023-01-22 + New Features (ENH) + * Add page label support to PdfWriter (#1558) + * Accept inline images with space before EI (#1552) + * Add circle annotation support (#1556) + * Add polygon annotation support (#1557) + * Make merging pages produce a deterministic PDF (#1542, #1543) + Bug Fixes (BUG) + * Fix error in cmap extraction (#1544) + * Remove erroneous assertion check (#1564) + * Fix dictionary access of optional page label keys (#1562) + Robustness (ROB) + * Set ignore_eof=True for read_until_regex (#1521) + Documentation (DOC) + * Paper size (#1550) + Developer Experience (DEV) + * Fix broken combination of dependencies of docs.txt + * Annotate tests appropriately (#1551) + - Version 3.2.1, 2023-01-08 + Bug Fixes (BUG) + * Accept hierarchical fields (#1529) + Documentation (DOC) + * Use google style docstrings (#1534) + * Fix linked markdown documents (#1537) + Developer Experience (DEV) + * Update docs config (#1535) + - Version 3.2.0, 2022-12-31 + Performance Improvement (PI) + * Help the specializing adaptive interpreter (#1522) + New Features (ENH) + * Add support for page labels (#1519) + Bug Fixes (BUG) + * upgrade clone_document_root (#1520) + - Version 3.1.0, 2022-12-23 + Move PyPDF2 to pypdf (#1513). This now it's all lowercase, no number in the + name. For installation and for import. PyPDF2 will no longer receive updates. + The community should move back to its roots. + If you were still using pyPdf or PyPDF2 < 2.0.0, I recommend reading the + migration guide: https://pypdf.readthedocs.io/en/latest/user/migration-1-to-2.html + pypdf==3.1.0 is only different from PyPDF2==3.0.0 in the package name. + Replacing "PyPDF2" by "pypdf" should be enough if you migrate from + `PyPDF2==3.0.0` to `pypdf==3.1.0`. +- Version 3.0.0, 2022-12-22 + BREAKING CHANGES ⚠️ + * Deprecate features with PyPDF2==3.0.0 (#1489) + * Refactor Fit / Zoom parameters (#1437) + New Features (ENH) + * Add Cloning (#1371) + * Allow int for indirect_reference in PdfWriter.get_object (#1490) + Documentation (DOC) + * How to read PDFs from S3 (#1509) + * Make MyST parse all links as simple hyperlinks (#1506) + * Changed 'latest' for 'stable' generated docs (#1495) + * Adjust deprecation procedure (#1487) + Maintenance (MAINT) + * Use typing.IO for file streams (#1498) +- Version 2.12.1, 2022-12-10 + Documentation (DOC) + * Deduplicate extract_text docstring (#1485) + * How to cite PyPDF2 (#1476) + Maintenance (MAINT) + Consistency changes: + * indirect_ref/ido ➔ indirect_reference, dest➔ page_destination (#1467) + * owner_pwd/user_pwd ➔ owner_password/user_password (#1483) + * position ➜ page_number in Merger.merge (#1482) + * indirect_ref ➜ indirect_reference (#1484) +- Version 2.12.0, 2022-12-10 + New Features (ENH) + * Add support to extract gray scale images (#1460) + * Add 'threads' property to PdfWriter (#1458) + * Add 'open_destination' property to PdfWriter (#1431) + * Make PdfReader.get_object accept integer arguments (#1459) + Bug Fixes (BUG) + * Scale PDF annotations (#1479) + Robustness (ROB) + * Padding issue with AES encryption (#1469) + * Accept empty object as null objects (#1477) + Documentation (DOC) + * Add module documentation the PaperSize class (#1447) + Maintenance (MAINT) + * Use 'page_number' instead of 'pagenum' (#1365) + * Add List of pages to PageRangeSpec (#1456) + Testing (TST) + * Cleanup temporary files (#1454) + * Mark test_tounicode_is_identity as external (#1449) + * Use Ubuntu 20.04 for running CI test suite (#1452) +- Version 2.11.2, 2022-11-20 + New Features (ENH) + * Add remove_from_tree (#1432) + * Add AnnotationBuilder.rectangle (#1388) + Bug Fixes (BUG) + * JavaScript executed twice (#1439) + * ToUnicode stores /Identity-H instead of stream (#1433) + * Declare Pillow as optional dependency (#1392) + Developer Experience (DEV) + * Modify read_string_from_stream to a benchmark (#1415) + * Improve error reporting of read_object (#1412) + * Test Python 3.11 (#1404) + * Extend Flake8 ignore list (#1410) + * Use correct pytest markers (#1407) + * Move project configuration to pyproject.toml (#1382) + - Version 2.11.1, 2022-10-09 + Bug Fixes (BUG) + * td matrix (#1373) + * Cope with cmap from #1322 (#1372) + Robustness (ROB) + * Cope with str returned from get_data in cmap (#1380) +- Version 2.11.0, 2022-09-25 + New Features (ENH) + * Addition of optional visitor-functions in extract_text() (#1252) + * Add metadata.creation_date and modification_date (#1364) + * Add PageObject.images attribute (#1330) + Bug Fixes (BUG) + * Lookup index in _xobj_to_image can be ByteStringObject (#1366) + * 'IndexError: index out of range' when using extract_text (#1361) + * Errors in transfer_rotation_to_content() (#1356) + Robustness (ROB) + * Ensure update_page_form_field_values does not fail if no fields (#1346) + - Version 2.10.9, 2022-09-18 + New Features (ENH) + * Add rotation property and transfer_rotate_to_content (#1348) + Performance Improvements (PI) + * Avoid string concatenation with large embedded base64-encoded images (#1350) + Bug Fixes (BUG) + * Format floats using their intrinsic decimal precision (#1267) + Robustness (ROB) + * Fix merge_page for pages without resources (#1349) +- Version 2.10.8, 2022-09-14 + New Features (ENH) + * Add PageObject.user_unit property (#1336) + Robustness (ROB) + * Improve NameObject reading/writing (#1345) +- Version 2.10.7, 2022-09-11 + Bug Fixes (BUG) + * Fix Error in transformations (#1341) + * Decode #23 in NameObject (#1342) + Testing (TST) + * Use pytest.warns() for warnings, and .raises() for exceptions (#1325) +- Version 2.10.6, 2022-09-09 + Robustness (ROB) + * Fix infinite loop due to Invalid object (#1331) + * Fix image extraction issue with superfluous whitespaces (#1327) +- Version 2.10.5, 2022-09-04 + New Features (ENH) + * Process XRefStm (#1297) + * Auto-detect RTL for text extraction (#1309) + Bug Fixes (BUG) + * Avoid scaling cropbox twice (#1314) + Robustness (ROB) + * Fix offset correction in revised PDF (#1318) + * Crop data of /U and /O in encryption dictionary to 48 bytes (#1317) + * MultiLine bfrange in cmap (#1299) + * Cope with 2 digit codes in bfchar (#1310) + * Accept '/annn' charset as ASCII code (#1316) + * Log errors during Float / NumberObject initialization (#1315) + * Cope with corrupted entries in xref table (#1300) + Documentation (DOC) + * Migration guide (PyPDF2 1.x ➔ 2.x) (#1324) + * Creating a coverage report (#1319) + * Fix AnnotationBuilder.free_text example (#1311) + * Fix usage of page.scale by replacing it with page.scale_by (#1313) + Maintenance (MAINT) + * PdfReaderProtocol (#1303) + * Throw PdfReadError if Trailer can't be read (#1298) + * Remove catching OverflowException (#1302) + - Version 2.10.4, 2022-08-28 + Robustness (ROB) + * Fix errors/warnings on no /Resources within extract_text (#1276) + * Add required line separators in ContentStream ArrayObjects (#1281) + Maintenance (MAINT) + * Use NameObject idempotency (#1290) + Testing (TST) + * Rectangle deletion (#1289) + * Add workflow tests (#1287) + * Remove files after tests ran (#1286) + Packaging (PKG) + * Add minimum version for typing_extensions requirement (#1277) +- Version 2.10.3, 2022-08-21 + Robustness (ROB) + * Decrypt returns empty bytestring (#1258) + Developer Experience (DEV) + * Modify CI to better verify built package contents (#1244) + Maintenance (MAINT) + * Remove 'mine' as PdfMerger always creates the stream (#1261) + * Let PdfMerger._create_stream raise NotImplemented (#1251) + * password param of _security._alg32(...) is only a string, not bytes (#1259) + * Remove unreachable code in read_block_backwards (#1250) + and sign function in _extract_text (#1262) + Testing (TST) + * Delete annotations (#1263) + * Close PdfMerger in tests (#1260) + * PdfReader.xmp_metadata workflow (#1257) + * Various PdfWriter (Layout, Bookmark deprecation) (#1249) +- Version 2.10.2, 2022-08-15 + BUG: Add PyPDF2.generic to PyPI distribution +- Version 2.10.1, 2022-08-15 + Bug Fixes (BUG) + * TreeObject.remove_child had a non-PdfObject assignment for Count (#1233, #1234) + * Fix stream truncated prematurely (#1223) + Documentation (DOC) + * Fix docstring formatting (#1228) + Maintenance (MAINT) + * Split generic.py (#1229) + Testing (TST) + * Decrypt AlgV4 with owner password (#1239) + * AlgV5.generate_values (#1238) + * TreeObject.remove_child / empty_tree (#1235, #1236) + * create_string_object (#1232) + * Free-Text annotations (#1231) + * generic._base (#1230) + * Strict get fonts (#1226) + * Increase PdfReader coverage (#1219, #1225) + * Increase PdfWriter coverage (#1237) + * 100% coverage for utils.py (#1217) + * PdfWriter exception non-binary stream (#1218) + * Don't check coverage for deprecated code (#1216) +- Version 2.10.0, 2022-08-07 + New Features (ENH) + * "with" support for PdfMerger and PdfWriter (#1193) + * Add AnnotationBuilder.text(...) to build text annotations (#1202) + Bug Fixes (BUG) + * Allow IndirectObjects as stream filters (#1211) + Documentation (DOC) + * Font scrambling + * Page vs Content scaling (#1208) + * Example for orientation parameter of extract_text (#1206) + * Fix AnnotationBuilder parameter formatting (#1204) + Developer Experience (DEV) + * Add flake8-print (#1203) + Maintenance (MAINT) + * Introduce WrongPasswordError / FileNotDecryptedError / EmptyFileError (#1201) +- Version 2.9.0, 2022-07-31 + New Features (ENH) + * Add ability to add hex encoded colors to outline items (#1186) + * Add support for pathlib.Path in PdfMerger.merge (#1190) + * Add link annotation (#1189) + * Add capability to filter text extraction by orientation (#1175) + Bug Fixes (BUG) + * Named Dest in PDF1.1 (#1174) + * Incomplete Graphic State save/restore (#1172) + Documentation (DOC) + * Update changelog url in package metadata (#1180) + * Mention camelot for table extraction (#1179) + * Mention pyHanko for signing PDF documents (#1178) + * Weow have CMAP support since a while (#1177) + Maintenance (MAINT) + * Consistent usage of warnings / log messages (#1164) + * Consistent terminology for outline items (#1156) +- Version 2.8.1, 2022-07-25 + Bug Fixes (BUG) + * u_hash in AlgV4.compute_key (#1170) + Robustness (ROB) + * Fix loading of file from #134 (#1167) + * Cope with empty DecodeParams (#1165) + Documentation (DOC) + * Typo in merger deprecation warning message (#1166) + Maintenance (MAINT) + * Package updates; solve mypy strict remarks (#1163) + Testing (TST) + * Add test from #325 (#1169) + +------------------------------------------------------------------- +Fri Aug 25 14:08:04 UTC 2023 - ecsos + +- Add %{?sle15_python_module_pythons} + +------------------------------------------------------------------- +Thu Oct 27 21:04:36 UTC 2022 - Yogalakshmi Arunachalam + +- Update to version 2.11.1 + Bug Fixes (BUG) + * td matrix (#1373) + * Cope with cmap from #1322 (#1372) + Robustness (ROB) + * Cope with str returned from get_data in cmap (#1380) + Full Changelog: https://github.com/py-pdf/PyPDF2/compare/2.11.0…2.11.1 + +------------------------------------------------------------------- +Wed Oct 12 02:36:06 UTC 2022 - Yogalakshmi Arunachalam + +- Update to version 2.11.0 + * New Features (ENH) + Addition of optional visitor-functions in extract_text() (#1252) + Add metadata.creation_date and modification_date (#1364) + Add PageObject.images attribute (#1330) + * Bug Fixes (BUG) + Lookup index in _xobj_to_image can be ByteStringObject (#1366) + ‘IndexError: index out of range’ when using extract_text (#1361) + Errors in transfer_rotation_to_content() (#1356) + * Robustness (ROB) + Ensure update_page_form_field_values does not fail if no fields (#1346) + Full Changelog: https://github.com/py-pdf/PyPDF2/compare/2.10.9…2.11.0 + +------------------------------------------------------------------- +Wed Sep 7 18:19:10 UTC 2022 - Yogalakshmi Arunachalam + +- Spec changes: + Changed the source to github + Renamed CHANGELOG to CHANGELOG.md + +------------------------------------------------------------------- +Wed Sep 7 16:36:36 UTC 2022 - Yogalakshmi Arunachalam + +- Update to version 2.6.0: + New Features (ENH): + - Add color and font_format to PdfReader.outlines[i] (#1104) + - Extract Text Enhancement (whitespaces) (#1084) + Bug Fixes (BUG): + - Use `build_destination` for named destination outlines (#1128) + - Avoid a crash when a ToUnicode CMap has an empty dstString in beginbfchar (#1118) + - Prevent deduplication of PageObject (#1105) + - None-check in DictionaryObject.read_from_stream (#1113) + - Avoid IndexError in _cmap.parse_to_unicode (#1110) + Documentation (DOC): - Explanation for git submodule - Watermark and stamp (#1095) Maintenance (MAINT): + - Text extraction improvements (#1126) + - Destination.color returns ArrayObject instead of tuple as fallback (#1119) + - Use add_bookmark_destination in add_bookmark (#1100) + - Use add_bookmark_destination in add_bookmark_dict (#1099) + Testing (TST): + - Remove xfail from test_outline_title_issue_1121 + - Add test for arab text (#1127) + - Add xfail for decryption fail (#1125) + - Add xfail test for IndexError when extracting text (#1124) + - Add MCVE showing outline title issue (#1123) + Code Style (STY): + - Apply black and isort + - Use IntFlag for permissions_flag / update_page_form_field_values (#1094) + - Simplify code (#1101) + +- Update to version 2.5.0: + New Features (ENH): + - Add PageObject._get_fonts (#1083) + - Add support for indexed color spaces / BitsPerComponent for decoding PNGs (#1067) + Performance Improvements (PI): + - Use iterative DFS in PdfWriter._sweep_indirect_references (#1072) + Bug Fixes (BUG): + - Let Page.scale also scale the crop-/trim-/bleed-/artbox (#1066) + - Column default for CCITTFaxDecode (#1079) + Robustness (ROB): + - Guard against None-value in _get_outlines (#1060) + Documentation (DOC): + - Stamps and watermarks (#1082) + - OCR vs PDF text extraction (#1081) + - Python Version support + - Formatting of CHANGELOG + Developer Experience (DEV): + - Cache downloaded files (#1070) + - Speed-up for CI (#1069) + Maintenance (MAINT): + - Set page.rotate(angle: int) (#1092) + - Issue #416 was fixed by #1015 (#1078) + Testing (TST): + - Image extraction (#1080) + - Image extraction (#1077) + Code Style (STY): + - Apply black + - Typo in Changelog + +- Update to version 2.4.2: + New Features (ENH): + - Add PdfReader.xfa attribute (#1026) + Bug Fixes (BUG): + - Wrong page inserted when PdfMerger.merge is done (#1063) + - Resolve IndirectObject when it refers to a free entry (#1054) + Developer Experience (DEV): + - Added {posargs} to tox.ini (#1055) + Maintenance (MAINT): + - Remove PyPDF2._utils.bytes_type (#1053) + Testing (TST): + - Scale page (indirect rect object) (#1057) + - Simplify pathlib PdfReader test (#1056) + - IndexError of VirtualList (#1052) + - Invalid XML in xmp information (#1051) + - No pycryptodome (#1050) + - Increase test coverage (#1045) + Code Style (STY): + - DOC of compress_content_streams (#1061) + - Minimize diff for #879 (#1049) + +- Update to version 2.4.1: + New Features (ENH): + - Add writer.pdf_header property (getter and setter) (#1038) + Performance Improvements (PI): + - Remove b_ call in FloatObject.write_to_stream (#1044) + - Check duplicate objects in writer._sweep_indirect_references (#207) + Documentation (DOC): + - How to surppress exceptions/warnings/log messages (#1037) + - Remove hyphen from lossless (#1041) + - Compression of content streams (#1040) + - Fix inconsistent variable names in add-watermark.md (#1039) + - File size reduction + - Add CHANGELOG to the rendered docs (#1023) + Maintenance (MAINT): + - Handle XML error when reading XmpInformation (#1030) + - Deduplicate Code / add mutmut config (#1022) + Code Style (STY): + - Use unnecessary one-line function / class attribute (#1043) + - Docstring formatting (#1033) + +- Update to version 2.4.0: + New Features (ENH): + - Support R6 decrypting (#1015) + - Add PdfReader.pdf_header (#1013) + Performance Improvements (PI): + - Remove ord_ calls (#1014) + Bug Fixes (BUG): + - Fix missing page for bookmark (#1016) + Robustness (ROB): + - Deal with invalid Destinations (#1028) + Documentation (DOC): + - get_form_text_fields does not extract dropdown data (#1029) + - Adjust PdfWriter.add_uri docstring + - Mention crypto extra_requires for installation (#1017) + Developer Experience (DEV): + - Use /n line endings everywhere (#1027) + - Adjust string formatting to be able to use mutmut (#1020) + - Update Bug report template + +- Update to version 2.3.1: + BUG: Forgot to add the interal `_codecs` subpackage. + +- Update to version 2.3.0: + The highlight of this release is improved support for file encryption + (AES-128 and AES-256, R5 only). See #749 for the amazing work of + @exiledkingcc confetti_ball Thank you hugs + Deprecations (DEP): + - Rename names to be PEP8-compliant (#967) + - `PdfWriter.get_page`: the pageNumber parameter is renamed to page_number + - `PyPDF2.filters`: + * For all classes, a parameter rename: decodeParms ➔ decode_parms + * decodeStreamData ➔ decode_stream_data + - `PyPDF2.xmp`: + * XmpInformation.rdfRoot ➔ XmpInformation.rdf_root + * XmpInformation.xmp_createDate ➔ XmpInformation.xmp_create_date + * XmpInformation.xmp_creatorTool ➔ XmpInformation.xmp_creator_tool + * XmpInformation.xmp_metadataDate ➔ XmpInformation.xmp_metadata_date + * XmpInformation.xmp_modifyDate ➔ XmpInformation.xmp_modify_date + * XmpInformation.xmpMetadata ➔ XmpInformation.xmp_metadata + * XmpInformation.xmpmm_documentId ➔ XmpInformation.xmpmm_document_id + * XmpInformation.xmpmm_instanceId ➔ XmpInformation.xmpmm_instance_id + - `PyPDF2.generic`: + * readHexStringFromStream ➔ read_hex_string_from_stream + * initializeFromDictionary ➔ initialize_from_dictionary + * createStringObject ➔ create_string_object + * TreeObject.hasChildren ➔ TreeObject.has_children + * TreeObject.emptyTree ➔ TreeObject.empty_tree + New Features (ENH): + - Add decrypt support for V5 and AES-128, AES-256 (R5 only) (#749) + Robustness (ROB): + - Fix corrupted (wrongly) linear PDF (#1008) + Maintenance (MAINT): + - Move PDF_Samples folder into ressources + - Fix typos (#1007) + Testing (TST): + - Improve encryption/decryption test (#1009) + - Add merger test cases with real PDFs (#1006) + - Add mutmut config + Code Style (STY): + - Put pure data mappings in separate files (#1005) + - Make encryption module private, apply pre-commit (#1010) + +- Update to version 2.2.1: + Performance Improvements (PI): + - Remove b_ calls (#992, #986) + - Apply improvements to _utils suggested by perflint (#993) + Robustness (ROB): + - utf-16-be\' codec can\'t decode (...) (#995) + Documentation (DOC): + - Remove reference to Scripts (#987) + Developer Experience (DEV): + - Fix type annotations for add_bookmarks (#1000) + Testing (TST): + - Add test for PdfMerger (#1001) + - Add tests for XMP information (#996) + - reader.get_fields / zlib issue / LZW decode issue (#1004) + - reader.get_fields with report generation (#1002) + - Improve test coverage by extracting texts (#998) + Code Style (STY): + - Apply fixes suggested by pylint (#999) + +- Update to version 2.2.0: + The 2.2.0 release improves text extraction again via (#969): + * Improvements around /Encoding / /ToUnicode + * Extraction of CMaps improved + * Fallback for font def missing + * Support for /Identity-H and /Identity-V: utf-16-be + * Support for /GB-EUC-H / /GB-EUC-V / GBp/c-EUC-H / /GBpc-EUC-V (beta release for evaluation) + * Arabic (for evaluation) + * Whitespace extraction improvements + Those changes should mainly improve the text extraction for non-ASCII alphabets, + e.g. Russian / Chinese / Japanese / Korean / Arabic. + +- Update to version 2.1.1: + New Features (ENH): + - Add support for pathlib as input for PdfReader (#979) + Performance Improvements (PI): + - Optimize read_next_end_line (#646) + Bug Fixes (BUG): + - Adobe Acrobat \'Would you like to save this file?\' (#970) + Documentation (DOC): + - Notes on annotations (#982) + - Who uses PyPDF2 + - intendet \xe2\x9e\x94 in robustness page (#958) + Maintenance (MAINT): + - pre-commit / requirements.txt updates (#977) + - Mark read_next_end_line as deprecated (#965) + - Export `PageObject` in PyPDF2 root (#960) + Testing (TST): + - Add MCVE of issue #416 (#980) + - FlateDecode.decode decodeParms (#964) + - Xmp module (#962) + - utils.paeth_predictor (#959) + Code Style (STY): + - Use more tuples and list/dict comprehensions (#976) + +- Update to version 2.1.0: + The highlight of the 2.1.0 release is the most massive improvement to the + text extraction capabilities of PyPDF2 since 2016 partying_faceconfetti_ball A very big thank you goes + to [pubpub-zz](https://github.com/pubpub-zz) who took a lot of time and + knowledge about the PDF format to finally get those improvements into PyPDF2. + Thank you hugsgreen_heart + In case the new function causes any issues, you can use `_extract_text_old` + for the old functionality. Please also open a bug ticket in that case. + There were several people who have attempted to bring similar improvements to + PyPDF2. All of those were valuable. The main reason why they didn't get merged + is the big amount of open PRs / issues. pubpub-zz was the most comprehensive + PR which also incorporated the latest changes of PyPDF2 2.0.0. + Thank you to [VictorCarlquist](https://github.com/VictorCarlquist) for #858 and + [asabramo](https://github.com/asabramo) for #464 hugs + New Features (ENH): + - Massive text extraction improvement (#924). Closed many open issues: + - Exceptions / missing spaces in extract_text() method (#17) man_dancing + - Whitespace issues in extract_text() (#42) woman_dancing + - pypdf2 reads the hifenated words in a new line (#246) + - PyPDF2 failing to read unicode character (#37) + - Unable to read bullets (#230) + - ExtractText yields nothing for apparently good PDF (#168) tada + - Encoding issue in extract_text() (#235) + - extractText() doesn't work on Chinese PDF (#252) + - encoding error (#260) + - Trouble with apostophes in names in text "O'Doul" (#384) + - extract_text works for some PDF files, but not the others (#437) + - Euro sign not being recognized by extractText (#443) + - Failed extracting text from French texts (#524) + - extract_text doesn't extract ligatures correctly (#598) + - reading spanish text - mark convert issue (#635) + - Read PDF changed from text to random symbols (#654) + - .extractText() reads / as 1. (#789) + - Update glyphlist (#947) - inspired by #464 + - Allow adding PageRange objects (#948) + Bug Fixes (BUG): + - Delete .python-version file (#944) + - Compare StreamObject.decoded_self with None (#931) + Robustness (ROB): + - Fix some conversion errors on non conform PDF (#932) + Documentation (DOC): + - Elaborate on PDF text extraction difficulties (#939) + - Add logo (#942) + - rotate vs Transformation().rotate (#937) + - Example how to use PyPDF2 with AWS S3 (#938) + - How to deprecate (#930) + - Fix typos on robustness page (#935) + - Remove scripts (pdfcat) from docs (#934) + Developer Experience (DEV): + - Ignore .python-version file + - Mark deprecated code with no-cover (#943) + - Automatically create Github releases from tags (#870) + Testing (TST): + - Text extraction for non-latin alphabets (#954) + - Ignore PdfReadWarning in benchmark (#949) + - writer.remove_text (#946) + - Add test for Tree and _security (#945) + Code Style (STY): + - black, isort, Flake8, splitting buildCharMap (#950) + +- Update to version 2.0.0: + The 2.0.0 release of PyPDF2 includes three core changes: + 1. Dropping support for Python 3.5 and older. + 2. Introducing type annotations. + 3. Interface changes, mostly to have PEP8-compliant names + We introduced a [deprecation process](#930) + that hopefully helps users to avoid unexpected breaking changes. + Breaking Changes(DEP): + - PyPDF2 2.0 requires Python 3.6+. Python 2.7 and 3.5 support were dropped. + - PdfFileReader: The "warndest" parameter was removed + - PdfFileReader and PdfFileMerger no longer have the `overwriteWarnings` + parameter. The new behavior is `overwriteWarnings=False`. + - merger: OutlinesObject was removed without replacement. + - merger.py ➔ _merger.py: You must import PdfFileMerger from PyPDF2 directly. + - utils: + * `ConvertFunctionsToVirtualList` was removed + * `formatWarning` was removed + * `isInt(obj)`: Use `instance(obj, int)` instead + * `u_(s)`: Use `s` directly + * `chr_(c)`: Use `chr(c)` instead + * `barray(b)`: Use `bytearray(b)` instead + * `isBytes(b)`: Use `instance(b, type(bytes()))` instead + * `xrange_fn`: Use `range` instead + * `string_type`: Use `str` instead + * `isString(s)`: Use `instance(s, str)` instead + * `_basestring`: Use `str` instead + * All Exceptions are now in `PyPDF2.errors`: + - PageSizeNotDefinedError + - PdfReadError + - PdfReadWarning + - PyPdfError + -`PyPDF2.pdf` (the `pdf` module) no longer exists. The contents were moved with + the library. You should most likely import directly from `PyPDF2` instead. + The `RectangleObject` is in `PyPDF2.generic`. + -The `Resources`, `Scripts`, and `Tests` will no longer be part of the distribution + files on PyPI. This should have little to no impact on most people. The + `Tests` are renamed to `tests`, the `Resources` are renamed to `resources`. + Both are still in the git repository. The `Scripts` are now in + https://github.com/py-pdf/cpdf. `Sample_Code` was moved to the `docs`. + For a full list of deprecated functions, please see the changelog of version 1.28.0. + New Features (ENH): + - Improve space setting for text extraction (#922) + - Allow setting the decryption password in PdfReader.__init__ (#920) + - Add Page.add_transformation (#883) + Bug Fixes (BUG): + - Fix error adding transformation to page without /Contents (#908) + Robustness (ROB): + - Cope with invalid length in streams (#861) + Documentation (DOC): + - Fix style of 1.25 and 1.27 patch notes (#927) + - Transformation (#907) + Developer Experience (DEV): + - Create flake8 config file (#916) + - Use relative imports (#875) + Maintenance (MAINT): + - Use Python 3.6 language features (#849) + - Add wrapper function for PendingDeprecationWarnings (#928) + - Use new PEP8 compliant names (#884) + - Explicitly represent transformation matrix (#878) + - Inline PAGE_RANGE_HELP string (#874) + - Remove unnecessary generics imports (#873) + - Remove star imports (#865) + - merger.py ➔ _merger.py (#864) + - Type annotations for all functions/methods (#854) + - Add initial type support with mypy (#853) + Testing (TST): + - Regression test for xmp_metadata converter (#923) + - Checkout submodule sample-files for benchmark + - Add text extracting performance benchmark + - Use new PyPDF2 API in benchmark (#902) + - Make test suite fail for uncaught warnings (#892) + - Remove -OO testrun from CI (#901) + - Improve tests for convert_to_int (#899) + +- Update to version 1.28.4: + Bug Fixes (BUG): + - XmpInformation._converter_date was unusable (#921) + - Update to version 1.28.3: + Deprecations (DEP): + - PEP8 renaming (#905) + Bug Fixes (BUG): + - XmpInformation missing method _getText (#917) + - Fix PendingDeprecationWarning on _merge_page (#904) + +- Update to version 1.28.2: + Bug Fixes (BUG): + - PendingDeprecationWarning for getContents (#893) + - PendingDeprecationWarning on using PdfMerger (#891) + - Update to version 1.28.1: + Bug Fixes (BUG): + - Incorrectly show deprecation warnings on internal usage (#887) + Maintenance (MAINT): + - Add stacklevel=2 to deprecation warnings (#889) + - Remove duplicate warnings imports (#888) + +- Update to version 1.28.0: + This release adds a lot of deprecation warnings in preparation of the + PyPDF2 2.0.0 release. The changes are mostly using snake_case function-, method-, + and variable-names as well as using properties instead of getter-methods. + Maintenance (MAINT): + - Remove IronPython Fallback for zlib (#868) + * Make the `PyPDF2.utils` module private + * Rename of core classes: + * PdfFileReader ➔ PdfReader + * PdfFileWriter ➔ PdfWriter + * PdfFileMerger ➔ PdfMerger + * Use PEP8 conventions for function names and parameters + * If a property and a getter-method are both present, use the property + In many places: + - getObject ➔ get_object + - writeToStream ➔ write_to_stream + - readFromStream ➔ read_from_stream + PyPDF2.generic + - readObject ➔ read_object + - convertToInt ➔ convert_to_int + - DocumentInformation.getText ➔ DocumentInformation._get_text : + This method should typically not be used; please let me know if you need it. + PdfReader class: + - `reader.getPage(pageNumber)` ➔ `reader.pages[page_number]` + - `reader.getNumPages()` / `reader.numPages` ➔ `len(reader.pages)` + - getDocumentInfo ➔ metadata + - flattenedPages attribute ➔ flattened_pages + - resolvedObjects attribute ➔ resolved_objects + - xrefIndex attribute ➔ xref_index + - getNamedDestinations / namedDestinations attribute ➔ named_destinations + - getPageLayout / pageLayout ➔ page_layout attribute + - getPageMode / pageMode ➔ page_mode attribute + - getIsEncrypted / isEncrypted ➔ is_encrypted attribute + - getOutlines ➔ get_outlines + - readObjectHeader ➔ read_object_header (TODO: read vs get?) + - cacheGetIndirectObject ➔ cache_get_indirect_object (TODO: public vs private?) + - cacheIndirectObject ➔ cache_indirect_object (TODO: public vs private?) + - getDestinationPageNumber ➔ get_destination_page_number + - readNextEndLine ➔ read_next_end_line + - _zeroXref ➔ _zero_xref + - _authenticateUserPassword ➔ _authenticate_user_password + - _pageId2Num attribute ➔ _page_id2num + - _buildDestination ➔ _build_destination + - _buildOutline ➔ _build_outline + - _getPageNumberByIndirect(indirectRef) ➔ _get_page_number_by_indirect(indirect_ref) + - _getObjectFromStream ➔ _get_object_from_stream + - _decryptObject ➔ _decrypt_object + - _flatten(..., indirectRef) ➔ _flatten(..., indirect_ref) + - _buildField ➔ _build_field + - _checkKids ➔ _check_kids + - _writeField ➔ _write_field + - _write_field(..., fieldAttributes) ➔ _write_field(..., field_attributes) + - _read_xref_subsections(..., getEntry, ...) ➔ _read_xref_subsections(..., get_entry, ...) + PdfWriter class: + - `writer.getPage(pageNumber)` ➔ `writer.pages[page_number]` + - `writer.getNumPages()` ➔ `len(writer.pages)` + - addMetadata ➔ add_metadata + - addPage ➔ add_page + - addBlankPage ➔ add_blank_page + - addAttachment(fname, fdata) ➔ add_attachment(filename, data) + - insertPage ➔ insert_page + - insertBlankPage ➔ insert_blank_page + - appendPagesFromReader ➔ append_pages_from_reader + - updatePageFormFieldValues ➔ update_page_form_field_values + - cloneReaderDocumentRoot ➔ clone_reader_document_root + - cloneDocumentFromReader ➔ clone_document_from_reader + - getReference ➔ get_reference + - getOutlineRoot ➔ get_outline_root + - getNamedDestRoot ➔ get_named_dest_root + - addBookmarkDestination ➔ add_bookmark_destination + - addBookmarkDict ➔ add_bookmark_dict + - addBookmark ➔ add_bookmark + - addNamedDestinationObject ➔ add_named_destination_object + - addNamedDestination ➔ add_named_destination + - removeLinks ➔ remove_links + - removeImages(ignoreByteStringObject) ➔ remove_images(ignore_byte_string_object) + - removeText(ignoreByteStringObject) ➔ remove_text(ignore_byte_string_object) + - addURI ➔ add_uri + - addLink ➔ add_link + - getPage(pageNumber) ➔ get_page(page_number) + - getPageLayout / setPageLayout / pageLayout ➔ page_layout attribute + - getPageMode / setPageMode / pageMode ➔ page_mode attribute + - _addObject ➔ _add_object + - _addPage ➔ _add_page + - _sweepIndirectReferences ➔ _sweep_indirect_references + PdfMerger class + - `__init__` parameter: strict=True ➔ strict=False (the PdfFileMerger still has the old default) + - addMetadata ➔ add_metadata + - addNamedDestination ➔ add_named_destination + - setPageLayout ➔ set_page_layout + - setPageMode ➔ set_page_mode + Page class: + - artBox / bleedBox/ cropBox/ mediaBox / trimBox ➔ artbox / bleedbox/ cropbox/ mediabox / trimbox + - getWidth, getHeight ➔ width / height + - getLowerLeft_x / getUpperLeft_x ➔ left + - getUpperRight_x / getLowerRight_x ➔ right + - getLowerLeft_y / getLowerRight_y ➔ bottom + - getUpperRight_y / getUpperLeft_y ➔ top + - getLowerLeft / setLowerLeft ➔ lower_left property + - upperRight ➔ upper_right + - mergePage ➔ merge_page + - rotateClockwise / rotateCounterClockwise ➔ rotate_clockwise + - _mergeResources ➔ _merge_resources + - _contentStreamRename ➔ _content_stream_rename + - _pushPopGS ➔ _push_pop_gs + - _addTransformationMatrix ➔ _add_transformation_matrix + - _mergePage ➔ _merge_page + XmpInformation class: + - getElement(..., aboutUri, ...) ➔ get_element(..., about_uri, ...) + - getNodesInNamespace(..., aboutUri, ...) ➔ get_nodes_in_namespace(..., aboutUri, ...) + - _getText ➔ _get_text + utils.py: + - matrixMultiply ➔ matrix_multiply + - RC4_encrypt is moved to the security module + - Update to version 1.27.12: + Bug Fixes (BUG): + - _rebuild_xref_table expects trailer to be a dict (#857) + Documentation (DOC): + - Security Policy + +- Update to version 1.27.11: + Bug Fixes (BUG): + - Incorrectly issued xref warning/exception (#855) + +- Update to version 1.27.10: + Robustness (ROB): + - Handle missing destinations in reader (#840) + - warn-only in readStringFromStream (#837) + - Fix corruption in startxref or xref table (#788 and #830) + Documentation (DOC): + - Project Governance (#799) + - History of PyPDF2 + - PDF feature/version support (#816) + - More details on text parsing issues (#815) + Developer Experience (DEV): + - Add benchmark command to Makefile + - Ignore IronPython parts for code coverage (#826) + Maintenance (MAINT): + - Split pdf module (#836) + - Separated CCITTFax param parsing/decoding (#841) + - Update requirements files + Testing (TST): + - Use external repository for larger/more PDFs for testing (#820) + - Swap incorrect test names (#838) + - Add test for PdfFileReader and page properties (#835) + - Add tests for PyPDF2.generic (#831) + - Add tests for utils, form fields, PageRange (#827) + - Add test for ASCII85Decode (#825) + - Add test for FlateDecode (#823) + - Add test for filters.ASCIIHexDecode (#822) + Code Style (STY): + - Apply pre-commit (black, isort) + use snake_case variables (#832) + - Remove debug code (#828) + - Documentation, Variable names (#839) + +- Update to version 1.27.9: + A change I would like to highlight is the performance improvement for + large PDF files (#808) tada + New Features (ENH): + - Add papersizes (#800) + - Allow setting permission flags when encrypting (#803) + - Allow setting form field flags (#802) + Bug Fixes (BUG): + - TypeError in xmp._converter_date (#813) + - Improve spacing for text extraction (#806) + - Fix PDFDocEncoding Character Set (#809) + Robustness (ROB): + - Use null ID when encrypted but no ID given (#812) + - Handle recursion error (#804) + Documentation (DOC): + - CMaps (#811) + - The PDF Format + commit prefixes (#810) + - Add compression example (#792) + Developer Experience (DEV): + - Add Benchmark for Performance Testing (#781) + Maintenance (MAINT): + - Validate PDF magic byte in strict mode (#814) + - Make PdfFileMerger.addBookmark() behave life PdfFileWriters\' (#339) + - Quadratic runtime while parsing reduced to linear (#808) + Testing (TST): + - Newlines in text extraction (#807) + +- Update to version 1.27.8: + Bug Fixes (BUG): + - Use 1MB as offset for readNextEndLine (#321) + - 'PdfFileWriter' object has no attribute 'stream' (#787) + Robustness (ROB): + - Invalid float object; use 0 as fallback (#782) + Documentation (DOC): + - Robustness (#785) + - Update to version 1.27.7: + Bug Fixes (BUG): + - Import exceptions from PyPDF2.errors in PyPDF2.utils (#780) + Code Style (STY): + - Naming in 'make_changelog.py' + - Update to version 1.27.6: + Deprecations (DEP): + - Remove support for Python 2.6 and older (#776) + New Features (ENH): + - Extract document permissions (#320) + Bug Fixes (BUG): + - Clip by trimBox when merging pages, which would otherwise be ignored (#240) + - Add overwriteWarnings parameter PdfFileMerger (#243) + - IndexError for getPage() of decryped file (#359) + - Handle cases where decodeParms is an ArrayObject (#405) + - Updated PDF fields don't show up when page is written (#412) + - Set Linked Form Value (#414) + - Fix zlib -5 error for corrupt files (#603) + - Fix reading more than last1K for EOF (#642) + - Acciental import + Robustness (ROB): + - Allow extra whitespace before "obj" in readObjectHeader (#567) + Documentation (DOC): + - Link to pdftoc in Sample_Code (#628) + - Working with annotations (#764) + - Structure history + Developer Experience (DEV): + - Add issue templates (#765) + - Add tool to generate changelog + Maintenance (MAINT): + - Use grouped constants instead of string literals (#745) + - Add error module (#768) + - Use decorators for @staticmethod (#775) + - Split long functions (#777) + Testing (TST): + - Run tests in CI once with -OO Flags (#770) + - Filling out forms (#771) + - Add tests for Writer (#772) + - Error cases (#773) + - Check Error messages (#769) + - Regression test for issue #88 + - Regression test for issue #327 +Code Style (STY): + - Make variable naming more consistent in test + - Update to version 1.27.5: + Security (SEC): + - ContentStream_readInlineImage had potential infinite loop (#740) + Bug fixes (BUG): + - Fix merging encrypted files (#757) + - CCITTFaxDecode decodeParms can be an ArrayObject (#756) + Robustness improvements (ROBUST): + - title sometimes None (#744) + Documentation (DOC): + - Adjust short description of the package + Tests and Test setup (TST): + - Rewrite JS tests from unittest to pytest (#746) + - Increase Test coverage, mainly with filters (#756) + - Add test for inline images (#758) + Developer Experience Improvements (DEV): + - Remove unused Travis-CI configuration (#747) + - Show code coverage (#754, #755) + - Add mutmut (#760) + Miscellaneous: + - STY: Closing file handles, explicit exports, ... (#743) + +- Update to version 1.27.4: + Bug fixes (BUG): + - Guard formatting of __init__.__doc__ string (#738) + Packaging (PKG): + - Add more precise license field to setup (#733) + Testing (TST): + - Add test for issue #297 + Miscellaneous: + - DOC: Miscallenious ➔ Miscellaneous (Typo) + - TST: Fix CI triggering (master ➔ main) (#739) + - STY: Fix various style issues (#742) + +- Update to version 1.27.3: + - PKG: Make Tests not a subpackage (#728) + - BUG: Fix ASCII85Decode.decode assertion (#729) + - BUG: Error in Chinese character encoding (#463) + - BUG: Code duplication in Scripts/2-up.py + - ROBUST: Guard 'obj.writeToStream' with 'if obj is not None' + - ROBUST: Ignore a /Prev entry with the value 0 in the trailer + - MAINT: Remove Sample_Code (#726) + - TST: Close file handle in test_writer (#722) + - TST: Fix test_get_images (#730) + - DEV: Make tox use pytest and add more Python versions (#721) + - DOC: Many (#720, #723-725, #469) + +- Update to version 1.27.2: + - Add Scripts (including `pdfcat`), Resources, Tests, and Sample_Code back to + PyPDF2. It was removed by accident in 1.27.0, but might get removed with 2.0.0 + See #718 for discussion + +- Update to version 1.27.0: + Features: + - Add alpha channel support for png files in Script (#614) + Bug fixes (BUG): + - Fix formatWarning for filename without slash (#612) + - Add whitespace between words for extractText() (#569, #334) + - "invalid escape sequence" SyntaxError (#522) + - Avoid error when printing warning in pythonw (#486) + - Stream operations can be List or Dict (#665) + Documentation (DOC): + - Added Scripts/pdf-image-extractor.py + - Documentation improvements (#550, #538, #324, #426, #394) + Tests and Test setup (TST): + - Add Github Action which automatically run unit tests via pytest and + static code analysis with Flake8 (#660) + - Add several unit tests (#661, #663) + - Add .coveragerc to create coverage reports + Developer Experience Improvements (DEV): + - Pre commit: Developers can now `pre-commit install` to avoid tiny issues + like trailing whitespaces + Miscallenious: + - Add the LICENSE file to the distributed packages (#288) + - Use setuptools instead of distutils (#599) + - Improvements for the PyPI page (#644) + - Python 3 changes (#504, #366) + +------------------------------------------------------------------- +Mon Oct 21 22:55:54 UTC 2019 - Simon Lees + +- change the copyright to 2019 + +------------------------------------------------------------------- +Thu Dec 6 13:22:02 UTC 2018 - Tomáš Chvátal + +- Fix fdupes call + +------------------------------------------------------------------- +Tue Dec 4 12:52:37 UTC 2018 - Matej Cepl + +- Remove superfluous devel dependency for noarch package + +------------------------------------------------------------------- +Mon May 14 10:11:40 UTC 2018 - tchvatal@suse.com + +- Use license macro + +------------------------------------------------------------------- +Thu Apr 20 04:22:33 UTC 2017 - sflees@suse.de + +- Convert to single spec +- Update to version 1.26.0 + * NOTE: Active maintenance on PyPDF2 is resuming after a hiatus + * Fixed a bug where image resources where incorrectly overwritten + when merging pages + * Added dictionary for JavaScript actions to the root (louib) + * Added unit tests for the JS functionality (louib) + * Add more Python 3 compatibility when reading inline images (im2703 + and (VyacheslavHashov) + * Return NullObject instead of raising error when failing to resolve + object (ctate) + * Don't output warning for non-zeroed xref table when strict=False + (BenRussert) + * Remove extraneous zeroes from output formatting (speedplane) + * Fix bug where reading an inline image would cut off prematurely in + certain cases (speedplane) +- Changes for 1.25 +BUGFIXES: + * Added Python 3 algorithm for ASCII85Decode. Fixes issue when + reading reportlab-generated files with Py 3 (jerickbixly) + * Recognize more escape sequence which would otherwise throw an + exception (manuelzs, robertsoakes) + * Fixed overflow error in generic.py. Occurred + when reading a too-large int in Python 2 (by Raja Jamwal) + * Allow access to files which were encrypted with an empty + password. Previously threw a "File has not been decrypted" + exception (Elena Williams) + * Do not attempt to decode an empty data stream. Previously + would cause an error in decode algorithms (vladir) + * Fixed some type issues specific to Py 2 or Py 3 + * Fix issue when stream data begins with whitespace (soloma83) + * Recognize abbreviated filter names (AlmightyOatmeal and + Matthew Weiss) + * Copy decryption key from PdfFileReader to PdfFileMerger. + Allows usage of PdfFileMerger with encrypted files (twolfson) + * Fixed bug which occurred when a NameObject is present at end + of a file stream. Threw a "Stream has ended unexpectedly" + exception (speedplane) +FEATURES: + * Initial work on a test suite; to be expanded in future. + Tests and Resources directory added, README updated (robertsoakes) + * Added document cloning methods to PdfFileWriter: + appendPagesFromReader, cloneReaderDocumentRoot, and + cloneDocumentFromReader. See official documentation (robertsoakes) + * Added method for writing to form fields: updatePageFormFieldValues. + This will be enhanced in the future. See official documentation + (robertsoakes) + * New addAttachment method. See documentation. Support for adding + and extracting embedded files to be enhanced in the future + (moshekaplan) + * Added methods to get page number of given PageObject or + Destination: getPageNumber and getDestinationPageNumber. + See documentation (mozbugbox) + +------------------------------------------------------------------- +Mon May 11 18:00:56 UTC 2015 - benoit.monin@gmx.fr + +- update to version 1.24: + * Bugfixes for reading files in Python 3 (by Anthony Tuininga and + pqqp) + * Appropriate errors are now raised instead of infinite loops (by + naure and Cyrus Vafadari) + * Bugfix for parsing number tokens with leading spaces (by Maxim + Kamenkov) + * Don't crash on bad /Outlines reference (by eshellman) + * Conform tabs/spaces and blank lines to PEP 8 standards + * Utilize the readUntilRegex method when reading Number Objects + (by Brendan Jurd) + * More bugfixes for Python 3 and clearer exception handling + * Fixed encoding issue in merger (with eshellman) + * Created separate folder for scripts +- additional changes from version 1.23: + * Documentation now available at http://pythonhosted.org//PyPDF2 + * Bugfix in pagerange.py for when __init__.__doc__ has no value + (by Vladir Cruz) + * Fix typos in OutlinesObject().add() (by shilluc) + * Re-added a missing return statement in a utils.py method + * Corrected viewing mode names (by Jason Scheirer) + * New PdfFileWriter method: addJS() (by vfigueiro) + * New bookmark features: color, boldness, italics, and page fit + (by Joshua Arnott) + * New PdfFileReader method: getFields(). Used to extract field + information from PDFs with interactive forms. See documentation + for details + * Converted README file to markdown format (by Stephen Bussard) + * Several improvements to overall performance and efficiency (by + mozbugbox) + * Fixed a bug where geospatial information was not scaling along + with its page + * Fixed a type issue and a Python 3 issue in the decryption + algorithms (with Francisco Vieira and koba-ninkigumi) + * Fixed a bug causing an infinite loop in the ASCII 85 decoding + algorithm (by madmaardigan) + * Annotations (links, comment windows, etc.) are now preserved + when pages are merged together + * Used the Destination class in addLink() and addBookmark() so + that the page fit option could be properly customized +- additional changes from version 1.22: + * Added .DS_Store to .gitignore (for Mac users) (by Steve Witham) + * Removed __init__() implementation in NameObject (by Steve + Witham) + * Fixed bug (inf. loop) when merging pages in Python 3 (by commx) + * Corrected error when calculating height in scaleTo() + * Removed unnecessary code from DictionaryObject (by Georges + Dubus) + * Fixed bug where an exception was thrown upon reading a NULL + string (by speedplane) + * Allow string literals (non-unicode strings in Python 2) to be + passed to PdfFileReader + * Allow ConvertFunctionsToVirtualList to be indexed with slices + and longs (in Python 2) (by Matt Gilson) + * Major improvements and bugfixes to addLink() method (see + documentation in source code) (by Henry Keiter) + * General code clean-up and improvements (with Steve Witham and + Henry Keiter) + * Fixed bug that caused crash when comments are present at end of + dictionary +- additional changes from version 1.21: + * Fix for when /Type isn't present in the Pages dictionary (by + Rob1080) + * More tolerance for extra whitespace in Indirect Objects + * Improved Exception handling + * Fixed error in getHeight() method (by Simon Kaempflein) + * implement use of utils.string_type to resolve Py2-3 + compatibility issues + * Prevent exception for multiple definitions in a dictionary + (with carlosfunk) (only when strict = False) + * Fixed errors when parsing a slice using pdfcat on command line + (by Steve Witham) + * Tolerance for EOF markers within 1024 bytes of the actual end + of the file (with David Wolever) + * Added overwriteWarnings parameter to PdfFileReader constructor, + if False PyPDF2 will NOT overwrite methods from Python's + warnings.py module with a custom implementation. + * Fix NumberObject and NameObject constructors for compatibility + with PyPy (Rüdiger Jungbeck, Xavier Dupré, shezadkhan137, + Steven Witham) + * Utilize utils.Str in pdf.py and pagerange.py to resolve type + issues (by egbutter) + * Improvements in implementing StringIO for Python 2 and BytesIO + for Python 3 (by Xavier Dupré) + * Added /x00 to Whitespaces, defined utils.WHITESPACES to clarify + code (by Maxim Kamenkov) + * Bugfix for merging 3 or more resources with the same name (by + lucky-user) + * Improvements to Xref parsing algorithm (by speedplane) +- additional changes from version 1.20: + * Official Python 3+ support (with contributions from TWAC and + cgammans) Support for Python versions 2.6 and 2.7 will be + maintained + * Command line concatenation (see pdfcat in sample code) (by + Steve Witham) + * New FAQ; link included in README + * Allow more (although unnecessary) escape sequences + * Prevent exception when reading a null object in decoding + parameters + * Corrected error in reading destination types (added a slash + since they are name objects) + * Corrected TypeError in scaleTo() method + * addBookmark() method in PdfFileMerger now returns bookmark (so + nested bookmarks can be created) + * Additions to Sample Code and Sample PDFs + * changes to allow 2up script to work (see sample code) (by Dylan + McNamee) + * changes to metadata encoding (by Chris Hiestand) + * New methods for links: addLink() (by Enrico Lambertini) and + removeLinks() + * Bugfix to handle nested bookmarks correctly (by Jamie Lentin) + * New methods removeImages() and removeText() available for + PdfFileWriter (by Tien Haï) + * Exception handling for illegal characters in Name Objects +- remove unwanted shebang in pagerange.py +- rename README to README.md: changed upstream + +------------------------------------------------------------------- +Tue Dec 3 10:52:18 UTC 2013 - cfarrell@suse.com + +- license update: BSD-3-Clause + See LICENSE + +------------------------------------------------------------------- +Sun Nov 24 21:44:43 UTC 2013 - p.drouand@gmail.com + +- Initial release ( version 1.19 ) + diff --git a/python-pypdf.spec b/python-pypdf.spec new file mode 100644 index 0000000..ec8d045 --- /dev/null +++ b/python-pypdf.spec @@ -0,0 +1,69 @@ +# +# spec file for package python-pypdf +# +# Copyright (c) 2024 SUSE LLC +# +# All modifications and additions to the file contributed by third parties +# remain the property of their copyright owners, unless otherwise agreed +# upon. The license for this file, and modifications and additions to the +# file, is the same license as for the pristine package itself (unless the +# license for the pristine package is not an Open Source License, in which +# case the license is the MIT License). An "Open Source License" is a +# license that conforms to the Open Source Definition (Version 1.9) +# published by the Open Source Initiative. + +# Please submit bugfixes or comments via https://bugs.opensuse.org/ +# + + +Name: python-pypdf +Version: 4.2.0 +Release: 0 +Summary: PDF toolkit +License: BSD-3-Clause +URL: https://github.com/py-pdf/pypdf +Source0: https://github.com/py-pdf/pypdf/archive/refs/tags/%{version}.tar.gz#/%{name}-%{version}.tar.gz +BuildRequires: %{python_module flit} +BuildRequires: %{python_module pip} +BuildRequires: %{python_module setuptools} +BuildRequires: fdupes +BuildArch: noarch +Provides: python3-PyPDF2 = %version-%release +Obsoletes: python3-PyPDF2 < %version-%release + +%python_subpackages + +%description +A Pure-Python library built as a PDF toolkit. It is capable of: + +- extracting document information (title, author, ...), +- splitting documents page by page, +- merging documents page by page, +- cropping pages, +- merging multiple pages into a single page, +- encrypting and decrypting PDF files. + +By being Pure-Python, it should run on any Python platform without any +dependencies on external libraries. It can also work entirely on StringIO +objects rather than file streams, allowing for PDF manipulation in memory. +It is therefore a useful tool for websites that manage or manipulate PDFs. + +%prep +%autosetup -n pypdf-%{version} + +%build +%pyproject_wheel + +%install +%pyproject_install +%fdupes %{buildroot}%{python_sitelib}/pypdf + +# no checks possible as large pdf downloaded from the internet are necessary + +%files %{python_files} +%license LICENSE +%doc CHANGELOG.md +%{python_sitelib}/pypdf +%{python_sitelib}/pypdf-%{version}*-info + +%changelog