diff --git a/PyMuPDF-1.18.9.tar.gz b/PyMuPDF-1.18.9.tar.gz new file mode 100644 index 0000000..f2f4196 --- /dev/null +++ b/PyMuPDF-1.18.9.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cb6ba6c5038ce590a088b9bf320c6e9ce714c1fa304181ece8b551d8589a8b21 +size 308451 diff --git a/PyMuPDF-1.19.6.tar.gz b/PyMuPDF-1.19.6.tar.gz deleted file mode 100644 index d8c102f..0000000 --- a/PyMuPDF-1.19.6.tar.gz +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:ef3d13e27f1585d776f6a2597f113aabd28d36b648b983a72850b21c5399ab08 -size 2270248 diff --git a/python-PyMuPDF.changes b/python-PyMuPDF.changes index 52f282e..ad9f670 100644 --- a/python-PyMuPDF.changes +++ b/python-PyMuPDF.changes @@ -1,365 +1,3 @@ -------------------------------------------------------------------- -Sun Mar 6 12:27:52 UTC 2022 - Hsiu-Ming Chang - -- Update to v1.19.6 - * Fixed #1620. The TextPage created by Page.get_textpage() will - now be freed correctly (removed memory leak). - * Fixed #1601. Document open errors should now be more concise - and easier to interpret. In the course of this, two - PyMuPDF-specific Python exceptions have been added: - EmptyFileError – raised when trying to create a Document - (fitz.open()) from an empty file or zero-length memory. - FileDataError – raised when MuPDF encounters irrecoverable - document structure issues. - * Added Page.load_widget() given a PDF field’s xref. - * Added Dictionary pdfcolor which provide the about 500 colors - defined as PDF color values with the lower case color name as - key. - * Added algebra functionality to the Quad class. These objects - can now also be added and subtracted among themselves, and be - multiplied by numbers and matrices. - * Added new constants defining the default text extraction flags - for more comfortable handling. Their naming convention is like - TEXTFLAGS_WORDS for page.get_text("words"). See Text Extraction - Flags Defaults. - * Changed Page.annots() and Page.widgets() to detect and prevent - reloading the page (illegally) inside the iterator loops via - Document.reload_page(). Doing this brings down the interpretor. - Documented clean ways to do annotation and widget mass updates - within properly designed loops. - * Changed several internal utility functions to become - standalone (“SWIG inline”) as opposed to be part of the Tools - class. This, among other things, increases the performance of - geometry object creation. - * Changed Document.update_stream() to always accept stream - updates - whether or not the dictionary object behind the xref - already is a stream. Thus the former new parameter is now - ignored and will be removed in v1.20.0. - -------------------------------------------------------------------- -Sun Feb 6 14:02:23 UTC 2022 - Hsiu-Ming Chang - -- Update to v1.19.5 - * Fixed #1518. A limited “fix”: in some cases, rectangles and - quadrupels were not correctly encoded to support re-drawing by - Shape. - * Fixed #1521. This had the same ultimate reason behind issue - #1510. - * Fixed #1513. Some Optional Content functions did not support - non-ASCII characters. - * Fixed #1510. Support more soft-mask image subtypes. - * Fixed #1507. Immunize against items in the outlines chain, - that are "null" objects. - * Fixed re-opened #1417. (“too many open files”). This was due - to insufficient calls to MuPDF’s fz_drop_document(). This also - fixes #1550. - * Fixed several undocumented issues in relation to incorrectly - setting the text span origin point_like. - * Fixed undocumented error computing the character bbox in - method Page.get_texttrace() when text is flipped (as opposed to - just rotated). - * Added items to the dictionary returned by image_properties(): - orientation and transform report the natural image orientation - (EXIF data). - * Added method Document.xref_copy(). It will make a given target - PDF object an exact copy of a source object. - -------------------------------------------------------------------- -Mon Jan 10 12:52:19 UTC 2022 - Hsiu-Ming Chang - -- Update to v1.19.4 - * Fixed #1505. Immunize against circular outline items. - * Fixed #1484. Correct CropBox coordinates are now returned in - all situations. - * Fixed #1479. - * Fixed #1474. TextPage objects are now properly deleted again. - * Added Page methods and attributes for PDF /ArtBox, /BleedBox, - /TrimBox. - * Added global attribute TESSDATA_PREFIX for easy checking of OCR - support. - * Changed Document.xref_set_key() such that dictionary keys will - physically be removed if set to value "null". - * Changed Document.extract_font() to optionally return a - dictionary (instead of a tuple). - -------------------------------------------------------------------- -Fri Dec 17 13:03:20 UTC 2021 - Hsiu-Ming Chang - -- Update to v1.19.3 - * Fixed #1351. Reverted code that introduced the memory growth - in v1.18.15. - * Fixed #1417. Developped circumvention for growth of open file - handles using Document.insert_pdf(). - * Fixed #1418. Developped circumvention for memory growth using - Document.insert_pdf(). - * Fixed #1430. Developped circumvention for mass pixmap - generations of document pages. - * Fixed #1433. Solves a bbox error for some Type 3 font in - PyMuPDF text processing. - * Added Pixmap.color_topusage() to determine the share of the - most frequently used color. Solves #1397. - * Added Pixmap.warp() which makes a new pixmap from a given - arbitrary convex quad inside the pixmap. - * Added Annot.irt_xref and Annot.set_irt_xref() to inquire or - set the /IRT (“In Responde To”) property of an annotation. - Implements #1450. - * Added Rect.torect() and IRect.torect() which compute a matrix - that transforms to a given other rectangle. - * Changed Pixmap.color_count() to also return the count of each - color. - * Changed Page.get_texttrace() to also return correct span and - character bboxes if span["dir"] != (1, 0). - -------------------------------------------------------------------- -Mon Nov 22 10:33:01 UTC 2021 - Hsiu-Ming Chang - -- Update to v1.19.2 - * Fixed #1388. Fixed intermittent memory corruption when insert or - updating annotations. - * Fixed #1375. Inconsistencies between line numbers as returned - by the “words” and the “dict” options of `Page.get_text()` have - been corrected. - * Fixed #1364. The check for being a "rawdict" span in - `recover_span_quad()` now works correctly. - * Fixed #1342. Corrected the check for rectangle infiniteness in - `Page.show_pdf_page()`. - * Changed `Page.get_drawings()`, `Page.get_cdrawings()` to return - an indicator on the area orientation covered by a rectangle. This - implements #1355. Also, the recognition rate for rectangles and - quads has been significantly improved. - * Changed all text search and extraction methods to set the new - flags option TEXT_MEDIABOX_CLIP to ON by default. That bit causes - the automatic suppression of all characters that are completely - outside a page’s mediabox (in as far as that notion is supported - for a document type). This eliminates the need for using - clip=page.rect or similar for omitting text outside the visible - area. - * Added parameter "dpi" to `Page.get_pixmap()` and - `Annot.get_pixmap()`. When given, parameter "matrix" is ignored, - and a Pixmap with the desired dots per inch is created. - * Added attributes `Pixmap.is_monochrome` and `Pixmap.is_unicolor` - allowing fast checks of pixmap properties. Addresses #1397. - * Added method `Pixmap.color_count()` to determine the unique - colors in the pixmap. - * Added boolean parameter "compress" to PDF document method - `Document.update_stream()`. Addresses / enables solution for - #1408. -- from v1.19.1 - * Fixed #1328. “words” text extraction again returns correct (x0, - y0) coordinates. - * Changed `Page.get_textpage_ocr()`: it now supports parameter - dpi to control OCR quality. It is also possible to choose whether - the full page should be OCRed or only the images displayed by the - page. - * Changed `Page.get_drawings()` and `Page.get_cdrawings()` to - automatically convert colors to RGB color tuples. Implements - #1332. Similar change was applied to `Page.get_texttrace()`. - * Changed `Page.get_text()` to support a parameter sort. If set - to True the output is conveniently sorted. -- from v1.19.0 - * Supports MuPDF 1.19.* - * Changed terminology and meaning of important geometry concepts: - Rectangles are now characterized as finite, valid or empty, while - the definitions of these terms have also changed. Rectangles - specifically are now thought of being “open”: not all corners - and sides are considered part of the retangle. Please do read - the Rect section for details. - * Added new parameter “no_new_id” to `Document.save()` / - `Document.tobytes()` methods. Use it to suppress updating the - second item of the document /ID which in PDF indicates that the - original file has been updated. If the PDF has no /ID at all yet, - then no new one will be created either. - * Added a journalling facility for PDF updates. This allows logging - changes, undoing or redoing them, or saving the journal for later - use. Refer to `Document.journal_enable()` and friends. - * Added new Pixmap methods `Pixmap.pdfocr_save()` and - `Pixmap.pdfocr_tobytes()`, which generate a 1-page PDF containing - the pixmap as PNG image with OCR text layer. - * Added `Page.get_textpage_ocr()` which executes optical character - recognition for the page, then extracts the results and stores - them together with “normal” page content in a TextPage. Use or - reuse this object in subsequent text extractions and text - searches to avoid multiple efforts. The existing text search - and text extraction methods have been extended to support a - separately created textpage – see next item. - * Added a new parameter textpage to text extraction and text search - methods. This allows reuse of a previously created TextPage and - thus achieves significant runtime benefits – which is especially - important for the new OCR features. But “normal” text extractions - can definitely also benefit. - * Added `Page.get_texttrace()`, a technical method delivering - low-level text character properties. It was present before as a - private method, but the author felt it now is mature enough to be - officially available. It specifically includes a “sequence - number” which indicates the page appearance build operation that - painted the text. - * Added `Page.get_bboxlog()` which delivers the list of - rectangles of page objects like text, images or drawings. Its - significance lies in its sequence: rectangles intersecting areas - with a lower index are covering or hiding them. - * Changed methods `Page.get_drawings()` and - `Page.get_cdrawings()` to include a “sequence number” indicating - the page appearance build operation that created the drawing. - * Fixed #1311. Field values in comboboxes should now be handled - correctly. - * Fixed #1290. Error was caused by incorrect rectangle emptiness - check, which is fixed due to new geometry logic of this version. - * Fixed #1286. Text alignment for redact annotations is working - again. - * Fixed #1287. Infinite loop issue for non-Windows systems when - applying some redactions has been resolved. - * Fixed #1284. Text layout destruction after applying redactions in - some cases has been resolved. -- from v1.18.19 - * Fixed issue #1266. Failure to set `Pixmap.samples` in important - cases, was hotfixed in a new version 1.18.19. -- from v1.18.18 - * Fixed issue #1257. Removing the read-only flag from PDF fields - is now possible. - * Fixed issue #1252. Now correctly specifying the zoom value for - PDF link annotations. - * Fixed issue #1244. Now correctly computing the transform matrix - in `Page.get_image__bbox()`. - * Fixed issue #1241. Prevent returning artifact characters in - `Page.get_textbox()`, which happened in certain constellations. - * Fixed issue #1234. Avoid creating infinite rectangles in corner - cases – `Page.get_drawings()`, `Page.get_cdrawings()`. - * Added test data and test scripts to the source PyPI source - distribution. -- from v1.18.17 - * Fixed issue #1199. Using a non-existing page number in - `Document.get_page_images()` and friends will no longer lead to - segfaults. - * Changed `Page.get_drawings()` to now differentiate between - “stroke”, “fill” and combined paths. Paths containing more than - one rectangle (i.e. “re” items) are now supported. Extracting - “clipped” paths is now available as an option. - * Added `Page.get_cdrawings()`, performance-optimized version of - `Page.get_drawings()`. - * Added `Pixmap.samples_mv`, memoryview of a pixmap’s pixel area. - Does not copy and thus always accesses the current state of that - area. - * Added `Pixmap.samples_ptr`, Python “pointer” to a pixmap’s pixel - area. Allows much faster creation (factor 800+) of Qt images. -- from v1.18.16 - * Fixed issue #1184. Existing PDF widget fonts in a PDF are now - accepted (i.e. not forcedly changed to a Base-14 font). - * Fixed issue #1154. Text search hits should now be correct when - clip is specified. - * Fixed issue #1152. - * Fixed issue #1146. - * Added `Link.flags` and `Link.set_flags()` to the Link class. - Implements enhancement requests #1187. - * Added option to simulate `TextWriter.fill_textbox() output for - predicting the number of lines, that a given text would occupy in - the textbox. - * Added text output support as subcommand gettext to the fitz CLI - module. Most importantly, original physical text layout - reproduction is now supported. -- from v1.18.15 - * Fixed issue #1088. Removing an annotation’s fill color should now - work again both ways, using the fill_color=[] argument in - `Annot.update()` as well as fill=[] in `Annot.set_colors()`. - * Fixed issue #1081. `Document.subset_fonts()`: fixed an error - which created wrong character widths for some fonts. - * Fixed issue #1078. `Page.get_text()` and other methods related to - text extraction: changed the default value of the TextPage flags - parameter. All whitespace and ligatures are now preserved. - * Fixed issue #1085. The old snake_cased alias of - `fitz.detTextlength` is now defined correctly. - * Changed `Document.subset_fonts()` will now correctly prefix font - subsets with an appropriate six letter uppercase tag, complying - with the PDF specification. - * Added new method `Widget.button_states()` which returns the - possible values that a button-type field can have when being set - to “on” or “off”. - * Added support of text with Small Capital letters to the Font and - TextWriter classes. This is reflected by an additional bool - parameter small_caps in various of their methods. -- from v1.18.14 - * Finished implementing new, “snake_cased” names for methods and - properties, that were “camelCased” and awkward in many aspects. - At the end of this documentation, there is section Deprecated - Names with more background and a mapping of old to new names. - * Fixed issue #1053. `Page.insert_image()`: when given, include - image mask in the hash computation. - * Fixed issue #1043. Added `Pixmap.getPNGdata` to the aliases of - `Pixmap.tobytes()`. - * Fixed an internal error when computing the envelopping - rectangle of drawn paths as returned by `Page.get_drawings()`. - * Fixed an internal error occasionally causing loops when - outputting text via `TextWriter.fill_textbox()`. - * Added `Font.char_lengths()`, which returns a tuple of character - widths of a string. - * Added more ways to specify pages in `Document.delete_pages()`. - Now a sequence (list, tuple or range) can be specified, and the - Python del statement can be used. In the latter case, Python - slices are also accepted. - * Changed `Document.del_toc_item()`, which disables a single item - of the TOC: previously, the title text was removed. Instead, now - the complete item will be shown grayed-out by supporting viewers. -- from v1.18.13 - * Fixed issue #1014 - * Fixed an internal memory leak when computing image bboxes – - `Page.get_image_bbox()`. - * Added support for low-level access and modification of the PDF - trailer. Applies to `Document.xref_get_keys()`, - `Document.xref_get_key(), and Document.xref_set_key()`. - * Added documentation for maintaining private entries in PDF - metadata. - * Added documentation for handling transparent image insertions, - `Page.insert_image()`. - * Added `Page.get_image_rects()`, an improved version of - `Page.get_image_bbox()`. - * Changed `Document.delete_pages()` to support various ways of - specifying pages to delete. - * Changed `Page.insert_image()` to also accept the xref of an - existing image in the file. This allows “copying” images between - pages, and extremely fast mutiple insertions. - * Changed `Page.insert_image()` to also accept the integer - parameter alpha. To be used for performance improvements. - * Changed `Pixmap.set_alpha()` to support new parameters for - pre-multiplying colors with their alpha values and setting a - specific color to fully transparent (e.g. white). - * Changed `Document.embfile_add()` to automatically set creation - and modification date-time. Correspondingly, - `Document.embfile_upd()` automatically maintains modification - date-time (/ModDate PDF key), and `Document.embfile_info()` - correspondingly reports these data. In addition, the embedded - file’s associated “collection item” is included via its xref. - This supports the development of PDF portfolio applications. - -------------------------------------------------------------------- -Sat Apr 10 12:56:40 UTC 2021 - John Vandenberg - -- Update to v1.18.11 - * Improved layout of source distribution material. - * Stabilized Linux distribution detection for generating PyMuPDF - from sources. - * Page.get_xobjects delivers the result of Document.get_page_xobjects. - * Page.get_image_info delivers meta information for all images shown - on the page. - * Tools.mupdf_display_warnings allows setting on / off the display - of MuPDF-generated warnings. The default is off. - * Document.ez_save convenience alias of :meth:`Document.save` - with some different defaults. - * Image extractions of document pages now also contain the image's - **transformation matrix**. This concerns `Page.get_image_bbox` - and the DICT, JSON, RAWDICT, and RAWJSON variants of `Page.get_text`. -- from v1.18.10 - * Added old aliases for `DisplayList.get_pixmap` and - `DisplayList.get_textpage`. - * Stabilized removal of JavaScript objects with `Document.scrub`. - * Removed a loop in the reworked `TextWriter.fill_textbox`. - * `Document.xref_get_keys` and `Document.xref_get_key` to also allow - accessing the PDF trailer dictionary. This can be done by using - `-1` as the xref number argument. - * Added a number of functions for reconstructing the quads for text - lines, spans and characters extracted by `Page.get_text` options - "dict" and "rawdict". - * Added `Tools.unset_quad_corrections` to suppress character quad - corrections (occasionally required for erroneous fonts). - ------------------------------------------------------------------- Sat Feb 27 00:04:25 UTC 2021 - John Vandenberg @@ -386,8 +24,8 @@ Sat Feb 27 00:04:25 UTC 2021 - John Vandenberg of the `warn` parameter to no longer print a warning message in overflow situations. * Added a utility function `recover_quad`, which computes the - quadrilateral of a span. This function can be used for correctly - marking text extracted with the "dict" or "rawdict" + quadrilateral of a span. This function can be used when + quadrilaterals for text extracted with the "dict" or "rawdict" options of `Page.get_text`. ------------------------------------------------------------------- @@ -424,7 +62,7 @@ Mon Feb 8 06:24:36 UTC 2021 - John Vandenberg * Added :meth:`Document.has_annots and Document.has_links to check whether these object types are present anywhere in a PDF. * Added expert low-level functions to simplify inquiry and - modification of PDF object sources: + modification of PDF object sources: + Document.xref_get_keys lists the keys of object `xref` + Document.xref_get_key returns type and content of a key + Document.xref_set_key modifies the key's value @@ -607,6 +245,15 @@ Mon Feb 8 06:24:36 UTC 2021 - John Vandenberg now automatically set from the respective Pixmap.xres and Pixmap.yres values +------------------------------------------------------------------- +Sat Dec 12 13:56:56 UTC 2020 - Matej Cepl + +- update to 1.18.4: + - Improved PDF Optional Content support + - Started overhaul of method and attribute naming + - Introduced support of Popup annotations + - Implemented other bug fixes. + ------------------------------------------------------------------- Wed Sep 23 12:34:51 UTC 2020 - Dirk Mueller @@ -639,7 +286,7 @@ Fri Mar 27 09:27:34 UTC 2020 - Marketa Calabkova * Added method which returns a list of Form XObjects of the page. * Added advanced graphics features to control the anti-aliasing values * Added :meth:`Document.scrub` which removes potentially sensitive data from a PDF. - * Changed text marker annotations to accept parameters beyond just + * Changed text marker annotations to accept parameters beyond just quadrilaterals such that now text lines between two given points can be marked. * Added :meth:`Annot.setBlendMode` to set the annotation's blend mode. @@ -654,7 +301,7 @@ Tue Feb 25 12:22:02 UTC 2020 - Yunhe Guo Wed Jan 15 11:54:42 UTC 2020 - Marketa Calabkova - update to 1.16.10 - * PyMuPDF can also be used as a module in the commandline using + * PyMuPDF can also be used as a module in the commandline using "python -m fitz" * Support for Python 3.4 has been dropped. @@ -665,7 +312,7 @@ Wed Oct 2 11:25:50 UTC 2019 - Yunhe Guo * significant performance improvements for dict / rawdict text extraction * Page.getText() now support text extraction for "blocks" and - "words" + "words" ------------------------------------------------------------------- Tue Sep 17 21:26:39 UTC 2019 - Yunhe Guo diff --git a/python-PyMuPDF.spec b/python-PyMuPDF.spec index 71296a6..ebb482a 100644 --- a/python-PyMuPDF.spec +++ b/python-PyMuPDF.spec @@ -1,7 +1,7 @@ # -# spec file for package python-PyMuPDF +# spec file # -# Copyright (c) 2021 SUSE LLC +# Copyright (c) 2022 SUSE LLC # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed @@ -17,11 +17,11 @@ %{?!python_module:%define python_module() python-%{**} python3-%{**}} -# Python 3 only syntax +# Python 2 build fails always %define skip_python2 1 %define pypi_name PyMuPDF Name: python-%{pypi_name} -Version: 1.19.6 +Version: 1.18.9 Release: 0 Summary: Python binding for MuPDF, a PDF and XPS viewer License: AGPL-3.0-only @@ -29,17 +29,16 @@ Group: Development/Libraries/Python URL: https://github.com/pymupdf/PyMuPDF Source: https://files.pythonhosted.org/packages/source/P/PyMuPDF/PyMuPDF-%{version}.tar.gz BuildRequires: %{python_module devel} -BuildRequires: %{python_module distro} BuildRequires: %{python_module setuptools} -BuildRequires: dos2unix BuildRequires: fdupes -BuildRequires: gcc-c++ +BuildRequires: gcc BuildRequires: jbig2dec-devel -BuildRequires: mupdf-devel-static < 1.20.0 -BuildRequires: mupdf-devel-static >= 1.19.0 +BuildRequires: mupdf-devel-static < 1.19.0 +BuildRequires: mupdf-devel-static >= 1.18.0 BuildRequires: openSUSE-release BuildRequires: pkgconfig BuildRequires: python-rpm-macros +BuildRequires: swig BuildRequires: pkgconfig(freetype2) BuildRequires: pkgconfig(gumbo) BuildRequires: pkgconfig(harfbuzz) @@ -56,24 +55,24 @@ book formats. PyMuPDF can also access files with extensions *.pdf, *.xps, *.oxps, *.epub, *.cbz or *.fb2 from Python scripts. %prep -%setup -q -n %{pypi_name}-%{version} -dos2unix README.md changes.txt +%autosetup -p1 -n %{pypi_name}-%{version} %build +export CFLAGS="%{optflags} -I/usr/include/freetype2" %python_build %install %python_install +rm %{buildroot}%{_prefix}/{COPYING,README.md,changes.rst} %python_expand %fdupes %{buildroot}%{$python_sitearch} %check -# https://github.com/pymupdf/PyMuPDF/issues/1002 requests a better test sequence cd /tmp %python_expand PYTHONPATH=%{buildroot}%{$python_sitearch} $python -c 'import fitz' %files %{python_files} %license COPYING -%doc README.md changes.txt +%doc README.md changes.rst %{python_sitearch}/* %changelog