diff --git a/PyMuPDF-1.18.9.tar.gz b/PyMuPDF-1.18.9.tar.gz deleted file mode 100644 index f2f4196..0000000 --- a/PyMuPDF-1.18.9.tar.gz +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:cb6ba6c5038ce590a088b9bf320c6e9ce714c1fa304181ece8b551d8589a8b21 -size 308451 diff --git a/PyMuPDF-1.19.6.tar.gz b/PyMuPDF-1.19.6.tar.gz new file mode 100644 index 0000000..d8c102f --- /dev/null +++ b/PyMuPDF-1.19.6.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ef3d13e27f1585d776f6a2597f113aabd28d36b648b983a72850b21c5399ab08 +size 2270248 diff --git a/python-PyMuPDF.changes b/python-PyMuPDF.changes index ad9f670..ad5b7b5 100644 --- a/python-PyMuPDF.changes +++ b/python-PyMuPDF.changes @@ -1,4 +1,366 @@ ------------------------------------------------------------------- +Sun Mar 6 12:27:52 UTC 2022 - Hsiu-Ming Chang + +- Update to v1.19.6 + * Fixed #1620. The TextPage created by Page.get_textpage() will + now be freed correctly (removed memory leak). + * Fixed #1601. Document open errors should now be more concise + and easier to interpret. In the course of this, two + PyMuPDF-specific Python exceptions have been added: + EmptyFileError – raised when trying to create a Document + (fitz.open()) from an empty file or zero-length memory. + FileDataError – raised when MuPDF encounters irrecoverable + document structure issues. + * Added Page.load_widget() given a PDF field’s xref. + * Added Dictionary pdfcolor which provide the about 500 colors + defined as PDF color values with the lower case color name as + key. + * Added algebra functionality to the Quad class. These objects + can now also be added and subtracted among themselves, and be + multiplied by numbers and matrices. + * Added new constants defining the default text extraction flags + for more comfortable handling. Their naming convention is like + TEXTFLAGS_WORDS for page.get_text("words"). See Text Extraction + Flags Defaults. + * Changed Page.annots() and Page.widgets() to detect and prevent + reloading the page (illegally) inside the iterator loops via + Document.reload_page(). Doing this brings down the interpretor. + Documented clean ways to do annotation and widget mass updates + within properly designed loops. + * Changed several internal utility functions to become + standalone (“SWIG inline”) as opposed to be part of the Tools + class. This, among other things, increases the performance of + geometry object creation. + * Changed Document.update_stream() to always accept stream + updates - whether or not the dictionary object behind the xref + already is a stream. Thus the former new parameter is now + ignored and will be removed in v1.20.0. + +------------------------------------------------------------------- +Sun Feb 6 14:02:23 UTC 2022 - Hsiu-Ming Chang + +- Update to v1.19.5 + * Fixed #1518. A limited “fix”: in some cases, rectangles and + quadrupels were not correctly encoded to support re-drawing by + Shape. + * Fixed #1521. This had the same ultimate reason behind issue + #1510. + * Fixed #1513. Some Optional Content functions did not support + non-ASCII characters. + * Fixed #1510. Support more soft-mask image subtypes. + * Fixed #1507. Immunize against items in the outlines chain, + that are "null" objects. + * Fixed re-opened #1417. (“too many open files”). This was due + to insufficient calls to MuPDF’s fz_drop_document(). This also + fixes #1550. + * Fixed several undocumented issues in relation to incorrectly + setting the text span origin point_like. + * Fixed undocumented error computing the character bbox in + method Page.get_texttrace() when text is flipped (as opposed to + just rotated). + * Added items to the dictionary returned by image_properties(): + orientation and transform report the natural image orientation + (EXIF data). + * Added method Document.xref_copy(). It will make a given target + PDF object an exact copy of a source object. + +------------------------------------------------------------------- +Mon Jan 10 12:52:19 UTC 2022 - Hsiu-Ming Chang + +- Update to v1.19.4 + * Fixed #1505. Immunize against circular outline items. + * Fixed #1484. Correct CropBox coordinates are now returned in + all situations. + * Fixed #1479. + * Fixed #1474. TextPage objects are now properly deleted again. + * Added Page methods and attributes for PDF /ArtBox, /BleedBox, + /TrimBox. + * Added global attribute TESSDATA_PREFIX for easy checking of OCR + support. + * Changed Document.xref_set_key() such that dictionary keys will + physically be removed if set to value "null". + * Changed Document.extract_font() to optionally return a + dictionary (instead of a tuple). + +------------------------------------------------------------------- +Fri Dec 17 13:03:20 UTC 2021 - Hsiu-Ming Chang + +- Update to v1.19.3 + * Fixed #1351. Reverted code that introduced the memory growth + in v1.18.15. + * Fixed #1417. Developped circumvention for growth of open file + handles using Document.insert_pdf(). + * Fixed #1418. Developped circumvention for memory growth using + Document.insert_pdf(). + * Fixed #1430. Developped circumvention for mass pixmap + generations of document pages. + * Fixed #1433. Solves a bbox error for some Type 3 font in + PyMuPDF text processing. + * Added Pixmap.color_topusage() to determine the share of the + most frequently used color. Solves #1397. + * Added Pixmap.warp() which makes a new pixmap from a given + arbitrary convex quad inside the pixmap. + * Added Annot.irt_xref and Annot.set_irt_xref() to inquire or + set the /IRT (“In Responde To”) property of an annotation. + Implements #1450. + * Added Rect.torect() and IRect.torect() which compute a matrix + that transforms to a given other rectangle. + * Changed Pixmap.color_count() to also return the count of each + color. + * Changed Page.get_texttrace() to also return correct span and + character bboxes if span["dir"] != (1, 0). + +------------------------------------------------------------------- +Mon Nov 22 10:33:01 UTC 2021 - Hsiu-Ming Chang + +- Update to v1.19.2 + * Fixed #1388. Fixed intermittent memory corruption when insert or + updating annotations. + * Fixed #1375. Inconsistencies between line numbers as returned + by the “words” and the “dict” options of `Page.get_text()` have + been corrected. + * Fixed #1364. The check for being a "rawdict" span in + `recover_span_quad()` now works correctly. + * Fixed #1342. Corrected the check for rectangle infiniteness in + `Page.show_pdf_page()`. + * Changed `Page.get_drawings()`, `Page.get_cdrawings()` to return + an indicator on the area orientation covered by a rectangle. This + implements #1355. Also, the recognition rate for rectangles and + quads has been significantly improved. + * Changed all text search and extraction methods to set the new + flags option TEXT_MEDIABOX_CLIP to ON by default. That bit causes + the automatic suppression of all characters that are completely + outside a page’s mediabox (in as far as that notion is supported + for a document type). This eliminates the need for using + clip=page.rect or similar for omitting text outside the visible + area. + * Added parameter "dpi" to `Page.get_pixmap()` and + `Annot.get_pixmap()`. When given, parameter "matrix" is ignored, + and a Pixmap with the desired dots per inch is created. + * Added attributes `Pixmap.is_monochrome` and `Pixmap.is_unicolor` + allowing fast checks of pixmap properties. Addresses #1397. + * Added method `Pixmap.color_count()` to determine the unique + colors in the pixmap. + * Added boolean parameter "compress" to PDF document method + `Document.update_stream()`. Addresses / enables solution for + #1408. +- from v1.19.1 + * Fixed #1328. “words” text extraction again returns correct (x0, + y0) coordinates. + * Changed `Page.get_textpage_ocr()`: it now supports parameter + dpi to control OCR quality. It is also possible to choose whether + the full page should be OCRed or only the images displayed by the + page. + * Changed `Page.get_drawings()` and `Page.get_cdrawings()` to + automatically convert colors to RGB color tuples. Implements + #1332. Similar change was applied to `Page.get_texttrace()`. + * Changed `Page.get_text()` to support a parameter sort. If set + to True the output is conveniently sorted. +- from v1.19.0 + * Supports MuPDF 1.19.* + * Changed terminology and meaning of important geometry concepts: + Rectangles are now characterized as finite, valid or empty, while + the definitions of these terms have also changed. Rectangles + specifically are now thought of being “open”: not all corners + and sides are considered part of the retangle. Please do read + the Rect section for details. + * Added new parameter “no_new_id” to `Document.save()` / + `Document.tobytes()` methods. Use it to suppress updating the + second item of the document /ID which in PDF indicates that the + original file has been updated. If the PDF has no /ID at all yet, + then no new one will be created either. + * Added a journalling facility for PDF updates. This allows logging + changes, undoing or redoing them, or saving the journal for later + use. Refer to `Document.journal_enable()` and friends. + * Added new Pixmap methods `Pixmap.pdfocr_save()` and + `Pixmap.pdfocr_tobytes()`, which generate a 1-page PDF containing + the pixmap as PNG image with OCR text layer. + * Added `Page.get_textpage_ocr()` which executes optical character + recognition for the page, then extracts the results and stores + them together with “normal” page content in a TextPage. Use or + reuse this object in subsequent text extractions and text + searches to avoid multiple efforts. The existing text search + and text extraction methods have been extended to support a + separately created textpage – see next item. + * Added a new parameter textpage to text extraction and text search + methods. This allows reuse of a previously created TextPage and + thus achieves significant runtime benefits – which is especially + important for the new OCR features. But “normal” text extractions + can definitely also benefit. + * Added `Page.get_texttrace()`, a technical method delivering + low-level text character properties. It was present before as a + private method, but the author felt it now is mature enough to be + officially available. It specifically includes a “sequence + number” which indicates the page appearance build operation that + painted the text. + * Added `Page.get_bboxlog()` which delivers the list of + rectangles of page objects like text, images or drawings. Its + significance lies in its sequence: rectangles intersecting areas + with a lower index are covering or hiding them. + * Changed methods `Page.get_drawings()` and + `Page.get_cdrawings()` to include a “sequence number” indicating + the page appearance build operation that created the drawing. + * Fixed #1311. Field values in comboboxes should now be handled + correctly. + * Fixed #1290. Error was caused by incorrect rectangle emptiness + check, which is fixed due to new geometry logic of this version. + * Fixed #1286. Text alignment for redact annotations is working + again. + * Fixed #1287. Infinite loop issue for non-Windows systems when + applying some redactions has been resolved. + * Fixed #1284. Text layout destruction after applying redactions in + some cases has been resolved. +- from v1.18.19 + * Fixed issue #1266. Failure to set `Pixmap.samples` in important + cases, was hotfixed in a new version 1.18.19. +- from v1.18.18 + * Fixed issue #1257. Removing the read-only flag from PDF fields + is now possible. + * Fixed issue #1252. Now correctly specifying the zoom value for + PDF link annotations. + * Fixed issue #1244. Now correctly computing the transform matrix + in `Page.get_image__bbox()`. + * Fixed issue #1241. Prevent returning artifact characters in + `Page.get_textbox()`, which happened in certain constellations. + * Fixed issue #1234. Avoid creating infinite rectangles in corner + cases – `Page.get_drawings()`, `Page.get_cdrawings()`. + * Added test data and test scripts to the source PyPI source + distribution. +- from v1.18.17 + * Fixed issue #1199. Using a non-existing page number in + `Document.get_page_images()` and friends will no longer lead to + segfaults. + * Changed `Page.get_drawings()` to now differentiate between + “stroke”, “fill” and combined paths. Paths containing more than + one rectangle (i.e. “re” items) are now supported. Extracting + “clipped” paths is now available as an option. + * Added `Page.get_cdrawings()`, performance-optimized version of + `Page.get_drawings()`. + * Added `Pixmap.samples_mv`, memoryview of a pixmap’s pixel area. + Does not copy and thus always accesses the current state of that + area. + * Added `Pixmap.samples_ptr`, Python “pointer” to a pixmap’s pixel + area. Allows much faster creation (factor 800+) of Qt images. +- from v1.18.16 + * Fixed issue #1184. Existing PDF widget fonts in a PDF are now + accepted (i.e. not forcedly changed to a Base-14 font). + * Fixed issue #1154. Text search hits should now be correct when + clip is specified. + * Fixed issue #1152. + * Fixed issue #1146. + * Added `Link.flags` and `Link.set_flags()` to the Link class. + Implements enhancement requests #1187. + * Added option to simulate `TextWriter.fill_textbox() output for + predicting the number of lines, that a given text would occupy in + the textbox. + * Added text output support as subcommand gettext to the fitz CLI + module. Most importantly, original physical text layout + reproduction is now supported. +- from v1.18.15 + * Fixed issue #1088. Removing an annotation’s fill color should now + work again both ways, using the fill_color=[] argument in + `Annot.update()` as well as fill=[] in `Annot.set_colors()`. + * Fixed issue #1081. `Document.subset_fonts()`: fixed an error + which created wrong character widths for some fonts. + * Fixed issue #1078. `Page.get_text()` and other methods related to + text extraction: changed the default value of the TextPage flags + parameter. All whitespace and ligatures are now preserved. + * Fixed issue #1085. The old snake_cased alias of + `fitz.detTextlength` is now defined correctly. + * Changed `Document.subset_fonts()` will now correctly prefix font + subsets with an appropriate six letter uppercase tag, complying + with the PDF specification. + * Added new method `Widget.button_states()` which returns the + possible values that a button-type field can have when being set + to “on” or “off”. + * Added support of text with Small Capital letters to the Font and + TextWriter classes. This is reflected by an additional bool + parameter small_caps in various of their methods. +- from v1.18.14 + * Finished implementing new, “snake_cased” names for methods and + properties, that were “camelCased” and awkward in many aspects. + At the end of this documentation, there is section Deprecated + Names with more background and a mapping of old to new names. + * Fixed issue #1053. `Page.insert_image()`: when given, include + image mask in the hash computation. + * Fixed issue #1043. Added `Pixmap.getPNGdata` to the aliases of + `Pixmap.tobytes()`. + * Fixed an internal error when computing the envelopping + rectangle of drawn paths as returned by `Page.get_drawings()`. + * Fixed an internal error occasionally causing loops when + outputting text via `TextWriter.fill_textbox()`. + * Added `Font.char_lengths()`, which returns a tuple of character + widths of a string. + * Added more ways to specify pages in `Document.delete_pages()`. + Now a sequence (list, tuple or range) can be specified, and the + Python del statement can be used. In the latter case, Python + slices are also accepted. + * Changed `Document.del_toc_item()`, which disables a single item + of the TOC: previously, the title text was removed. Instead, now + the complete item will be shown grayed-out by supporting viewers. +- from v1.18.13 + * Fixed issue #1014 + * Fixed an internal memory leak when computing image bboxes – + `Page.get_image_bbox()`. + * Added support for low-level access and modification of the PDF + trailer. Applies to `Document.xref_get_keys()`, + `Document.xref_get_key(), and Document.xref_set_key()`. + * Added documentation for maintaining private entries in PDF + metadata. + * Added documentation for handling transparent image insertions, + `Page.insert_image()`. + * Added `Page.get_image_rects()`, an improved version of + `Page.get_image_bbox()`. + * Changed `Document.delete_pages()` to support various ways of + specifying pages to delete. + * Changed `Page.insert_image()` to also accept the xref of an + existing image in the file. This allows “copying” images between + pages, and extremely fast mutiple insertions. + * Changed `Page.insert_image()` to also accept the integer + parameter alpha. To be used for performance improvements. + * Changed `Pixmap.set_alpha()` to support new parameters for + pre-multiplying colors with their alpha values and setting a + specific color to fully transparent (e.g. white). + * Changed `Document.embfile_add()` to automatically set creation + and modification date-time. Correspondingly, + `Document.embfile_upd()` automatically maintains modification + date-time (/ModDate PDF key), and `Document.embfile_info()` + correspondingly reports these data. In addition, the embedded + file’s associated “collection item” is included via its xref. + This supports the development of PDF portfolio applications. + +------------------------------------------------------------------- +Sat Apr 10 12:56:40 UTC 2021 - John Vandenberg + +- Update to v1.18.11 + * Improved layout of source distribution material. + * Stabilized Linux distribution detection for generating PyMuPDF + from sources. + * Page.get_xobjects delivers the result of Document.get_page_xobjects. + * Page.get_image_info delivers meta information for all images shown + on the page. + * Tools.mupdf_display_warnings allows setting on / off the display + of MuPDF-generated warnings. The default is off. + * Document.ez_save convenience alias of :meth:`Document.save` + with some different defaults. + * Image extractions of document pages now also contain the image's + **transformation matrix**. This concerns `Page.get_image_bbox` + and the DICT, JSON, RAWDICT, and RAWJSON variants of `Page.get_text`. +- from v1.18.10 + * Added old aliases for `DisplayList.get_pixmap` and + `DisplayList.get_textpage`. + * Stabilized removal of JavaScript objects with `Document.scrub`. + * Removed a loop in the reworked `TextWriter.fill_textbox`. + * `Document.xref_get_keys` and `Document.xref_get_key` to also allow + accessing the PDF trailer dictionary. This can be done by using + `-1` as the xref number argument. + * Added a number of functions for reconstructing the quads for text + lines, spans and characters extracted by `Page.get_text` options + "dict" and "rawdict". + * Added `Tools.unset_quad_corrections` to suppress character quad + corrections (occasionally required for erroneous fonts). + +-------------------------------------------------------------------- Sat Feb 27 00:04:25 UTC 2021 - John Vandenberg - Revised License to be AGPL-3.0-only diff --git a/python-PyMuPDF.spec b/python-PyMuPDF.spec index ebb482a..56eea8d 100644 --- a/python-PyMuPDF.spec +++ b/python-PyMuPDF.spec @@ -21,7 +21,7 @@ %define skip_python2 1 %define pypi_name PyMuPDF Name: python-%{pypi_name} -Version: 1.18.9 +Version: 1.19.6 Release: 0 Summary: Python binding for MuPDF, a PDF and XPS viewer License: AGPL-3.0-only