- Update to v1.19.6

* Fixed #1620. The TextPage created by Page.get_textpage() will now be freed correctly (removed memory leak). * Fixed #1601. Document open errors should now be more concise and easier to interpret. In the course of this, two PyMuPDF-specific Python exceptions have been added: EmptyFileError – raised when trying to create a Document (fitz.open()) from an empty file or zero-length memory. FileDataError – raised when MuPDF encounters irrecoverable document structure issues. * Added Page.load_widget() given a PDF field’s xref. * Added Dictionary pdfcolor which provide the about 500 colors defined as PDF color values with the lower case color name as key. * Added algebra functionality to the Quad class. These objects can now also be added and subtracted among themselves, and be multiplied by numbers and matrices. * Added new constants defining the default text extraction flags for more comfortable handling. Their naming convention is like TEXTFLAGS_WORDS for page.get_text("words"). See Text Extraction Flags Defaults. * Changed Page.annots() and Page.widgets() to detect and prevent reloading the page (illegally) inside the iterator loops via Document.reload_page(). Doing this brings down the interpretor. Documented clean ways to do annotation and widget mass updates within properly designed loops. * Changed several internal utility functions to become standalone (“SWIG inline”) as opposed to be part of the Tools class. This, among other things, increases the performance of geometry object creation. OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-PyMuPDF?expand=0&rev=41
2022-05-10 00:11:10 +00:00
parent 80bbe9de57
commit 7b05a7d1ac
4 changed files with 366 additions and 4 deletions
--- a/PyMuPDF-1.18.9.tar.gz
+++ b/PyMuPDF-1.18.9.tar.gz
@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:cb6ba6c5038ce590a088b9bf320c6e9ce714c1fa304181ece8b551d8589a8b21
-size 308451
--- a/PyMuPDF-1.19.6.tar.gz
+++ b/PyMuPDF-1.19.6.tar.gz
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ef3d13e27f1585d776f6a2597f113aabd28d36b648b983a72850b21c5399ab08
+size 2270248
--- a/python-PyMuPDF.changes
+++ b/python-PyMuPDF.changes
@@ -1,4 +1,366 @@
 -------------------------------------------------------------------
+Sun Mar  6 12:27:52 UTC 2022 - Hsiu-Ming Chang <cges30901@gmail.com>
+
+- Update to v1.19.6
+  * Fixed #1620. The TextPage created by Page.get_textpage() will
+    now be freed correctly (removed memory leak).
+  * Fixed #1601. Document open errors should now be more concise
+    and easier to interpret. In the course of this, two
+    PyMuPDF-specific Python exceptions have been added:
+    EmptyFileError – raised when trying to create a Document
+    (fitz.open()) from an empty file or zero-length memory.
+    FileDataError – raised when MuPDF encounters irrecoverable
+    document structure issues.
+  * Added Page.load_widget() given a PDF field’s xref.
+  * Added Dictionary pdfcolor which provide the about 500 colors
+    defined as PDF color values with the lower case color name as
+    key.
+  * Added algebra functionality to the Quad class. These objects
+    can now also be added and subtracted among themselves, and be
+    multiplied by numbers and matrices.
+  * Added new constants defining the default text extraction flags
+    for more comfortable handling. Their naming convention is like
+    TEXTFLAGS_WORDS for page.get_text("words"). See Text Extraction
+    Flags Defaults.
+  * Changed Page.annots() and Page.widgets() to detect and prevent
+    reloading the page (illegally) inside the iterator loops via
+    Document.reload_page(). Doing this brings down the interpretor.
+    Documented clean ways to do annotation and widget mass updates
+    within properly designed loops.
+  * Changed several internal utility functions to become
+    standalone (“SWIG inline”) as opposed to be part of the Tools
+    class. This, among other things, increases the performance of
+    geometry object creation.
+  * Changed Document.update_stream() to always accept stream
+    updates - whether or not the dictionary object behind the xref
+    already is a stream. Thus the former new parameter is now
+    ignored and will be removed in v1.20.0.
+
+-------------------------------------------------------------------
+Sun Feb  6 14:02:23 UTC 2022 - Hsiu-Ming Chang <cges30901@gmail.com>
+
+- Update to v1.19.5
+  * Fixed #1518. A limited “fix”: in some cases, rectangles and
+    quadrupels were not correctly encoded to support re-drawing by
+    Shape.
+  * Fixed #1521. This had the same ultimate reason behind issue
+    #1510.
+  * Fixed #1513. Some Optional Content functions did not support
+    non-ASCII characters.
+  * Fixed #1510. Support more soft-mask image subtypes.
+  * Fixed #1507. Immunize against items in the outlines chain,
+    that are "null" objects.
+  * Fixed re-opened #1417. (“too many open files”). This was due
+    to insufficient calls to MuPDF’s fz_drop_document(). This also
+    fixes #1550.
+  * Fixed several undocumented issues in relation to incorrectly
+    setting the text span origin point_like.
+  * Fixed undocumented error computing the character bbox in
+    method Page.get_texttrace() when text is flipped (as opposed to
+    just rotated).
+  * Added items to the dictionary returned by image_properties():
+    orientation and transform report the natural image orientation
+    (EXIF data).
+  * Added method Document.xref_copy(). It will make a given target
+    PDF object an exact copy of a source object.
+
+-------------------------------------------------------------------
+Mon Jan 10 12:52:19 UTC 2022 - Hsiu-Ming Chang <cges30901@gmail.com>
+
+- Update to v1.19.4
+  * Fixed #1505. Immunize against circular outline items.
+  * Fixed #1484. Correct CropBox coordinates are now returned in
+    all situations.
+  * Fixed #1479.
+  * Fixed #1474. TextPage objects are now properly deleted again.
+  * Added Page methods and attributes for PDF /ArtBox, /BleedBox,
+    /TrimBox.
+  * Added global attribute TESSDATA_PREFIX for easy checking of OCR
+    support.
+  * Changed Document.xref_set_key() such that dictionary keys will
+    physically be removed if set to value "null".
+  * Changed Document.extract_font() to optionally return a
+    dictionary (instead of a tuple).
+
+-------------------------------------------------------------------
+Fri Dec 17 13:03:20 UTC 2021 - Hsiu-Ming Chang <cges30901@gmail.com>
+
+- Update to v1.19.3
+  * Fixed #1351. Reverted code that introduced the memory growth
+    in v1.18.15.
+  * Fixed #1417. Developped circumvention for growth of open file
+    handles using Document.insert_pdf().
+  * Fixed #1418. Developped circumvention for memory growth using
+    Document.insert_pdf().
+  * Fixed #1430. Developped circumvention for mass pixmap
+    generations of document pages.
+  * Fixed #1433. Solves a bbox error for some Type 3 font in
+    PyMuPDF text processing.
+  * Added Pixmap.color_topusage() to determine the share of the
+    most frequently used color. Solves #1397.
+  * Added Pixmap.warp() which makes a new pixmap from a given
+    arbitrary convex quad inside the pixmap.
+  * Added Annot.irt_xref and Annot.set_irt_xref() to inquire or
+    set the /IRT (“In Responde To”) property of an annotation.
+    Implements #1450.
+  * Added Rect.torect() and IRect.torect() which compute a matrix
+    that transforms to a given other rectangle.
+  * Changed Pixmap.color_count() to also return the count of each
+    color.
+  * Changed Page.get_texttrace() to also return correct span and
+    character bboxes if span["dir"] != (1, 0).
+
+-------------------------------------------------------------------
+Mon Nov 22 10:33:01 UTC 2021 - Hsiu-Ming Chang <cges30901@gmail.com>
+
+- Update to v1.19.2
+  * Fixed #1388. Fixed intermittent memory corruption when insert or
+    updating annotations.
+  * Fixed #1375. Inconsistencies between line numbers as returned
+    by the “words” and the “dict” options of `Page.get_text()` have
+    been corrected.
+  * Fixed #1364. The check for being a "rawdict" span in
+    `recover_span_quad()` now works correctly.
+  * Fixed #1342. Corrected the check for rectangle infiniteness in
+    `Page.show_pdf_page()`.
+  * Changed `Page.get_drawings()`, `Page.get_cdrawings()` to return
+    an indicator on the area orientation covered by a rectangle. This
+    implements #1355. Also, the recognition rate for rectangles and
+    quads has been significantly improved.
+  * Changed all text search and extraction methods to set the new
+    flags option TEXT_MEDIABOX_CLIP to ON by default. That bit causes
+    the automatic suppression of all characters that are completely
+    outside a page’s mediabox (in as far as that notion is supported
+    for a document type). This eliminates the need for using
+    clip=page.rect or similar for omitting text outside the visible
+    area.
+  * Added parameter "dpi" to `Page.get_pixmap()` and
+    `Annot.get_pixmap()`. When given, parameter "matrix" is ignored,
+    and a Pixmap with the desired dots per inch is created.
+  * Added attributes `Pixmap.is_monochrome` and `Pixmap.is_unicolor`
+    allowing fast checks of pixmap properties. Addresses #1397.
+  * Added method `Pixmap.color_count()` to determine the unique
+    colors in the pixmap.
+  * Added boolean parameter "compress" to PDF document method
+    `Document.update_stream()`. Addresses / enables solution for
+    #1408.
+- from v1.19.1
+  * Fixed #1328. “words” text extraction again returns correct (x0,
+    y0) coordinates.
+  * Changed `Page.get_textpage_ocr()`: it now supports parameter
+    dpi to control OCR quality. It is also possible to choose whether
+    the full page should be OCRed or only the images displayed by the
+    page.
+  * Changed `Page.get_drawings()` and `Page.get_cdrawings()` to
+    automatically convert colors to RGB color tuples. Implements
+    #1332. Similar change was applied to `Page.get_texttrace()`.
+  * Changed `Page.get_text()` to support a parameter sort. If set
+    to True the output is conveniently sorted.
+- from v1.19.0
+  * Supports MuPDF 1.19.*
+  * Changed terminology and meaning of important geometry concepts:
+    Rectangles are now characterized as finite, valid or empty, while
+    the definitions of these terms have also changed. Rectangles
+    specifically are now thought of being “open”: not all corners
+    and sides are considered part of the retangle. Please do read
+    the Rect section for details.
+  * Added new parameter “no_new_id” to `Document.save()` /
+    `Document.tobytes()` methods. Use it to suppress updating the
+    second item of the document /ID which in PDF indicates that the
+    original file has been updated. If the PDF has no /ID at all yet,
+    then no new one will be created either.
+  * Added a journalling facility for PDF updates. This allows logging
+    changes, undoing or redoing them, or saving the journal for later
+    use. Refer to `Document.journal_enable()` and friends.
+  * Added new Pixmap methods `Pixmap.pdfocr_save()` and
+    `Pixmap.pdfocr_tobytes()`, which generate a 1-page PDF containing
+    the pixmap as PNG image with OCR text layer.
+  * Added `Page.get_textpage_ocr()` which executes optical character
+    recognition for the page, then extracts the results and stores
+    them together with “normal” page content in a TextPage. Use or
+    reuse this object in subsequent text extractions and text
+    searches to avoid multiple efforts. The existing text search
+    and text extraction methods have been extended to support a
+    separately created textpage – see next item.
+  * Added a new parameter textpage to text extraction and text search
+    methods. This allows reuse of a previously created TextPage and
+    thus achieves significant runtime benefits – which is especially
+    important for the new OCR features. But “normal” text extractions
+    can definitely also benefit.
+  * Added `Page.get_texttrace()`, a technical method delivering
+    low-level text character properties. It was present before as a
+    private method, but the author felt it now is mature enough to be
+    officially available. It specifically includes a “sequence
+    number” which indicates the page appearance build operation that
+    painted the text.
+  * Added `Page.get_bboxlog()` which delivers the list of
+    rectangles of page objects like text, images or drawings. Its
+    significance lies in its sequence: rectangles intersecting areas
+    with a lower index are covering or hiding them.
+  * Changed methods `Page.get_drawings()` and
+    `Page.get_cdrawings()` to include a “sequence number” indicating
+    the page appearance build operation that created the drawing.
+  * Fixed #1311. Field values in comboboxes should now be handled
+    correctly.
+  * Fixed #1290. Error was caused by incorrect rectangle emptiness
+    check, which is fixed due to new geometry logic of this version.
+  * Fixed #1286. Text alignment for redact annotations is working
+    again.
+  * Fixed #1287. Infinite loop issue for non-Windows systems when
+    applying some redactions has been resolved.
+  * Fixed #1284. Text layout destruction after applying redactions in
+    some cases has been resolved.
+- from v1.18.19
+  * Fixed issue #1266. Failure to set `Pixmap.samples` in important
+    cases, was hotfixed in a new version 1.18.19.
+- from v1.18.18
+  * Fixed issue #1257. Removing the read-only flag from PDF fields
+    is now possible.
+  * Fixed issue #1252. Now correctly specifying the zoom value for
+    PDF link annotations.
+  * Fixed issue #1244. Now correctly computing the transform matrix
+    in `Page.get_image__bbox()`.
+  * Fixed issue #1241. Prevent returning artifact characters in
+    `Page.get_textbox()`, which happened in certain constellations.
+  * Fixed issue #1234. Avoid creating infinite rectangles in corner
+    cases – `Page.get_drawings()`, `Page.get_cdrawings()`.
+  * Added test data and test scripts to the source PyPI source
+    distribution.
+- from v1.18.17
+  * Fixed issue #1199. Using a non-existing page number in
+    `Document.get_page_images()` and friends will no longer lead to
+    segfaults.
+  * Changed `Page.get_drawings()` to now differentiate between
+    “stroke”, “fill” and combined paths. Paths containing more than
+    one rectangle (i.e. “re” items) are now supported. Extracting
+    “clipped” paths is now available as an option.
+  * Added `Page.get_cdrawings()`, performance-optimized version of
+    `Page.get_drawings()`.
+  * Added `Pixmap.samples_mv`, memoryview of a pixmap’s pixel area.
+    Does not copy and thus always accesses the current state of that
+    area.
+  * Added `Pixmap.samples_ptr`, Python “pointer” to a pixmap’s pixel
+    area. Allows much faster creation (factor 800+) of Qt images.
+- from v1.18.16
+  * Fixed issue #1184. Existing PDF widget fonts in a PDF are now
+    accepted (i.e. not forcedly changed to a Base-14 font).
+  * Fixed issue #1154. Text search hits should now be correct when
+    clip is specified.
+  * Fixed issue #1152.
+  * Fixed issue #1146.
+  * Added `Link.flags` and `Link.set_flags()` to the Link class.
+    Implements enhancement requests #1187.
+  * Added option to simulate `TextWriter.fill_textbox() output for
+    predicting the number of lines, that a given text would occupy in
+    the textbox.
+  * Added text output support as subcommand gettext to the fitz CLI
+    module. Most importantly, original physical text layout
+    reproduction is now supported.
+- from v1.18.15
+  * Fixed issue #1088. Removing an annotation’s fill color should now
+    work again both ways, using the fill_color=[] argument in
+    `Annot.update()` as well as fill=[] in `Annot.set_colors()`.
+  * Fixed issue #1081. `Document.subset_fonts()`: fixed an error
+    which created wrong character widths for some fonts.
+  * Fixed issue #1078. `Page.get_text()` and other methods related to
+    text extraction: changed the default value of the TextPage flags
+    parameter. All whitespace and ligatures are now preserved.
+  * Fixed issue #1085. The old snake_cased alias of
+    `fitz.detTextlength` is now defined correctly.
+  * Changed `Document.subset_fonts()` will now correctly prefix font
+    subsets with an appropriate six letter uppercase tag, complying
+    with the PDF specification.
+  * Added new method `Widget.button_states()` which returns the
+    possible values that a button-type field can have when being set
+    to “on” or “off”.
+  * Added support of text with Small Capital letters to the Font and
+    TextWriter classes. This is reflected by an additional bool
+    parameter small_caps in various of their methods.
+- from v1.18.14
+  * Finished implementing new, “snake_cased” names for methods and
+    properties, that were “camelCased” and awkward in many aspects.
+    At the end of this documentation, there is section Deprecated
+    Names with more background and a mapping of old to new names.
+  * Fixed issue #1053. `Page.insert_image()`: when given, include
+    image mask in the hash computation.
+  * Fixed issue #1043. Added `Pixmap.getPNGdata` to the aliases of
+    `Pixmap.tobytes()`.
+  * Fixed an internal error when computing the envelopping
+    rectangle of drawn paths as returned by `Page.get_drawings()`.
+  * Fixed an internal error occasionally causing loops when
+    outputting text via `TextWriter.fill_textbox()`.
+  * Added `Font.char_lengths()`, which returns a tuple of character
+    widths of a string.
+  * Added more ways to specify pages in `Document.delete_pages()`.
+    Now a sequence (list, tuple or range) can be specified, and the
+    Python del statement can be used. In the latter case, Python
+    slices are also accepted.
+  * Changed `Document.del_toc_item()`, which disables a single item
+    of the TOC: previously, the title text was removed. Instead, now
+    the complete item will be shown grayed-out by supporting viewers.
+- from v1.18.13
+  * Fixed issue #1014
+  * Fixed an internal memory leak when computing image bboxes –
+    `Page.get_image_bbox()`.
+  * Added support for low-level access and modification of the PDF
+    trailer. Applies to `Document.xref_get_keys()`,
+    `Document.xref_get_key(), and Document.xref_set_key()`.
+  * Added documentation for maintaining private entries in PDF
+    metadata.
+  * Added documentation for handling transparent image insertions,
+    `Page.insert_image()`.
+  * Added `Page.get_image_rects()`, an improved version of
+    `Page.get_image_bbox()`.
+  * Changed `Document.delete_pages()` to support various ways of
+    specifying pages to delete.
+  * Changed `Page.insert_image()` to also accept the xref of an
+    existing image in the file. This allows “copying” images between
+    pages, and extremely fast mutiple insertions.
+  * Changed `Page.insert_image()` to also accept the integer
+    parameter alpha. To be used for performance improvements.
+  * Changed `Pixmap.set_alpha()` to support new parameters for
+    pre-multiplying colors with their alpha values and setting a
+    specific color to fully transparent (e.g. white).
+  * Changed `Document.embfile_add()` to automatically set creation
+    and modification date-time. Correspondingly,
+    `Document.embfile_upd()` automatically maintains modification
+    date-time (/ModDate PDF key), and `Document.embfile_info()`
+    correspondingly reports these data. In addition, the embedded
+    file’s associated “collection item” is included via its xref.
+    This supports the development of PDF portfolio applications.
+
+-------------------------------------------------------------------
+Sat Apr 10 12:56:40 UTC 2021 - John Vandenberg <jayvdb@gmail.com>
+
+- Update to v1.18.11
+  * Improved layout of source distribution material.
+  * Stabilized Linux distribution detection for generating PyMuPDF
+    from sources.
+  * Page.get_xobjects delivers the result of Document.get_page_xobjects.
+  * Page.get_image_info delivers meta information for all images shown
+    on the page.
+  * Tools.mupdf_display_warnings allows setting on / off the display
+    of MuPDF-generated warnings. The default is off.
+  * Document.ez_save convenience alias of :meth:`Document.save`
+    with some different defaults.
+  * Image extractions of document pages now also contain the image's
+    **transformation matrix**. This concerns `Page.get_image_bbox`
+    and the DICT, JSON, RAWDICT, and RAWJSON variants of `Page.get_text`.
+- from v1.18.10
+  * Added old aliases for `DisplayList.get_pixmap` and
+    `DisplayList.get_textpage`.
+  * Stabilized removal of JavaScript objects with `Document.scrub`.
+  * Removed a loop in the reworked `TextWriter.fill_textbox`.
+  * `Document.xref_get_keys` and `Document.xref_get_key` to also allow
+    accessing the PDF trailer dictionary. This can be done by using
+    `-1` as the xref number argument.
+  * Added a number of functions for reconstructing the quads for text
+    lines, spans and characters extracted by `Page.get_text` options
+    "dict" and "rawdict".
+  * Added `Tools.unset_quad_corrections` to suppress character quad
+    corrections (occasionally required for erroneous fonts).
+
+--------------------------------------------------------------------
 Sat Feb 27 00:04:25 UTC 2021 - John Vandenberg <jayvdb@gmail.com>

 - Revised License to be AGPL-3.0-only
--- a/python-PyMuPDF.spec
+++ b/python-PyMuPDF.spec
@@ -21,7 +21,7 @@
 %define skip_python2 1
 %define pypi_name PyMuPDF
 Name:           python-%{pypi_name}
-Version:        1.18.9
+Version:        1.19.6
 Release:        0
 Summary:        Python binding for MuPDF, a PDF and XPS viewer
 License:        AGPL-3.0-only