Accepting request 925848 from home:mnhauke

- Update to version 2.0.7 * Addition: bento Add support for Kazakh (Cyrillic) language detection * Improvement: sparkle Further improve inferring the language from a given code page (single-byte). * Removed: fire Remove redundant logging entry about detected language(s). * Improvement: zap Refactoring for potential performance improvements in loops. * Improvement: sparkles Various detection improvement (MD+CD). * Bugfix: bug Fix a minor inconsistency between Python 3.5 and other versions regarding language detection. - Update to version 2.0.6 * Bugfix: bug Unforeseen regression with the loss of the backward-compatibility with some older minor of Python 3.5.x. * Bugfix: bug Fix CLI crash when using --minimal output in certain cases. * Improvement: sparkles Minor improvement to the detection efficiency (less than 1%). - Update to version 2.0.5 * Improvement: sparkles The BC-support with v1.x was improved, the old staticmethods are restored. * Remove: fire The project no longer raise warning on tiny content given for detection, will be simply logged as warning instead. * Improvement: sparkles The Unicode detection is slightly improved, see #93 * Bugfix: bug In some rare case, the chunks extractor could cut in the middle of a multi-byte character and could mislead the mess detection. OBS-URL: https://build.opensuse.org/request/show/925848 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=18
2021-10-26 20:41:42 +00:00 · 2021-10-26 20:41:42 +00:00 · fd5f5dc1f2
commit fd5f5dc1f2
parent d5d2d1f9e5
4 changed files with 148 additions and 9 deletions
--- a/charset_normalizer-1.3.9.tar.gz
+++ b/charset_normalizer-1.3.9.tar.gz
@ -1,3 +0,0 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:54425d9436c1cff46dfbb6b6598ac0a4c2d7b003d4787ab7daaf64528e458ed8
 size 347681
--- a/charset_normalizer-2.0.7.tar.gz
+++ b/charset_normalizer-2.0.7.tar.gz
@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:6473e80f73f5918254953073798a367f120cc5717e70c917359e155901c0e2d0
 size 369094
--- a/python-charset-normalizer.changes
+++ b/python-charset-normalizer.changes
@ -1,3 +1,144 @@
 -------------------------------------------------------------------
 Sun Oct 17 14:01:59 UTC 2021 - Martin Hauke <mardnh@gmx.de>
 - Update to version 2.0.7
  * Addition: bento Add support for Kazakh (Cyrillic) language
    detection
  * Improvement: sparkle Further improve inferring the language
    from a given code page (single-byte).
  * Removed: fire Remove redundant logging entry about detected
    language(s).
  * Improvement: zap Refactoring for potential performance
    improvements in loops.
  * Improvement: sparkles Various detection improvement (MD+CD).
  * Bugfix: bug Fix a minor inconsistency between Python 3.5 and
    other versions regarding language detection.
 - Update to version 2.0.6
  * Bugfix: bug Unforeseen regression with the loss of the
    backward-compatibility with some older minor of Python 3.5.x.
  * Bugfix: bug Fix CLI crash when using --minimal output in
    certain cases.
  * Improvement: sparkles Minor improvement to the detection
    efficiency (less than 1%).
 - Update to version 2.0.5
  * Improvement: sparkles The BC-support with v1.x was improved,
    the old staticmethods are restored.
  * Remove: fire The project no longer raise warning on tiny
    content given for detection, will be simply logged as warning
    instead.
  * Improvement: sparkles The Unicode detection is slightly
    improved, see #93
  * Bugfix: bug In some rare case, the chunks extractor could cut
    in the middle of a multi-byte character and could mislead the
    mess detection.
  * Bugfix: bug Some rare 'space' characters could trip up the
    UnprintablePlugin/Mess detection.
  * Improvement: art Add syntax sugar __bool__ for results
    CharsetMatches list-container.
 - Update to version 2.0.4
  * Improvement: sparkle Adjust the MD to lower the sensitivity,
    thus improving the global detection reliability.
  * Improvement: sparkle Allow fallback on specified encoding
    if any.
  * Bugfix: bug The CLI no longer raise an unexpected exception
    when no encoding has been found.
  * Bugfix: bug Fix accessing the 'alphabets' property when the
    payload contains surrogate characters.
  * Bugfix: bug pencil2 The logger could mislead (explain=True) on
    detected languages and the impact of one MBCS match (in #72)
  * Bugfix: bug Submatch factoring could be wrong in rare edge
    cases (in #72)
  * Bugfix: bug Multiple files given to the CLI were ignored when
    publishing results to STDOUT. (After the first path) (in #72)
  * Internal: art Fix line endings from CRLF to LF for certain
    files.
 - Update to version 2.0.3
  * Improvement: sparkles Part of the detection mechanism has been
    improved to be less sensitive, resulting in more accurate
    detection results. Especially ASCII. #63 Fix #62
  * Improvement: sparklesAccording to the community wishes, the
    detection will fall back on ASCII or UTF-8 in a last-resort
    case.
 - Update to version 2.0.2
  * Bugfix: bug Empty/Too small JSON payload miss-detection fixed.
  * Improvement: sparkler Don't inject unicodedata2 into sys.modules
 - Update to version 2.0.1
  * Bugfix: bug Make it work where there isn't a filesystem
    available, dropping assets frequencies.json.
  * Improvement: sparkles You may now use aliases in cp_isolation
    and cp_exclusion arguments.
  * Bugfix: bug Using explain=False permanently disable the verbose
    output in the current runtime #47
  * Bugfix: bug One log entry (language target preemptive) was not
    show in logs when using explain=True #47
  * Bugfix: bug Fix undesired exception (ValueError) on getitem of
    instance CharsetMatches #52
  * Improvement: wrench Public function normalize default args
    values were not aligned with from_bytes #53
 - Update to version 2.0.0
  * Performance: zap 4x to 5 times faster than the previous 1.4.0
    release.
  * Performance: zap At least 2x faster than Chardet.
  * Performance: zap Accent has been made on UTF-8 detection,
    should perform rather instantaneous.
  * Improvement: back The backward compatibility with Chardet has
    been greatly improved. The legacy detect function returns an
    identical charset name whenever possible.
  * Improvement: sparkle The detection mechanism has been slightly
    improved, now Turkish content is detected correctly (most of
    the time)
  * Code: art The program has been rewritten to ease the
    readability and maintainability. (+Using static typing)
  * Tests: heavy_check_mark New workflows are now in place to
    verify the following aspects: Performance, Backward-
    Compatibility with Chardet, and Detection Coverage in addition#
    to currents tests. (+CodeQL)
  * Dependency: heavy_minus_sign This package no longer require
    anything when used with Python 3.5 (Dropped cached_property)
  * Docs: pencil2 Performance claims have been updated, the guide
    to contributing, and the issue template.
  * Improvement: sparkle Add --version argument to CLI
  * Bugfix: bug The CLI output used the relative path of the
    file(s). Should be absolute.
  * Deprecation: red_circle Methods coherence_non_latin, w_counter,
    chaos_secondary_pass of the class CharsetMatch are now
    deprecated and scheduled for removal in v3.0
  * Improvement: sparkle If no language was detected in content,
    trying to infer it using the encoding name/alphabets used.
  * Removal: fire Removed support for these languages: Catalan,
    Esperanto, Kazakh, Baque, Volapük, Azeri, Galician, Nynorsk,
    Macedonian, and Serbocroatian.
  * Improvement: sparkle utf_7 detection has been reinstated.
  * Removal: fire The exception hook on UnicodeDecodeError has
    been removed.
 - Update to version 1.4.1
  * Improvement: art Logger configuration/usage no longer
    conflict with others #44
 - Update to version 1.4.0
  * Dependency: heavy_minus_sign Using standard logging instead
    of using the package loguru.
  * Dependency: heavy_minus_sign Dropping nose test framework in
    favor of the maintained pytest.
  * Dependency: heavy_minus_sign Choose to not use dragonmapper
    package to help with gibberish Chinese/CJK text.
  * Dependency: wrench heavy_minus_sign Require cached_property
    only for Python 3.5 due to constraint. Dropping for every
    other interpreter version.
  * Bugfix: bug BOM marker in a CharsetNormalizerMatch instance
    could be False in rare cases even if obviously present. Due
    to the sub-match factoring process.
  * Improvement: sparkler Return ASCII if given sequences fit.
  * Performance: zap Huge improvement over the larges payload.
  * Change: fire Stop support for UTF-7 that does not contain a
    SIG. (Contributions are welcome to improve that point)
  * Feature: sparkler CLI now produces JSON consumable output.
  * Dependency: Dropping PrettyTable, replaced with pure JSON
    output.
  * Bugfix: bug Not searching properly for the BOM when trying
    utf32/16 parent codec.
  * Other: zap Improving the package final size by compressing
    frequencies.json.
 -------------------------------------------------------------------
 Thu May 20 09:46:56 UTC 2021 - pgajdos@suse.com
--- a/python-charset-normalizer.spec
+++ b/python-charset-normalizer.spec
@ -19,14 +19,13 @@
 %{?!python_module:%define python_module() python-%{**} python3-%{**}}
 %define skip_python2 1
 Name:           python-charset-normalizer
-Version:        1.3.9
+Version:        2.0.7
 Release:        0
 Summary:        Python Universal Charset detector
 License:        MIT
 URL:            https://github.com/ousret/charset_normalizer
-Source:         https://files.pythonhosted.org/packages/source/c/charset_normalizer/charset_normalizer-%{version}.tar.gz
+Source:         https://github.com/Ousret/charset_normalizer/archive/refs/tags/%{version}.tar.gz#/charset_normalizer-%{version}.tar.gz
 BuildRequires:  %{python_module setuptools}
 BuildRequires:  dos2unix
 BuildRequires:  fdupes
 BuildRequires:  python-rpm-macros
 Requires:       python-PrettyTable
@ -45,6 +44,7 @@ BuildRequires:  %{python_module PrettyTable}
 BuildRequires:  %{python_module cached-property >= 1.5}
 BuildRequires:  %{python_module dragonmapper >= 0.2}
 BuildRequires:  %{python_module loguru >= 0.5}
 BuildRequires:  %{python_module pytest-cov}
 BuildRequires:  %{python_module pytest}
 BuildRequires:  %{python_module zhon}
 # /SECTION
@ -55,8 +55,6 @@ Python Universal Charset detector.
 %prep
 %setup -q -n charset_normalizer-%{version}
 dos2unix README.md
 chmod a-x charset_normalizer/assets/frequencies.json
 %build
 %python_build
@ -79,6 +77,6 @@ chmod a-x charset_normalizer/assets/frequencies.json
 %doc README.md
 %license LICENSE
 %python_alternative %{_bindir}/normalizer
-%{python_sitelib}/*
+%{python_sitelib}/charset_normalizer*
 %changelog