From 7e521f133b4d1017a9671bda909f703ebcaded2bc9a64160392ca697ab62f15b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mark=C3=A9ta=20Machov=C3=A1?= Date: Thu, 9 Jan 2025 12:21:13 +0000 Subject: [PATCH] - Update to 3.4.1 * Project metadata are now stored using `pyproject.toml` instead of `setup.cfg` using setuptools as the build backend. * Enforce annotation delayed loading for a simpler and consistent types in the project. * Optional mypyc compilation upgraded to version 1.14 for Python >= 3.8 * Added pre-commit configuration. * Added noxfile. * Removed `build-requirements.txt` as per using `pyproject.toml` native build configuration. * Removed `bin/integration.py` and `bin/serve.py` in favor of downstream integration test (see noxfile). * Removed `setup.cfg` in favor of `pyproject.toml` metadata configuration. * Removed unused `utils.range_scan` function. * Converting content to Unicode bytes may insert `utf_8` instead of preferred `utf-8`. (#572) * Deprecation warning "'count' is passed as positional argument" when converting to Unicode bytes on Python 3.13+ - Drop sed command to remove code coverage flags from pytest OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=49 --- .gitattributes | 23 ++ .gitignore | 1 + charset_normalizer-3.3.2.tar.gz | 3 + charset_normalizer-3.4.0.tar.gz | 3 + charset_normalizer-3.4.1.tar.gz | 3 + python-charset-normalizer.changes | 440 ++++++++++++++++++++++++++++++ python-charset-normalizer.spec | 72 +++++ 7 files changed, 545 insertions(+) create mode 100644 .gitattributes create mode 100644 .gitignore create mode 100644 charset_normalizer-3.3.2.tar.gz create mode 100644 charset_normalizer-3.4.0.tar.gz create mode 100644 charset_normalizer-3.4.1.tar.gz create mode 100644 python-charset-normalizer.changes create mode 100644 python-charset-normalizer.spec diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..9b03811 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,23 @@ +## Default LFS +*.7z filter=lfs diff=lfs merge=lfs -text +*.bsp filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.gem filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.jar filter=lfs diff=lfs merge=lfs -text +*.lz filter=lfs diff=lfs merge=lfs -text +*.lzma filter=lfs diff=lfs merge=lfs -text +*.obscpio filter=lfs diff=lfs merge=lfs -text +*.oxt filter=lfs diff=lfs merge=lfs -text +*.pdf filter=lfs diff=lfs merge=lfs -text +*.png filter=lfs diff=lfs merge=lfs -text +*.rpm filter=lfs diff=lfs merge=lfs -text +*.tbz filter=lfs diff=lfs merge=lfs -text +*.tbz2 filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.ttf filter=lfs diff=lfs merge=lfs -text +*.txz filter=lfs diff=lfs merge=lfs -text +*.whl filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..57affb6 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.osc diff --git a/charset_normalizer-3.3.2.tar.gz b/charset_normalizer-3.3.2.tar.gz new file mode 100644 index 0000000..5bb58d5 --- /dev/null +++ b/charset_normalizer-3.3.2.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9948e5c17831916ef192cf3f26c744d539eb6f4e9e3b02eea649552c52b10d91 +size 95792 diff --git a/charset_normalizer-3.4.0.tar.gz b/charset_normalizer-3.4.0.tar.gz new file mode 100644 index 0000000..a7e431a --- /dev/null +++ b/charset_normalizer-3.4.0.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e8dfa606a7c3f4d4540e7f648a602338323ee0f5e458043accf59b4a0493783f +size 97414 diff --git a/charset_normalizer-3.4.1.tar.gz b/charset_normalizer-3.4.1.tar.gz new file mode 100644 index 0000000..b1bccf9 --- /dev/null +++ b/charset_normalizer-3.4.1.tar.gz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5b91677d4c790eead798f4ed3e5f546ed80d6fe8cf1d8120939960d593371855 +size 97345 diff --git a/python-charset-normalizer.changes b/python-charset-normalizer.changes new file mode 100644 index 0000000..4d09ab4 --- /dev/null +++ b/python-charset-normalizer.changes @@ -0,0 +1,440 @@ +------------------------------------------------------------------- +Thu Jan 9 09:05:30 UTC 2025 - John Paul Adrian Glaubitz + +- Update to 3.4.1 + * Project metadata are now stored using `pyproject.toml` instead of + `setup.cfg` using setuptools as the build backend. + * Enforce annotation delayed loading for a simpler and consistent + types in the project. + * Optional mypyc compilation upgraded to version 1.14 for Python >= 3.8 + * Added pre-commit configuration. + * Added noxfile. + * Removed `build-requirements.txt` as per using `pyproject.toml` + native build configuration. + * Removed `bin/integration.py` and `bin/serve.py` in favor of downstream + integration test (see noxfile). + * Removed `setup.cfg` in favor of `pyproject.toml` metadata configuration. + * Removed unused `utils.range_scan` function. + * Converting content to Unicode bytes may insert `utf_8` instead of + preferred `utf-8`. (#572) + * Deprecation warning "'count' is passed as positional argument" when + converting to Unicode bytes on Python 3.13+ +- Drop sed command to remove code coverage flags from pytest + +------------------------------------------------------------------- +Mon Oct 28 16:37:52 UTC 2024 - Dirk Müller + +- switch to PEP517 build + +------------------------------------------------------------------- +Tue Oct 22 16:00:12 UTC 2024 - Dirk Müller + +- update to 3.4.0: + * Argument `--no-preemptive` in the CLI to prevent the detector + to search for hints. + * Support for Python 3.13 + * Relax the TypeError exception thrown when trying to compare a + CharsetMatch with anything else than a CharsetMatch. + * Improved the general reliability of the detector based on + user feedbacks. (#520) (#509) (#498) (#407) + * Declared charset in content (preemptive detection) not + changed when converting to utf-8 bytes. + +------------------------------------------------------------------- +Sat Nov 25 14:12:18 UTC 2023 - Dirk Müller + +- update to 3.3.2: + * Unintentional memory usage regression when using large + payload that match several encoding (#376) + * Regression on some detection case showcased in the + documentation (#371) + * Noise (md) probe that identify malformed arabic + representation due to the presence of letters in isolated + form + * Optional mypyc compilation upgraded to version 1.6.1 for + Python >= 3.8 + * Improved the general detection reliability based on reports + from the community + +------------------------------------------------------------------- +Mon Oct 2 09:07:47 UTC 2023 - Dirk Müller + +- update to 3.3.0: + * Allow to execute the CLI (e.g. normalizer) through `python -m + charset_normalizer.cli` or `python -m charset_normalizer` + * Support for 9 forgotten encoding that are supported by Python + but unlisted in `encoding.aliases` as they have no alias + * Optional mypyc compilation upgraded to version 1.5.1 for + Python >= 3.7 + * Unable to properly sort CharsetMatch when both chaos/noise + and coherence were close due to an unreachable condition in + \_\_lt\_\_ (#350) + +------------------------------------------------------------------- +Tue Jul 11 13:22:52 UTC 2023 - Dirk Müller + +- update to 3.2.0: + * Typehint for function `from_path` no longer enforce + `PathLike` as its first argument + * Minor improvement over the global detection reliability + * Introduce function `is_binary` that relies on main + capabilities, and optimized to detect binaries + * Propagate `enable_fallback` argument throughout `from_bytes`, + `from_path`, and `from_fp` that allow a deeper control over + the detection (default True) + * Edge case detection failure where a file would contain 'very- + long' camel cased word (Issue #289) + +------------------------------------------------------------------- +Fri Apr 21 12:31:23 UTC 2023 - Dirk Müller + +- add sle15_python_module_pythons (jsc#PED-68) + +------------------------------------------------------------------- +Sun Mar 26 20:04:17 UTC 2023 - Dirk Müller + +- update to 3.1.0: + * Argument `should_rename_legacy` for legacy function `detect` + and disregard any new arguments without errors (PR #262) + * Removed Support for Python 3.6 (PR #260) + * Optional speedup provided by mypy/c 1.0.1 + +------------------------------------------------------------------- +Sat Dec 3 04:13:46 UTC 2022 - Yogalakshmi Arunachalam + +- Update to 3.0.1 + Fixed + Multi-bytes cutter/chunk generator did not always cut correctly (PR #233) + Changed + Speedup provided by mypy/c 0.990 on Python >= 3.7 + +------------------------------------------------------------------- +Thu Oct 27 22:18:02 UTC 2022 - Yogalakshmi Arunachalam + +- Update to 3.0.0 + Added + * Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results + Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES + Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio + normalizer --version now specify if current version provide extra speedup (meaning mypyc compilation whl) + * Changed + Build with static metadata using 'build' frontend + Make the language detection stricter + Optional: Module md.py can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1 + * Fixed + CLI with opt --normalize fail when using full path for files + TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it + Sphinx warnings when generating the documentation + * Removed + Coherence detector no longer return 'Simple English' instead return 'English' + Coherence detector no longer return 'Classical Chinese' instead return 'Chinese' + Breaking: Method first() and best() from CharsetMatch + UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII) + Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches + Breaking: Top-level function normalize + Breaking: Properties chaos_secondary_pass, coherence_non_latin and w_counter from CharsetMatch + Support for the backport unicodedata2 + +------------------------------------------------------------------- +Sat Sep 17 15:46:10 UTC 2022 - Dirk Müller + +- update to 2.1.1: + * Function `normalize` scheduled for removal in 3.0 + * Removed useless call to decode in fn is_unprintable (#206) + +------------------------------------------------------------------- +Thu Aug 18 15:34:14 UTC 2022 - Ben Greiner + +- Clean requirements: We don't need anything + +------------------------------------------------------------------- +Tue Jul 19 11:38:48 UTC 2022 - Dirk Müller + +- update to 2.1.0: + * Output the Unicode table version when running the CLI with `--version` + * Re-use decoded buffer for single byte character sets + * Fixing some performance bottlenecks + * Workaround potential bug in cpython with Zero Width No-Break Space located + * in Arabic Presentation Forms-B, Unicode 1.1 not acknowledged as space + * CLI default threshold aligned with the API threshold from + * Support for Python 3.5 (PR #192) + * Use of backport unicodedata from `unicodedata2` as Python is quickly + catching up, scheduled for removal in 3.0 + +------------------------------------------------------------------- +Tue Feb 15 08:42:30 UTC 2022 - Dirk Müller + +- update to 2.0.12: + * ASCII miss-detection on rare cases (PR #170) + * Explicit support for Python 3.11 (PR #164) + * The logging behavior have been completely reviewed, now using only TRACE + and DEBUG levels + +------------------------------------------------------------------- +Mon Jan 10 23:01:54 UTC 2022 - Dirk Müller + +- update to 2.0.10: + * Fallback match entries might lead to UnicodeDecodeError for large bytes + sequence + * Skipping the language-detection (CD) on ASCII + +------------------------------------------------------------------- +Mon Dec 6 20:08:41 UTC 2021 - Dirk Müller + +- update to 2.0.9: + * Moderating the logging impact (since 2.0.8) for specific + environments + * Wrong logging level applied when setting kwarg `explain` to True + +------------------------------------------------------------------- +Mon Nov 29 11:14:37 UTC 2021 - Dirk Müller + +- update to 2.0.8: + * Improvement over Vietnamese detection + * MD improvement on trailing data and long foreign (non-pure latin) + * Efficiency improvements in cd/alphabet_languages + * call sum() without an intermediary list following PEP 289 recommendations + * Code style as refactored by Sourcery-AI + * Minor adjustment on the MD around european words + * Remove and replace SRTs from assets / tests + * Initialize the library logger with a `NullHandler` by default + * Setting kwarg `explain` to True will add provisionally + * Fix large (misleading) sequence giving UnicodeDecodeError + * Avoid using too insignificant chunk + * Add and expose function `set_logging_handler` to configure a specific + StreamHandler + +------------------------------------------------------------------- +Fri Nov 26 11:35:25 UTC 2021 - Dirk Müller + +- require lower-case name instead of breaking build + +------------------------------------------------------------------- +Thu Nov 25 22:26:52 UTC 2021 - Matej Cepl + +- Use lower-case name of prettytable package + +------------------------------------------------------------------- +Sun Oct 17 14:01:59 UTC 2021 - Martin Hauke + +- Update to version 2.0.7 + * Addition: bento Add support for Kazakh (Cyrillic) language + detection + * Improvement: sparkle Further improve inferring the language + from a given code page (single-byte). + * Removed: fire Remove redundant logging entry about detected + language(s). + * Improvement: zap Refactoring for potential performance + improvements in loops. + * Improvement: sparkles Various detection improvement (MD+CD). + * Bugfix: bug Fix a minor inconsistency between Python 3.5 and + other versions regarding language detection. +- Update to version 2.0.6 + * Bugfix: bug Unforeseen regression with the loss of the + backward-compatibility with some older minor of Python 3.5.x. + * Bugfix: bug Fix CLI crash when using --minimal output in + certain cases. + * Improvement: sparkles Minor improvement to the detection + efficiency (less than 1%). +- Update to version 2.0.5 + * Improvement: sparkles The BC-support with v1.x was improved, + the old staticmethods are restored. + * Remove: fire The project no longer raise warning on tiny + content given for detection, will be simply logged as warning + instead. + * Improvement: sparkles The Unicode detection is slightly + improved, see #93 + * Bugfix: bug In some rare case, the chunks extractor could cut + in the middle of a multi-byte character and could mislead the + mess detection. + * Bugfix: bug Some rare 'space' characters could trip up the + UnprintablePlugin/Mess detection. + * Improvement: art Add syntax sugar __bool__ for results + CharsetMatches list-container. +- Update to version 2.0.4 + * Improvement: sparkle Adjust the MD to lower the sensitivity, + thus improving the global detection reliability. + * Improvement: sparkle Allow fallback on specified encoding + if any. + * Bugfix: bug The CLI no longer raise an unexpected exception + when no encoding has been found. + * Bugfix: bug Fix accessing the 'alphabets' property when the + payload contains surrogate characters. + * Bugfix: bug pencil2 The logger could mislead (explain=True) on + detected languages and the impact of one MBCS match (in #72) + * Bugfix: bug Submatch factoring could be wrong in rare edge + cases (in #72) + * Bugfix: bug Multiple files given to the CLI were ignored when + publishing results to STDOUT. (After the first path) (in #72) + * Internal: art Fix line endings from CRLF to LF for certain + files. +- Update to version 2.0.3 + * Improvement: sparkles Part of the detection mechanism has been + improved to be less sensitive, resulting in more accurate + detection results. Especially ASCII. #63 Fix #62 + * Improvement: sparklesAccording to the community wishes, the + detection will fall back on ASCII or UTF-8 in a last-resort + case. +- Update to version 2.0.2 + * Bugfix: bug Empty/Too small JSON payload miss-detection fixed. + * Improvement: sparkler Don't inject unicodedata2 into sys.modules +- Update to version 2.0.1 + * Bugfix: bug Make it work where there isn't a filesystem + available, dropping assets frequencies.json. + * Improvement: sparkles You may now use aliases in cp_isolation + and cp_exclusion arguments. + * Bugfix: bug Using explain=False permanently disable the verbose + output in the current runtime #47 + * Bugfix: bug One log entry (language target preemptive) was not + show in logs when using explain=True #47 + * Bugfix: bug Fix undesired exception (ValueError) on getitem of + instance CharsetMatches #52 + * Improvement: wrench Public function normalize default args + values were not aligned with from_bytes #53 +- Update to version 2.0.0 + * Performance: zap 4x to 5 times faster than the previous 1.4.0 + release. + * Performance: zap At least 2x faster than Chardet. + * Performance: zap Accent has been made on UTF-8 detection, + should perform rather instantaneous. + * Improvement: back The backward compatibility with Chardet has + been greatly improved. The legacy detect function returns an + identical charset name whenever possible. + * Improvement: sparkle The detection mechanism has been slightly + improved, now Turkish content is detected correctly (most of + the time) + * Code: art The program has been rewritten to ease the + readability and maintainability. (+Using static typing) + * Tests: heavy_check_mark New workflows are now in place to + verify the following aspects: Performance, Backward- + Compatibility with Chardet, and Detection Coverage in addition# + to currents tests. (+CodeQL) + * Dependency: heavy_minus_sign This package no longer require + anything when used with Python 3.5 (Dropped cached_property) + * Docs: pencil2 Performance claims have been updated, the guide + to contributing, and the issue template. + * Improvement: sparkle Add --version argument to CLI + * Bugfix: bug The CLI output used the relative path of the + file(s). Should be absolute. + * Deprecation: red_circle Methods coherence_non_latin, w_counter, + chaos_secondary_pass of the class CharsetMatch are now + deprecated and scheduled for removal in v3.0 + * Improvement: sparkle If no language was detected in content, + trying to infer it using the encoding name/alphabets used. + * Removal: fire Removed support for these languages: Catalan, + Esperanto, Kazakh, Baque, Volapük, Azeri, Galician, Nynorsk, + Macedonian, and Serbocroatian. + * Improvement: sparkle utf_7 detection has been reinstated. + * Removal: fire The exception hook on UnicodeDecodeError has + been removed. +- Update to version 1.4.1 + * Improvement: art Logger configuration/usage no longer + conflict with others #44 +- Update to version 1.4.0 + * Dependency: heavy_minus_sign Using standard logging instead + of using the package loguru. + * Dependency: heavy_minus_sign Dropping nose test framework in + favor of the maintained pytest. + * Dependency: heavy_minus_sign Choose to not use dragonmapper + package to help with gibberish Chinese/CJK text. + * Dependency: wrench heavy_minus_sign Require cached_property + only for Python 3.5 due to constraint. Dropping for every + other interpreter version. + * Bugfix: bug BOM marker in a CharsetNormalizerMatch instance + could be False in rare cases even if obviously present. Due + to the sub-match factoring process. + * Improvement: sparkler Return ASCII if given sequences fit. + * Performance: zap Huge improvement over the larges payload. + * Change: fire Stop support for UTF-7 that does not contain a + SIG. (Contributions are welcome to improve that point) + * Feature: sparkler CLI now produces JSON consumable output. + * Dependency: Dropping PrettyTable, replaced with pure JSON + output. + * Bugfix: bug Not searching properly for the BOM when trying + utf32/16 parent codec. + * Other: zap Improving the package final size by compressing + frequencies.json. + +------------------------------------------------------------------- +Thu May 20 09:46:56 UTC 2021 - pgajdos@suse.com + +- version update to 1.3.9 + * Bugfix: bug In some very rare cases, you may end up getting encode/decode errors due to a bad bytes payload #40 + * Bugfix: bug Empty given payload for detection may cause an exception if trying to access the alphabets property. #39 + * Bugfix: bug The legacy detect function should return UTF-8-SIG if sig is present in the payload. #38 + +------------------------------------------------------------------- +Tue Feb 9 00:47:34 UTC 2021 - John Vandenberg + +- Switch to PyPI source +- Add Suggests: python-unicodedata2 +- Remove executable bit from charset_normalizer/assets/frequencies.json +- Update to v1.3.6 + * Allow prettytable 2.0 +- from v1.3.5 + * Dependencies refactor and add support for py 3.9 and 3.10 + * Fix version parsing + +------------------------------------------------------------------- +Mon May 25 10:59:12 UTC 2020 - Petr Gajdos + +- %python3_only -> %python_alternative + +------------------------------------------------------------------- +Mon Jan 27 09:09:27 UTC 2020 - Marketa Calabkova + +- Update to 1.3.4 + * Improvement/Bugfix : False positive when searching for successive upper, lower char. (ProbeChaos) + * Improvement : Noticeable better detection for jp + * Bugfix : Passing zero-length bytes to from_bytes + * Improvement : Expose version in package + * Bugfix : Division by zero + * Improvement : Prefers unicode (utf-8) when detected + * Apparently dropped Python2 silently + +------------------------------------------------------------------- +Fri Oct 4 08:52:51 UTC 2019 - Marketa Calabkova + +- Update to 1.3.0 + * Backport unicodedata for v12 impl into python if available + * Add aliases to CharsetNormalizerMatches class + * Add feature preemptive behaviour, looking for encoding declaration + * Add method to determine if specific encoding is multi byte + * Add has_submatch property on a match + * Add percent_chaos and percent_coherence + * Coherence ratio based on mean instead of sum of best results + * Using loguru for trace/debug <3 + * from_byte method improved + +------------------------------------------------------------------- +Thu Sep 26 10:35:51 UTC 2019 - Tomáš Chvátal + +- Update to 1.1.1: + * from_bytes parameters steps and chunk_size were not adapted to sequence len if provided values were not fitted to content + * Sequence having lenght bellow 10 chars was not checked + * Legacy detect method inspired by chardet was not returning + * Various more test updates + +------------------------------------------------------------------- +Fri Sep 13 11:05:06 UTC 2019 - Tomáš Chvátal + +- Update to 0.3: + * Improvement on detection + * Performance loss to expect + * Added --threshold option to CLI + * Bugfix on UTF 7 support + * Legacy detect(byte_str) method + * BOM support (Unicode mostly) + * Chaos prober improved on small text + * Language detection has been reviewed to give better result + * Bugfix on jp detection, every jp text was considered chaotic + +------------------------------------------------------------------- +Fri Aug 30 00:46:27 UTC 2019 - Tomáš Chvátal + +- Fix the tarball to really be the one published by upstream + +------------------------------------------------------------------- +Tue Aug 28 06:29:02 PM UTC 2019 - John Vandenberg + +- Initial spec for v0.1.8 diff --git a/python-charset-normalizer.spec b/python-charset-normalizer.spec new file mode 100644 index 0000000..de319cc --- /dev/null +++ b/python-charset-normalizer.spec @@ -0,0 +1,72 @@ +# +# spec file for package python-charset-normalizer +# +# Copyright (c) 2025 SUSE LLC +# +# All modifications and additions to the file contributed by third parties +# remain the property of their copyright owners, unless otherwise agreed +# upon. The license for this file, and modifications and additions to the +# file, is the same license as for the pristine package itself (unless the +# license for the pristine package is not an Open Source License, in which +# case the license is the MIT License). An "Open Source License" is a +# license that conforms to the Open Source Definition (Version 1.9) +# published by the Open Source Initiative. + +# Please submit bugfixes or comments via https://bugs.opensuse.org/ +# + + +%{?sle15_python_module_pythons} +Name: python-charset-normalizer +Version: 3.4.1 +Release: 0 +Summary: Python Universal Charset detector +License: MIT +URL: https://github.com/ousret/charset_normalizer +Source: https://github.com/Ousret/charset_normalizer/archive/refs/tags/%{version}.tar.gz#/charset_normalizer-%{version}.tar.gz +BuildRequires: %{python_module base >= 3.7} +BuildRequires: %{python_module pip} +BuildRequires: %{python_module setuptools} +BuildRequires: %{python_module wheel} +BuildRequires: fdupes +BuildRequires: python-rpm-macros +Requires(post): update-alternatives +Requires(postun): update-alternatives +Suggests: python-unicodedata2 +BuildArch: noarch +# SECTION test requirements +BuildRequires: %{python_module pytest} +# /SECTION +%python_subpackages + +%description +Python Universal Charset detector. + +%prep +%setup -q -n charset_normalizer-%{version} + +%build +%pyproject_wheel + +%install +%pyproject_install +%python_clone -a %{buildroot}%{_bindir}/normalizer +%python_expand %fdupes %{buildroot}%{$python_sitelib} + +%check +%pytest + +%post +%python_install_alternative normalizer + +%postun +%python_uninstall_alternative normalizer + +%files %{python_files} +%doc README.md +%license LICENSE +%python_alternative %{_bindir}/normalizer +%{python_sitelib}/charset_normalizer +%{python_sitelib}/charset_normalizer-%{version}*-info + +%changelog