* Improvement over Vietnamese detection
* MD improvement on trailing data and long foreign (non-pure latin)
* Efficiency improvements in cd/alphabet_languages
* call sum() without an intermediary list following PEP 289 recommendations
* Code style as refactored by Sourcery-AI
* Minor adjustment on the MD around european words
* Remove and replace SRTs from assets / tests
* Initialize the library logger with a `NullHandler` by default
* Setting kwarg `explain` to True will add provisionally
* Fix large (misleading) sequence giving UnicodeDecodeError
* Avoid using too insignificant chunk
* Add and expose function `set_logging_handler` to configure a specific
StreamHandler
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=21
- Update to version 2.0.7
* Addition: bento Add support for Kazakh (Cyrillic) language
detection
* Improvement: sparkle Further improve inferring the language
from a given code page (single-byte).
* Removed: fire Remove redundant logging entry about detected
language(s).
* Improvement: zap Refactoring for potential performance
improvements in loops.
* Improvement: sparkles Various detection improvement (MD+CD).
* Bugfix: bug Fix a minor inconsistency between Python 3.5 and
other versions regarding language detection.
- Update to version 2.0.6
* Bugfix: bug Unforeseen regression with the loss of the
backward-compatibility with some older minor of Python 3.5.x.
* Bugfix: bug Fix CLI crash when using --minimal output in
certain cases.
* Improvement: sparkles Minor improvement to the detection
efficiency (less than 1%).
- Update to version 2.0.5
* Improvement: sparkles The BC-support with v1.x was improved,
the old staticmethods are restored.
* Remove: fire The project no longer raise warning on tiny
content given for detection, will be simply logged as warning
instead.
* Improvement: sparkles The Unicode detection is slightly
improved, see #93
* Bugfix: bug In some rare case, the chunks extractor could cut
in the middle of a multi-byte character and could mislead the
mess detection.
OBS-URL: https://build.opensuse.org/request/show/925848
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=18
- Update to 1.3.0
* Backport unicodedata for v12 impl into python if available
* Add aliases to CharsetNormalizerMatches class
* Add feature preemptive behaviour, looking for encoding declaration
* Add method to determine if specific encoding is multi byte
* Add has_submatch property on a match
* Add percent_chaos and percent_coherence
* Coherence ratio based on mean instead of sum of best results
* Using loguru for trace/debug <3
* from_byte method improved
OBS-URL: https://build.opensuse.org/request/show/734946
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=8
* Improvement on detection
* Performance loss to expect
* Added --threshold option to CLI
* Bugfix on UTF 7 support
* Legacy detect(byte_str) method
* BOM support (Unicode mostly)
* Chaos prober improved on small text
* Language detection has been reviewed to give better result
* Bugfix on jp detection, every jp text was considered chaotic
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=5