Commit Graph

16 Commits

Author SHA256 Message Date
259f5f1afe - update to 2.0.12:
* ASCII miss-detection on rare cases (PR #170) 
  * Explicit support for Python 3.11 (PR #164)
  * The logging behavior have been completely reviewed, now using only TRACE
    and DEBUG levels

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=24
2022-02-15 08:43:43 +00:00
c739862e1a - update to 2.0.10:
* Fallback match entries might lead to UnicodeDecodeError for large bytes
    sequence
  * Skipping the language-detection (CD) on ASCII

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=23
2022-01-10 23:04:22 +00:00
53a1bfb655 - update to 2.0.9:
* Moderating the logging impact (since 2.0.8) for specific
    environments
  * Wrong logging level applied when setting kwarg `explain` to True

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=22
2021-12-06 20:09:48 +00:00
4e6d945d9a - update to 2.0.8:
* Improvement over Vietnamese detection
  * MD improvement on trailing data and long foreign (non-pure latin)
  * Efficiency improvements in cd/alphabet_languages
  * call sum() without an intermediary list following PEP 289 recommendations
  * Code style as refactored by Sourcery-AI
  * Minor adjustment on the MD around european words
  * Remove and replace SRTs from assets / tests
  * Initialize the library logger with a `NullHandler` by default
  * Setting kwarg `explain` to True will add provisionally
  * Fix large (misleading) sequence giving UnicodeDecodeError
  * Avoid using too insignificant chunk
  * Add and expose function `set_logging_handler` to configure a specific
    StreamHandler

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=21
2021-11-29 11:18:31 +00:00
380896adbc - require lower-case name instead of breaking build
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=20
2021-11-26 11:35:38 +00:00
515e72fd80 - Use lower-case name of prettytable package
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=19
2021-11-25 22:27:00 +00:00
fd5f5dc1f2 Accepting request 925848 from home:mnhauke
- Update to version 2.0.7
  * Addition: bento Add support for Kazakh (Cyrillic) language
    detection
  * Improvement: sparkle Further improve inferring the language
    from a given code page (single-byte).
  * Removed: fire Remove redundant logging entry about detected
    language(s).
  * Improvement: zap Refactoring for potential performance
    improvements in loops.
  * Improvement: sparkles Various detection improvement (MD+CD).
  * Bugfix: bug Fix a minor inconsistency between Python 3.5 and
    other versions regarding language detection.
- Update to version 2.0.6
  * Bugfix: bug Unforeseen regression with the loss of the
    backward-compatibility with some older minor of Python 3.5.x.
  * Bugfix: bug Fix CLI crash when using --minimal output in
    certain cases.
  * Improvement: sparkles Minor improvement to the detection
    efficiency (less than 1%).
- Update to version 2.0.5
  * Improvement: sparkles The BC-support with v1.x was improved,
    the old staticmethods are restored.
  * Remove: fire The project no longer raise warning on tiny
    content given for detection, will be simply logged as warning
    instead.
  * Improvement: sparkles The Unicode detection is slightly
    improved, see #93
  * Bugfix: bug In some rare case, the chunks extractor could cut
    in the middle of a multi-byte character and could mislead the
    mess detection.

OBS-URL: https://build.opensuse.org/request/show/925848
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=18
2021-10-26 20:41:42 +00:00
Markéta Machová
d5d2d1f9e5 Accepting request 894588 from home:pgajdos:python
- version update to 1.3.9
  * Bugfix: bug In some very rare cases, you may end up getting encode/decode errors due to a bad bytes payload #40
  * Bugfix: bug Empty given payload for detection may cause an exception if trying to access the alphabets property. #39
  * Bugfix: bug The legacy detect function should return UTF-8-SIG if sig is present in the payload. #38

OBS-URL: https://build.opensuse.org/request/show/894588
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=16
2021-05-20 09:54:40 +00:00
4ec9d5c90b Accepting request 870710 from home:jayvdb:branches:devel:languages:python
- Switch to PyPI source
- Add Suggests: python-unicodedata2
- Remove executable bit from charset_normalizer/assets/frequencies.json
- Update to v1.3.6

OBS-URL: https://build.opensuse.org/request/show/870710
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=14
2021-02-10 08:09:39 +00:00
Tomáš Chvátal
204ac0c668 Accepting request 808744 from home:pgajdos:python
submit

OBS-URL: https://build.opensuse.org/request/show/808744
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=12
2020-05-25 13:36:05 +00:00
Tomáš Chvátal
611d9d38c7 Accepting request 767602 from home:mcalabkova:branches:devel:languages:python
- Update to 1.3.4
  * Improvement/Bugfix : False positive when searching for successive upper, lower char. (ProbeChaos)
  * Improvement : Noticeable better detection for jp
  * Bugfix : Passing zero-length bytes to from_bytes
  * Improvement : Expose version in package
  * Bugfix : Division by zero
  * Improvement : Prefers unicode (utf-8) when detected
  * Apparently dropped Python2 silently

OBS-URL: https://build.opensuse.org/request/show/767602
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=10
2020-01-28 08:11:06 +00:00
Tomáš Chvátal
631de8d368 Accepting request 734946 from home:mcalabkova:branches:devel:languages:python
- Update to 1.3.0
  * Backport unicodedata for v12 impl into python if available
  * Add aliases to CharsetNormalizerMatches class
  * Add feature preemptive behaviour, looking for encoding declaration
  * Add method to determine if specific encoding is multi byte
  * Add has_submatch property on a match
  * Add percent_chaos and percent_coherence
  * Coherence ratio based on mean instead of sum of best results
  * Using loguru for trace/debug <3
  * from_byte method improved

OBS-URL: https://build.opensuse.org/request/show/734946
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=8
2019-10-04 09:50:53 +00:00
Tomáš Chvátal
c385f2c788 - Update to 1.1.1:
* from_bytes parameters steps and chunk_size were not adapted to sequence len if provided values were not fitted to content
  * Sequence having lenght bellow 10 chars was not checked
  * Legacy detect method inspired by chardet was not returning
  * Various more test updates

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=6
2019-09-26 10:38:40 +00:00
Tomáš Chvátal
5cb7342274 - Update to 0.3:
* Improvement on detection
  * Performance loss to expect
  * Added --threshold option to CLI
  * Bugfix on UTF 7 support
  * Legacy detect(byte_str) method
  * BOM support (Unicode mostly)
  * Chaos prober improved on small text
  * Language detection has been reviewed to give better result
  * Bugfix on jp detection, every jp text was considered chaotic

OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=5
2019-09-13 11:07:21 +00:00
Tomáš Chvátal
4b06d6e2e5 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=2 2019-08-30 00:46:24 +00:00
Tomáš Chvátal
2365d5732b Accepting request 726939 from home:jayvdb:py-new
A very new & impressive (and the only) alternative to chardet which has stagnated lately

OBS-URL: https://build.opensuse.org/request/show/726939
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-charset-normalizer?expand=0&rev=1
2019-08-29 10:43:06 +00:00