From 12bdfe54de9058fa29234e4fa938aaa0ee936c3cf935b5fca7c24fbfa5ac5878 Mon Sep 17 00:00:00 2001 From: Kyrill Detinov Date: Wed, 31 Aug 2016 14:58:43 +0000 Subject: [PATCH] Accepting request 423851 from home:X0F:HSF Here's the maximum of what I can do with the log. Had to extract it from git tag history. OBS-URL: https://build.opensuse.org/request/show/423851 OBS-URL: https://build.opensuse.org/package/show/M17N/uchardet?expand=0&rev=3 --- uchardet-0.0.1.tar.gz | 3 -- uchardet-0.0.6.tar.xz | 3 ++ uchardet.changes | 66 +++++++++++++++++++++++++++++++++++++++++++ uchardet.spec | 6 ++-- 4 files changed, 72 insertions(+), 6 deletions(-) delete mode 100644 uchardet-0.0.1.tar.gz create mode 100644 uchardet-0.0.6.tar.xz diff --git a/uchardet-0.0.1.tar.gz b/uchardet-0.0.1.tar.gz deleted file mode 100644 index bd4d116..0000000 --- a/uchardet-0.0.1.tar.gz +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:e238c212350e07ebbe1961f8f128faaa40f71b70d37b63ffa2fe12c664269ee6 -size 179207 diff --git a/uchardet-0.0.6.tar.xz b/uchardet-0.0.6.tar.xz new file mode 100644 index 0000000..4a33c1b --- /dev/null +++ b/uchardet-0.0.6.tar.xz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8351328cdfbcb2432e63938721dd781eb8c11ebc56e3a89d0f84576b96002c61 +size 169192 diff --git a/uchardet.changes b/uchardet.changes index f9c8fda..7800fd8 100644 --- a/uchardet.changes +++ b/uchardet.changes @@ -1,3 +1,69 @@ +------------------------------------------------------------------- +Tue Aug 30 14:05:21 UTC 2016 - virtuousfox@gmail.com + +- Update to version 0.0.6: +- Improve ASCII and ISO-8859-1 detection. +- Improve language models: Greek, Hungarian. +- New supports: + * Arabic - ISO-8859-6 and Windows-1256. + * Danish - Windows-1252, ISO-8859-1 and ISO-8859-15. + * Spanish - ISO-8859-1, ISO-8859-15 and Windows-1252. + * Vietnamese - VISCII and Windows-1258. +- Improve single-byte encoding detection algorithm by giving more weight + to "probable" sequences (less frequent than "positive" sequence, yet + not "negative"). +- `uchardet` command line tool improved: + * exits with non-zero return values on error. +- CMake build improved with more options: + * Binary can be installed to non-default dir. + * Allow building static-only builds. + * Allow not building the command line tool. + * Add static lib destination. +- Changes from 0.0.4 to 0.0.5: +- Revert UTF-16 and UTF-32 label change: + it was an error to specify endianness for texts with BOM. + The Unicode standard explicitly warns against it, and it actually + even (partially) break conversions. +- Added supports: + - French: Windows-1252. + - German: ISO-8859-1, Windows-1252 + - Esperanto: ISO-8859-3 + - Turkish: ISO-8859-3 and ISO-8859-9 + - Thai: ISO-8859-11 (and TIS-620 model rebuilt). +- Single Byte charset detection algorithm improved: + detection of control characters lowers confidence. +- Changes from 0.0.3 to 0.0.4: +- Add support of ISO-8859-1 and ISO-8859-15 for French. +- Re-enable Hungarian language models (ISO-8859-2 and Windows-1250) + which used to conflict with other charsets (should be better now). +- Differentiate ASCII detection and detection failure. +- Improve single-byte charset detection confidence algorithm (fixes for + instance Windows-1251 Russian text detection). +- "UTF-16" is now outputted with endianness information (UTF-16LE/BE). +- Add UTF-32 BOM detection. +- Discard single byte charsets upon illegal codepoint detection. +- Internal redesign of single-byte charmaps with more semantics, and + variable sample size length (different languages have different sizes + of grapheme lists). +- A lot more test files (33 successful unit tests should be successful + with `make test`). +- Adding python scripts to generate language models from Wikipedia data + in a single command. +- Changes from 0.0.2 to 0.0.3: +- A quick release after 0.0.2 mostly to fix a bad crash on the command + line tool when charset detection failed (or detected ASCII). +- The build now includes more test files for various language/encoding + and a `make test` target for unit testing (20 encoding detection tests + should be successful upon running it). +- The build has a new BUILD_STATIC option, by default set to ON, + allowing to disable static library building if not needed. +- All encoding names are iconv-compatible, enabling developers to + directly feed the result of uchardet_get_charset() into libiconv. +- Compilation warnings fixed. +- Changes from 0.0.1 to 0.0.2: +- Version 0.0.2 mostly fixes various bugs and allow querying charsets + for multiple files in the same command with uchardet command line tool. + ------------------------------------------------------------------- Mon Oct 14 10:06:31 UTC 2013 - lazy.kent@opensuse.org diff --git a/uchardet.spec b/uchardet.spec index 5f2cc1e..2283b4d 100644 --- a/uchardet.spec +++ b/uchardet.spec @@ -18,13 +18,13 @@ %define major 0 Name: uchardet -Version: 0.0.1 +Version: 0.0.6 Release: 0 License: MPL-1.1 or GPL-2.0+ or LGPL-2.1+ Summary: Universal Charset Detection Library -Url: https://code.google.com/p/uchardet/ +Url: https://www.freedesktop.org/wiki/Software/uchardet/ Group: Productivity/Text/Utilities -Source0: https://%{name}.googlecode.com/files/%{name}-%{version}.tar.gz +Source0: https://www.freedesktop.org/software/%{name}releases/%{name}-%{version}.tar.xz Source1: baselibs.conf BuildRequires: cmake BuildRequires: gcc-c++