Accepting request 212347 from home:benoit_monin:pelican

I would like to use devel:languages:python as the devel project for python-Unidecode. OBS-URL: https://build.opensuse.org/request/show/212347 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python/python-Unidecode?expand=0&rev=2
2013-12-27 14:53:04 +00:00
parent 5d5b1fdb47
commit 2d1acd35d7
4 changed files with 52 additions and 7 deletions
--- a/python-Unidecode.spec
+++ b/python-Unidecode.spec
@@ -16,15 +16,15 @@


 Name:           python-Unidecode
-Version:        0.04.12
+Version:        0.04.14
 Release:        0
 License:        GPL-2.0+
 Summary:        ASCII transliterations of Unicode text
 Url:            https://pypi.python.org/pypi/Unidecode
 Group:          Development/Languages/Python
 Source:         http://pypi.python.org/packages/source/U/Unidecode/Unidecode-%{version}.tar.gz
-BuildRequires:  fdupes
 BuildRequires:  python-devel
+BuildRequires:  fdupes
 BuildRoot:      %{_tmppath}/%{name}-%{version}-build
 %if 0%{?suse_version} && 0%{?suse_version} <= 1110
 %{!?python_sitelib: %global python_sitelib %(python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()")}
@@ -41,11 +41,40 @@ human-readable Unicode strings that should still be somewhat intelligible
 (a popular example of this is when making an URL slug from an article
 title). 

+In most of these examples you could represent Unicode characters as
+"???" or "\\15BA\\15A0\\1610", to mention two extreme cases. But that's
+nearly useless to someone who actually wants to read what the text says.
+
+What Unidecode provides is a middle road: function unidecode() takes
+Unicode data and tries to represent it in ASCII characters (i.e., the
+universally displayable characters between 0x00 and 0x7F), where the
+compromises taken when mapping between two character sets are chosen to be
+near what a human with a US keyboard would choose.
+
+The quality of resulting ASCII representation varies. For languages of
+western origin it should be between perfect and good. On the other hand
+transliteration (i.e., conveying, in Roman letters, the pronunciation
+expressed by the text in some other writing system) of languages like
+Chinese, Japanese or Korean is a very complex issue and this library does
+not even attempt to address it. It draws the line at context-free
+character-by-character mapping. So a good rule of thumb is that the further
+the script you are transliterating is from Latin alphabet, the worse the
+transliteration will be.
+
+Note that this module generally produces better results than simply
+stripping accents from characters (which can be done in Python with
+built-in functions). It is based on hand-tuned character mappings that for
+example also contain ASCII approximations for symbols and non-Latin
+alphabets.
+
+This is a Python port of Text::Unidecode Perl module by
+Sean M. Burke <sburke@cpan.org>.
+
 %prep
 %setup -q -n Unidecode-%{version}

 %build
-python setup.py build
+CFLAGS="%{optflags}" python setup.py build

 %install
 python setup.py install --prefix=%{_prefix} --root=%{buildroot}