forked from pool/perl-Text-Unidecode
- updated to 1.22
* RELEASE 1.22. (The dev release works, so this is a version bump.) * See notes for 2014-07-25, because this is the first public release with significant changes since 2001! 2014-07-25 Sean M. Burke sburke@cpan.org * !DEVELOPER RELEASE! * !Release 1.20_01! * Many bugfixes. Thanks especially to Tomaž Šolc! * Yet more *.t files added for improved sanity checking. * Shuffling around the internals of Unidecode.pm * Putting in some vacuous 0x__.pm files where previously there would just be a load failure OBS-URL: https://build.opensuse.org/package/show/devel:languages:perl/perl-Text-Unidecode?expand=0&rev=11
This commit is contained in:
parent
e22a75bb72
commit
43b83c895f
@ -1,3 +0,0 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:baaecfee090e18e2c0fdcd8d76a961befdb934fca12b6cdcb7dd7e04b5510ce9
|
||||
size 122457
|
3
Text-Unidecode-1.22.tar.gz
Normal file
3
Text-Unidecode-1.22.tar.gz
Normal file
@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:dd76f01c8b1e865bbb02a6c719eb21adb76451ea8f49281bc051319e34faddbc
|
||||
size 129557
|
@ -1,3 +1,20 @@
|
||||
-------------------------------------------------------------------
|
||||
Wed Dec 3 10:00:20 UTC 2014 - coolo@suse.com
|
||||
|
||||
- updated to 1.22
|
||||
* RELEASE 1.22. (The dev release works, so this is a version bump.)
|
||||
* See notes for 2014-07-25, because this is the first public release
|
||||
with significant changes since 2001!
|
||||
|
||||
2014-07-25 Sean M. Burke sburke@cpan.org
|
||||
* !DEVELOPER RELEASE!
|
||||
* !Release 1.20_01!
|
||||
* Many bugfixes. Thanks especially to Tomaž Šolc!
|
||||
* Yet more *.t files added for improved sanity checking.
|
||||
* Shuffling around the internals of Unidecode.pm
|
||||
* Putting in some vacuous 0x__.pm files where
|
||||
previously there would just be a load failure
|
||||
|
||||
-------------------------------------------------------------------
|
||||
Thu Aug 7 09:04:25 UTC 2014 - dmitry_r@opensuse.org
|
||||
|
||||
|
@ -16,96 +16,62 @@
|
||||
#
|
||||
|
||||
|
||||
%define real_name Text-Unidecode
|
||||
Name: perl-Text-Unidecode
|
||||
Version: 1.01
|
||||
Version: 1.22
|
||||
Release: 0
|
||||
Summary: US-ASCII transliterations of Unicode text
|
||||
License: Artistic-1.0
|
||||
%define cpan_name Text-Unidecode
|
||||
Summary: Provide plain ASCII transliterations of Unicode text
|
||||
License: Artistic-1.0 or GPL-1.0+
|
||||
Group: Development/Libraries/Perl
|
||||
Url: http://search.cpan.org/perldoc?Text::Unidecode
|
||||
Source: http://www.cpan.org/authors/id/S/SB/SBURKE/%{real_name}-%{version}.tar.gz
|
||||
Url: http://search.cpan.org/dist/Text-Unidecode/
|
||||
Source: http://www.cpan.org/authors/id/S/SB/SBURKE/%{cpan_name}-%{version}.tar.gz
|
||||
BuildArch: noarch
|
||||
BuildRoot: %{_tmppath}/%{name}-%{version}-build
|
||||
BuildRequires: perl
|
||||
BuildRequires: perl-macros
|
||||
BuildRoot: %{_tmppath}/%{name}-%{version}-build
|
||||
%{perl_requires}
|
||||
|
||||
%description
|
||||
It often happens that you have non-Roman text data in Unicode, but you can't
|
||||
display it -- usually because you're trying to show it to a user via an
|
||||
application that doesn't support Unicode, or because the fonts you need aren't
|
||||
accessible. You could represent the Unicode characters as "???????" or
|
||||
"\15BA\15A0\1610...", but that's nearly useless to the user who actually wants
|
||||
to read what the text says.
|
||||
It often happens that you have non-Roman text data in Unicode, but you
|
||||
can't display it-- usually because you're trying to show it to a user via
|
||||
an application that doesn't support Unicode, or because the fonts you need
|
||||
aren't accessible. You could represent the Unicode characters as "???????"
|
||||
or "\15BA\15A0\1610...", but that's nearly useless to the user who actually
|
||||
wants to read what the text says.
|
||||
|
||||
What Text::Unidecode provides is a function, unidecode(...) that takes Unicode
|
||||
data and tries to represent it in US-ASCII characters (i.e., the universally
|
||||
displayable characters between 0x00 and 0x7F). The representation is almost
|
||||
always an attempt at transliteration -- i.e., conveying, in Roman letters, the
|
||||
pronunciation expressed by the text in some other writing system. (See the
|
||||
example in the synopsis.)
|
||||
What Text::Unidecode provides is a function, 'unidecode(...)' that takes
|
||||
Unicode data and tries to represent it in US-ASCII characters (i.e., the
|
||||
universally displayable characters between 0x00 and 0x7F). The
|
||||
representation is almost always an attempt at _transliteration_-- i.e.,
|
||||
conveying, in Roman letters, the pronunciation expressed by the text in
|
||||
some other writing system. (See the example in the synopsis.)
|
||||
|
||||
Unidecode's ability to transliterate is limited by two factors:
|
||||
NOTE:
|
||||
|
||||
* The amount and quality of data in the original
|
||||
To make sure your perldoc/Pod viewing setup for viewing this page is
|
||||
working: The six-letter word "résumé" should look like "resume" with an "/"
|
||||
accent on each "e".
|
||||
|
||||
So if you have Hebrew data that has no vowel points in it, then Unidecode
|
||||
cannot guess what vowels should appear in a pronounciation. S f y hv n vwls n
|
||||
th npt, y wn't gt ny vwls n th tpt. (This is a specific application of the
|
||||
general principle of "Garbage In, Garbage Out".)
|
||||
|
||||
* Basic limitations in the Unidecode design
|
||||
|
||||
Writing a real and clever transliteration algorithm for any single
|
||||
language usually requires a lot of time, and at least a passable knowledge of
|
||||
the language involved. But Unicode text can convey more languages than I could
|
||||
possibly learn (much less create a transliterator for) in the entire rest of my
|
||||
lifetime. So I put a cap on how intelligent Unidecode could be, by insisting
|
||||
that it support only context-insensitive transliteration. That means missing
|
||||
the finer details of any given writing system, while still hopefully being
|
||||
useful.
|
||||
|
||||
Unidecode, in other words, is quick and dirty. Sometimes the output is not so
|
||||
dirty at all: Russian and Greek seem to work passably; and while Thaana
|
||||
(Divehi, AKA Maldivian) is a definitely non-Western writing system, setting up
|
||||
a mapping from it to Roman letters seems to work pretty well. But sometimes the
|
||||
output is very dirty: Unidecode does quite badly on Japanese and Thai.
|
||||
|
||||
If you want a smarter transliteration for a particular language than Unidecode
|
||||
provides, then you should look for (or write) a transliteration algorithm
|
||||
specific to that language, and apply it instead of (or at least before)
|
||||
applying Unidecode.
|
||||
|
||||
In other words, Unidecode's approach is broad (knowing about dozens of writing
|
||||
systems), but shallow (not being meticulous about any of them).
|
||||
|
||||
Author:
|
||||
-------
|
||||
Sean M. Burke sburke@cpan.org
|
||||
For further tests, and help if that doesn't work, see below, the /A POD
|
||||
ENCODING TEST manpage.
|
||||
|
||||
%prep
|
||||
%setup -q -n %{real_name}-%{version}
|
||||
%setup -q -n %{cpan_name}-%{version}
|
||||
|
||||
%build
|
||||
perl Makefile.PL
|
||||
make %{?_smp_mflags}
|
||||
%{__perl} Makefile.PL INSTALLDIRS=vendor
|
||||
%{__make} %{?_smp_mflags}
|
||||
|
||||
%check
|
||||
make test
|
||||
%{__make} test
|
||||
|
||||
%install
|
||||
%perl_make_install
|
||||
%perl_process_packlist
|
||||
%perl_gen_filelist
|
||||
|
||||
%files
|
||||
%defattr(-, root, root)
|
||||
%doc ChangeLog README MANIFEST TODO.txt
|
||||
%doc %{_mandir}/man?/*
|
||||
%dir %{perl_vendorarch}/auto/Text
|
||||
%dir %{perl_vendorarch}/auto/Text/Unidecode
|
||||
%dir %{perl_vendorlib}/Text
|
||||
%dir %{perl_vendorlib}/Text/Unidecode
|
||||
%{perl_vendorlib}/Text/Unidecode/*.pm
|
||||
%{perl_vendorlib}/Text/Unidecode.pm
|
||||
%files -f %{name}.files
|
||||
%defattr(-,root,root,755)
|
||||
%doc ChangeLog LICENSE README TODO.txt
|
||||
|
||||
%changelog
|
||||
|
Loading…
Reference in New Issue
Block a user