forked from pool/perl-Text-Unidecode
- updated to 1.22
* RELEASE 1.22. (The dev release works, so this is a version bump.) * See notes for 2014-07-25, because this is the first public release with significant changes since 2001! 2014-07-25 Sean M. Burke sburke@cpan.org * !DEVELOPER RELEASE! * !Release 1.20_01! * Many bugfixes. Thanks especially to Tomaž Šolc! * Yet more *.t files added for improved sanity checking. * Shuffling around the internals of Unidecode.pm * Putting in some vacuous 0x__.pm files where previously there would just be a load failure OBS-URL: https://build.opensuse.org/package/show/devel:languages:perl/perl-Text-Unidecode?expand=0&rev=11
This commit is contained in:
parent
e22a75bb72
commit
43b83c895f
@ -1,3 +0,0 @@
|
|||||||
version https://git-lfs.github.com/spec/v1
|
|
||||||
oid sha256:baaecfee090e18e2c0fdcd8d76a961befdb934fca12b6cdcb7dd7e04b5510ce9
|
|
||||||
size 122457
|
|
3
Text-Unidecode-1.22.tar.gz
Normal file
3
Text-Unidecode-1.22.tar.gz
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:dd76f01c8b1e865bbb02a6c719eb21adb76451ea8f49281bc051319e34faddbc
|
||||||
|
size 129557
|
@ -1,3 +1,20 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Wed Dec 3 10:00:20 UTC 2014 - coolo@suse.com
|
||||||
|
|
||||||
|
- updated to 1.22
|
||||||
|
* RELEASE 1.22. (The dev release works, so this is a version bump.)
|
||||||
|
* See notes for 2014-07-25, because this is the first public release
|
||||||
|
with significant changes since 2001!
|
||||||
|
|
||||||
|
2014-07-25 Sean M. Burke sburke@cpan.org
|
||||||
|
* !DEVELOPER RELEASE!
|
||||||
|
* !Release 1.20_01!
|
||||||
|
* Many bugfixes. Thanks especially to Tomaž Šolc!
|
||||||
|
* Yet more *.t files added for improved sanity checking.
|
||||||
|
* Shuffling around the internals of Unidecode.pm
|
||||||
|
* Putting in some vacuous 0x__.pm files where
|
||||||
|
previously there would just be a load failure
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Thu Aug 7 09:04:25 UTC 2014 - dmitry_r@opensuse.org
|
Thu Aug 7 09:04:25 UTC 2014 - dmitry_r@opensuse.org
|
||||||
|
|
||||||
|
@ -16,96 +16,62 @@
|
|||||||
#
|
#
|
||||||
|
|
||||||
|
|
||||||
%define real_name Text-Unidecode
|
|
||||||
Name: perl-Text-Unidecode
|
Name: perl-Text-Unidecode
|
||||||
Version: 1.01
|
Version: 1.22
|
||||||
Release: 0
|
Release: 0
|
||||||
Summary: US-ASCII transliterations of Unicode text
|
%define cpan_name Text-Unidecode
|
||||||
License: Artistic-1.0
|
Summary: Provide plain ASCII transliterations of Unicode text
|
||||||
|
License: Artistic-1.0 or GPL-1.0+
|
||||||
Group: Development/Libraries/Perl
|
Group: Development/Libraries/Perl
|
||||||
Url: http://search.cpan.org/perldoc?Text::Unidecode
|
Url: http://search.cpan.org/dist/Text-Unidecode/
|
||||||
Source: http://www.cpan.org/authors/id/S/SB/SBURKE/%{real_name}-%{version}.tar.gz
|
Source: http://www.cpan.org/authors/id/S/SB/SBURKE/%{cpan_name}-%{version}.tar.gz
|
||||||
|
BuildArch: noarch
|
||||||
|
BuildRoot: %{_tmppath}/%{name}-%{version}-build
|
||||||
BuildRequires: perl
|
BuildRequires: perl
|
||||||
BuildRequires: perl-macros
|
BuildRequires: perl-macros
|
||||||
BuildRoot: %{_tmppath}/%{name}-%{version}-build
|
|
||||||
%{perl_requires}
|
%{perl_requires}
|
||||||
|
|
||||||
%description
|
%description
|
||||||
It often happens that you have non-Roman text data in Unicode, but you can't
|
It often happens that you have non-Roman text data in Unicode, but you
|
||||||
display it -- usually because you're trying to show it to a user via an
|
can't display it-- usually because you're trying to show it to a user via
|
||||||
application that doesn't support Unicode, or because the fonts you need aren't
|
an application that doesn't support Unicode, or because the fonts you need
|
||||||
accessible. You could represent the Unicode characters as "???????" or
|
aren't accessible. You could represent the Unicode characters as "???????"
|
||||||
"\15BA\15A0\1610...", but that's nearly useless to the user who actually wants
|
or "\15BA\15A0\1610...", but that's nearly useless to the user who actually
|
||||||
to read what the text says.
|
wants to read what the text says.
|
||||||
|
|
||||||
What Text::Unidecode provides is a function, unidecode(...) that takes Unicode
|
What Text::Unidecode provides is a function, 'unidecode(...)' that takes
|
||||||
data and tries to represent it in US-ASCII characters (i.e., the universally
|
Unicode data and tries to represent it in US-ASCII characters (i.e., the
|
||||||
displayable characters between 0x00 and 0x7F). The representation is almost
|
universally displayable characters between 0x00 and 0x7F). The
|
||||||
always an attempt at transliteration -- i.e., conveying, in Roman letters, the
|
representation is almost always an attempt at _transliteration_-- i.e.,
|
||||||
pronunciation expressed by the text in some other writing system. (See the
|
conveying, in Roman letters, the pronunciation expressed by the text in
|
||||||
example in the synopsis.)
|
some other writing system. (See the example in the synopsis.)
|
||||||
|
|
||||||
Unidecode's ability to transliterate is limited by two factors:
|
NOTE:
|
||||||
|
|
||||||
* The amount and quality of data in the original
|
To make sure your perldoc/Pod viewing setup for viewing this page is
|
||||||
|
working: The six-letter word "résumé" should look like "resume" with an "/"
|
||||||
|
accent on each "e".
|
||||||
|
|
||||||
So if you have Hebrew data that has no vowel points in it, then Unidecode
|
For further tests, and help if that doesn't work, see below, the /A POD
|
||||||
cannot guess what vowels should appear in a pronounciation. S f y hv n vwls n
|
ENCODING TEST manpage.
|
||||||
th npt, y wn't gt ny vwls n th tpt. (This is a specific application of the
|
|
||||||
general principle of "Garbage In, Garbage Out".)
|
|
||||||
|
|
||||||
* Basic limitations in the Unidecode design
|
|
||||||
|
|
||||||
Writing a real and clever transliteration algorithm for any single
|
|
||||||
language usually requires a lot of time, and at least a passable knowledge of
|
|
||||||
the language involved. But Unicode text can convey more languages than I could
|
|
||||||
possibly learn (much less create a transliterator for) in the entire rest of my
|
|
||||||
lifetime. So I put a cap on how intelligent Unidecode could be, by insisting
|
|
||||||
that it support only context-insensitive transliteration. That means missing
|
|
||||||
the finer details of any given writing system, while still hopefully being
|
|
||||||
useful.
|
|
||||||
|
|
||||||
Unidecode, in other words, is quick and dirty. Sometimes the output is not so
|
|
||||||
dirty at all: Russian and Greek seem to work passably; and while Thaana
|
|
||||||
(Divehi, AKA Maldivian) is a definitely non-Western writing system, setting up
|
|
||||||
a mapping from it to Roman letters seems to work pretty well. But sometimes the
|
|
||||||
output is very dirty: Unidecode does quite badly on Japanese and Thai.
|
|
||||||
|
|
||||||
If you want a smarter transliteration for a particular language than Unidecode
|
|
||||||
provides, then you should look for (or write) a transliteration algorithm
|
|
||||||
specific to that language, and apply it instead of (or at least before)
|
|
||||||
applying Unidecode.
|
|
||||||
|
|
||||||
In other words, Unidecode's approach is broad (knowing about dozens of writing
|
|
||||||
systems), but shallow (not being meticulous about any of them).
|
|
||||||
|
|
||||||
Author:
|
|
||||||
-------
|
|
||||||
Sean M. Burke sburke@cpan.org
|
|
||||||
|
|
||||||
%prep
|
%prep
|
||||||
%setup -q -n %{real_name}-%{version}
|
%setup -q -n %{cpan_name}-%{version}
|
||||||
|
|
||||||
%build
|
%build
|
||||||
perl Makefile.PL
|
%{__perl} Makefile.PL INSTALLDIRS=vendor
|
||||||
make %{?_smp_mflags}
|
%{__make} %{?_smp_mflags}
|
||||||
|
|
||||||
%check
|
%check
|
||||||
make test
|
%{__make} test
|
||||||
|
|
||||||
%install
|
%install
|
||||||
%perl_make_install
|
%perl_make_install
|
||||||
%perl_process_packlist
|
%perl_process_packlist
|
||||||
|
%perl_gen_filelist
|
||||||
|
|
||||||
%files
|
%files -f %{name}.files
|
||||||
%defattr(-, root, root)
|
%defattr(-,root,root,755)
|
||||||
%doc ChangeLog README MANIFEST TODO.txt
|
%doc ChangeLog LICENSE README TODO.txt
|
||||||
%doc %{_mandir}/man?/*
|
|
||||||
%dir %{perl_vendorarch}/auto/Text
|
|
||||||
%dir %{perl_vendorarch}/auto/Text/Unidecode
|
|
||||||
%dir %{perl_vendorlib}/Text
|
|
||||||
%dir %{perl_vendorlib}/Text/Unidecode
|
|
||||||
%{perl_vendorlib}/Text/Unidecode/*.pm
|
|
||||||
%{perl_vendorlib}/Text/Unidecode.pm
|
|
||||||
|
|
||||||
%changelog
|
%changelog
|
||||||
|
Loading…
Reference in New Issue
Block a user