forked from pool/perl-Text-Unidecode
initial version
OBS-URL: https://build.opensuse.org/package/show/devel:languages:perl/perl-Text-Unidecode?expand=0&rev=1
This commit is contained in:
commit
ed01d948f2
23
.gitattributes
vendored
Normal file
23
.gitattributes
vendored
Normal file
@ -0,0 +1,23 @@
|
||||
## Default LFS
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.bsp filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.gem filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.jar filter=lfs diff=lfs merge=lfs -text
|
||||
*.lz filter=lfs diff=lfs merge=lfs -text
|
||||
*.lzma filter=lfs diff=lfs merge=lfs -text
|
||||
*.obscpio filter=lfs diff=lfs merge=lfs -text
|
||||
*.oxt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pdf filter=lfs diff=lfs merge=lfs -text
|
||||
*.png filter=lfs diff=lfs merge=lfs -text
|
||||
*.rpm filter=lfs diff=lfs merge=lfs -text
|
||||
*.tbz filter=lfs diff=lfs merge=lfs -text
|
||||
*.tbz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.ttf filter=lfs diff=lfs merge=lfs -text
|
||||
*.txz filter=lfs diff=lfs merge=lfs -text
|
||||
*.whl filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
1
.gitignore
vendored
Normal file
1
.gitignore
vendored
Normal file
@ -0,0 +1 @@
|
||||
.osc
|
3
Text-Unidecode-0.04.tar.bz2
Normal file
3
Text-Unidecode-0.04.tar.bz2
Normal file
@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a23f0bb769d8507495bd06b269e8c1b50e4d55854447509a5d586880bb3886ae
|
||||
size 79278
|
5
perl-Text-Unidecode.changes
Normal file
5
perl-Text-Unidecode.changes
Normal file
@ -0,0 +1,5 @@
|
||||
-------------------------------------------------------------------
|
||||
Wed Mar 11 13:33:23 CET 2009 - lars@linux-schulserver.de
|
||||
|
||||
- initial version 0.04
|
||||
|
104
perl-Text-Unidecode.spec
Normal file
104
perl-Text-Unidecode.spec
Normal file
@ -0,0 +1,104 @@
|
||||
#
|
||||
# spec file for package perl-Text-Unidecode
|
||||
#
|
||||
|
||||
# norootforbuild
|
||||
|
||||
Name: perl-Text-Unidecode
|
||||
%define real_name Text-Unidecode
|
||||
Summary: US-ASCII transliterations of Unicode text
|
||||
Url: http://search.cpan.org/perldoc?Text::Unidecode
|
||||
Group: Development/Libraries/Perl
|
||||
License: Artistic License
|
||||
Version: 0.04
|
||||
Release: 1
|
||||
Vendor: openSUSE-Education
|
||||
Source: %{real_name}-%{version}.tar.bz2
|
||||
Requires: perl = %{perl_version}
|
||||
BuildRoot: %{_tmppath}/%{name}-%{version}-build
|
||||
|
||||
%description
|
||||
It often happens that you have non-Roman text data in Unicode, but you can't
|
||||
display it -- usually because you're trying to show it to a user via an
|
||||
application that doesn't support Unicode, or because the fonts you need aren't
|
||||
accessible. You could represent the Unicode characters as "???????" or
|
||||
"\15BA\15A0\1610...", but that's nearly useless to the user who actually wants
|
||||
to read what the text says.
|
||||
|
||||
What Text::Unidecode provides is a function, unidecode(...) that takes Unicode
|
||||
data and tries to represent it in US-ASCII characters (i.e., the universally
|
||||
displayable characters between 0x00 and 0x7F). The representation is almost
|
||||
always an attempt at transliteration -- i.e., conveying, in Roman letters, the
|
||||
pronunciation expressed by the text in some other writing system. (See the
|
||||
example in the synopsis.)
|
||||
|
||||
Unidecode's ability to transliterate is limited by two factors:
|
||||
|
||||
* The amount and quality of data in the original
|
||||
|
||||
So if you have Hebrew data that has no vowel points in it, then Unidecode
|
||||
cannot guess what vowels should appear in a pronounciation. S f y hv n vwls n
|
||||
th npt, y wn't gt ny vwls n th tpt. (This is a specific application of the
|
||||
general principle of "Garbage In, Garbage Out".)
|
||||
|
||||
* Basic limitations in the Unidecode design
|
||||
|
||||
Writing a real and clever transliteration algorithm for any single
|
||||
language usually requires a lot of time, and at least a passable knowledge of
|
||||
the language involved. But Unicode text can convey more languages than I could
|
||||
possibly learn (much less create a transliterator for) in the entire rest of my
|
||||
lifetime. So I put a cap on how intelligent Unidecode could be, by insisting
|
||||
that it support only context-insensitive transliteration. That means missing
|
||||
the finer details of any given writing system, while still hopefully being
|
||||
useful.
|
||||
|
||||
Unidecode, in other words, is quick and dirty. Sometimes the output is not so
|
||||
dirty at all: Russian and Greek seem to work passably; and while Thaana
|
||||
(Divehi, AKA Maldivian) is a definitely non-Western writing system, setting up
|
||||
a mapping from it to Roman letters seems to work pretty well. But sometimes the
|
||||
output is very dirty: Unidecode does quite badly on Japanese and Thai.
|
||||
|
||||
If you want a smarter transliteration for a particular language than Unidecode
|
||||
provides, then you should look for (or write) a transliteration algorithm
|
||||
specific to that language, and apply it instead of (or at least before)
|
||||
applying Unidecode.
|
||||
|
||||
In other words, Unidecode's approach is broad (knowing about dozens of writing
|
||||
systems), but shallow (not being meticulous about any of them).
|
||||
|
||||
Author:
|
||||
-------
|
||||
Sean M. Burke sburke@cpan.org
|
||||
|
||||
|
||||
%prep
|
||||
%setup -n %{real_name}-%{version}
|
||||
|
||||
%build
|
||||
perl Makefile.PL
|
||||
make %{?jobs:-j%jobs}
|
||||
|
||||
%check
|
||||
make test
|
||||
|
||||
%install
|
||||
%perl_make_install
|
||||
%perl_process_packlist
|
||||
|
||||
%clean
|
||||
rm -rf %{buildroot}
|
||||
|
||||
%files
|
||||
%defattr(-, root, root)
|
||||
%doc ChangeLog README MANIFEST TODO.txt
|
||||
%doc %{_mandir}/man?/*
|
||||
%dir %{perl_vendorarch}/auto/Text
|
||||
%dir %{perl_vendorarch}/auto/Text/Unidecode
|
||||
%dir %{perl_vendorlib}/Text
|
||||
%dir %{perl_vendorlib}/Text/Unidecode
|
||||
%{perl_vendorarch}/auto/Text/Unidecode/.packlist
|
||||
%{perl_vendorlib}/Text/Unidecode/*.pm
|
||||
%{perl_vendorlib}/Text/Unidecode.pm
|
||||
/var/adm/perl-modules/%{name}
|
||||
|
||||
%changelog
|
Loading…
Reference in New Issue
Block a user