initial version
OBS-URL: https://build.opensuse.org/package/show/devel:languages:perl/perl-Text-Unidecode?expand=0&rev=1
This commit is contained in:
commit
ed01d948f2
23
.gitattributes
vendored
Normal file
23
.gitattributes
vendored
Normal file
@ -0,0 +1,23 @@
|
|||||||
|
## Default LFS
|
||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bsp filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gem filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.jar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lzma filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.obscpio filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.oxt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pdf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.png filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rpm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tbz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tbz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ttf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.txz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.whl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
1
.gitignore
vendored
Normal file
1
.gitignore
vendored
Normal file
@ -0,0 +1 @@
|
|||||||
|
.osc
|
3
Text-Unidecode-0.04.tar.bz2
Normal file
3
Text-Unidecode-0.04.tar.bz2
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:a23f0bb769d8507495bd06b269e8c1b50e4d55854447509a5d586880bb3886ae
|
||||||
|
size 79278
|
5
perl-Text-Unidecode.changes
Normal file
5
perl-Text-Unidecode.changes
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Wed Mar 11 13:33:23 CET 2009 - lars@linux-schulserver.de
|
||||||
|
|
||||||
|
- initial version 0.04
|
||||||
|
|
104
perl-Text-Unidecode.spec
Normal file
104
perl-Text-Unidecode.spec
Normal file
@ -0,0 +1,104 @@
|
|||||||
|
#
|
||||||
|
# spec file for package perl-Text-Unidecode
|
||||||
|
#
|
||||||
|
|
||||||
|
# norootforbuild
|
||||||
|
|
||||||
|
Name: perl-Text-Unidecode
|
||||||
|
%define real_name Text-Unidecode
|
||||||
|
Summary: US-ASCII transliterations of Unicode text
|
||||||
|
Url: http://search.cpan.org/perldoc?Text::Unidecode
|
||||||
|
Group: Development/Libraries/Perl
|
||||||
|
License: Artistic License
|
||||||
|
Version: 0.04
|
||||||
|
Release: 1
|
||||||
|
Vendor: openSUSE-Education
|
||||||
|
Source: %{real_name}-%{version}.tar.bz2
|
||||||
|
Requires: perl = %{perl_version}
|
||||||
|
BuildRoot: %{_tmppath}/%{name}-%{version}-build
|
||||||
|
|
||||||
|
%description
|
||||||
|
It often happens that you have non-Roman text data in Unicode, but you can't
|
||||||
|
display it -- usually because you're trying to show it to a user via an
|
||||||
|
application that doesn't support Unicode, or because the fonts you need aren't
|
||||||
|
accessible. You could represent the Unicode characters as "???????" or
|
||||||
|
"\15BA\15A0\1610...", but that's nearly useless to the user who actually wants
|
||||||
|
to read what the text says.
|
||||||
|
|
||||||
|
What Text::Unidecode provides is a function, unidecode(...) that takes Unicode
|
||||||
|
data and tries to represent it in US-ASCII characters (i.e., the universally
|
||||||
|
displayable characters between 0x00 and 0x7F). The representation is almost
|
||||||
|
always an attempt at transliteration -- i.e., conveying, in Roman letters, the
|
||||||
|
pronunciation expressed by the text in some other writing system. (See the
|
||||||
|
example in the synopsis.)
|
||||||
|
|
||||||
|
Unidecode's ability to transliterate is limited by two factors:
|
||||||
|
|
||||||
|
* The amount and quality of data in the original
|
||||||
|
|
||||||
|
So if you have Hebrew data that has no vowel points in it, then Unidecode
|
||||||
|
cannot guess what vowels should appear in a pronounciation. S f y hv n vwls n
|
||||||
|
th npt, y wn't gt ny vwls n th tpt. (This is a specific application of the
|
||||||
|
general principle of "Garbage In, Garbage Out".)
|
||||||
|
|
||||||
|
* Basic limitations in the Unidecode design
|
||||||
|
|
||||||
|
Writing a real and clever transliteration algorithm for any single
|
||||||
|
language usually requires a lot of time, and at least a passable knowledge of
|
||||||
|
the language involved. But Unicode text can convey more languages than I could
|
||||||
|
possibly learn (much less create a transliterator for) in the entire rest of my
|
||||||
|
lifetime. So I put a cap on how intelligent Unidecode could be, by insisting
|
||||||
|
that it support only context-insensitive transliteration. That means missing
|
||||||
|
the finer details of any given writing system, while still hopefully being
|
||||||
|
useful.
|
||||||
|
|
||||||
|
Unidecode, in other words, is quick and dirty. Sometimes the output is not so
|
||||||
|
dirty at all: Russian and Greek seem to work passably; and while Thaana
|
||||||
|
(Divehi, AKA Maldivian) is a definitely non-Western writing system, setting up
|
||||||
|
a mapping from it to Roman letters seems to work pretty well. But sometimes the
|
||||||
|
output is very dirty: Unidecode does quite badly on Japanese and Thai.
|
||||||
|
|
||||||
|
If you want a smarter transliteration for a particular language than Unidecode
|
||||||
|
provides, then you should look for (or write) a transliteration algorithm
|
||||||
|
specific to that language, and apply it instead of (or at least before)
|
||||||
|
applying Unidecode.
|
||||||
|
|
||||||
|
In other words, Unidecode's approach is broad (knowing about dozens of writing
|
||||||
|
systems), but shallow (not being meticulous about any of them).
|
||||||
|
|
||||||
|
Author:
|
||||||
|
-------
|
||||||
|
Sean M. Burke sburke@cpan.org
|
||||||
|
|
||||||
|
|
||||||
|
%prep
|
||||||
|
%setup -n %{real_name}-%{version}
|
||||||
|
|
||||||
|
%build
|
||||||
|
perl Makefile.PL
|
||||||
|
make %{?jobs:-j%jobs}
|
||||||
|
|
||||||
|
%check
|
||||||
|
make test
|
||||||
|
|
||||||
|
%install
|
||||||
|
%perl_make_install
|
||||||
|
%perl_process_packlist
|
||||||
|
|
||||||
|
%clean
|
||||||
|
rm -rf %{buildroot}
|
||||||
|
|
||||||
|
%files
|
||||||
|
%defattr(-, root, root)
|
||||||
|
%doc ChangeLog README MANIFEST TODO.txt
|
||||||
|
%doc %{_mandir}/man?/*
|
||||||
|
%dir %{perl_vendorarch}/auto/Text
|
||||||
|
%dir %{perl_vendorarch}/auto/Text/Unidecode
|
||||||
|
%dir %{perl_vendorlib}/Text
|
||||||
|
%dir %{perl_vendorlib}/Text/Unidecode
|
||||||
|
%{perl_vendorarch}/auto/Text/Unidecode/.packlist
|
||||||
|
%{perl_vendorlib}/Text/Unidecode/*.pm
|
||||||
|
%{perl_vendorlib}/Text/Unidecode.pm
|
||||||
|
/var/adm/perl-modules/%{name}
|
||||||
|
|
||||||
|
%changelog
|
Loading…
x
Reference in New Issue
Block a user