- updated to 2.11

OBS-URL: https://build.opensuse.org/package/show/devel:languages:perl/perl-HTML-TableExtract?expand=0&rev=24
2011-12-20 13:40:27 +00:00
parent 3d4e564a24
commit 80be1dc906
2 changed files with 13 additions and 43 deletions
--- a/perl-HTML-TableExtract.spec
+++ b/perl-HTML-TableExtract.spec
@@ -20,18 +20,21 @@ Name:           perl-HTML-TableExtract
 Version:        2.11
 Release:        0
 %define cpan_name HTML-TableExtract
-Summary:        For extracting the content contained in tables within an HTML document
+Summary:        Perl module for extracting the content contained in tables within an HTM[cut]
 License:        GPL-1.0+ or Artistic-1.0
 Group:          Development/Libraries/Perl
 Url:            http://search.cpan.org/dist/HTML-TableExtract/
-Source:         http://www.cpan.org/authors/id/M/MS/MSISK/HTML-TableExtract-%{version}.tar.gz
-Patch0:         %{cpan_name}-2.10-HTML.patch
+Source:         http://www.cpan.org/authors/id/M/MS/MSISK/%{cpan_name}-%{version}.tar.gz
+Patch0:         HTML-TableExtract-2.10-HTML.patch
 BuildArch:      noarch
 BuildRoot:      %{_tmppath}/%{name}-%{version}-build
 BuildRequires:  perl
 BuildRequires:  perl-macros
 BuildRequires:  perl(HTML::ElementTable) >= 1.16
 BuildRequires:  perl(HTML::Parser)
+#BuildRequires: perl(HTML::Entities)
+#BuildRequires: perl(HTML::TableExtract)
+#BuildRequires: perl(testload)
 Requires:       perl(HTML::ElementTable) >= 1.16
 Requires:       perl(HTML::Parser)
 %{perl_requires}
@@ -94,45 +97,10 @@ When extracting only text from tables, the text is decoded with
 HTML::Entities by default; this can be disabled by setting the _decode_
 parameter to 0.

-Extraction Modes
-    The default mode of extraction for HTML::TableExtract is raw text or
-    HTML. In this mode, embedded tables are completely decoupled from one
-    another. In this case, HTML::TableExtract is a subclass of
-    HTML::Parser:
-
-      use HTML::TableExtract;
-
-    Alternativevly, tables can be extracted as HTML::ElementTable
-    structures, which are in turn embedded in an HTML::Element tree
-    representing the entire HTML document. Embedded tables are not
-    decoupled from one another since this tree structure must be
-    manitained. In this case, HTML::TableExtract is a subclass of
-    HTML::TreeBuilder (itself a subclass of HTML:::Parser):
-
-      use HTML::TableExtract qw(tree);
-
-    In either case, the basic interface for HTML::TableExtract and the
-    resulting table objects remains the same -- all that changes is what
-    you can do with the resulting data.
-
-    HTML::TableExtract is a subclass of HTML::Parser, and as such inherits
-    all of its basic methods such as 'parse()' and 'parse_file()'. During
-    scans, 'start()', 'end()', and 'text()' are utilized. Feel free to
-    override them, but if you do not eventually invoke them in the SUPER
-    class with some content, results are not guaranteed.
-
-Advice
-    The main point of this module was to provide a flexible method of
-    extracting tabular information from HTML documents without relying to
-    heavily on the document layout. For that reason, I suggest using
-    _Headers_ whenever possible -- that way, you are anchoring your
-    extraction on what the document is trying to communicate rather than
-    some feature of the HTML comprising the document (other than the fact
-    that the data is contained in a table).
-
 %prep
 %setup -q -n %{cpan_name}-%{version}
 %patch0 -p1
+find . -type f -print0 | xargs -0 chmod 644

 %build
 %{__perl} Makefile.PL INSTALLDIRS=vendor
@@ -146,11 +114,8 @@ Advice
 %perl_process_packlist
 %perl_gen_filelist

-%clean
-%{__rm} -rf %{buildroot}
-
 %files -f %{name}.files
-%defattr(644,root,root,755)
+%defattr(-,root,root,755)
 %doc Changes README

 %changelog