perl-Code-DRY/perl-Code-DRY.spec

#
# spec file for package perl-Code-DRY
#
# Copyright (c) 2016 SUSE LINUX GmbH, Nuernberg, Germany.
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
# upon. The license for this file, and modifications and additions to the
# file, is the same license as for the pristine package itself (unless the
# license for the pristine package is not an Open Source License, in which
# case the license is the MIT License). An "Open Source License" is a
# license that conforms to the Open Source Definition (Version 1.9)
# published by the Open Source Initiative.

# Please submit bugfixes or comments via http://bugs.opensuse.org/
#


Name:           perl-Code-DRY
Version:        0.03
Release:        0
%define cpan_name Code-DRY
Summary:        Cut-and-Paste-Detector for Perl code
License:        GPL-1.0+ or Artistic-1.0
Group:          Development/Libraries/Perl
Url:            http://search.cpan.org/dist/Code-DRY/
Source0:        http://www.cpan.org/authors/id/H/HE/HEXCODER/%{cpan_name}-%{version}.tar.gz
BuildRoot:      %{_tmppath}/%{name}-%{version}-build
BuildRequires:  perl
BuildRequires:  perl-macros
%{perl_requires}

%description
The module's main purpose is to report repeated text fragments (typically
Perl code) that could be considered for isolation and/or abstraction in
order to reduce multiple copies of the same code (aka cut and paste code).

Code duplicates may occur in the same line, file or directory.

The ad hoc approach to compare every item against every other item leads to
computing times growing exponentially with the amount of code, which is not
useful for anything but the smallest code bases.

So a efficient data structure is needed.

This module can create the suffix array and the longest common prefix array
for a string of 8-bit characters. These data structures can be used to
search for repetitions of substrings in O(n) time.

The current strategy is to concatenate code from all files into one string
and then use the suffix array and its companion, the longest-common-prefix
(lcp) array on this string.

Example:
    Instead of real Perl code I use the string 'mississippi' for
    simplicity. A *suffix* is a partial string of an input string, which
    ends at the end of the input string. A *prefix* is a partial string of
    an input string, which starts at the start of the input string. The
    *suffix array* of a string is a list of offsets (each one for a
    suffix), which is sorted lexicographically by suffix:

        #  offset suffix
        ================
        0  10:    i
        1   7:    ippi
        2   4:    issippi
        3   1:    ississippi
        4   0:    mississippi
        5   9:    pi
        6   8:    ppi
        7   6:    sippi
        8   3:    sissippi
        9   5:    ssippi
        10  2:    ssissippi

    The other structure needed is the *longest common prefix array* (lcp).
    It contains the maximal length of the prefixes for this entry shared
    with the previous entry from the suffix array. For this example it
    looks like this:

        #  offset lcp  (common prefixes shown in ())
        =====================
        0  10:    0    ()
        1   7:    1    (i)
        2   4:    1    (i)
        3   1:    4    (issi) overlap!
        3         3    (iss)  corrected non overlapping prefixes
        4   0:    0    ()
        5   9:    0    ()
        6   8:    1    (p)
        7   6:    0    ()
        8   3:    2    (si)
        9   5:    1    (s)
        10  2:    3    (ssi)

    The standard lcp array may contain overlapping prefixes, but for our
    purposes we need only non overlapping prefixes lengths. The same
    overlap may occur for prefixes that extend from the end of one source
    file to the start of the next file when we use concatenated content of
    source files. The limiting with respect to internal overlaps and file
    crossing prefix lengths is done by two respective functions afterwards.

    If we sort the so obtained lcp values in descending order we get

        #  offset lcp  (prefix shown in ())
        ===================================
        3   1:    3    (iss) now corrected to non overlapping prefixes
        10  2:    3    (ssi)
        8   3:    2    (si)
        1   7:    1    (i)
        2   4:    1    (i)
        6   8:    1    (p)
        9   5:    1    (s)
        0  10:    0    ()
        4   0:    0    ()
        5   9:    0    ()
        7   6:    0    ()

    The first entry shows the longest repetition in the given string. Not
    all entries are of interest since smaller copies are contained in the
    longest match. After removing all 'shadowed' repetitions, the next
    entry can be reported. Finally the lcp values are too small to be of
    any interest.

    Currently this is experimental code.

    The most appropriate mailing list on which to discuss this module would
    be perl-qa. See http://lists.perl.org/list/perl-qa.html.

%prep
%setup -q -n %{cpan_name}-%{version}
find . -type f ! -name \*.pl -print0 | xargs -0 chmod 644

%build
%{__perl} Makefile.PL INSTALLDIRS=vendor OPTIMIZE="%{optflags}"
%{__make} %{?_smp_mflags}

%check
%{__make} test

%install
%perl_make_install
%perl_process_packlist
%perl_gen_filelist

%files -f %{name}.files
%defattr(-,root,root,755)
%doc Changes README

%changelog
Accepting request 452390 from home:okurz:branches:devel:languages:perl:CPAN-C Initial submission of perl(Code::DRY) OBS-URL: https://build.opensuse.org/request/show/452390 OBS-URL: https://build.opensuse.org/package/show/devel:languages:perl/perl-Code-DRY?expand=0&rev=1 2017-01-25 11:46:52 +00:00			`#`
			`# spec file for package perl-Code-DRY`
			`#`
			`# Copyright (c) 2016 SUSE LINUX GmbH, Nuernberg, Germany.`
			`#`
			`# All modifications and additions to the file contributed by third parties`
			`# remain the property of their copyright owners, unless otherwise agreed`
			`# upon. The license for this file, and modifications and additions to the`
			`# file, is the same license as for the pristine package itself (unless the`
			`# license for the pristine package is not an Open Source License, in which`
			`# case the license is the MIT License). An "Open Source License" is a`
			`# license that conforms to the Open Source Definition (Version 1.9)`
			`# published by the Open Source Initiative.`

			`# Please submit bugfixes or comments via http://bugs.opensuse.org/`
			`#`


			`Name: perl-Code-DRY`
			`Version: 0.03`
			`Release: 0`
			`%define cpan_name Code-DRY`
			`Summary: Cut-and-Paste-Detector for Perl code`
			`License: GPL-1.0+ or Artistic-1.0`
			`Group: Development/Libraries/Perl`
			`Url: http://search.cpan.org/dist/Code-DRY/`
			`Source0: http://www.cpan.org/authors/id/H/HE/HEXCODER/%{cpan_name}-%{version}.tar.gz`
			`BuildRoot: %{_tmppath}/%{name}-%{version}-build`
			`BuildRequires: perl`
			`BuildRequires: perl-macros`
			`%{perl_requires}`

			`%description`
			`The module's main purpose is to report repeated text fragments (typically`
			`Perl code) that could be considered for isolation and/or abstraction in`
			`order to reduce multiple copies of the same code (aka cut and paste code).`

			`Code duplicates may occur in the same line, file or directory.`

			`The ad hoc approach to compare every item against every other item leads to`
			`computing times growing exponentially with the amount of code, which is not`
			`useful for anything but the smallest code bases.`

			`So a efficient data structure is needed.`

			`This module can create the suffix array and the longest common prefix array`
			`for a string of 8-bit characters. These data structures can be used to`
			`search for repetitions of substrings in O(n) time.`

			`The current strategy is to concatenate code from all files into one string`
			`and then use the suffix array and its companion, the longest-common-prefix`
			`(lcp) array on this string.`

			`Example:`
			`Instead of real Perl code I use the string 'mississippi' for`
			`simplicity. A suffix is a partial string of an input string, which`
			`ends at the end of the input string. A prefix is a partial string of`
			`an input string, which starts at the start of the input string. The`
			`suffix array of a string is a list of offsets (each one for a`
			`suffix), which is sorted lexicographically by suffix:`

			`# offset suffix`
			`================`
			`0 10: i`
			`1 7: ippi`
			`2 4: issippi`
			`3 1: ississippi`
			`4 0: mississippi`
			`5 9: pi`
			`6 8: ppi`
			`7 6: sippi`
			`8 3: sissippi`
			`9 5: ssippi`
			`10 2: ssissippi`

			`The other structure needed is the longest common prefix array (lcp).`
			`It contains the maximal length of the prefixes for this entry shared`
			`with the previous entry from the suffix array. For this example it`
			`looks like this:`

			`# offset lcp (common prefixes shown in ())`
			`=====================`
			`0 10: 0 ()`
			`1 7: 1 (i)`
			`2 4: 1 (i)`
			`3 1: 4 (issi) overlap!`
			`3 3 (iss) corrected non overlapping prefixes`
			`4 0: 0 ()`
			`5 9: 0 ()`
			`6 8: 1 (p)`
			`7 6: 0 ()`
			`8 3: 2 (si)`
			`9 5: 1 (s)`
			`10 2: 3 (ssi)`

			`The standard lcp array may contain overlapping prefixes, but for our`
			`purposes we need only non overlapping prefixes lengths. The same`
			`overlap may occur for prefixes that extend from the end of one source`
			`file to the start of the next file when we use concatenated content of`
			`source files. The limiting with respect to internal overlaps and file`
			`crossing prefix lengths is done by two respective functions afterwards.`

			`If we sort the so obtained lcp values in descending order we get`

			`# offset lcp (prefix shown in ())`
			`===================================`
			`3 1: 3 (iss) now corrected to non overlapping prefixes`
			`10 2: 3 (ssi)`
			`8 3: 2 (si)`
			`1 7: 1 (i)`
			`2 4: 1 (i)`
			`6 8: 1 (p)`
			`9 5: 1 (s)`
			`0 10: 0 ()`
			`4 0: 0 ()`
			`5 9: 0 ()`
			`7 6: 0 ()`

			`The first entry shows the longest repetition in the given string. Not`
			`all entries are of interest since smaller copies are contained in the`
			`longest match. After removing all 'shadowed' repetitions, the next`
			`entry can be reported. Finally the lcp values are too small to be of`
			`any interest.`

			`Currently this is experimental code.`

			`The most appropriate mailing list on which to discuss this module would`
			`be perl-qa. See http://lists.perl.org/list/perl-qa.html.`

			`%prep`
			`%setup -q -n %{cpan_name}-%{version}`
			`find . -type f ! -name \*.pl -print0 \| xargs -0 chmod 644`

			`%build`
			`%{__perl} Makefile.PL INSTALLDIRS=vendor OPTIMIZE="%{optflags}"`
			`%{__make} %{?_smp_mflags}`

			`%check`
			`%{__make} test`

			`%install`
			`%perl_make_install`
			`%perl_process_packlist`
			`%perl_gen_filelist`

			`%files -f %{name}.files`
			`%defattr(-,root,root,755)`
			`%doc Changes README`

			`%changelog`