forked from pool/perl-Code-DRY
151 lines
5.3 KiB
RPMSpec
151 lines
5.3 KiB
RPMSpec
![]() |
#
|
||
|
# spec file for package perl-Code-DRY
|
||
|
#
|
||
|
# Copyright (c) 2016 SUSE LINUX GmbH, Nuernberg, Germany.
|
||
|
#
|
||
|
# All modifications and additions to the file contributed by third parties
|
||
|
# remain the property of their copyright owners, unless otherwise agreed
|
||
|
# upon. The license for this file, and modifications and additions to the
|
||
|
# file, is the same license as for the pristine package itself (unless the
|
||
|
# license for the pristine package is not an Open Source License, in which
|
||
|
# case the license is the MIT License). An "Open Source License" is a
|
||
|
# license that conforms to the Open Source Definition (Version 1.9)
|
||
|
# published by the Open Source Initiative.
|
||
|
|
||
|
# Please submit bugfixes or comments via http://bugs.opensuse.org/
|
||
|
#
|
||
|
|
||
|
|
||
|
Name: perl-Code-DRY
|
||
|
Version: 0.03
|
||
|
Release: 0
|
||
|
%define cpan_name Code-DRY
|
||
|
Summary: Cut-and-Paste-Detector for Perl code
|
||
|
License: GPL-1.0+ or Artistic-1.0
|
||
|
Group: Development/Libraries/Perl
|
||
|
Url: http://search.cpan.org/dist/Code-DRY/
|
||
|
Source0: http://www.cpan.org/authors/id/H/HE/HEXCODER/%{cpan_name}-%{version}.tar.gz
|
||
|
BuildRoot: %{_tmppath}/%{name}-%{version}-build
|
||
|
BuildRequires: perl
|
||
|
BuildRequires: perl-macros
|
||
|
%{perl_requires}
|
||
|
|
||
|
%description
|
||
|
The module's main purpose is to report repeated text fragments (typically
|
||
|
Perl code) that could be considered for isolation and/or abstraction in
|
||
|
order to reduce multiple copies of the same code (aka cut and paste code).
|
||
|
|
||
|
Code duplicates may occur in the same line, file or directory.
|
||
|
|
||
|
The ad hoc approach to compare every item against every other item leads to
|
||
|
computing times growing exponentially with the amount of code, which is not
|
||
|
useful for anything but the smallest code bases.
|
||
|
|
||
|
So a efficient data structure is needed.
|
||
|
|
||
|
This module can create the suffix array and the longest common prefix array
|
||
|
for a string of 8-bit characters. These data structures can be used to
|
||
|
search for repetitions of substrings in O(n) time.
|
||
|
|
||
|
The current strategy is to concatenate code from all files into one string
|
||
|
and then use the suffix array and its companion, the longest-common-prefix
|
||
|
(lcp) array on this string.
|
||
|
|
||
|
Example:
|
||
|
Instead of real Perl code I use the string 'mississippi' for
|
||
|
simplicity. A *suffix* is a partial string of an input string, which
|
||
|
ends at the end of the input string. A *prefix* is a partial string of
|
||
|
an input string, which starts at the start of the input string. The
|
||
|
*suffix array* of a string is a list of offsets (each one for a
|
||
|
suffix), which is sorted lexicographically by suffix:
|
||
|
|
||
|
# offset suffix
|
||
|
================
|
||
|
0 10: i
|
||
|
1 7: ippi
|
||
|
2 4: issippi
|
||
|
3 1: ississippi
|
||
|
4 0: mississippi
|
||
|
5 9: pi
|
||
|
6 8: ppi
|
||
|
7 6: sippi
|
||
|
8 3: sissippi
|
||
|
9 5: ssippi
|
||
|
10 2: ssissippi
|
||
|
|
||
|
The other structure needed is the *longest common prefix array* (lcp).
|
||
|
It contains the maximal length of the prefixes for this entry shared
|
||
|
with the previous entry from the suffix array. For this example it
|
||
|
looks like this:
|
||
|
|
||
|
# offset lcp (common prefixes shown in ())
|
||
|
=====================
|
||
|
0 10: 0 ()
|
||
|
1 7: 1 (i)
|
||
|
2 4: 1 (i)
|
||
|
3 1: 4 (issi) overlap!
|
||
|
3 3 (iss) corrected non overlapping prefixes
|
||
|
4 0: 0 ()
|
||
|
5 9: 0 ()
|
||
|
6 8: 1 (p)
|
||
|
7 6: 0 ()
|
||
|
8 3: 2 (si)
|
||
|
9 5: 1 (s)
|
||
|
10 2: 3 (ssi)
|
||
|
|
||
|
The standard lcp array may contain overlapping prefixes, but for our
|
||
|
purposes we need only non overlapping prefixes lengths. The same
|
||
|
overlap may occur for prefixes that extend from the end of one source
|
||
|
file to the start of the next file when we use concatenated content of
|
||
|
source files. The limiting with respect to internal overlaps and file
|
||
|
crossing prefix lengths is done by two respective functions afterwards.
|
||
|
|
||
|
If we sort the so obtained lcp values in descending order we get
|
||
|
|
||
|
# offset lcp (prefix shown in ())
|
||
|
===================================
|
||
|
3 1: 3 (iss) now corrected to non overlapping prefixes
|
||
|
10 2: 3 (ssi)
|
||
|
8 3: 2 (si)
|
||
|
1 7: 1 (i)
|
||
|
2 4: 1 (i)
|
||
|
6 8: 1 (p)
|
||
|
9 5: 1 (s)
|
||
|
0 10: 0 ()
|
||
|
4 0: 0 ()
|
||
|
5 9: 0 ()
|
||
|
7 6: 0 ()
|
||
|
|
||
|
The first entry shows the longest repetition in the given string. Not
|
||
|
all entries are of interest since smaller copies are contained in the
|
||
|
longest match. After removing all 'shadowed' repetitions, the next
|
||
|
entry can be reported. Finally the lcp values are too small to be of
|
||
|
any interest.
|
||
|
|
||
|
Currently this is experimental code.
|
||
|
|
||
|
The most appropriate mailing list on which to discuss this module would
|
||
|
be perl-qa. See http://lists.perl.org/list/perl-qa.html.
|
||
|
|
||
|
%prep
|
||
|
%setup -q -n %{cpan_name}-%{version}
|
||
|
find . -type f ! -name \*.pl -print0 | xargs -0 chmod 644
|
||
|
|
||
|
%build
|
||
|
%{__perl} Makefile.PL INSTALLDIRS=vendor OPTIMIZE="%{optflags}"
|
||
|
%{__make} %{?_smp_mflags}
|
||
|
|
||
|
%check
|
||
|
%{__make} test
|
||
|
|
||
|
%install
|
||
|
%perl_make_install
|
||
|
%perl_process_packlist
|
||
|
%perl_gen_filelist
|
||
|
|
||
|
%files -f %{name}.files
|
||
|
%defattr(-,root,root,755)
|
||
|
%doc Changes README
|
||
|
|
||
|
%changelog
|