forked from pool/perl-Code-DRY
Accepting request 578020 from home:jengelh:branches:devel:languages:perl
- Compact description. OBS-URL: https://build.opensuse.org/request/show/578020 OBS-URL: https://build.opensuse.org/package/show/devel:languages:perl/perl-Code-DRY?expand=0&rev=2
This commit is contained in:
@@ -1,3 +1,8 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Mon Feb 19 11:33:42 UTC 2018 - jengelh@inai.de
|
||||||
|
|
||||||
|
- Compact description.
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Wed Jan 25 11:37:54 UTC 2017 - okurz@suse.com
|
Wed Jan 25 11:37:54 UTC 2017 - okurz@suse.com
|
||||||
|
|
||||||
|
@@ -1,7 +1,7 @@
|
|||||||
#
|
#
|
||||||
# spec file for package perl-Code-DRY
|
# spec file for package perl-Code-DRY
|
||||||
#
|
#
|
||||||
# Copyright (c) 2016 SUSE LINUX GmbH, Nuernberg, Germany.
|
# Copyright (c) 2018 SUSE LINUX GmbH, Nuernberg, Germany.
|
||||||
#
|
#
|
||||||
# All modifications and additions to the file contributed by third parties
|
# All modifications and additions to the file contributed by third parties
|
||||||
# remain the property of their copyright owners, unless otherwise agreed
|
# remain the property of their copyright owners, unless otherwise agreed
|
||||||
@@ -31,101 +31,20 @@ BuildRequires: perl-macros
|
|||||||
%{perl_requires}
|
%{perl_requires}
|
||||||
|
|
||||||
%description
|
%description
|
||||||
The module's main purpose is to report repeated text fragments (typically
|
The module reports repeated text fragments (typically Perl code) that
|
||||||
Perl code) that could be considered for isolation and/or abstraction in
|
could be considered for isolation and/or abstraction in order to
|
||||||
order to reduce multiple copies of the same code (aka cut and paste code).
|
reduce multiple copies of the same code (a.k.a. cut and paste code).
|
||||||
|
|
||||||
Code duplicates may occur in the same line, file or directory.
|
Code duplicates may occur in the same line, file or directory.
|
||||||
|
|
||||||
The ad hoc approach to compare every item against every other item leads to
|
Rather than the exponential-time ad-hoc approach to compare every
|
||||||
computing times growing exponentially with the amount of code, which is not
|
item against every other item, this module can create the suffix
|
||||||
useful for anything but the smallest code bases.
|
array and the longest common prefix array for a string of 8-bit
|
||||||
|
characters. These data structures can be used to search for
|
||||||
So a efficient data structure is needed.
|
repetitions of substrings in O(n) time. The current strategy is to
|
||||||
|
concatenate code from all files into one string and then use the
|
||||||
This module can create the suffix array and the longest common prefix array
|
suffix array and its companion, the longest-common-prefix (lcp) array
|
||||||
for a string of 8-bit characters. These data structures can be used to
|
on this string.
|
||||||
search for repetitions of substrings in O(n) time.
|
|
||||||
|
|
||||||
The current strategy is to concatenate code from all files into one string
|
|
||||||
and then use the suffix array and its companion, the longest-common-prefix
|
|
||||||
(lcp) array on this string.
|
|
||||||
|
|
||||||
Example:
|
|
||||||
Instead of real Perl code I use the string 'mississippi' for
|
|
||||||
simplicity. A *suffix* is a partial string of an input string, which
|
|
||||||
ends at the end of the input string. A *prefix* is a partial string of
|
|
||||||
an input string, which starts at the start of the input string. The
|
|
||||||
*suffix array* of a string is a list of offsets (each one for a
|
|
||||||
suffix), which is sorted lexicographically by suffix:
|
|
||||||
|
|
||||||
# offset suffix
|
|
||||||
================
|
|
||||||
0 10: i
|
|
||||||
1 7: ippi
|
|
||||||
2 4: issippi
|
|
||||||
3 1: ississippi
|
|
||||||
4 0: mississippi
|
|
||||||
5 9: pi
|
|
||||||
6 8: ppi
|
|
||||||
7 6: sippi
|
|
||||||
8 3: sissippi
|
|
||||||
9 5: ssippi
|
|
||||||
10 2: ssissippi
|
|
||||||
|
|
||||||
The other structure needed is the *longest common prefix array* (lcp).
|
|
||||||
It contains the maximal length of the prefixes for this entry shared
|
|
||||||
with the previous entry from the suffix array. For this example it
|
|
||||||
looks like this:
|
|
||||||
|
|
||||||
# offset lcp (common prefixes shown in ())
|
|
||||||
=====================
|
|
||||||
0 10: 0 ()
|
|
||||||
1 7: 1 (i)
|
|
||||||
2 4: 1 (i)
|
|
||||||
3 1: 4 (issi) overlap!
|
|
||||||
3 3 (iss) corrected non overlapping prefixes
|
|
||||||
4 0: 0 ()
|
|
||||||
5 9: 0 ()
|
|
||||||
6 8: 1 (p)
|
|
||||||
7 6: 0 ()
|
|
||||||
8 3: 2 (si)
|
|
||||||
9 5: 1 (s)
|
|
||||||
10 2: 3 (ssi)
|
|
||||||
|
|
||||||
The standard lcp array may contain overlapping prefixes, but for our
|
|
||||||
purposes we need only non overlapping prefixes lengths. The same
|
|
||||||
overlap may occur for prefixes that extend from the end of one source
|
|
||||||
file to the start of the next file when we use concatenated content of
|
|
||||||
source files. The limiting with respect to internal overlaps and file
|
|
||||||
crossing prefix lengths is done by two respective functions afterwards.
|
|
||||||
|
|
||||||
If we sort the so obtained lcp values in descending order we get
|
|
||||||
|
|
||||||
# offset lcp (prefix shown in ())
|
|
||||||
===================================
|
|
||||||
3 1: 3 (iss) now corrected to non overlapping prefixes
|
|
||||||
10 2: 3 (ssi)
|
|
||||||
8 3: 2 (si)
|
|
||||||
1 7: 1 (i)
|
|
||||||
2 4: 1 (i)
|
|
||||||
6 8: 1 (p)
|
|
||||||
9 5: 1 (s)
|
|
||||||
0 10: 0 ()
|
|
||||||
4 0: 0 ()
|
|
||||||
5 9: 0 ()
|
|
||||||
7 6: 0 ()
|
|
||||||
|
|
||||||
The first entry shows the longest repetition in the given string. Not
|
|
||||||
all entries are of interest since smaller copies are contained in the
|
|
||||||
longest match. After removing all 'shadowed' repetitions, the next
|
|
||||||
entry can be reported. Finally the lcp values are too small to be of
|
|
||||||
any interest.
|
|
||||||
|
|
||||||
Currently this is experimental code.
|
|
||||||
|
|
||||||
The most appropriate mailing list on which to discuss this module would
|
|
||||||
be perl-qa. See http://lists.perl.org/list/perl-qa.html.
|
|
||||||
|
|
||||||
%prep
|
%prep
|
||||||
%setup -q -n %{cpan_name}-%{version}
|
%setup -q -n %{cpan_name}-%{version}
|
||||||
|
Reference in New Issue
Block a user