diff --git a/perl-Code-DRY.changes b/perl-Code-DRY.changes index c4d0808..eb63340 100644 --- a/perl-Code-DRY.changes +++ b/perl-Code-DRY.changes @@ -1,3 +1,8 @@ +------------------------------------------------------------------- +Mon Feb 19 11:33:42 UTC 2018 - jengelh@inai.de + +- Compact description. + ------------------------------------------------------------------- Wed Jan 25 11:37:54 UTC 2017 - okurz@suse.com diff --git a/perl-Code-DRY.spec b/perl-Code-DRY.spec index 3c62d55..73fd019 100644 --- a/perl-Code-DRY.spec +++ b/perl-Code-DRY.spec @@ -1,7 +1,7 @@ # # spec file for package perl-Code-DRY # -# Copyright (c) 2016 SUSE LINUX GmbH, Nuernberg, Germany. +# Copyright (c) 2018 SUSE LINUX GmbH, Nuernberg, Germany. # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed @@ -31,101 +31,20 @@ BuildRequires: perl-macros %{perl_requires} %description -The module's main purpose is to report repeated text fragments (typically -Perl code) that could be considered for isolation and/or abstraction in -order to reduce multiple copies of the same code (aka cut and paste code). +The module reports repeated text fragments (typically Perl code) that +could be considered for isolation and/or abstraction in order to +reduce multiple copies of the same code (a.k.a. cut and paste code). Code duplicates may occur in the same line, file or directory. -The ad hoc approach to compare every item against every other item leads to -computing times growing exponentially with the amount of code, which is not -useful for anything but the smallest code bases. - -So a efficient data structure is needed. - -This module can create the suffix array and the longest common prefix array -for a string of 8-bit characters. These data structures can be used to -search for repetitions of substrings in O(n) time. - -The current strategy is to concatenate code from all files into one string -and then use the suffix array and its companion, the longest-common-prefix -(lcp) array on this string. - -Example: - Instead of real Perl code I use the string 'mississippi' for - simplicity. A *suffix* is a partial string of an input string, which - ends at the end of the input string. A *prefix* is a partial string of - an input string, which starts at the start of the input string. The - *suffix array* of a string is a list of offsets (each one for a - suffix), which is sorted lexicographically by suffix: - - # offset suffix - ================ - 0 10: i - 1 7: ippi - 2 4: issippi - 3 1: ississippi - 4 0: mississippi - 5 9: pi - 6 8: ppi - 7 6: sippi - 8 3: sissippi - 9 5: ssippi - 10 2: ssissippi - - The other structure needed is the *longest common prefix array* (lcp). - It contains the maximal length of the prefixes for this entry shared - with the previous entry from the suffix array. For this example it - looks like this: - - # offset lcp (common prefixes shown in ()) - ===================== - 0 10: 0 () - 1 7: 1 (i) - 2 4: 1 (i) - 3 1: 4 (issi) overlap! - 3 3 (iss) corrected non overlapping prefixes - 4 0: 0 () - 5 9: 0 () - 6 8: 1 (p) - 7 6: 0 () - 8 3: 2 (si) - 9 5: 1 (s) - 10 2: 3 (ssi) - - The standard lcp array may contain overlapping prefixes, but for our - purposes we need only non overlapping prefixes lengths. The same - overlap may occur for prefixes that extend from the end of one source - file to the start of the next file when we use concatenated content of - source files. The limiting with respect to internal overlaps and file - crossing prefix lengths is done by two respective functions afterwards. - - If we sort the so obtained lcp values in descending order we get - - # offset lcp (prefix shown in ()) - =================================== - 3 1: 3 (iss) now corrected to non overlapping prefixes - 10 2: 3 (ssi) - 8 3: 2 (si) - 1 7: 1 (i) - 2 4: 1 (i) - 6 8: 1 (p) - 9 5: 1 (s) - 0 10: 0 () - 4 0: 0 () - 5 9: 0 () - 7 6: 0 () - - The first entry shows the longest repetition in the given string. Not - all entries are of interest since smaller copies are contained in the - longest match. After removing all 'shadowed' repetitions, the next - entry can be reported. Finally the lcp values are too small to be of - any interest. - - Currently this is experimental code. - - The most appropriate mailing list on which to discuss this module would - be perl-qa. See http://lists.perl.org/list/perl-qa.html. +Rather than the exponential-time ad-hoc approach to compare every +item against every other item, this module can create the suffix +array and the longest common prefix array for a string of 8-bit +characters. These data structures can be used to search for +repetitions of substrings in O(n) time. The current strategy is to +concatenate code from all files into one string and then use the +suffix array and its companion, the longest-common-prefix (lcp) array +on this string. %prep %setup -q -n %{cpan_name}-%{version}