- Update to version 5.4.0
Features:
* The new clean-whitespace verb resolves#190 from @aborruso.
Along with the new functions strip, lstrip, rstrip,
collapse_whitespace, and clean_whitespace, there is now both
coarse-grained and fine-grained control over whitespace
within field names and/or values. See the linked-to
documentation for examples.
* The new altkv verb resolves#184 which was originally opened
via an email request. This supports mapping value-lists such
as a,b,c,d to alternating key-value pairs such as a=b,c=d.
* The new fill-down verb resolves#189 by @aborruso. See the
linked-to documentation for examples.
* The uniq verb now has a uniq -a which resolves#168 from
@sjackman.
* The new regextract and regextract_or_else functions resolve
#183 by @aborruso.
* The new ssub function arises from #171 by @dohse, as a
simplified way to avoid escaping characters which are special
to regular-expression parsers.
* There are new localtime functions in response to #170 by
@sitaramc. However note that as discussed on #170 these do
not undo one another in all circumstances. This is a
non-issue for timezones which do not do DST. Otherwise, please
use with disclaimers: localdate, localtime2sec, sec2localdate,
sec2localtime, strftime_local, and strptime_local.
* Travis builds at
https://travis-ci.org/johnkerl/miller/builds now run on OSX as
well as Linux.
* An Ubuntu 17 build issue was fixed by @singalen on #164.
Documentation:
* put/filter documentation was confusing as reported by
@NikosAlexandris on #169.
* The new FAQ entry
http://johnkerl.org/miller-releases/miller-head/doc/faq.html#How_
to_rectangularize_after_joins_with_unpaired? resolves#193 by
@aborruso.
* The new cookbook entry
http://johnkerl.org/miller/doc/cookbook.html#Options_for_dealing_
with_duplicate_rows arises from #168 from @sjackman.
* The unsparsify documentation had some words missing as
reported by @tst2005 on #194.
* There was a typo in the cookpage page
http://johnkerl.org/miller/doc/cookbook.html#Full_field_renames_a
nd_reassigns as fixed by @tst2005 in #192.
Bugfixes:
* There was a memory leak for TSV-format files only as
reported by @treynr on #181.
* Dollar sign in regular expressions were not being escaped
properly as reported by @dohse on #171.
OBS-URL: https://build.opensuse.org/request/show/642014
OBS-URL: https://build.opensuse.org/package/show/utilities/miller?expand=0&rev=13
- Updated license
- Update to 5.2.2
* This bugfix release delivers a fix for #147 where a memory
allocation failed beyond 4GB.
- Update to version 5.2.1
* Fixes (gh#johnkerl/miller#142) build segfault on non-x86
architectures
- Update to version 5.2.0
This release contains mostly feature requests.
Features:
* The stats1 verb now lets you use regular expressions to
specify which field names to compute statistics on, and/or which
to group by. Full details are here.
* The min and max DSL functions, and the min/max/percentile
aggregators for the stats1 and merge-fields verbs, now support
numeric as well as string field values. (For mixed string/numeric
fields, numbers compare before strings.) This means in particular
that order statistics -- min, max, and non-interpolated
percentiles -- as well as mode, antimode, and count are now
possible on string-only fields. (Of course, any operations
requiring arithmetic on values, such as computing sums, averages,
or interpolated percentiles, yield an error on string-valued
input.)
* There is a new DSL function mapexcept which returns a copy of
the argument with specified key(s), if any, unset. The motivating
use-case is to split records to multiple filenames depending on
particular field value, which is omitted from the output: mlr
--from f.dat put 'tee > "/tmp/data-".$a, mapexcept($*, "a")'
Likewise, mapselect returns a copy of the argument with only
specified key(s), if any, set. This resolves#137.
* A new -u option for count-distinct allows unlashed counts for
multiple field names. For example, with -f a,b and without -u,
count-distinct computes counts for distinct pairs of a and b field
values. With -f a,b and with -u, it computes counts for distinct a
field values and counts for distinct b field values separately.
* If you build from source, you can now do ./configure without
first doing autoreconf -fiv. This resolves#131.
* The UTF-8 BOM sequence 0xef 0xbb 0xbf is now automatically
ignored from the start of CSV files. (The same is already done for
JSON files.) This resolves#138.
* For put and filter with -S, program literals such as the 6 in
$x = 6 were being parsed as strings. This is not sensible, since
the -S option for put and filter is intended to suppress numeric
conversion of record data, not program literals. To get string 6
one may use $x = "6".
Documentation:
* A new cookbook example shows how to compute differences
between successive queries, e.g. to find out what changed in
time-varying data when you run and rerun a SQL query.
* Another new cookbook example shows how to compute
interquartile ranges.
* A third new cookbook example shows how to compute weighted
means.
Bugfixes:
* CRLF line-endings were not being correctly autodetected when
I/O formats were specified using --c2j et al.
* Integer division by zero was causing a fatal runtime
exception, rather than computing inf or nan as in the
floating-point case.
OBS-URL: https://build.opensuse.org/request/show/518550
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/miller?expand=0&rev=3