Luigi Baldoni
c761f836c7
- Update to version 5.9.0 * You can now save common defaults in a ~/.mlrrc. For example, if you normally process CSV files, you can say that in your ~/.mlrrc and you can leave off the --csv flag from your mlr commands. OBS-URL: https://build.opensuse.org/request/show/827953 OBS-URL: https://build.opensuse.org/package/show/utilities/miller?expand=0&rev=25
302 lines
13 KiB
Plaintext
302 lines
13 KiB
Plaintext
-------------------------------------------------------------------
|
|
Wed Aug 19 20:26:12 UTC 2020 - aloisio@gmx.com
|
|
|
|
- Update to version 5.9.0
|
|
* You can now save common defaults in a ~/.mlrrc. For example,
|
|
if you normally process CSV files, you can say that in your
|
|
~/.mlrrc and you can leave off the --csv flag from your mlr
|
|
commands.
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Aug 4 06:57:04 UTC 2020 - aloisio@gmx.com
|
|
|
|
- Update to version 5.8.0
|
|
Features:
|
|
* The new count verb is a keystroke-saver for stats -a count
|
|
-f {some field name}`.
|
|
* --jsonx and --ojsonx are keystroke-savers for --json
|
|
--jvstack and --ojson --jvstack, which is to say, multi-line
|
|
pretty-printed JSON format.
|
|
* The new -s name=value feature for mlr put and mlr filter
|
|
gives you simpler access to environment variables in your
|
|
Miller script, as requested in #315.
|
|
Bugfixes:
|
|
* mlr format-values is no longer SEGVing on CSV/TSV input.
|
|
This was reported on #330.
|
|
* #313 fixes a corner case when field names within
|
|
command-line arguments have embedded newlines.
|
|
* Line/column indicators for JSON-formatting error messages
|
|
are now correct (previously they were showing up as 0).
|
|
* end {print NF} no longer SEGVs. This was reported in #330.
|
|
* Several broken doc links were fixed up as reported on #329.
|
|
- Drop miller-5.3.0-gcc43.patch (no longer necessary)
|
|
- Spec cleanup
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Mar 17 06:56:30 UTC 2020 - aloisio@gmx.com
|
|
|
|
- Update to version 5.7.0
|
|
Features:
|
|
* The new remove-empty-columns and skip-trivial-records are
|
|
keystroke-savers for things which would other require DSL
|
|
syntax, as tracked in #274.
|
|
Bugfixes:
|
|
* A bug regarding optional regex-pattern groups was fixed in
|
|
#277.
|
|
* As of #294 you can now specify --implicit-csv-header for the
|
|
join-file in mlr join.
|
|
* A bug with spaces in XTAB-file values was fixed on #296.
|
|
* A bug with missing final newline for XTAB-formatted files
|
|
using MMAP files was fixed on #301.
|
|
|
|
- Drop group tag
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Sep 22 04:53:59 UTC 2019 - Luigi Baldoni <aloisio@gmx.com>
|
|
|
|
- Update to version 5.6.2
|
|
* #271 fixes a corner-case bug with more than 100 CSV/TSV files
|
|
with headers of varying lengths.
|
|
|
|
-------------------------------------------------------------------
|
|
Fri Sep 13 05:54:23 UTC 2019 - Luigi Baldoni <aloisio@gmx.com>
|
|
|
|
- Update to version 5.6.0
|
|
Features:
|
|
* The new system DSL function allows you to run arbitrary
|
|
shell commands and store them in field values. Some example
|
|
usages are documented here. This is in response to issues #246
|
|
and #209.
|
|
* There is now support for ASV and USV file formats. This is
|
|
in response to issue #245.
|
|
* The new format-values verb allows you to apply numerical
|
|
formatting across all record values. This is in response to
|
|
issue #252.
|
|
Documentation:
|
|
* The new DKVP I/O in Python sample code now works for Python
|
|
2 as well as Python 3.
|
|
* There is a new cookbook entry on doing multiple joins. This
|
|
is in response to issue #235.
|
|
Bugfixes:
|
|
* The toupper, tolower, and capitalize DSL functions are now
|
|
UTF-8 aware, thanks to @sheredom's marvelous
|
|
https://github.com/sheredom/utf8.h. The internationalization
|
|
page has also been expanded. This is in response to issue #254.
|
|
* #250 fixes a bug using in-place mode in conjunction with
|
|
verbs (such as rename or sort) which take field-name lists as
|
|
arguments.
|
|
* #253 fixes a bug in the label when one or more names are
|
|
common between old and new.
|
|
* #251 fixes a corner-case bug when (a) input is CSV; (b) the
|
|
last field ends with a comma and no newline; (c) input is from
|
|
standard input and/or --no-mmap is supplied.
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Sep 1 06:34:42 UTC 2019 - Luigi Baldoni <aloisio@gmx.com>
|
|
|
|
- Update to version 5.5.0
|
|
* Positional indexing and other data-cleaning features
|
|
Features:
|
|
* The new positional-indexing feature resolves #236 from
|
|
@aborruso. You can now get the name of the 3rd field of each
|
|
record via $[[3]], and its value by $[[[3]]]. These are both
|
|
usable on either the left-hand or right-hand side of assignment
|
|
statements, so you can more easily do things like renaming
|
|
fields progrmatically within the DSL.
|
|
* There is a new capitalize DSL function, complementing the
|
|
already-existing toupper. This stems from #236.
|
|
* There is a new skip-trivial-records verb, resolving #197.
|
|
Similarly, there is a new remove-empty-columns verb, resolving
|
|
#206. Both are useful for data-cleaning use-cases.
|
|
* Another pair is #181 and #256. While Miller uses mmap
|
|
internally (and invisibily) to get approximately a 20%
|
|
performance boost over not using it, this can cause
|
|
out-of-memory issues with reading either large files, or too
|
|
many small ones. Now, Miller automatically avoids mmap in these
|
|
cases. You can still use --mmap or --no-mmap if you want manual
|
|
control of this.
|
|
* There is a new --ivar option for the nest verb which
|
|
complements the already-existing --evar. This is from #260
|
|
thanks to @jgreely.
|
|
* There is a new keystroke-saving urandrange DSL function:
|
|
urandrange(low, high) is the same as low + (high - low) *
|
|
urand().
|
|
* There is a new -v option for the cat verb which writes a
|
|
low-level record-structure dump to standard error.
|
|
* There is a new -N option for mlr which is a keystroke-saver
|
|
for --implicit-csv-header --headerless-csv-output.
|
|
Documentation:
|
|
* The new FAQ entry
|
|
http://johnkerl.org/miller/doc/faq.html#How_to_escape_'%3F'_in_re
|
|
gexes%3F resolves #203.
|
|
* The new FAQ entry
|
|
http://johnkerl.org/miller/doc/faq.html#How_can_I_filter_by_date%
|
|
3F resolves #208.
|
|
* #244 fixes a documentation issue while highlighting the need
|
|
for #241.
|
|
Bugfixes:
|
|
* There was a SEGV using nest within then-chains, fixed in
|
|
response to #220.
|
|
* Quotes and backslashes weren't being escaped in JSON output
|
|
with --jvquoteall; reported on #222.
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Oct 15 07:23:57 UTC 2018 - Luigi Baldoni <aloisio@gmx.com>
|
|
|
|
- Update to version 5.4.0
|
|
Features:
|
|
* The new clean-whitespace verb resolves #190 from @aborruso.
|
|
Along with the new functions strip, lstrip, rstrip,
|
|
collapse_whitespace, and clean_whitespace, there is now both
|
|
coarse-grained and fine-grained control over whitespace
|
|
within field names and/or values. See the linked-to
|
|
documentation for examples.
|
|
* The new altkv verb resolves #184 which was originally opened
|
|
via an email request. This supports mapping value-lists such
|
|
as a,b,c,d to alternating key-value pairs such as a=b,c=d.
|
|
* The new fill-down verb resolves #189 by @aborruso. See the
|
|
linked-to documentation for examples.
|
|
* The uniq verb now has a uniq -a which resolves #168 from
|
|
@sjackman.
|
|
* The new regextract and regextract_or_else functions resolve
|
|
#183 by @aborruso.
|
|
* The new ssub function arises from #171 by @dohse, as a
|
|
simplified way to avoid escaping characters which are special
|
|
to regular-expression parsers.
|
|
* There are new localtime functions in response to #170 by
|
|
@sitaramc. However note that as discussed on #170 these do
|
|
not undo one another in all circumstances. This is a
|
|
non-issue for timezones which do not do DST. Otherwise, please
|
|
use with disclaimers: localdate, localtime2sec, sec2localdate,
|
|
sec2localtime, strftime_local, and strptime_local.
|
|
* Travis builds at
|
|
https://travis-ci.org/johnkerl/miller/builds now run on OSX as
|
|
well as Linux.
|
|
* An Ubuntu 17 build issue was fixed by @singalen on #164.
|
|
Documentation:
|
|
* put/filter documentation was confusing as reported by
|
|
@NikosAlexandris on #169.
|
|
* The new FAQ entry
|
|
http://johnkerl.org/miller-releases/miller-head/doc/faq.html#How_
|
|
to_rectangularize_after_joins_with_unpaired? resolves #193 by
|
|
@aborruso.
|
|
* The new cookbook entry
|
|
http://johnkerl.org/miller/doc/cookbook.html#Options_for_dealing_
|
|
with_duplicate_rows arises from #168 from @sjackman.
|
|
* The unsparsify documentation had some words missing as
|
|
reported by @tst2005 on #194.
|
|
* There was a typo in the cookpage page
|
|
http://johnkerl.org/miller/doc/cookbook.html#Full_field_renames_a
|
|
nd_reassigns as fixed by @tst2005 in #192.
|
|
Bugfixes:
|
|
* There was a memory leak for TSV-format files only as
|
|
reported by @treynr on #181.
|
|
* Dollar sign in regular expressions were not being escaped
|
|
properly as reported by @dohse on #171.
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Jan 7 07:56:34 UTC 2018 - aloisio@gmx.com
|
|
|
|
- Update to version 5.3.0 (see draft-release-notes.md for a
|
|
changelog)
|
|
- Added miller-5.3.0-gcc43.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Aug 24 15:18:41 UTC 2017 - aloisio@gmx.com
|
|
|
|
- Updated license
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Jul 20 09:29:23 UTC 2017 - aloisio@gmx.com
|
|
|
|
- Update to 5.2.2
|
|
* This bugfix release delivers a fix for #147 where a memory
|
|
allocation failed beyond 4GB.
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Jun 20 06:34:53 UTC 2017 - aloisio@gmx.com
|
|
|
|
- Update to version 5.2.1
|
|
* Fixes (gh#johnkerl/miller#142) build segfault on non-x86
|
|
architectures
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Jun 13 08:06:28 UTC 2017 - aloisio@gmx.com
|
|
|
|
- Update to version 5.2.0
|
|
This release contains mostly feature requests.
|
|
Features:
|
|
* The stats1 verb now lets you use regular expressions to
|
|
specify which field names to compute statistics on, and/or which
|
|
to group by. Full details are here.
|
|
* The min and max DSL functions, and the min/max/percentile
|
|
aggregators for the stats1 and merge-fields verbs, now support
|
|
numeric as well as string field values. (For mixed string/numeric
|
|
fields, numbers compare before strings.) This means in particular
|
|
that order statistics -- min, max, and non-interpolated
|
|
percentiles -- as well as mode, antimode, and count are now
|
|
possible on string-only fields. (Of course, any operations
|
|
requiring arithmetic on values, such as computing sums, averages,
|
|
or interpolated percentiles, yield an error on string-valued
|
|
input.)
|
|
* There is a new DSL function mapexcept which returns a copy of
|
|
the argument with specified key(s), if any, unset. The motivating
|
|
use-case is to split records to multiple filenames depending on
|
|
particular field value, which is omitted from the output: mlr
|
|
--from f.dat put 'tee > "/tmp/data-".$a, mapexcept($*, "a")'
|
|
Likewise, mapselect returns a copy of the argument with only
|
|
specified key(s), if any, set. This resolves #137.
|
|
* A new -u option for count-distinct allows unlashed counts for
|
|
multiple field names. For example, with -f a,b and without -u,
|
|
count-distinct computes counts for distinct pairs of a and b field
|
|
values. With -f a,b and with -u, it computes counts for distinct a
|
|
field values and counts for distinct b field values separately.
|
|
* If you build from source, you can now do ./configure without
|
|
first doing autoreconf -fiv. This resolves #131.
|
|
* The UTF-8 BOM sequence 0xef 0xbb 0xbf is now automatically
|
|
ignored from the start of CSV files. (The same is already done for
|
|
JSON files.) This resolves #138.
|
|
* For put and filter with -S, program literals such as the 6 in
|
|
$x = 6 were being parsed as strings. This is not sensible, since
|
|
the -S option for put and filter is intended to suppress numeric
|
|
conversion of record data, not program literals. To get string 6
|
|
one may use $x = "6".
|
|
Documentation:
|
|
* A new cookbook example shows how to compute differences
|
|
between successive queries, e.g. to find out what changed in
|
|
time-varying data when you run and rerun a SQL query.
|
|
* Another new cookbook example shows how to compute
|
|
interquartile ranges.
|
|
* A third new cookbook example shows how to compute weighted
|
|
means.
|
|
Bugfixes:
|
|
* CRLF line-endings were not being correctly autodetected when
|
|
I/O formats were specified using --c2j et al.
|
|
* Integer division by zero was causing a fatal runtime
|
|
exception, rather than computing inf or nan as in the
|
|
floating-point case.
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Apr 15 07:48:57 UTC 2017 - aloisio@gmx.com
|
|
|
|
- Update to 5.1.0 (see changelog at
|
|
https://github.com/johnkerl/miller/releases/tag/v5.1.0)
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Mar 12 21:04:27 UTC 2017 - aloisio@gmx.com
|
|
|
|
- Update to version 5.0.1
|
|
Minor bugfixes:
|
|
* As described in #132, mlr nest was incorrectly splitting
|
|
fields with multi-character separators.
|
|
* The XTAB-format reader, when using multi-character IPS,
|
|
was incorrectly splitting key-value pairs, but only when
|
|
reading from standard input (e.g. on a pipe or less-than
|
|
redirect).
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Feb 28 10:28:05 UTC 2017 - aloisio@gmx.com
|
|
|
|
- Initial package (v5.0.0)
|
|
|