Despite all the efforts, there still seems to be a lot of noise in the
performance measurement. Especially, the first iterations seem to run
faster. Maybe that is because the kernel didn't yet determine that the
process is CPU bound and is less likely to schedule it out Or maybe it's
because burning the cycles heats up the CPU and it gets throttled after
a while. It's unclear why, and it's even unclear whether this really
happens. But from my observations, it seems to do.
Hence, more warm up.
- the first time we enter the test, ensure that we keep the CPU busy for
at 2 seconds. This additional warm up (WARM_UP_ALWAYS_SEC) is
global, and not per test.
- for each test, ignore the first 5% of the runs. It seems those tend to
run faster, thus skewing the results.
- if the user specifies a "--factor", the warm up operations are the
same and independent from external factors (such as time
measurements).
Note that this matters the most, when you want to run the executable
twice in a row and compare the results.
By default, the test estimates a run factor for each test. This means,
if you run performance under `perf`, the results are not comparable,
as the run time depends on the estimated factor.
Add an option, to set a fixed factor.
Of course, there is only one factor argument for all tests. Quite
possibly, you would want to run each test individually with a factor
appropriate for the test. On the other hand, all tests should be tuned
so that the same factor gives a similar test duration. So this may not
be a concern, or the tests should be adjusted. In any case, the option
is most useful when running only one test explicitly.
You can get a suitable factor by running the test once with "--verbose".
Another use case is if you run the benchmark under valgrind. Valgrind
slows down the run so much, that the estimated factor would be quite
off. As a result, the chosen code paths are different from the real run.
By setting the factor, the timing measurements don't affect the executed
code.
The default output is annoyingly verbose. You see
Running test simple-construction
simple-construction: Millions of constructed objects per second: 33.498
Running test simple-construction1
simple-construction1: Millions of constructed objects per second: 142.493
Running test complex-construction
complex-construction: Millions of constructed objects per second: 14.304
Running test complex-construction1
...
where the "Running test" lines just clutter the output. In fact so much
so, that my terminal fills up and I don't see the output of all tests in
one page. The "Running test" line is not so useful, because I mostly
care about the test result, and that line already contains the test
name.
Add an option to silence this.
Previously, the result lines are not unique, for example
Running test simple-construction
Millions of constructed objects per second: 27.629
Running test simple-construction1
Millions of constructed objects per second: 151.879
...
That is undesirable, because we might want to parse the test results
with a script, and that's easier when the line is unique.
Change to:
Running test simple-construction
simple-construction: Millions of constructed objects per second: 27.629
Running test simple-construction1
simple-construction1: Millions of constructed objects per second: 151.879
...
This doesn’t change the tests’ behaviour, but moves them to a slightly
more logical location.
They are still not installed or run by default.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #1434