Interested in benchmarking and profiling your code?
My new blog post walks you through it from highlevel benchmarking to getting deeper with profiling tools.
It’s quite high level so I avoided explaining the different lower level macros there but will do this in another post if interested:
Let me know your thoughts and enjoy reading 
19 Likes
sijo
2
Very nice! A few remarks:
-
You might want to rename this topic: it sounds like you’re asking for help profiling your code… maybe “A new tutorial on benchmarking and profiling” or similar?
-
Regarding the first flamegraph screenshots and this part:
My screenshot does not show the full width but when you run it you can see that the filter!
function takes more than 98% of the time. Which means it is the function we want to optimize.
It’s a bit confusing to have a screenshot that doesn’t illustrate the point
(on the screenshot it looks like filter!
spends almost all its time in !=
and <=
). Maybe use a screenshot showing the full width of filter!
and if the text is unreadable then, you could add the current screenshot as “magnifier”?
-
The last runtime plot which “looks quite funny” as you say, can make the reader skeptical that the third solution is really doing what it should do… Maybe a good opportunity to show that a logarithmic scale can be useful?
EDIT: just to clarify, for the first remark I meant the title here on Discourse.
4 Likes
Thanks for your thoughts. Will add those!
In this code
# convert from nano seconds to seconds
push!(ys, mean(t).time / 10^9)
Why are you using mean
? It’s inconsistent with @btime
behaviour which uses min
. And it behave worse compared to median
which is my second choice in such estimations.
I find min
a bit strange but it depends on what you want to measure I guess. Is mean
wrong?
I chose mean
to have the average running time of the function. One could add error bars around it in the plot.
Well, the consensus is that min
is the most adequate metric to measure actual code performance because the time of the code execution is always “time of the code itself + some random nonnegative noise from the operating system”. Since the second term is always nonnegative, when you take min
you’ll get the closest estimate to the real time of the code execution. Median
is slightly worse, mean
is the worst of them all, since it is very skewed. Imagine, that in 10 runs you get 9 measurements with the time 1ms and 1 with the time 10s. Mean
time would be 1s, which is definitely not representative of the actual execution time.
But anyway, whether you agree with it or not, it’s inconsistent to use and compare @btime
and mean
to profile the same code. It should be either one or another.
7 Likes
Thanks for the clarification @Skoffer . Will make the changes accordingly.
1 Like
lmiq
8
Once I benchmarked a code putting my laptop in the freezer. It was clearly faster. I will test that again and compare the minimum, median and average times obtained relative to room temperature, hopping to show that thermodynamic noise, not only operating system noise, enters into the equation.
5 Likes
Surely my statement is simplification. Another reason for “negative operating system time” can be governor management (I hope this term is correct). Operating system can change cpu frequency on demand, so it is possible, that during benchmark frequency can go up and overall execution time decrease. So yes, this formula is simplification.
jzr
10
1 Like
Or, possibly, garbage collection, which occurs in bursts. If some GC is inevitable, it is reasonable to use the median too because it gives you more realistic timing for practical purposes.
These are just informative statistics, there isn’t a single best one. That said, if you have to pick one, then in general minimum is a reasonable choice.
5 Likes