This post is a follow-up to Java Performance Puzzle in which a small code change leads to a counter-intuitive performance gain.
In the previous post, the benchmark was based on the code from my CSV parser with all the logic to deal with unescaping, the cell aggregation etc …
I managed to reduce the code to a simpler naive CSV parser. see SimplifiedBenchmark.java
We now have a tight loop with 2 conditions in which with extract the
char, accumulate it in an array and pass the array
Blackhole when encountering
That still exhibit the weird behavior, the holder make it go faster.
But the code is still quite complex to reason with. Does the loop matters? Is it just the if conditions.
isolate if condition
So next I wrote a version that isolates the if condition in a non-inlinable method. I also inline the char copy to make it easier for analysis.
But we still have the effect of having some code that should only execute conditionally being executed for each char.
Where do we go from there? we can see what happens but why would that be the case.
Everything is an illusion
The JIT can make a lot of optimisation, code that doesn’t need to be executed will be strip away, loops get unrolled, condition order changed depending on profiling stats. The ASM generated may look very different from your bytecode as long as the outcome is the same as the original code. That allows for a lot of optimisation to happens.
The JIT looks at the code in the form of an Abstract Syntax Tree and applies a list of transformation before generating the final machine code. And fortunately, there is a way to get an insight into what is happening (ht Jean-Philippe Bempel) with Ideal Graph Visualizer.
Unfortunately to be able to use that you need to build a debug version of OpenJDK.
To build OpenJDK you will need a Linux box, on MacOsX it relies on an old version of Xcode that is not supported with Sierra.
Also, beware to not have a
j in the name the path of the directories you are checking it out - avoid opendjdk -.
you will need JDK 1.7 to build it, it is also needed to launch ideal graph visualizer.
Make sure that you have a gcc 4.3 to 4.9 installed as it will fail with gcc 5 for ubuntu.
if it fails to check the os_versions because you are running a 4.x kernel append 4% to the SUPPORTED_OS_VERSION in hotspot/make/linux/Makefile
You might need to install a few libraries… but once make all succeed you have your vm in build/
Ideal Graph Visualizer
Now we need to start Ideal Graph Visualizer
and run the benchmark with the debug JVM
the JVM might not be that stable and fail sometimes it should eventually run the benchmark successfully and you should see the nodes information being pushed to Ideal Graph Visualizer.
you can also export the data to a file using
Let’s start by looking at the graph for the one with the holder.
If we isolate the
sub in the graph we can see that it depends on the
Same for the
sub in the second condition
That stays true after the
RemoveUseless in the
see the following graphs.
But in the direct call we can see the 2
subs in the
Before RemoveUseless stage
there is no dependency on the
And at the
AfterParsing stage, there is only one
The allocated array depends on the
Sub and on the
If, but the
Sub does not depend on the
Which can explain why it is pushed outside the
What does that mean?
sub does not depend on the
if in the direct call in the
RemoveUseless stage and does not have any side effect
the code get consolidate in one place and is generated outside the if.
The holder stops that by creating a dependency on the
if. Why does it create a dependency on the
I don’t know yet.
When is it a problem?
If you have a 2 or more test that contains duplicated code with no side effect, that code might end up being executed outside the if.
Depending on the condition ratio and how many time the
if is called, the impact might start to dominate as it does in my
CSV parser. Most of the time the condition are false and the code inside should not be executed but some of it is.
I would really like to understand why
SubI does not depend on
if but when adding an indirection it does.
How does the RemoveUseless work? should the code with no side effect still be marked as depending on the
The only place to go from there is to dig in the hotspot code … and experiment in will
hopefully be part 3.