Kotlin Benchmark Initial Porting Complete...First Impressions Only

2017-09-10 in LANGUAGE EXPERIMENTS • SE COMMENTARY • SOFTWARE ENGINEERING

benchmark java kotlin software engineering

5 min read

As I wrote about here yesterday I am taking my exploration of Kotlin to the next level by looking at performance metrics using the Computer Language Benchmark Game . As of right now I’ve completed my first two steps: got the benchmark building/running Kotlin code, and doing a straight port of the suite (follow along at the Gitlab project ). This was really 90% IntelliJ auto-converting followed by me fixing compiling errors (and submitting a nasty conversion bug that came up from one test back to JetBrains). So now onto the results! Well, actually not so fast on that one…

I do have initial results from running these test cases against their Java-based counterpart. Since this is literally Java code transliterated into Kotlin code it is perhaps the purest one-for-one run off of the two languages. At the same time it’s not exactly fair either though, is it? Am I using any language constructs that may give Kotlin a performance edge? Or worse, is the transliterated code doing funky stuff in the quest to get it “just compiling” which are actually bad practices? I can’t say for sure either way so I’m not going to jump to conclusions without doing a deeper dive on the analysis of the results and the tests themselves. That said, this straight transliteration does provide some information that I’m willing to explore qualitatively. I have twelve test cases ported over: nbody, regexdna, binarytrees, chameneosredux, fannkuchredux, fasta, knucleotide, mandelbrot, pidigits, regexredux, revcomp, spectralnorm. While I am a huge fan of the succinctness of Kotlin code and some of the really cool language features, like extension methods, how did Kotlin do compared to Java on these initial transliterated versions of these benchmarks? It’s complicated…

The first complication is that most of these benchmarks run in a very short period of time, most of the runs being on the order of 100 ms or less. Why is that a problem? From what I can tell the lower limit of startup time for a Kotlin app, through the kotlin command line tool anyway, is about 30-40 ms slower than the corresponding Java app. Do I care if it takes the time of a single frame of a movie to finish bootstrapping the application? Not usually, but when the whole test run time is only 2x longer that heavily skews the numbers. The second complication is that I’m not running these tests on a dedicated machine but in a VM among many running on my system. Since I don’t care about absolute performance and I’m not crushing all the cores in the VM, much less all the cores on the host machine, it should all average out somewhat, but I’m not going to claim final results quite yet because of that. The third complication is that while the transliteration is fine for getting things up and running I’m really testing transliterated Java not pure Kotlin. In the end they may not look too dissimilar, but because I can’t say that definitively right now I’m not going to make any sweeping generalizations.

I guess if Kotlin was running 2x faster than Java in all of the benchmarks I wouldn’t have put any of those caveats in, but it’s not. The bottom line is this in terms of raw numbers Kotlin was slower than Java in 10 of the 12 tests by a wide margin, but with that startup penalty being involved I didn’t think that was fair. So if just look at tests that I have cases longer than 1 second Kotlin is far slower in one case, just slightly slower in two more, and basically even but slightly faster in one. If I crudely look at taking out the extra bootstrapping time in these fast tests the numbers get better, and potentially legitimately so. I still have one test case where Kotlin is slower by a wide margin, I have several more cases where Kotlin in just slightly slower or slight faster, and then one where it is much faster. You can see the qualitative results here:

Test Name	Raw Absolute	Raw Long Term Only	Start Up Adjusted
binarytrees	Much Slower	Much Slower	Much Slower
chameneosredux	Much Slower	N/A	About Even Slower
fannkuchredux	Much Slower	N/A	Slightly Slower
fasta	Much Slower	N/A	About Even Faster
mandelbrot	Much Slower	N/A	About Even Slower
nbody	Much Slower	N/A	About Even Slower
pidigits	Slower	Slightly Slower	Slightly Slower
pidigitslastonly	Slower	Slightly Slower	Slightly Slower
regexredux	Slower	N/A	Slightly Faster
revcomp	Slightly Slower	N/A	Much Faster
spectralnorm	Even	About Even Faster	Slightly Faster

Next steps are:

I have to get longer duration benchmarks to get the noise out of the bootstrapping time.
I do want to quantify the relative startup times of the two systems both by directly invoking through kotlin command line tools and by invoking through the java JVM command line directly.
I need to run this in a more pristine environment to get more solid metrics.
Kotlin has their own benchmark tools to compare it against Java. I want to run those on the same environment to see if I can get characterizations of where the potential slow downs in my own specific versions are.
I need to look at the transliterated code and see if there are tweaks that can be done to make it more Kotlin-specific.

Ultimately if all these numbers hold out what we have in terms of Kotlin vs. Java runtime performance is something on par to slightly slower performance with some edge cases where it can be a lot slower or faster. However it’s way way too early for me to say that. Even if that turns out to be the case that on average it’s slightly slower, language selection is not just about runtime performance otherwise we’d all be coding in straight C anyway so…