Thursday, June 16, 2016

Analyzing Scala performance using javap

Target audience: Beginner
Estimated reading time: 15'


As mentioned in a previous post, iterators in Scala, such as foreach, for & while loop have different performance characteristics. A recurrent question between developers is the relative performance of Scala and Java regarding for and while.

The "for comprehension" closure in Scala is indeed a syntactic sugar wraps around a sequence of flatMap and map methods. Monad in Practice. It is expected to be significantly slower and should not be used as an iterator.
We select the higher method foreach to implement the for iterator in Scala.

Note: For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted

Evaluation while
The simple evaluation consists of processing a large number of iterations of a very simple statement which execution does not interfere with the actual performance of the iterator. The code is also is easy to understand. The code for the while loop is described below.

def testRun(numIterations: Int) : Unit = {
  val startTime = System.currentTimeMillis
  var sum: Long = 0
  var i: Int = 0

  while(i < numIterations) {
    var j = 0
    while(j  < 1000) {
      sum += 1
      j += 1
     i += 1

The test is executed on a 4 cores Intel i7 CPU with java 1.7.02 64-bit and Scala 2.10.2. The chart below, compares the duration of the execution of the for iterator for Java and Scala.

The performance of Scala and Java for executing the while loop are very similar.

Evaluation for
The second test consists of comparing the relative performance of Java for loop and Scala foreach higher order method.

def testRun(numIterations: Int) : Unit = {
  var sum: Long = 0
  (0 until numIterations.foreach( sum += _)

Analysis using Javap
The first step is to analyze the number of byte-code instructions generated by the Scala compiler. We use the Java Class File Disassembler,  javap, to print out the actual instructions processed by the Java virtual machine.
            javap -c -verbose xxx.class
The sequence of instructions generated for the execution of the while loops are displayed below.

0:   lconst_0
1:   lstore_1
2:   iconst_0
3:   istore_3
4:   iload_3
5:   sipush  1000
8:   if_icmpge       42
11:  iconst_0
12:  istore  4
14:  iload   4
16:  sipush  1000
19:  if_icmpge       35
22:  lload_1
23:  lconst_1
24:  ladd
25:  lstore_1
26:  iload   4
28:  iconst_1
29:  iadd
30:  istore  4
32:  goto    14
35:  iload_3
36:  iconst_1
37:  iadd
38:  istore_3
39:  goto    4
42:  return

Dissambling the Java class for the code with the for iterators produces the following print out:

0:   new     #12; //class scala/runtime/LongRef
3:   dup
4:   lconst_0
5:   invokespecial   #16; //Method scala/runtime/LongRef."<init>":(J)V
8:   astore_1
9:   getstatic       #22; //Field scala/runtime/RichInt$.MODULE$:Lscala/runtime/RichInt$;
12:  getstatic       #27; //Field scala/Predef$.MODULE$:Lscala/Predef$;
15:  iconst_0
16:  invokevirtual   #31; //Method scala/Predef$.intWrapper:(I)I
19:  sipush  1000
22:  invokevirtual   #35; //Method scala/runtime/RichInt$.until$extension0:
                            (II Lscala/collection/immutable/Range;
25:  new             #37; //class JavaScala$$anonfun$testRun$1
28:  dup
29:  aload_0
30:  aload_1
31:  invokespecial   #40; //Method JavaScala$$anonfun$testRun$1."<init>
34:  invokevirtual   #46; //Method scala/collection/immutable/Range.foreach$mVc$sp:
37:  return

Although the number of instructions for the for loop is smaller than the number of instructions for while, most of those instructions are function calls:
- conversion of counter to long
- static conversion of int to RichInt to wrap Java Integer:
@inline implicit def intWrapper(x: Int)= new runtime.RichInt(x)
- ultimately the foreach method is invoked to execute the loop.

Interestingly enough, the foreach method for collection is implemented using the while loop not the for loop. Scala compiler plug-in such as ScalaCL to optimize the execution of iteration on Scala collections, Arrays, Lists,... have been introduced to get around this issue. The reader can also take comfort in using the Java Class File Dissambler to understand how a method or piece of code translate into efficient, or in our case, inefficient byte-code. Quite a few methods in Scala such as foldLeft, reduceLeft uses tail recursion to traverse collections. It would be interesting to compare the relative performance of those methods with alternatives using iterators.. stay tune.