Friday, January 31, 2014

Performance Evaluation of Scala Maps

The Scala programming language API provides developers with a set of short immutable maps of predefined length, Map1, .. Map4,  Set. I thought it would be worthwhile to evaluate the performance of those short,  dedicated collections compare to their generic counterpart.
A second test consists of comparing HashMap with TreeMap for sorting a hash table by key.

Note: For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted

Benchmark Short Maps
The benchmark consists of a very simple methods calls on immutable Map and Map4 instances as well as a mutable HashMap as defined in the code snippet below.

val map4 = new Map4(
 "0"-> 0, "1" ->1, "2 ->,2, "3"-> 3
val map0 = Map[String, Int](
 "0" -> 0, "1" -> 1, "2" -> 2, "3"-> 3
val hMap = HashMap[String, Int](
 "0" -> 0, "1" -> 1, "2" -> 2, "3" -> 3

We evaluate the time needed to execute 10 million iterations of a map, get and sum methods as follows:

aMap map { kv => kv._2 + 2 }
aMap get("2")
aMap values sum

The results of the performance test is shown in the table below. As expected the "short" collection is faster that the immutable Map and the mutable HashMap. However the performance improvement is not very significant.

Methods immutable.Map.Map4 immutable.Map mutable.HashMap
get 0.835 s 0.879 s 1.091 s
map 7.462 s 7.566 s 1.106 s
foldLeft 3.444 s 3.621 s 4.782 s

Sorting tables: TreeMap vs. sortBy
The second test consists of sorting a very large list or array of tuple (String, Float) by using maps. There are few options to sort a table among them:
  • Creating, populating and sorting a scala.collection.mutable.HashMap
  • Creating and populating a scala.collection.immutable.TreeMap
The test is set-up by creating pseudo-random key by concatenating the name of a city with an unique id. The values in the map are completely random.

 // Hashmap to be sorted 
val hMap = Range(0, sz)./:(new mutable.HashMap[String, Float])(
 (h, n) =>
   h += (s"${cities(Random.nextInt(cities.size))}_$n", 
        Random.nextFloat )
val sortedByKey = hMap.toSeq.sortBy(_._1)

   // TreeMap
val treeMap = Range(0, sz)./:(new immutable.TreeMap[String, Float])(
  (h, n) => 
   h + ((s"${cities(Random.nextInt(cities.size))}_$n", 
val sorted = sortedByKey.toSeq

The test is run with map which size varies between 10,000,00 and 90,000,00 entries

Sorting a tuple (String, Float) using TreeMap is roughly 40% faster than populating a HashMap and sort by key.

Scala for the impatient - C Horstmann Addison-Wesley 2012