Wednesday, October 30, 2013

Type Erasure, Manifest, Specialization in Scala

This post introduces two mechanisms that address type erasure in scala. Manifests and specialization for primitive types.

Overview
Scala and Java programming languages use type erasure to compile away generics. The type parameters [U <: T]  are removed and replaced by their upper bound T or Any. The process involves boxing and un-boxing the primitives types if they are used in the code as type parameters, degrading performance. This post describes two approaches to work around type erasure: Manifest and type specialization.

Note: For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted


Description
Let consider a class ListCompare that compare lists of parametric type U bounded by the type Item (line 3).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
case class Item()
 
class ListCompare[U <: Item](
  xs: List[U]
)(implicit f: U => Ordered[U]) {

  def compare(xso: List[U]): Boolean = xso match {
    case str: List[String] => 
      if(xs.size == xso.size) 
        xs.zip(xso).exists( x=> x._1.compareTo(x._2) != 0) 
      else false
    case n: List[Int] => 
      if(xs.size == xso.size) 
        xs.zip(xso).exists(x => x._1 != x._2) 
      else false
    case _ => false
  }
}

The class has to have an implicit conversion U => Ordered[T] (line 5) to support the comparison of strings. The code above will generate the following warning message: "non-variable type argument String in type pattern List[String] is unchecked since it is eliminated by erasure". The warning message is generated by the compiler because the two parameterized types, List[String] (line 8) and List[Int] (line 12) may not be available to the JVM at the time of the execution of the pattern matching.

One solution is to use a Manifest. A Manifest[T] is an opaque descriptor for type T. It allows access to the erasure of the type as a class instance. The most common usage of manifest is related to the creation of native Arrays if the class is not known at compile time as illustrated in the code snippet below.

def myArray[T] = new Array[T](0)
def myArray[T](implicit m: Manifest[T]) = new Array[T](0)
def myArray[T: Manifest] = new Array[T](0)

The first line of code won't compile.
The second function will maintain the erasure by passing the manifest as an implicit parameter.
The last function defined the manifest as a context bound.

Let's re-implement the compare method with a Manifest specified as an implicit parameter.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
class ListCompare[U <: Item](
  xs: List[U]
)(implicit f: U => Ordered[U]) {

  def compare(
    xso: List[U]
   )(implicit u: Manifest[List[U]]): Boolean = {
    if( u <:< manifest[List[String]] ) 
      if( xs.size == xso.size)
        xs.zip(xso)
          .exists( x=> x._1.compareTo(x._2) != 0) 
      else false
     
    else if(u <:< manifest[List[Int]])  
      if( xs.size == xso.size) 
        xs.zip(xso).exists(x => x._1!=x._2) 
      else false
     
    else false
  }
}

The code will compile and execute as expected. It relies on the <:< type operator, which is a old variant of <: (lines 8 & 14). In any case, the code is far from being elegant and quite difficult for an inexperience Scala developer to grasp. There should be a better way.

One better option is to generate a separate class for the primitive type, Int and String using the @specialized annotation. The annotation @specialized (line 1) forces the compiler to generate byte code for each primitive listed as specialized type. For instance the instruction @specialized(Int, Double) T generates two extra, efficient classes for the primitive types Int and Double. The original ListComp class and its method compare are re-written using the annotation as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class ListComp[@specialized Int, String, U <: Item](
   xs:List[U]
)(implicit f: U =>Ordered[U]) {
   
  def compare(xso: List[U]): Boolean =
    if(xs.size == xso.size) 
      xs.zip(xso)
        .exists( x => x._1.compareTo(x._2)!=0)   
    else 
      false
}

The code above will not throw a warning or error. However there is not such a thing as a free lunch, as the compiler generates extra byte code for the methods associated to the specialized primitive types. The objective in this case is to trade a higher memory consumption for performance improvement.

References
Programming in Scala M. Odesky, L.Spoon, B. Venners - Artima 2008
Scala for the Impatient - Cay Horstman Addison-Wesley 2012

No comments:

Post a Comment