## Tuesday, June 25, 2013

### Performance of Scala Iterators

Objective
The Scala programming language provides software developers with several options to iterate through the elements of a collection:

• for,while loops and foreach ( x => f(x)) higher order function.
• map[Y] { f: X => Y) : Collection[Y] that creates a new collection by applying a function f to each element of the collection
• foldLeft[Y] (y : Y)(f : (Y,X)=>Y) : Y) (resp. foldRight[Y] (y : Y)(f : (X,Y)=>Y) : Y) that applies a binary operator to each element of this collection starting left to right (resp. right to left)

This post attempts to quantify the overhead of the most commonly used iterative methods in Scala and demonstrate the effectiveness of the higher order methods map and foldLeft

Scala loops for summation
The test runs are executed on a 'plain vanilla' dual core i3 2.1 Ghz running Linux CentOs 6.0. The first test consists of comparing compare the performance of the different options to traverse an array of Float with size varies from 2,000,000 to 40,000,000 elements then apply an operation += z to each of its members. The options are
  foreach   (line 6)
for loop  (line 9)
while loop (lines 14 - 16)
foldLeft   (line 19)


  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 val rGen = new scala.util.Random(System.currentTimeMillis) var data = Array.fill(size)(rGen.nextFloat) var sum = 0.0 // Higher order method data.foreach(sum += _) // for loop for( x <- data) sum += x // while loop var k = 0 val len = data.size while( k < len) { sum += data(k) k += 1 } // fold sum = data.foldLeft(0.0)((x, z) => x + z) 

The test is repeated 25 times in order to reduce variance and noise generated by the garbage collector. The first 5 iterations are discarded to avoid the overhead of the initialization of the JVM. The mean value of the execution time for each method is computed for different size of an array from 2,000,000 to 40,000,000 floating point values (type Float). The results of the test are plotted in the graph below. The unit of time on the Y-coordinate is milliseconds.
The for, while and foreach expression have very similar performance.
The foldLeft is significantly faster (ratio 1:6)

Data transformation
The second test consists of comparing the performance of
• foreach: "fills-up" iteratively a mutable array of type ArrayBuffer (line 3)
• foreach: creates and updates a copy of the original array (immutable approach)(lines 7 & 8)
• map: transform the original array into an array of square values (line 11)

  1 2 3 4 5 6 7 8 9 10 11  // foreach with mutable array buffer val newData = new mutable.ArrayBuffer[Float] data.foreach( (x: Float) => newData.append(x *x)) val result = newData.toArray // foreach with update of immmutable array val pData = Array.fill(sz)(0.0) data.foreach( pData.update(i, _) ) // map val nData = data.map((x:Float) => x*x) 

Let's run the same test (with the same setup defined in the previous section).

The test shows that the method dedicated to convert an array to other array by applying a natural transformation, map is by far the most efficient.
The methods dedicated to a specific task such as foldLeft for summation and map for data transformation are far more effective that the "plain vanilla" loop constructors. The tests are conducted with Scala 2.10.2

Important Notes:
The syntax or construct for has a very different meaning in Scsla as in C or Java. It is actually a wrapper or syntactic sugar layer around the monadic chain of flatMap and map transformation as follows

  for (
a <- f(x)  // flatMap
b <- g(a)  // flatMap
c <- h(b)  // map
) yield { }

Therefore, it should be avoided as an iterative counter and reserved to be used as for-comprehensive construct.

A more elaborate and time consuming benchmark would consist of running multiple tests using several boolean (< !=..) and numeric (+, *, x => sin(x) ..) operators and computes the normalize mean and variance.

References
Scala for the Impatient - C. Horstman - Addison-Wesley 2012
github.com/prnicolas

## Monday, June 10, 2013

### Runge-Kutta ODE solver in Scala

This post describes the implementation of the different Runge-Kutta method to solve differential equations in Scala.
The objective is to leverage the functional programming components of the Scala programming language to create a generic solver of ordinary differential equations (ODE) using Runge-Kutta family of approximation algorithms.

Overview
Most of ordinary differential equations cannot be solved analytically. In this case, a numeric approximation to the solution is often good enough to solve an engineering problem. Oddly enough most of commonly used algorithm to compute such an approximation have been establish a century ago. Let's consider the differential equation $\frac{\mathrm{d}y }{\mathrm{d} x} = f(x,y)$ The family of explicit Runge-Kutta numerical approximation methods is defined as $y_{n+1} = y_{n} + \sum_{i=0}^{s<n}b_{i}k_{i}\\where\,\,k_{j}=h.f(x_{n} + c_{j}h, y_{n} + \sum_{s=1}^{j-1} a_{s,s-1}k_{s-1} )\\with\,\,h=x_{n+1}-x_{n}\,\,and\,\,\Delta = \frac{dy}{dx}+ \sum_{s=1}^{j-1} a_{s,s-1}k_{s-1}$ k(j) is the increment based on the slope at the midpoint of the interval [x(n),x(n+1)] using delta. The Euler method defined as $y_{n+1} = y_{n} + hf(t_{n},y_{n})$ and 4th order Runge-Kutta $y_{n+1} = y_{n} + \frac{h}{6}(k_{1} + 2 k_{2}+2 k_{3}+ k_{4})\,\,\,;h = x_{n+1}-x_{n}\\k_{1}=f(x_{n},y_{n})\\k_{2}=f(x_{n}+\frac{h}{2}, y_{n}+\frac{hk_{1}}{2})\\k_{3}=f(x_{n}+\frac{h}{2},y_{n}+\frac{hk_{2}}{2})\\k_{4}=f(x_{n}+h,y_{n}+hk_{3})$

The implementation relies on the functional aspect of the Scala language and should be flexible enough to support any new future algorithm. The generic Runge-Kutta coefficients a(i), b(i) and c(i) are represented as a matrix: $\begin{vmatrix} c_{2}\,\,a_{21}\,\,0.0\,\,0.0\,\,...\,\,0.0\,\,0.0\\ c_{3}\,\,a_{31}\,\,a_{32}\,\,0.0\,\,...\,\,0.0\,\,0.0 \\ c_{4}\,\,a_{41}\,\,a_{42}\,\,a_{43}\,\,...\,\,0.0\,\,0.0\\ \\ c_{i}\,\,a_{i1}\,\,a_{i2}\,\,a_{i3}\,\,...\,\,\,\,...\,\,a_{ii-1}\\*\,\,b_{1}\,\,b_{2}\,\,b_{3}\,\,...\,\,...\,\,b_{i-1}\,\,b_{i} \end{vmatrix}$ In order to illustrate the flexibility of this implementation using Scala, I encapsulate the matrix of coefficients of the Euler, 3th order Runge-Kutta, 4th order Runge-Kutta and Felhberg methods using enumeration and case classes.

Note: For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted

Coefficients": Enumerators and case classes
Java developers, are familiar with enumerators as a data structure to list values without the need to instantiate an iterable collection.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 trait RungeKuttaCoefs { type COEFS = Array[Array[Double]] val EULER = Array( Array[Double](0.0, 1.0) ) // Coefficients for Runge-Kutta of order 3 val RK3 = Array( Array[Double](0.0, 0.0, 1/3, 0.0, 0,0), Array[Double](0.5, 0.5, 0.0, 2/3, 0.0), Array[Double](1.0, 0.0, -1.0, 0.0, 1/3) ) // Coefficients for Runge-Kutta of order 4 val RK4 = Array( Array[Double](0.0, 0.0, 1/6, 0.0, 0,0, 0.0), Array[Double](0.5, 0.5, 0.0, 1/3, 0.0, 0.0 ), Array[Double](0.5, 0.0, 0.5, 0.0, 1/3, 0.0), Array[Double](1.0, 0.0, 0.0, 1.0, 0.0, 1/6) ) // Coefficients for Runge-Kutta/Felberg of order 5 val FELBERG = Array( Array[Double](0.0, 0.0, 25/216, 0.0, 0.0, 0.0, 0.0, 0.0), Array[Double](0.25, 0.25, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ), Array[Double](3/8, 3/32, 0.0, 0.0, 1408/2565, 0.0, 0.0, 0.0), Array[Double](12/13,1932/2197,-7200/2197, 7296/2197, 0.0,2197/4101,0.0,0.0), Array[Double](1.0, 439/216, -8.0, 3680/513, -845/4104, 0.0, -1/5, 0.0), Array[Double](0.5, -8/27, 2.0, -3544/2565, 1859/4104, -11/40, 0.0, 0.0) ) val rk = List[COEFS](EULER, RK3, RK4, FELBERG) } object RungeKuttaForms extends Enumeration with RungeKuttaCoefs{ type RungeKuttaForms = Value val Euler, Rk3, Rk4, Fehlberg = Value @inline final def getRk(value: Value): COEFS = rk(value.id) } 

The first step is to encapsulate the coefficients of the various versions of the Runge-Kutta formuler (lines 4 & 5), Runge-Kutta order 3 (lines 9 - 12), Runge-Kutta order 4 (lines 16 - 20) and Runge-Kutta-Felberg order 5 (lines 24 - 30) in an Scala enumerator.

The enumerator is at best not elegant. As you browse throught the code snippet above, it is clear that the design to wrap the matrices of coefficients with the enumerator is quite cumbersome. There is a better way: pattern matching Case classes could be used instead of the singleton enumeration. Setters or getters can optionally be added as in the example below.
The validation of the arguments of methods, exceptions and auxiliary method or variables are omitted for the sake of simplicity.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 trait RKMethods { type COEFS = Array[Array[Double]] def getRk(i: Int, j: Int): COEFS object class Euler extends RKMethods{ Array(Array[Double](0.0, 1.0)) override def getRk(i: Int, j: Int): COEFS {} } object class RK3 extends RKMethods { Array( Array[Double](0.0, 0.0, 1/3, 0.0, 0,0), Array[Double](0.5, 0.5, 0.0, 2/3, 0.0), Array[Double](1.0, 0.0, -1.0, 0.0, 1/3) ) override def getRk(i: Int, j: Int): COEFS {} } .... 

The first step is to encapsulate the coefficients of the various versions of the Runge-Kutta formuler (lines 4 & 5), Runge-Kutta order 3 (lines 9 - 12), Runge-Kutta order 4 (lines 16 - 20) and Runge-Kutta-Felberg order 5 (lines 24 - 30) in an Scala enumerator.

In this second approach, the values of the enumerator are replaced y Euler object (line 5), RK3 - Runge-Kutta order 3 (line 10).

Integration
The main class RungeKutta implements all the components necessary to resolve the ordinary differential equation. This simplified implementation relies on adjustable step for the integration.

  1 2 3 4 5 6 7 8 9 10 11 12 class RungeKutta( rungeKutta: RungeKuttaForm, adjustStep: (Double, AdjustParameters) => (Double), adjustParameters: AdjustParameters) { final class StepIntegration(val coefs: Array[Array[Double]]) {} def solve( xBegin: Double, xEnd: Double, derivative: (Double, Double) => Double): Double } 

The class RungeKutta has three arguments
• rungeKutta form or type of the Runge-Kutta formula (line 2)
• adjustStep Metric function to adjust the integration step, dx (line 3)<.li>
• adjustParameters Parameters used to compute the derivative (line 4)

The computation of the parameters to adjust the integration step, in the code snippet below, is rather simple. A more elaborate implementation would include several alternative formulas implemented as sealed case class

 1 2 3 4 5 6 7 8 case class AdjustParameters( maxDerivativeValue: Double = 0.01, minDerivativeValue: Double = 0.00001, gamma: Double = 1.0) { lazy val dx0 = 2.0*gamma/(maxDerivativeValue + minDerivativeValue) } 

The sum of the previous Ks value is computed through an inner loop. The outer loop computes all the values for k using the Runge-Kutta matrix coefficients for this particular method. The integration step is implemented as a tail recursion (lines 14 - 22). but an iterative methods using foldLeft can also be used. The tail recursion may not be as effective in this case because it is implemented as a closure: the method has to close on ks.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 final class StepIntegration(val coefs : Array[Array[Double]] ) { // Main routine def compute( x: Double, y: Double, dx: Double, derivative: (Double, Double) => Double): Double = { val ks = new Array[Double](coefs.length) // Tail recursion closure @scala.annotation.tailrec def compute(i: Int, k: Double, sum: Double): Double= { ks(i) = k val sumKs= (0 until i)./:(0.0)((s, j) => s + ks(j)*coefs(i)(j+1)) val newK = derivative(x + coefs(i)(0)*dx, y + sumKs*dx) if( i >= coefs.size) sum + newK*coefs(i)(i+2) else compute(i+1, newK, sum + newK*coefs(i)(i+2)) } dx*compute(0, 0.0, 0.0) } 

The next method implements the generic solver that iterates through the entire integration interval. As a matter of fact the solver is indeed implemented as a tail recursion (lines 8 - 20).
The accuracy of the solver depends on the value of the increment value dx as computed on line 17. We need to weight the accuracy provided by infinitesimal increment with its computation cost.  Ideally an adaptive algorithm that compute the value dx according the value dy/dx or delta would provide a good compromise between accuracy and cost. The recursion ends when the recursion value x reaches the end of the integration interval (line 14)..

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 def solve( xBegin: Double, xEnd: Double, derivative: (Double, Double) => Double): Double ={ val rungeKutta = new StepIntegration(rungeKuttaForm) @scala.annotation.tailrec def solve( x: Double, y: Double, dx: Double, sum: Double): Double = { val z = rungeKutta.compute(x, y, dx, derivative) if( x >= xEnd) sum + z else { val dx = adjustStep(z - y, adjustParameters) solve(x + dx, z, dx, sum+z) } } solve(xBegin, 0.0, adjustParameters.initial, 0.0) } 

The invocation of the solver is very straight forward and can be verified against the analytical solution.
The first step is to define the function that adjusts the integration step (lines 1, 10). This implementation uses the default adjust paramters (line 19) in the initialization of the solver (lines 16 - 19).

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 val adjustingStep = (diff: Double, adjustParams: AdjustParameters) => { val dx = Math.abs(diff)*adjustParams.dx0/adjustParams.gamma if( dx < adjustParams.minDerivativeValue) adjustParams.minDerivativeValue else if ( dx > adjustParams.maxDerivativeValue) adjustParams.maxDerivativeValue else dx } final val x0 = 0.0 final val xEnd = 2.0 val solver = new RungeKutta( Rk4, adjustingStep, AdjustParameters()) solver.solve( x0, xEnd, (x: Double, y: Double) => Math.exp(-x*x)) 

The family of explicit Runge-Kutta methods provides a good starting point to resolve ordinary differential equations. The current implementation could and possibly should be extended to support adaptive dx managed by a control loop using a reinforcement learning algorithm of a Kalman filter of just simple exponential moving average.

References

The Numerical Solution of Differential-Algebraic Systems by Runge-Kutta Methods - E. Hairer, C Lubich, M. Roche - Springer - 1989
Programming in Scala - M. Odersky, L. Spoon, B. Venners - Artima Press 2008
https://github.com/prnicolas

## Wednesday, June 5, 2013

Overview
I assume that the reader is either familiar with  the theory behind Categories, Functors & Monads. If not, one of my  older posts, Monads & Functors: Theory , should provide some understanding behind those concepts.
In the previous post we introduced a Monad as a structure or triple M = <T,eta,mu> on a category X consists of
- A map: applicative functor from category X to category Y)   T : X->Y
- A unit: natural transformation  eta: 1 -> T
- A join: multiplication or natural transformation mu: T*T -> T

Note: For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted

Let's implement these monadic operators in Scala for some collections.

trait Monad[M[_]]  {
def map[X,Y](f: X=>Y): M[X] => M[Y]
def unit[X](x: X): M[X]
def join[X](mu: M[M[X]]): M[X]
}


The map method implements the natural transformation, phi. The unit method create a target category from an element (i.e. Double -> List[Double]). The join method enforces the mu natural transformation.

Let's use the List<Int> structured introduced in the post related to the theory of Monads ( Monads & Functors: Theory).

val monadList = new Monad[List] {
override def map[X,Y](f: X=>Y): List[X] => List[Y]=
(xs: List[X]) => xs.map(f)
override def unit[X](x: X): List[X] = x :: Nil
override def join[X](xs: List[List[X]]): List[X] = xs.flatten
}


The class MonadList is a wrapper around the List Monad. Therefore it is easy to implement all those 3 methods using the method of scala.collection.immutable.List class:
• map: build a new list by applying the function f to all elements of the orginal list: x -> x*x => List(.. x ..) -> List( .. x*x ...)
• :: nill: create a single element list
•  flatten: Converts this list of lists into a List formed by concatenating the element of all the contained lists.
Let's consider X, Y be the category (or type) Int. The Monad can be applied to list of Integers

val xs = monadList.map((n: Int) => n * n)
xs(List(4, 11, 6)).foreach( println )

val xss : List[List[Int]] = List( List(3,5,6), List(11,34,12,66))


In the example above, the execution of the first foreach method will print '16, 121, 36' while the second foreach invocation generate the sequence '3,5,6,11,34,12,66'.
The same methodology is applied to immutable sequences by implementing the generic Monad trait.

import scala.collection.immutable.Seq

override def map[X,Y](f: X=>Y): Seq[X] => Seq[Y] =
(_x: Seq[X]) => _x.map(f)
override def unit[X](x: X): Seq[X] = Seq[X](x)
override def join[X](__x: Seq[Seq[X]]): Seq[X] = __x.flatten
}


The implementation of the monad for immutable sequence is very similar to the monad for immutable lists: the map method relies on the Seq.map method and the join method flattens a 2-dimensional sequence into a single sequence

flatMap
The Scala standard libraries uses monads for collections, options and exceptions handling. The standard library uses a slightly different but equivalent methods to implement the three basic functionality of a monad.
• flatMap uses the transformation f: T -> M[T] instead of the "flattening" join

trait _Monad[M[_]] {
def map[T, U](m: M[T])(f: T =>U): M[U] =
flatMap(m)((t: T) => apply(f(t)))
def apply[T](t: T): M[T]
def flatMap[T, U](m: M[T])(f: T =>M[U]): M[U]
}


Let's use the Monad template above, to create a monad for time series. A time series of type TS is defined as a sequence of indexed observations (Obs. An observation has an index (or sequence ordering) and a value of type T.
The monad can be defined as an implicit class.

case class Obs[T](val t: Int, val features: T)
case class TS[T](val data: List[Obs[T]])

def apply(t: T): TS[T] = TS[T](List[Obs[T]](Obs[T](0, t)))
def map[U](f: T => U): TS[U] =
TS[U](ts.data.map(obs => Obs[U](obs.t, f(obs.features))))
def flatMap[U](f: T =>TS[U]): TS[U] =
TS[U]( (ts.data.map(obs => f(obs.features).data)).flatten)
}


The monad is ready for transforming time series by applying the implicit conversion of a time series of type TS to its monadic representation.

val obsList = List.tabulate(10)(new Obs(_, Random.nextDouble))
val ts = new TS[Double](obsList)

val newTs = ts.map( _*2.0)


For-Comprehension
Like many other functional languages, Scala embellish the syntax (sugar coated) . The Scala language combines join and unit methods to produce the Monad external shape method map and flatMap method as defined as
• def map(f: A => B): M[B]
• def flatMap(f: A => M[B]): M[B]
• map applies a natural transformation of the content structure
• flatMap composes the Monad with an operation f to generate another Monad instance of the same type.
The syntax to implement the following sequence of operations of concatenation of 3 arrays can be expressed using the methods map -> flatMap associated with the Scala collections (List, Array, Map...)

val sum2 = array1 flatMap { x =>
array2 flatMap { y =>
array3 map { z => x+y+z }
}
}


or using the For-Yield idiom, which is easier to write and understand.

val sum : Array[Int] = for {
x <- array1
y <- array2
z <- array3
} yield x+y+


References