## Sunday, August 16, 2015

### Fisher-Yates shuffle in Scala

Overview
The generation of a sequence of randomly ordered elements from an existing sequence is a common problem. How can you be sure if the new sequence of items is actually randomly generated.

Fisher-Yates
Let's look at the implementation of Fisher-Yates

def fisherYates[T](t: mutable.Seq[T]): mutable.Seq[T] = {
require(t.size > 0, "Undefined argument")

setSeed(System.currentTimeMillis)

(0 until t.size).foreach (n => {
val randomIdx = n + nextInt(t.size-n)
val tmp = t(randomIdx)
t.update(randomIdx, t(n))
t(n) = tmp
})
t
}


The Fisher-Yates algorithm executes as an iteration with the following two steps:
• Select a random item in the remaining sequence
• Swap the randomly selected item with the current item

Fast Fisher-Yates implementation
The performance of the Fisher-Yates algorithm can be improved by implementing the swapping of the randomly selected item with the current time in-place using bit manipulation operators.

def fisherYates2(seq: mutable.Seq[Int]): IndexedSeq[Int] = {
require(seq.size > 0, "Undefined argument")

setSeed(System.currentTimeMillis)
(0 until seq.size).map(i => {
var randomIdx: Int = i + nextInt(seq.size-i)
seq(randomIdx) ^= seq(i)
seq(i) = seq(randomIdx) ^ seq(i)
seq(randomIdx) ^= (seq(i))
seq(i)
})
}


The following code snippet tests the two implementations of the Fisher-Yates shuffling algorithm.

val input = Array[Char]('a', 't', '4', '&', '9', '2', 'z', 'y', 'r', '8', '*', '?')
println(fisherYates[Char](input).mkString(","))
// t,&,*,2,a,r,4,z,?,y,8,9
println(fisherYates2(Array.tabulate(34)(n => n)).mkString(","))
// 9,24,11,29,1,31,28,20,6,18,23,0,13,17,22,0,27,
// 4,16,7,25,14,30,32,5,12,8,19,21,0,0,33,26,0

Scala standard library provides developers with a shuffling method: Random.shuffle.

Reference

## Thursday, July 9, 2015

### Fibonacci galore

Overview
There are many ways to skin a cat and implement the Fibonacci recursion. This post illustrates the power of tail recursion in Scala, relative to alternative solution such as recursion without tail elimination and iterations.

Comparative performance analysis
The fibonacci formula is defined as
f(n) = f(n-1) + f(n-2)
f(0) = 0
f(1) = 1
Let's look at text book recursive implementation.

def fibonacci(n: Int): Long =  {
if(n == 0) 0
else if(n == 1) 1
else fibonacci1(n-1) + fibonacci1(n-2)
}


The duration of execution is recorded for values between 2 and 50, run 10,000 times.

Let's look at one example of implementation of the Fibonacci formula using the Scala tail recursion.

def fibonacci(n: Int): Long = {
@tailrec
def _fibonacci(i: Int, next: Long, sum: Long): Long = {
if(i ==0) sum
else fibonacci(i-1, sum+ next, next)
}
fibonacci(n, 1, 0)
}


The performance of the tail recursion is measure against a plain-vanilla iterative execution. The duration of execution is recorded for values between 5 and 200, run 100,000 times.

def fibonacci(n: Int): Long = {
var s1 = 0
var s2 = 1
(n-1 to 0 by -1).foreach(_ => {
val tmp = s2 + s1
s1 = s2
s2 = tmp
})
s1
}


The following chart illustrates that the performance of the implementation of Fibonacci using the tail recursion and the iterative execution is almost identical.

For good measure, I thought it would be interesting to measure the performance of an iterative approach using a fold.

def fibonacci(n: Int): Long =
(n-1 to 0 by -1)./:((0, 1)){ case ((s1, s2), n) => {
val tmp = s2 + s1
(s2, tmp)
}}._1


Note that the fold can be implemented as a closure using a recursive formula for the function argument.

Summary
The Scala tail recursion is indeed as effective as the iterative implementation at processing the Fibonacci formula. The performance of the recursion without tail elimination is extremely poor. Aggregators or reducers such as fold, reduce and scan add some overhead to the computation although the performance is still decent.

References
Tail-recursion Basics in Scala Old-Fashioned Software Development Blog - M. Malone 2008
Programming in Scala M. Odersky, L. Spoon, B. Venners - Artima 2010

## Friday, June 19, 2015

### From view to context bound

Overview
There have been some discussion of deprecating view bound in Scala 2.12. Binding a parameterized type T to a predefined and known type using view (i.e. implicit function conversion) is quite convenient. I have used (or even abuse) this construct in writing machine leaning-based applications.
There are few options for replacing view bound class parameter types with any kind of ad-hoc polymophism techniques. I chose to convert my legacy code by binding the class parameter type to a context instead of a view.

Converting class parameter type binding
First, let's look at binding the parameterized type, T of a class Classifier to the primitive type Double. The binding is actually implemented by defining an implicit conversion function f: T => Double as described in the following code snippet.

def sqr(x: Double): Double = x*x

class Classifier[T <: AnyVal](
xt: Vector[Array[T]],
yt: Vector[Double])
(implicit f: T => Double) {

def loss(weights: Array[T]): Double = {
xt.zip(yt).map{ case(x, y) =>
y - x.zip(weights).map{ case(_x, w)
=> _x*w}.sum }.map(sqr(_)).sum
}


In this example, the observed features and the model parameters weights have the same type: array of element T. The input time series xt is a vector of observations and the labels (or expected outcome) yt is a vector of single variable observation of type Double. The implicit conversion apply to the features and weights.
Scala provides developers with shorter alternative notation as follows:

class Classifier[T <% Double](
xt: Vector[Array[T]],
yt: Vector[Double]) { }


Replacing the binding of the class parameters for many classes would be tedious and error prone. One solution is to create a template or pattern that can be applied to several section of the code base.
The process of binding the type T of a class parameter to another known parameter consists of defining a value (or context) Feature for the type T. The context binding is an implicit value conversion T => Feature[T].

trait Feature[T]{ def apply(x: T): Array[Double] }

class Classifier[T : Feature](
xt: Vector[T],
yt: Vector[Double]) {

def loss(weights: T): Double = {
xt.zip(yt).map{case(x, y) => {
val mapper: Feature[T] = implicitly[Feature[T]]
y - mapper(x).zip(mapper(weights)).map{ case(x, w) =>x*w}.sum
}}.map(sqr(_)).sum
}
}


The Feature trait defines the context for the conversion which is actually implemented by the apply method. The view binding implicit function T => Double defined in the first implementation of the Classifier class is replaced by the mapping (or conversion) method: Feature.apply: T => Array[Double]
The mapping function mapper is implicitly declared using the implicitly.

The notation T: Feature specifies the implicit value, feature: It is equivalent to the more descriptive declaration of the Classifier class:

class Classifier[T](
xt: Vector[T],
yt: Vector[Double])(implicit feature: Feature[T])


A class parameterized type T cannot be bound to a primitive type P using a context: the developer has to define either an implicit view (T => P) or a context (or wrapper) W for the type T => W[P]

References
"Scala for the Impatient" - Chap 12: Implicits: C Horstman - Addison-Wesley - 2012
What are Scala context and view bounds?

## Wednesday, April 15, 2015

### Recursive Viterbi algorithm for hidden Markov model

Overview
A significant number of algorithms used in machine learning relies on dynamic programming and hidden Markov model (HMM) is no exception. Scala tail elimination represents an excellent alternative to the traditional iterative implementation of the 3 canonical forms of the HMM.

Introduction to HMM
Markov processes, and more specifically HMM, are commonly used in speech recognition, language translation, and text classification, document tagging, data compression and decoding.
A HMM algorithm uses 3 key components

• A set of observations
• A sequence of hidden states
• A model that maximizes the joint probability of the observations and hidden states, known as the Lambda model
HMM are usually visualized as lattice of states with transition and observations.
There are 3 use cases (or canonical forms) of the HMM.
• Evaluation: Evaluate the probability of a given sequence of observations, given a model
• Training: Identify (or learn) a model given a set of observations
• Decoding Estimate the state sequence with the highest probability to generate a given as set of observations and a model
The last use case, Decoding, is implemented numerically using the Viterbi algorithm.

Viterbi recursive implementation
Given a sequence of states {qt} and sequence of observations {oj}, the probability δt(i) for any sequence to have the highest probability path for the first T observations is defined for the state Si.
• Delta: sequence to have the highest probability path for the first i observations is defined for a specific test δt(i)
• Qstar: the optimum sequence q* of states Q0:T-1
The state of the Viterbi/HMM computation regarding the state transition and emission matrices is defined in the HMMState class

final protected class HMMState(val lambda: HMMLambda, val maxIters: Int) {

// Matrix of elements (t, i) that defines the highest probability of
// a single path of t observations reaching state S(i)
val delta = Matrix[Double](lambda.getT, lambda.getN)

// Auxiliary matrix of indices that maximize the probability of a
//given sequence of states
val psi = Matrix[Int](lambda.getT, lambda.getN)

// Singleton to compute the sequence Q* of states with the highest
// probability given a sequence of observations.
object QStar {
private val qStar = Array.fill(lambda.getT)(0)

// Update Q* the optimum sequence of state using backtracking..
def update(t: Int, index: Int): Unit
...
}
}


The algorithm is conveniently illustrated by the following diagram.

First, Let's create the key member and method for the Lambda model for HMM. The model is defined as a tuple of the transition probability matrix A, emission probability matrix B and the initial probability π

final protected class HMMLambda(
val A: Matrix[Double],
val B: Matrix[Double],
var pi: Array[Double],
val numObs: Int) {

// Number of observations
@inline def getT: Int = numObs

// Number of states for a sequence of observations
@inline def getN: Int = A.nRows

// Number of unique symbols (problem dimension) u
@inline def getM: Int = B.nCols

// Compute a new estimate of the log of the conditional
// probabilities for a given iteration.
def estimate(state: HMMState, obs: Array[Int]): Unit

// Normalize the state transition matrix A, emission matrix B
// and the initial probabilities pi.
def normalize: Unit
}


Then let's define the basic components for implementing the Viterbi algorithm. The class ViterbiPath encapsulates the member variables critical to the recursion.
The Viterbi algorithm is fully defined with

• lambda: Lambda model defined above
• state: State of the computation
• obsIndx: Index of observed states

class ViterbiPath(lambda: HMMLambda, state: HMMState, obsIndx: Array[Int]) {
val maxDelta = recurse(lambda.config._T, 0)
…
}


The recursive method, recurse that implements the formula or steps defined earlier. The method relies on the tail recursion. Tail recursion or tail elimination algorithm avoid the creation of a new stack frame for each invocation, improving the performance of the entire recursion.
@scala.annotation.tailrec
private def recurse(t: Int): Double = {

// Initialization of the delta value, return -1.0 in case of error
if( t == 0)
initial

// then for the subsequent observations ...
else {
// Update the maximum delta value and its state index for the observation t
Range(0, lambda.getN).foreach( updateMaxDelta(t, _) )

// If we reached the last observation... exit by backtracing the
// computation of the
if( t ==  obs.size-1) {
val idxMaxDelta = Range(0, lambda.getN).map(i =>
(i, state.delta(t, i))).maxBy(_._2)

// Update the Q* value with the index that maximize the delta.A
state.QStar.update(t+1, idxMaxDelta._1)
idxMaxDelta._2
}
else
recurse(t+1)
}
}


Once initialized (1), the maximum value of delta, maxDelta, is computed recursively by applying the formula at each state, s using the Scala maxBy method. (2). Next, the index of the column of the transition matrix A corresponding to the maximum of delta is computed (3). The last step is to update the matrix psi (4) (resp. delta (5)). Once the step t reaches the maximum number of observation labels (6), the optimum sequence of state q* is computed (7).

References
Scala for machine learning Packt Publishing 2014
Pattern Recognition and Machine Learning; Chap 13 Sequential Data/Hidden Markov Models C. Bishop Springer 2009

## Sunday, March 29, 2015

### Contravariant functors for vector fields

Overview
Most of Scala developers have some experience with the core tenets of functional programming: monads, functors and applicatives. Those concepts are not specific to Scala or even functional programming at large. There are elements of a field in Mathematics known as topology or algebraic topology.
Differential geometry or differential topology makes heavy use of tensors that leverage covariant and contravariant functors.
This post introduces the concepts of

• Contravariant functors applied to co-vectors and differential forms
• Projection of higher kind

Vector fields 101
Let's consider a 3 dimension Euclidean space with basis vector {ei} and a vector field V (f1, f2, f3) [Note: we follow Einstein tensor indices convention]

The vector field at the point P(x,y,z) as the tuple (f1(x,y,z), f2(x,y,z), f3(x,y,z)).
The vector over a field of k dimension field can be formally. mathematically defined as
$f: \boldsymbol{x} \,\, \epsilon \,\,\, \mathbb{R}^{k} \mapsto \mathbb{R} \\ f(\mathbf{x})=\sum_{i=1}^{n}{f^{i}}(\mathbf{x}).\mathbf{e}^{i}$ Example: $f(x,y,z) = 2x+z^{3}\boldsymbol{\mathbf{\overrightarrow{i}}} + xy+e^{-y}-z^{2}\boldsymbol{\mathbf{\overrightarrow{j}}} + \frac{x^{3}}{y}\boldsymbol{\mathbf{\overrightarrow{k}}}$
Now, let's consider the same vector V with a second reference (origin O' and basis vector e'i

$f(\mathbf{x})=\sum_{i=1}^{n}{f'_{i}}(\mathbf{x}).\mathbf{e'}_{i}$
The transformation matrix Sij convert the coordinates value functions fi and f'i. The tuple f =(fi) or more accurately defined as (fi) is the co-vector field for the vector field V
$S_{ij}: \begin{Vmatrix} f^{1} \\ f^{2} \\ f^{3} \end{Vmatrix} \mapsto \begin{Vmatrix} f'^{1} \\ f'^{2} \\ f'^{3} \end{Vmatrix}$
The scalar product of the co-vector f' and vector v(f) defined as is defined as
$< f',v> = \sum f'_{i}.f^{i}$
Given the scalar product we can define the co-vector field f' as a linear map
$\alpha (v) = < f',v> (1)$
Covariant functors
I assume the reader has basic understanding of Functor and Monads. Here is short overview:
A category C is composed of object x and morphism f defined as
$C= \{ {x_{i}, f \in C | f: x_{i} \mapsto x_{j}} \}$ A functor F is a map between two categories C and D that preserves the mapping.
$x\in C \Rightarrow F(x)\in D \\ x, y\in C \,\,\, F: x \mapsto y => F(x)\mapsto F(y)$
Let's look at the definition of a functor in Scala with the "preserving" mapping method, map

trait Functor[M[_]] {
def map[U, V](m: M[U])(f: U => V): M[V]
}


Let's define the functor for a vector (or tensor) field. A vector field is defined as a sequence or list of fields (i.e. values or function values).

type VField[U] = List[U]

trait VField_Ftor extends Functor[VField] {
override def map[U, V](vu: VField[U])(f: U => V): VField[V] = vu.map(f)
}


This particular implementation relies on the fact that List is a category with its own functor. The next step is to define the implicit class conversion VField[U] => Functor[VField[U]] so the map method is automatically invoked for each VField instance.

implicit class vField2Functor[U](vu: VField[U]) extends VField_Ftor {
final def map[V](f: U => V): VField[V] = super.map(vu)(f)
}


By default Covariant Functors (which preserve mapping) are known simply as Functors. Let's look at the case of Covector fields.

Contravariant functors
A Contravariant functor is a map between two categories that reverses the mapping of morphisms.
$x, y\in C \,\,\, F: x \mapsto y => F(y)\mapsto F(x)$

trait CoFunctor[M[_]] {
def map[U, V](m: M[U])(f: V => U): M[V]
}


The map method of the Cofunctor implements the relation M[V->U] => M[U]->M[V]
Let's implement a co-vector field using a contravariant functor. The definition (1) describes a linear map between a vector V over a field X to the scalar product V*: V => T.
A morphism on the category V* consists of a morphism of V => T or V => _ where V is a vector field and T or _ is a scalar function value.

type CoField[V, T] = Function1[V, T]


The co-vector field type, CoField is parameterized on the vector field type V which is a input or function parameter. Therefore the functor has to be contravariant.
The higher kind type M[_] takes a single type as parameter (i.e. M[V]) but a co-vector field requires two types:
• V: Vector field
• T: The scalar function is that the result of the inner product <.>
Fortunately the contravariant functor CoField_Ftor associated with the co-vector needs to be parameterized only with the vector field V. The solution is to pre-defined (or 'fix') the scalar type T using a higher kind projector for the type L[V] => CoField[V, T]
T => ({type L[X] = CoField[X,T]})#L

trait CoField_Ftor[T] extends CoFunctor[({type L[X] = CoField[X,T]})#L ] {
override def map[U,V](vu: CoField[U,T])(f: V => U): CoField[V,T] =
(v: V) => vu(f(v))
}


As you can see the morphism over the type V on the category CoField is defined as f: V => U instead of f: U => V. A kind parameterized on the return type (Function1) would require the 'default' (covariant) functor. Once again, we define an implicit class to convert a co-vector field, of type CoField to its functor, CoField2Ftor

implicit class CoField2Ftor[U,T](vu: CoField[U,T]) extends CoField_Ftor[T] {
final def map[V](f: V => U): CoField[V,T] = super.map(vu)(f)
}


Evaluation
Let's consider a field of function values FuncD of two dimension: v(x,y) = f1(x,y).i + f2(x,y.j. The Vector field VField is defined as a list of two function values.
type DVector = Array[Double]
type FuncD = Function1[DVector, Double]
type VFieldD = VField[FuncD]


The vector is computed by assigning a vector field to a specific point (P(1.5, 2.0). The functor is applied to the vector field, vField to generate a new vector field vField2

val f1: FuncD = new FuncD((x: DVector) => x(0)*x(1))
val f2: FuncD = new FuncD((x: DVector) => -2.0*x(1)*x(1))

val vfield: VFieldD = List[FuncD](f1, f2)
val value: DVector = Array[Double](1.5, 2.0)
val vField2: VFieldD = vfield.map( _*(4.0))


A co-vector field, coField is computed as the sum of the fields (function values). The product of co-vector field and vector field (scalar field product) is simply computed by applying the co-vector (linear map) to the vector field. Once defined, the morphism _morphism is used to generate a new co-vector field through the contravariant function,

val coField: CoField[VFieldD, FuncD] = (vf: VFieldD) => vf(0) + vf(1)
val contraFtor: CoField2Functor[VFieldD, FuncD] = coField

val product = coField(vField)
val _morphism: VFieldD => VFieldD = (vf: VFieldD) => vf.map( _*(3.0))
val coField2 = contraFtor.map( _morphism )

val newProduct: FuncD = coField2(coField)


References
• Tensor Analysis on Manifolds - R. Bishop, S. Goldberg - Dover publication 1980
• Differential Geometric Structures - W. Poor - Dover publications 2007
• Functors and Natural Transformationsv- A. Tarlecki - Category Theory 2014

## Wednesday, March 4, 2015

### F-bounded type polymorphism

Overview
F-Bounded Type polymorphism or Bounded Quantification Polymorphism is parametric type polymorphism that constrains the subtypes to themselves using bounds.
Let's consider the following "classification" problem:

How can we guarantee that the SharpPencils bucket contains only sharp pencils, not small pencils or erasers?

Type Polymorphism
The first attempt to solve the problem is to rely on parametric type polymorphism. To this purpose, we create a Pencils trait sub-classed by as a bucket for sharp pencils, SharpPencils and a bucket for small pencils, SmallPencils.
For the sake of simplification, we assume that Pencils defines only 2 methods: add and pop pencils.

trait Pencils[T] {
private lazy val content = new mutable.ListBuffer[T]

def add(t: T): Unit = content.append(t)
def pop: List[T] = (content -= data.head).toList
}

class SharpPencils extends Pencils[SharpPencils]
class SmallPencils extends Pencils[SmallPencils]


This implementation does not guarantee that SharpPencils is the only bucket that contains the sharp pencils. Another bucket can be created with sharp pencils, too.

class OtherPencils extends Bucket[SharpPencils]


The solution is to specify constraints (or bounds) on the type of elements contained in each bucket.

Bounded Polymorphism
The goal is to make sure that the bucket of specific type (or containing a specific type of pencils). The first step is to make sure that a bucket does not contain other items than a Pencils

trait Pencils[T <: Pencils[T]] {
private lazy val content = new mutable.ListBuffer[T]

def add(t: T): Unit = content.append(t)
def pop: List[T] = (content -= data.head).toList
}


This implementation resolve the limitation raised in the previous paragraph. However there is nothing that prevent SharpPencils to contain small pencils and SmallPencils to contain sharp pencils, as illustrated in the next code snippet.

 // Won't compile!
class SharpPencils extends Pencils[Eraser]

// The following code compiles!
class SharpPencils extends Pencils[SmallPencils]
class SmallPencils extends Pencils[SharpPencils]


Self-referential Polymorphism
As a matter of fact, the most effective constraint on a inherited type is the self reference that list the type allows for the method to execute.

trait Pencils[T <: Pencils[T]] {
self: =>
private lazy val content = new mutable.ListBuffer[T]
def add(t: T): Unit = content.append(t)
def pop: List[T] = (content -= data.head).toList
}

// The following code fails to compile!
class SharpPencils extends Pencils[SmallPencils]


The declaration of SharpPencils as containing small pencils fails to compile because it violates the self-referential type restriction.

Reference
Getting F-Bounded Polymorphism into Shape B. Greenman, F. Muehlboeck, R. Tate - Cornell University