Friday, April 15, 2016

Managing Spark context in ScalaTest

Target audience: Beginner
Estimated reading time: 10'


This post describes a methodology to manage the Spark context while testing you application using ScalaTest


Introduction
Debugging Apache Spark application using ScalaTest seems quite simple when dealing with a single test:
  • Specify your Spark context configuration SparkConf
  • Create the Spark context
  • Add test code related to your application
  • Clean up resources (Spark context, Akka context, File handles...
However, there are cases when you may need to create and close the spark context used across multiple test or on a subset of the tests. The challenge is to make sure that the Spark context is actually close when it is no longer needed.
This post introduces two basic
ScalaTest methods beforeAll and afterAll of the trait BeforeAndAfterAll to manage the context life-cyle of your test application.


Wrapping the Spark context
The objective is to create a small framework that create or retrieve an existing Spark context before executing a test and closing it after the test is completed, independently of the status of the test.
The initialization of the Spark context consist of specifying the configuration for the sequence of tests. Some of these parameters can be dynamically defined through a simple parameterization. The context is created only if it is not already defined within the scope of the test using the getOrCreate method. 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
trait SparkContextWrapper {
  protected[this] var sc: SparkContext = _

  def getContext: SparkContext = {
    val conf = new SparkConf().setAppName(s"Test-App")
      .set("spark.executor.instances", "2")
      .set("spark.driver.memory", "2g")
      .set("spark.executor.memory", "2g")
      .set("spark.executor.cores", "2")
      .set("spark.serializer", 
              "org.apache.spark.serializer.KryoSerializer")
      .set("spark.rdd.compress", "true")
      .set("spark.shuffle.spill", "true")
      .set("spark.shuffle.spill.compress", "true")
      .set("spark.shuffle.memoryFraction", "0.4")
      .set("spark.io.compression.codec", "snappy")
      .set("spark.network.timeout", "600")

    sc = SparkContext.getOrCreate(conf.setMaster("local[4]"))
    sc.setLogLevel("ERROR")
    sc
  }

  def close: Unit = {
    if (sc != null) sc.stop
  }
}

The solution is to let ScalaTest method manage the lifecycle of the Spark context for testing. 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
trait SparkContextForTesting 
     extends SparkContextWrapper with BeforeAndAfterAll { 
  self: Suite =>

  override def beforeAll: Unit = {
    getContext
    super.beforeAll
  }

  override def afterAll: Unit = {
    close
    super.afterAll
  }
}

Note: The BeforeAndAfterAll trait can only be sub-classed by a test suite inheriting the Suite trait. The method beforeAll (line: 5) is automatically called before the first test of your test suite and the method afterAll (line 10)is called after your last test is completed, whether those tests succeeded or not

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class MyTest 
  extends FlatSpec 
    with Matchers 
      with SparkContextForTesting {

 val x: Int = -4
 it should "My first test" in {
   val sqlContext = new SQLContext(sc) 
   // ....
 }
   // ... other tests

 it should "My last test" in {
   // clean up after it completes
 }
}

As they say at the end of the movie, "That's all folks!

14 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Good blog for the people who are seeking information about the technology.
    Awesome work keep it up. Simply superb.
    machine learning course
    machine learning certification
    machine learning training
    machine learning training course

    ReplyDelete
  3. Really very nice blog information for this one and more technical skills are improve,i like that kind of post.

    Guest posting sites
    Technology

    ReplyDelete
  4. A very awesome blog post. We are really grateful for your blog post. You will find a lot of approaches after visiting your post. KISS English Youtube

    ReplyDelete
  5. I have a hard time describing my thoughts on content, but I really felt I should here. Your article is really great. I like the way you wrote this information. watch video

    ReplyDelete
  6. Nice information thank you,if you want more information please visit our link machine learning online course

    ReplyDelete
  7. this is really good post here. Thanks for taking the time to post such valuable information. Quality content is what always gets the visitors coming.
    IELTS Coaching in chennai

    German Classes in Chennai

    GRE Coaching Classes in Chennai

    TOEFL Coaching in Chennai

    spoken english classes in chennai | Communication training

    ReplyDelete
  8. Movie-watching websites that are more than movie-watching websites Because we are the number 1 free movie site in Thailand for a long time, including new movies, Thai movies, Western movies, Asian movies, we have all kinds of ways for you Including new series Full of all stories without interstitial ads to keep annoying anymore. One place sa-movie.com.

    Android and IOS operating systems. Watch online movies, Thai movies, Western movies, Asian movies, Cartoon movies, Netflix Movie, Action Movies, Comedy Movies, Crime Movies, Drama Movies, Horror Movies, Adventure Movies, Crash Movies and still have many new movies to watch. You can watch for free anytime, anywhere 24 hours a day at see4k.com.


    GangManga read manga, read manga, read manga online for free, fast loading, clear images in HD quality, all titles, anywhere, anytime, on mobile, tablet, computer. Android and IOS operating systems. Read top comics, action dramas, comedy, adventure, horror and manga. New coming every day to watch many more. Can be read for free anytime anywhere 24 hours a day at gangmanga.com..

    It is no secret that football is among the most popular and widely watched sports. Everybody who likes football tries to find the best platform for free soccer streaming. So, what are the best free sports streaming sites? We are going to answer this question. On this page, you can find a detailed overview of the most widespread soccer streaming websites. Keep on reading and make the best choice for you live24th.me.

    ReplyDelete