Pattern matching
Slice and dice is defined as the process of breaking something down (for example, information) into smaller parts to examine and understand it. You can get more information about slice and dice at:
http://dictionary.reference.com/browse/slice-and-dice
Let's see this technique in action. We will try to count the number of elements in List
. There is already a length method defined on Lists:
scala> List(1,2,3).length res0: Int = 3
Rolling out one of our own list teaches us something:
object Count extends App { def count(list: List[Int]): Int = list match { case Nil => 0 // 1 case head :: tail => 1 + count(tail) // 2 } val l = List(1,2,3,4,5) println(count(l)) // prints 5 }
The preceding code counts the number of elements in the list. We match the list against two possible cases, which are as follows:
- The base case: The list is
Nil
, and an empty list matchesNil
as it has zero elements, we return0
. - The general case: This is a list that has one or more elements. A list with at least one element (head) plus possibly more elements (tail), we very well could have none. We don't know (as yet). So, we call the same method recursively with the tail. Here is the process shown pictorially:
Figure 3.2: Losing head at every iteration
Note how we've left head
aside in the preceding figure. We have taken the head value into account; in this case, we've incremented the count by 1
. When we call the method recursively, we have one less element to process. This losing of the head and reducing finally land us with the case 1—letting us terminate the processing eventually. We are iterating the list and visiting each node, albeit in a different way. Here, we don't use any mutation. There are no variables (no var
keyword) used as loop counters. By avoiding loop counters (which would have to be var), recursion promotes immutability. Immutability is a boon when we write concurrent code, as we will soon see in the following chapters.
Deconstruction with case statements
Here is how we de-structure a list to get at the first element of the list. The following case clause splits the list into the first element (the head) and the rest of the list (the tail):
case head :: tail => 1 + count(tail)
These case matches when it is matched against a list with at least one element. Open the REPL and type the following code:
scala> val head :: tail = List(1, 2, 3, 4) // 1 head: Int = 1 tail: List[Int] = List(2, 3, 4) scala> val (x,y) = (7, 10) // 2 x: Int = 7 y: Int = 10
The salient points for the preceding code are as follows:
- Deconstructs a list into its head,
1
, and its tail,List(2, 3, 4)
. - Deconstructs a pair into its constituent values—assigns
7
tox
and10
toy
And the case clause using underscore is as shown:
scala> List(1, 2, 3, 4) match { | case head :: tail => println(head) // 3 | case _ => println("Nothing") | }
Note
The case clause with just an underscore. We use it as an unnamed variable, just a placeholder.
The preceding command will print 1
.
The ::
symbol is a List extractor (refer to results in a call to the unapply
method of the ::
object. You did read it right. The ::
symbol is a case class and its companion object is::
. Like the apply
method that we saw earlier, the Scala compiler will call unapply
in a pattern matching expression. When we define a case class, we get both the apply
and unapply
methods written for us:
scala> case class MyClass (x: Int, y: Int) defined class MyClass scala> val p = MyClass(1, 2); p: MyClass = MyClass(1,2) scala> p match { | case MyClass(a, b) => println(s"a=$a, b=$b") // 1 | } a=1, b=2 scala> p match { | case a MyClass b => println(s"a=$a, b=$b") // 2 | } a=1, b=2
At the part in the code labeled as 2
, the extractor expression is written in the infix form:
scala> val p = List(1, 2, 3, 4) p: List[Int] = List(1, 2, 3, 4) scala> p match { | case ::(head, tail) => println(s"head=$head, tail=$tail") // 1 | case _ => println("What's up???") | } head=1, tail=List(2, 3, 4)
The part of code labeled as 1
, the unapply
method of ::
is called. The statement is just rewritten in an infix notation as follows:
case head :: tail => println(s"head=$head, tail=$tail")
Stack overflows
Our recursive solution works, however there is a problem waiting to strike us. The code works fine for small lists with a few elements. Let's stress test it with a big list that has 20000
elements:
- Call the count method as shown in the following code:
val l = 1 to 20000 // A range object count(l.toList) // Converts the range into a list
- Run the code, and you will get the java.lang.StackOverflowError error. The problem here is the recursive call 1 + count (tail).
Each intermediate context is remembered on a stack frame. The intermediate context here is Get me a count of the tail, and add one to it. How many such intermediate contexts are there? I think you already guessed it right, they are equal to the number of elements in a list.
In other words, the numbers of contexts to remember are proportional to n. In algorithmic sense, these are equal to O(n). So for this example list, we need 20,000 stack frames for a list that has 20,000 elements; so, we need these many stack frames. The system usually cannot allocate these many. Hence, the routine looks broken.
Now, what good is the technique if it does not work for large lists? We are in a logjam, as you can see in the following figure:
Figure 3.3: An example of stack overflow