Type Classes in Scala

Type classes are an idea originally from Haskell that you can implement in Scala via implicit parameters. This is not new or obscure, but the technique is so useful that another post just demonstrating the basics can’t hurt. I will keep this brief — I have a lot to say about type classes, but first the fundamentals need to be extremely solid.

Let’s get imports out of the way….

import org.scalacheck._
import org.scalacheck.Prop._
import org.scalacheck.Gen._

Suppose I create a type such as a major-minor version scheme, or some such. It doesn’t really matter for this example.

case class Version(major: Int, minor: Int)

And suppose I want to serialize and deserialize it. To do this with a type class, I create a trait Serialize[T] that provides the ability to serialize/deserialize any value of type T. Implicit values implementing this trait are instances of the type class. (I prefix the methods with underscore to avoid name conflicts later, and this is not going to bother anyone because you should never call these methods directly. This is an entirely optional stylistic choice.)

trait Serialize[T] {
  def _serialize(v:T) : String
  def _deserialize(s:String) : Option[T] // May fail. In a robust implementation, we'd return Either[T, ReasonItFailed]
}

Next, I define functions that utilize this trait. These are the methods of the type class. A typical place for them is in the companion object. I also include a ScalaCheck property that should hold of any instance of the type class.

object Serialize {
  def serialize[T](v: T)(implicit instance: Serialize[T]) =
    instance._serialize(v)
  
  def deserialize[T](s: String)(implicit instance: Serialize[T]) =
    instance._deserialize(s)

  def deserializeSerialize[T](v: T)(implicit instance: Serialize[T]) : Prop
    = (deserialize(serialize(v)) ?= Some(v))
}

Now the type class is defined and I move on to instantiating it. In this case, I start by defining instances for standard types to build up to my own; this is common. Often instances for the standard types will also be shipped with the type class in the same companion object, but here they are in a totally separate module (if you are a seasoned OO programmer, this should already blow your mind a little). I’m not actually trying to make this robust, as you will generally want to go through an intermediate format such as JSON or XML which is more trivially compositional.

object SerializeInstances {
  import Serialize._

  implicit object serializeString extends Serialize[String] {
    def _serialize(v: String) = v
    def _deserialize(s: String) = Some(s)
  }

  implicit object serializeInt extends Serialize[Int] {
    def _serialize(v: Int) = v.toString
    def _deserialize(s: String) =
      try { Some(s.toInt) } catch { case _ => None }
  }

  implicit def serializeTuple2[A, B]
    (implicit l: Serialize[A], r: Serialize[B]) = new Serialize[(A, B)] {

      def escape(s: String) = s.replaceAll("&", "&")
                               .replaceAll(",", ",")

      def unescape(s: String) = s.replaceAll(",", ",")
                                 .replaceAll("&", "&")

      def _serialize(v: (A, B)) = 
        escape(serialize(v._1)) + "," + escape(serialize(v._2))

      def _deserialize(s: String) =
        s.split(",", -1).toList match {
          case leftS :: rightS :: Nil => {
            for(left <- deserialize[A](unescape(leftS));
                right <- deserialize[B](unescape(rightS)))
            yield (left, right)
          }
          case _ => None
        }
    }
}

Now I do a quick check (oblique pun intended) that the property holds via the sbt console.

$ xsbt console

scala> import Serialize._
import Serialize._

scala> import SerializeInstances._
import SerializeInstances._

scala> serialize(3)
res0: String = 3

scala> serialize("foo")
res1: String = foo

scala> serialize( (3, (5, "hello")) )
res2: String = 3,5&comma;hello

scala> serialize( ("foo", 3), (7, 22))
res3: String = foo&comma;3,7&comma;22

scala> import org.scalacheck.Prop._
import org.scalacheck.Prop._

scala> forAll(deserializeSerialize[String] _) check
+ OK, passed 100 tests.

scala> forAll(deserializeSerialize[Int] _) check
+ OK, passed 100 tests.

scala> forAll(deserializeSerialize[(Int, String)] _) check
+ OK, passed 100 tests.

scala> forAll(deserializeSerialize[((String, Int), (Int, (Int, Int)))] _).check
+ OK, passed 100 tests.

In a real project, you probably want to use ScalaTest to manage a suite of tests. I tend to mix in Spec with Checkers.

Coming back to the Version type, we make a Serialize instance for it. To test it, we also need to make an instance of Arbitrary – ScalaCheck is a key example of a great library built on type classes. Note how in a brief and simple line I define a fuzz test generator (org.scalacheck.Gen[Version]) and wrap it into an Arbitrary[Version] instance.

object Version {
  import Serialize._
  import SerializeInstances._

  implicit def arbVersion =
    Arbitrary[Version] { resultOf(Version.apply _) }

  implicit object serializeVersion extends Serialize[Version] {
    def _serialize(v: Version) = Serialize.serialize(v.major, v.minor)
    def _deserialize(s: String) = {
      for((major, minor) <- deserialize[(Int, Int)](s))
      yield Version(major, minor)
    }
  }
}

And check the property for this instance:

$ xsbt console

...

scala> import Version._
import Version._

scala> forAll(deserializeSerialize[Version] _) check
+ OK, passed 100 tests.

There are a couple of reasonable interpretations of the definition / instantiation of type classes:

  1. A type class is an interface that a static type may implement. The implicit parameter is the method table.
  2. A type class is a property that may hold of a static type. The implicit parameter is how we get the Scala compiler to infer that the type has this property and provide evidence.

Both of the above make sense with the alternate syntax that Scala allows for implicit parameters used in this style: “def foo[T](implicit instance: Serialize[T])” may be shortened to “def foo[T: Serialize]“. In the latter syntax, you cannot actually name and call the methods of the instance; it is simply makes it possible to call serialize and deserialize by bringing the type class instance into scope.

There are huge number of advantages to type classes over simply implementing a trait. Here are some easily tangible ones:

  1. The Version type did not have to know about the serialization or deserialization code. In many real-world scenarios the instance may actually be provided by a separate library. This is probably the number one reason to use type classes. They introduce a form of code decoupling and reuse that simply does not exist without them.
  2. The Scala compiler automatically derives serialization/deserialization for tuples. Inferring instances for lists, etc, is just as easy. In a type class based project, the compiler writes a ton of code for you.
  3. I was able to implement deserialize in the same type class as serialize. This cannot be done by implementing an interface as in “case class Version(...) extends Serializable” because the method is selected by the return type — it is obviously not an appropriate method of String!. In a more Java-esque world, you tend to have a separate — extraneous, in light of this — interface for a factory. For the logically inclined: Java-style OO has an unfortunate asymmetry between introduction and elimination forms.

As a side note, having the scalacheck property helps to avoid the black hole of defensive programming, where you carefully check for errors and unexpected results. Instead, you can enjoy what I like to call offensive programming (pun definitely intended) where you blithely assume your tests have covered all corner cases.

Should you like to see more examples and explanations:

  • An academic paper, perhaps the origination of the idea in Scala: Type Classes as Objects and Implicits by Oliviera, Moors, and Odersky.
  • ScalaCheck uses type classes heavily to make it easy to build test case generators.
  • Scalaz includes tons of type classes and instances, and much more.
  • json-scalaz uses JSONR / JSONW / JSON type classes, which I’m going to write about later.
  • scala-relaxng, authored by yours truly at Inkling, uses a pretty-printing typeclass to test the parser on arbitrary rnc schemas, somewhat like this post.
  • Countless blog posts. Search for them and enjoy!
Tagged , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: