Continuous Delivery for Scala with TravisCI

2016-10-02

Programming , Technology

scala

For many engineers - regardless of language - all the plumbing needed for setting up reliable continuous delivery (continuous release) is either tedious, or out of reach as the cost vs benefit equation just doesn’t make sense. For Scala users this frustration is often compounded because SBT is, for many users, black magic they simply don’t understand. Build setup and commands are often cargo cult’ed from one build to the next “because that’s what worked last time”. This is an unacceptable state of affairs.

A better way

If for no other reason that to reduce the burden on my team when it comes to open sourcing our software infrastructure, I have consolidated our open source builds into a straight forward and simple to use plugin that does all the things most users would want for releasing open source projects. This plugin is sbt-rig. Rig has the following goals:

All development should be done in the open and use Git SCM, and builds should take place on Travis (or internally hosted enterprise Travis instances). The use of Travis is important, because it mandates all build definitions are checked in, and that credentials are properly encrypted.
All pull requests should be built and tested, complete with code coverage to inform the reviewer in the best possible manner as to if this PR should merge or not.
Merging code to the master branch actually means something: any updates to master are considered release-worthy and code is automatically built and released as such.
All rebel leases should be tagged in Git and their artifacts should hosted on maven central.
sbt-rig should not interfere with anything outside of these bounds in a mandatory way. Users are free to configure their projects however they wish, and still gain this workflow. Likewise, sbt-rig should not interfere with any additional release or deployment workflow steps.
Users should not be manually setting the patch version number. Provided the version number is monotonically increasing, its actual version doesn’t matter. Users can focus on major and minor semantics, but patch numbers themselves don’t matter as the build should be both binary and semantically compatible (a la semver).

With these goals defined, lets move on to how one actually uses sbt-rig.

Administrivia

The setup of sbt-rig in your project is trivial, but there are a few steps of administration that you’ll have to go through first in order to publish to maven central:

Register your profile with maven central. This allows you to push to a given TLD, for example, my profile allows me to write artifacts for com.timperrett and all subordinate domains.
You’ll need to setup a gpg key and public ring. You’ll have to publish the public part of your key pair as well.

Thankfully these operations are a one-time thing, so its only tedious once :-D Be 100% sure to retain your credentials for maven central in a super safe place - you’ll need them for subsequent steps, shortly.

Project Setup

First and foremost, ensure that your project/build.properties file has the latest version of SBT configured:

sbt.version=0.13.12

Next, simply add the sbt-rig plugin by adding the following lines to your project/plugins.sbt:

addSbtPlugin("io.verizon.build" % "sbt-rig" % "1.1.20")

With that done, you need to decide what license you want to release your open source project as. If you’re not sure, i’d recommend you head over to opensource.org and do some reading. Personally, i’d recommend Apache 2.0, but you should use a license that makes sense for your company, team and project - seek legal advice if you are not sure what makes the most sense. Once you’ve decided on a license, you need to set some metadata about your project such that maven central constraints for POM files are satisfied. I typically do this as an autoplugin within a given project (this is not common, because sbt-rig does not want to make assumptions about the legal nature of your projects, or who contributes to them).

In project/CentralRequirementsPlugin.scala you can add the following:

package myproject

import sbt._, Keys._
import xerial.sbt.Sonatype.autoImport.sonatypeProfileName

object CentralRequirementsPlugin extends AutoPlugin {

  override def trigger = allRequirements

  override def requires = RigPlugin

  override lazy val projectSettings = Seq(
    sonatypeProfileName := "com.yourcompany.myproject",
    pomExtra in Global := {
      <developers>
        <developer>
          <id>your-github-id</id>
          <name>Your Name Goes Here</name>
          <url>http://github.com/your-github-id</url>
        </developer>
        <!-- add other developers as needed -->
      </developers>
    },
    licenses := Seq("Apache-2.0" -> url("https://www.apache.org/licenses/LICENSE-2.0.html")),
    homepage := Some(url("http://your-github-id.github.io/myproject/")),
    scmInfo := Some(ScmInfo(url("https://github.com/yourcompany/myproject"),
                                "git@github.com:yourcompany/myproject.git"))
  )
}

I know this might look a bit daunting - but I assure its not. All we’ve done here is let maven central know who it was developing this project, what license we are going to be releasing under (I strongly urge you to also place a LICENSE file in the root of your repository with the complete legal text of the chosen license), and also where users can both clone the source code, and where they can read more about the project (for the latter, I recommend github pages, but you’re free to host that where ever you want - future versions of sbt-rig will make using github pages stupidly simple).

Travis Setup

Configuring Travis is also straight forward and obvious, but if you’re not familiar, then I recommend reading the getting started documentation from travis themselves. Once familiar and you have enabled builds on your project, you are going to want a .travis.yml file that looks something like this:

language: scala
scala:
  - 2.11.7

jdk:
  - oraclejdk8

branches:
  only:
  - master

before_script:
  - "if [ $TRAVIS_PULL_REQUEST = 'false' ]; then git checkout -qf $TRAVIS_BRANCH; fi"

script:
  - |
    if [ $TRAVIS_PULL_REQUEST = 'false' ]; then
      sbt ++$TRAVIS_SCALA_VERSION 'release with-defaults'
    else
      sbt sbt ++$TRAVIS_SCALA_VERSION test
    fi
  - find $HOME/.sbt -name "*.lock" | xargs rm
  - find $HOME/.ivy2 -name "ivydata-*.properties" | xargs rm

cache:
  directories:
    - $HOME/.ivy2/cache
    - $HOME/.sbt/boot/scala-$TRAVIS_SCALA_VERSION

after_success:
  - find $HOME/.sbt -name "*.lock" | xargs rm
  - find $HOME/.ivy2 -name "ivydata-*.properties" | xargs rm

# we'll go over these missing values in a second.
env:
  global:
    - secure: "......"

For the most part, this is going to be super boilerplate for most projects and will barely change. There are however, a few key points to note:

This will enable Travis to build pull requests and submit feedback to Github
This build only targets a single version of Scala - we’ll cover more advanced uses in the next section.
The after_success and cache blocks will dramatically speed up your build. These instruct Travis to cache all the common JARs used by your project, stuff them in S3 and then download them before any subsequent builds, which avoids re-bootstrapping the ivy environment (a.k.a downloading the internet).
The branches section is important, as this prevents travis going into a build-loop and triggering builds for tags. If you do not do this, you will inadvertently find yourself DoS’ing the build server.

The main thing that is missing from this definition - and must be added once per-repository are the credentials for publishing to maven central. As-is, sbt-rig is expecting these to be set into the build environment, and to do this safely we can leverage travis’ support of variable encryption. You’ll need to this twice:

travis encrypt --add -r yourorg/yourrepo SONATYPE_USERNAME=xxxxxxx
travis encrypt --add -r yourorg/yourrepo SONATYPE_PASSWORD=zzzzzzz

These values are encrypted with a special RSA key (added by Travis to you repo) that is repository specific, and the private key is held by Travis, so essentially these are one-way encrypted from the perspective of the user, and now the secure values can be added to the env block of the build file (--add to the travis command is just a shortcut for this; feel free to do it manually if you want to preserve formatting).

That’s all there is too it for a single scala version build. The next section covers some extra credit items: cross-building multiple versions of Scala, cross building along arbitrary matrices, and adding code-coverage reporting.

Advanced Builds

Given that Scala is not binary compatible between feature versions, a really common use case is to build for both 2.10 and 2.11. To do this, simply set multiple versions of Scala in your .travis.yml:

language: scala
scala:
  - 2.10.5
  - 2.11.7

sbt-rig will automatically do the right thing, and create staging repositories on maven central for each discrete build job (meaning one build does not stomp on the other).

Another common, but much trickier use case is publishing multiple library versions that contain different dependencies and compile different sources, based on either a particular library version or language version (and combinations therein). For this, we can leverage the Travis build matrix feature. This is an extremely powerful tool, but one that is perfect for this job. Consider the following real example from the verizon/knobs project:

language: scala

matrix:
  include:
    - jdk: oraclejdk8
      scala: 2.10.5
      env: SCALAZ_STREAM_VERSION=0.7.3a
    - jdk: oraclejdk8
      scala: 2.10.5
      env: SCALAZ_STREAM_VERSION=0.8.1a
    - jdk: oraclejdk8
      scala: 2.11.8
      env: SCALAZ_STREAM_VERSION=0.8.1a
    - jdk: oraclejdk8
      scala: 2.11.8
      env: SCALAZ_STREAM_VERSION=0.7.3a

In this case, we need to build the library for both Scala 2.10, Scala 2.11 and then both of those language versions with different versions of Scalaz Stream (such that they rely on different and incompatible versions of Scalaz). By using the matrix in this manner, we can get all the combinations we need, and if needed, even exclude particular combinations to create a sparse matrix. Naturally this does not work right out of the box, because SBT needs to know which library versions to include. To do this, we simply embed a small AutoPlugin in the project directory of our repository:

package myproject

import sbt._, Keys._

object CrossLibraryPlugin extends AutoPlugin {

  object autoImport {
    val scalazStreamVersion = settingKey[String]("scalaz-stream version")
  }

  import autoImport._

  override def trigger = allRequirements

  override def requires = RigPlugin

  override lazy val projectSettings = Seq(
    scalazStreamVersion := {
      sys.env.get("SCALAZ_STREAM_VERSION").getOrElse("0.7.3a") // "0.8.1a" "0.7.3a"
    },
    unmanagedSourceDirectories in Compile += 
      (sourceDirectory in Compile).value / s"scalaz-stream-${scalazStreamVersion.value.take(3)}",
    version := {
      val suffix = if(scalazStreamVersion.value.startsWith("0.7")) "" else "a"
      val versionValue = version.value
      if(versionValue.endsWith("-SNAPSHOT"))
        versionValue.replaceAll("-SNAPSHOT", s"$suffix-SNAPSHOT")
      else s"$versionValue$suffix"
    }
  )
}

As is a community convention within the Scalaz ecosystem, we then end up with artifacts that have an a happened to the version. Naturally, you can make this work with any set of matrix vectors you want, simply by adding a new AutoPlugin that does the particular behavior you need - hopefully this gives you a sufficient template to customize it for your needs. AutoPlugin plus build matrices are awesome and can do pretty much anything.

Enjoy, fellow Scala users.

Footnote about Travis stability

After initially posting this article, I received questions about the instability of travis and how to handle flakey builds when travis-ci.org sporadically fails. That is a legitimate question, and one that doesn’t have an awesome answer. For our company open source projects, we do all the development, conversation, testing and such all in the open, and then we have a process that dynamically syncs the public repositories into our Github Enterprise, at which point our internal Travis Enterprise conducts the release build - clearly this is a heavy hammer, but given our internal systems are the same as the external systems this simply allows the release process to be deterministic, as our dedicated hardware is a lot more stable than the public travis-ci (where stability is critical for automated release builds when doing this kind of release strategy). In order to mitigate (where you don’t have your own build infra), you can either pay TravisCI for improved hosted service, or you can split-up your build such that you’re not consuming 4gb of RAM during testing or similar… most of the time builds are only killed because they ran out of space on their shared (containerized) build infrastructure. Hope that helps!