This is my Blog.

It's full of opinions on a wide range of stuff.

Unboxed new types within Scalaz7

Some time ago I started investigating the latest (and as yet unreleased) version of Scalaz. Version 7.x is markedly different to version 6.x; utilising a totally different design that makes a distinct split between core abstractions and the syntax to work with said abstractions. In any case, thats fodder for another post; the bottom line is that Scalaz7 is really, really slick - I like the look of it a lot and I feel like theres a lot to be learnt by simply studying the codebase and throwing around the abstractions therein (this may be less true for haskell gurus, but for mere mortals like myself I’ve certainly found it insightful).

One of the really neat things that Scalaz7 makes use of is a very clever trick with types in order to disambiguate typeclass instances for a given type T. This was something Miles demonstrated some time ago in a gist, and I was intrigued to find it being used in Scalaz7.

So then, what the hell does all this mean you might be wondering? Well, consider a Monoid of type Int like so:

scala>import scalaz._, std.anyVal._
import scalaz._
import std.anyVal._

scala> Monoid[Int].append(10,20)
res1: Int = 30

…or with sugary syntax imported as well:

scala>import scalaz._, syntax.semigroup._, std.anyVal._
import scalaz._
import syntax.semigroup._
import std.anyVal._

scala> 10 |+| 20
res2: Int = 30

This is simple use of the Monoid type, and this example uses the default type class instance for Int, which simply sums two numbers together. But, consider another case for Int: what if you needed to multiply those same numbers instead?

If there were two typeclass instances for Monoid[Int], unless they were in separate objects, packages or jars there would be an implicit ambiguity which would not compile: the type system would not implciitly be able to determine which typeclass it should apply when considering an operation on monoids of Int.

This issue has been overcome in Scalaz7 by using the type tagging trick mentioned in the introduction. Specifically, the default behaviour for Monoid[Int] is addition, and then a second typeclass exists for Monoid[Int @@ Multiplication]. The @@ syntax probably looks a little funny, so lets look at how its used and then talk in more detail about how that works:

scala>import Tags._
import Tags._

scala> Multiplication(2) |+| Multiplication(10)
res14: scalaz.package.@@[Int,scalaz.Tags.Multiplication] = 20

You’ll notice that the value was multiplied this time, giving the correct result of 20, but you may well be wondering about the resulting type signature of @@[Int,Multiplication]… this is where it gets interesting.

Multiplication just acts as a small bit of plumbing to produce a type of A, tagged with Multiplication; and this gives you the ability to define a type which is distinct from another - even if they are “the same” (so to speak). The definition of Multiplication is like so:

sealed trait Multiplication
def Multiplication[A](a: A): A @@ Multiplication = Tag[A, Multiplication](a)

object Tag {
  @inline def apply[A, T](a: A): A @@ T = a.asInstanceOf[A @@ T]
  ...
}

The Tag object simply tags types of A with a specified marker of T; or, in this case, Multiplication, where the result is A @@ T. I grant you, it looks weird to start with, but the definition of @@ is quite straight forward:

type @@[T, Tag] = T with Tagged[Tag]

Where Tagged in turn is a structural type:

type Tagged[T] = {type Tag = T}

The really clever thing here is that this whole setup is just at the type-level: the values are just Int in the underlying byte code. That is to say if we had something like:

def foobar(i: Int): Int @@ Multiplication = ... 

When compiled it would actually end up being:

def foobar(i: Int): Int = … 

Which is pretty awesome. This sort of strategy is obviously quite generic and there are a range of different tags within Scalaz7, including selecting first and last operands of a monoid, zipping with applicatives, conjunction and disjunctions etc. All very neat stuff!

Leave a Comment

An introduction to simpler concurrency abstractions

No matter what programming language you use to create ””the next big thing™”, when it comes to running your code, there will – eventually – be a thread) that has to execute and compute the result of your program. You may be wondering why you should even care about this threading lark? Well, modern computing hardware is typically not sporting faster clock speeds, but instead features multiple cores, or physical processors. If you have a single threaded program, then you cannot make use of the abundant power that modern hardware makes available; surplus cores simply sit idle and unused. For desktop computers this is less of an issue, but for server based applications, having wasted resources that you already paid for is quite an issue.

A common scenario that you are likely familiar with – either implicitly or explicitly – is for your program to run on a single thread. In this situation the program executes imperatively. For readers familiar with C or a scripting language like PHP or Ruby, this generally means the code runs top to bottom. Consider this trivial example:

// create a variable
var foo = 0

// loop and increment the var with each iteration
def doSomething = 
  for(i <- 1 to 10){ 
    foo += 1 
  }

// check the value of foo
foo 

This kind of code, irrespective of the exact syntax, should be familiar to anyone who’s ever used a mainstream programming language. When this program is run with a single thread, the result will, as one might expect, be an integer value of 10. Seemingly straightforward.

Now, reconsider what would happen if you ran this same program on two concurrently executing threads that shared the same memory space. If you’ve never done any multi-threaded programming, the answer to this question may not be obvious: as each thread runs its own counting loop, both thread A and thread B will be setting the value of the foo variable. This will have some wacky side-effects in that one thread will constantly be pulling the rug out from under the others feet. This is not constructive for either thread.

This rather unfortunate scenario has several “solutions” that are found in the majority of mainstream programming languages: one of these solutions is known as synchronisation). As you might have guessed from the name, the two concurrently executing threads are synchronised so that only one thread updates the foo variable at a time. In various programming languages, locks) are often used as a synchronisation mechanism (along with derivatives like Semaphores and Reentrant Locks). Whilst this article isn’t long enough to go into the details of all these things (and its a deep subject!), the general concept is that when a thread needs a shared resource, it locks it for its exclusive use whilst it does its business. For all the while thread A is locking, thread B (or indeed, thread n) is “blocked”, that is to say thread B is waiting on thread A and cannot do any work during that time. This gets awkward. Quickly.

Hopefully this gives you a high-level appreciation for the issues associated in writing concurrent software. Even in a small application, concurrent operations could quickly become a practical nightmare (and often do), let alone in a large-scale distributed application that has both inter-process and inter-machine concurrency concerns. Writing software that makes effective and correct use of these traditional concurrency tools is very, very tricky.

Time for a re-think.

As it turns out, some clever folks realised back in the mid-sixties that manual threading and locking was a poor level of abstraction for writing concurrent software, and with that, invented the actor model. Actors essentially allow the developer to reason about concurrent operations without the need for explicit locking or thread management. In fact, Actors were largely popularised by the Erlang programming language in the mid-eighties, where Erlang actors allowed telecoms companies such as Ericsson to build highly fault-tolerant, concurrent systems, that achieve extreme levels of availability – famously achieving “nine nines”: 99.9999999% annual uptime. In fact, Erlang is still highly popular in the TelCo sector even today, and many of your phone calls, SMS and Facebook chat IMs all use Erlang actors as they wind their way across the interweb.

So what is the actor model? Well, primarily actors are a mechanism for encapsulating both state and behaviour into a single, consolidated item. Each actor within an application can only see its own state, and the only way to communicate with other actors is to send immutable “messages”. Unlike the earlier example in the introduction that involved shared state (the foo variable), and resulted in blocking synchronisation between threads, actors are inherently non-blocking. They are free to send messages to other actors asynchronously and continue working on something else - each actor is entirely independent.

In terms of the underlying actor implementation, actors have what is known as a “mailbox”. In the same way that a postman places letters through a physical mailbox, those letters collect on top of each other one by one, with the oldest letter received being at the bottom of the pile. Actors operate a similar mechanism: when an actor instance receives a message in its mailbox, if its not doing anything else it will action the message, but if its currently busy, the message just sits there until the actor gets to it. Likewise, if an actor has no messages in its mailbox, it will consume a very small amount of system resources, and won’t block any applications threads: these qualities make actors a much easier abstraction to deal with and lend themselves to writing concurrent (and often distributed) software.

Actors by example

Enough chatter, let’s look at an example of using actors. The samples below make use of a toolkit called Akka, which is an actor and concurrency toolkit for both Scala, and the wider polyglot JVM ecosystem. I’ll be using Scala here, which is a hybrid-functional programming language. In short that means that values are immutable, and applications are typically constructed from small building blocks (functions). Scala also sports a range of language features that allow for concise, accurate programming not available in other languages. However, the principals of these samples can be easily replicated both in imperative languages like C# or Java, and also in languages such as Erlang.

The most basic actor implementation one could make with Akka would look like this:

import akka.actor._

class MyActor extends Actor {
  def receive = {
    case i: Int => println("received an integer!")
    case _      => println("received unknown message")
  }
}

This receive method (technically a partial function) defines the actors behaviour. That is to say, the actions the actor will take upon receipt of different messages. With the receiving behaviour defined, lets send this actor a message:

import akka.actor._

// boot the actor system
val system = ActorSystem("MySystem")
val sample = system.actorOf(Props[MyActor])

// send the actor message asynchronously 
sample ! 1234

Besides the bootstrapping, the ! method (called “bang”) is used to indicate a fire and forget (async) message send to sample actor instance, which, as you saw from the earlier code listing will output a message to the console. Clearly there are more interesting things to do than print messages to the console window, but you get the idea. At no point did the developer have to specify any concrete details about threading, locking or other finite details. Akka allows you fine control over thread allocation if you want it, but relatively default settings will see you through to about 20 million messages a second, with Akka being able to reach 50 million messages a second with some light configuration. This is staggering performance for a single commodity server. With all that being said, this example is of course very trivial, so lets talk about what else actors (and Akka) can do…

It turns out that the conceptual model of actors is a very convenient fit for a range of problems commonly found in business, and Akka ships with a range of tools that make solving these problems in a robust, and correct way nice and simple. Whilst this blog post is too short to talk about all of Akka’s features, one that is no doubt of interest to most readers is fault tolerance.

Supervisor Hierarchies

Think about the way many people write programs: how many times have you written a call to an external system, or to the database, and just assumed that it would work? Such practices are widespread, and Akka is entirely based around the opposite idea: failure is embraced, and assumed to happen. With this frame of reference one can design systems that can recover from general faults gracefully, rather than having no sensible provision for error. One of the main strategies for providing such functionality is something known as Supervisor Hierarchies. As the name suggests, actors can be “supervised” by another actor that takes action in the case of failure. There are a couple of different strategies in Akka that facilitate this structure:

  • All for one: If the supervisor is monitoring several actors, all of those actors under the supervisor are restarted in the event that one has a failure.
  • One for one: If the supervisor is monitoring multiple actors, when one actor has a failure, it will be restarted in isolation, and the other actors will be unawares that the failure occurred.

Simple enough. Let’s see supervision in action:

import akka.actor.Actor

class Example extends Actor {
  import akka.actor.OneForOneStrategy
  import akka.actor.SupervisorStrategy._
  import akka.util.duration._ 
  
  override val supervisorStrategy = OneForOneStrategy(
    maxNrOfRetries = 10, withinTimeRange = 1 minute){
      case _: NullPointerException => Resume
      case _: Exception => Restart
    }
  
  def receive = {
    case msg => ....
  }
}

You can see from the listing that the structure of the actor is exactly the same as the trivial example earlier – its just augmented with a customisation of the supervisionStratagy. This customisation allows you to specify different behaviour upon different errors arising. This is exceedingly convenient, and very easy to implement.

Conclusion

The actor model is a simple abstraction that facilities easier development of robust, concurrent software. This article only scratches the surface of what’s available, but there is a lot of cutting edge work being done right now in the area of concurrency and distributed systems. If this article has peaked your interest, there has never been a better time to get your hands dirty and try something new!

Leave a Comment

HTML5 Event Sources with Scala Spray

With the onset of HTML5 modern browsers (read: everyone but IE) implemented the W3C’s Server-Sent Events feature. This essentially allows you to push events from server to client without any additional hacks like long polling, or additional protocols like Web Sockets. This is exceedingly neat as it allows you to do stream-events with a very tiny API and minimal effort on that part of the developer.

Recently i’ve been doing a fair bit with the Scala Spray toolkit when building HTTP services, so I thought i’d take it for a spin and see how trivial it would be to implement SSE. Turns out it was pretty simple. Here’s a demo:

As mentioned in the demo video, most of the SSE tutorials you’ll find online don’t talk about streaming events, they just talk about having an event source HTTP resource for which the browser will consume events. The massive omission here is that if the server just closes the connection like any regular request, then you essentially have to wait for the retry period to pass before the browser will then automatically reconnect. This is not really that great as it just gives you a more convenient API for polling (how 90s!). On the one hand, if you have a low volume of events, or a bit of lag between updates doesn’t hurt then it may not be an issue, but in the main i’d imagine that most people wanting a stream of events would want exactly that, a stream. This is achieved by using chunking the response server side, and the content flushed to the client with the SSE’s mandatory “data:” prefix will then be operated on, giving you a handy stream of data. Of course, you’ll have to close that connection at some point, but if you set the retry latency to a low value then you’ll just move from stream to stream with very little lag (as per the example video).

In the broad sense, this kind of approach is useful in a wide set of circumstances to improve the user experience. I’ve rambled on about this before, but now with SSE there is a generic way of achieving it without needing a framework or platform that explicitly makes comet-style programming easy: its broadly accessible for all toolkits.

The code for the example application I demoed in the video can be found on github. Enjoy.

Leave a Comment

Using dust.js with Scala SBT

If you’ve been living under a rock for the past few weeks you might have missed LinkedIn bloging about their usage of dust.js. This idea of client-side templating is certainly a very interesting one, and something that is in all likelihood very useful for many large-scale applications currently in production. dust.js is quite nice in that it has no dependencies on any javascript library such as JQuery or YUI. This is really rather handy is it means no prescription on your style of implementation.

However, whilst the dust.js documentation has a wide range of examples demonstrating how to use the templating engine, what it does not have is any examples of how one might actually wrap it up and use it. So, with that I wanted to just write up a quick post to explain some of the things that might not be immediately obvious. Before getting to the meat of the article, it’s worth bearing in mind some terminology:

  • Template: The source dust.js template

  • Compiled template: A pre-compiled version of the template; essentially some generated JavaScript that is created at build-time (it’s also possible to create the JavaScript at runtime, but this is heavily discouraged)

  • Context: Each template can only be rendered with a context. More regularly we can think of this as the data that populates the template variables

In short, the developers writes a source template which is then compiled to JavaScript at build time and then rendered at runtime with a given context; which is likely a dynamic JSON payload.

When I first go going with dust.js I was unsure how to handle this build-time compilation. Frankly, the support for everything that isn’t JavaScript or Node was (and is still mostly) non-existent. This was problematic as I certainly didn’t want to be compiling templates on each and every request for a given HTML page. With that, I set about writing an SBT plugin for dust.js template compilation. This then allows you to have SBT build the dust.js templates into JavaScript that you can simply reference directly from your HTML (more on this later). The plugin itself takes templates it finds in src/main/dust and compiles them to the resource managed target whenever you invoke the dust task from the SBT shell. It’s also super easy to redirect the compilation output to the webapp directory during development so that you can simply use the local Jetty server to tryout your app:

(resourceManaged in (Compile, DustKeys.dust)) <<= (
      sourceDirectory in Compile){
    _ / "webapp" / "js" / "view"
}  

But I digress. For more information about the plugin, see it’s README file in the github repo.

After making my own tooling for SBT to work with dust.js, I then got to thinking about how one would actually wrap the dust.js rendering system. Curiously, the examples I found online suggest making an AJAX call to fetch the context, or template data. This strikes me as quite sub-optimal as it would mean creating the following requests:

  • fetching the general html for the page in the first instance
  • another request for the dust.js runtime
  • another request for the dust template
  • and then another for the context data

All that, just to render one template and add unnecessary lag to the page load whilst the ancillary request for context data takes place (after the initial page load has already completed!). For general page rendering, this seems somewhat bizarre. That is to say, requesting context data via AJAX for say, a single page application could well be a reasonable use-case, but for general pages that render “in full”, it seems silly.

Assuming you’re just rendering regular page content, we can make the separation between things one would want to load and cache locally in the browser, and things that need to be dynamic. For the most part, the context is the only thing you’d want to be dynamic and the more general plumbing code for rendering would probably be reusable throughout your rendering call sites. Consider a simple sample:

The dust template (lets call it home.dust in src/main/dust):

<div>
  <h2>Hey {name}</h2>
</div>  

This template will then be compiled to home.js and placed in the resource managed output location (as per your build config). The other thing to probably think about is if had several pre-compiled template files needed on a given page it would make sense to merge and minify them into a single JS file. Alternatively, if you were building a single-page application it might be nice to use a client-side loading framework like LABjs to pull in the template files lazily (i.e. as you needed them). Getting back tot eh example, and considering the aforementioned, lets assume a HTML page that loads the DOM skeleton and the static libraries needed for rendering:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
  <script type="text/javascript" src="js/dust.min.js"></script>
  <script type="text/javascript" src="js/view/home.js"></script>
</head>
<body>
  <h1>Home</h1>
</body>
</html>

Now, as it stands this page would display the heading “Home” and load the dust library along with the compiled template; otherwise it is essentially static. In order to render the dust template one would have to add something like the following to the page <head>:

<script type="text/javascript">
//<![CDATA[
dust.render("home", { name: "Fred" }, function(err, out) {
  document.getElementById('whatever').innerHTML = out;
});
//]]>
</script>

This simple bit of JS would render the content into the element with the “whatever” ID attribute. Now then, this will work, and prove the concept for sure. What it does not do however is provide a dynamic context, which almost certainly most users would want to do. In this way, I can see why people might see that it made sense to make an AJAX request to a pure JSON backend service to get the context dynamically, but as above, this would be a bit slow and cumbersome. Instead, I think it would probably make more sense to dynamically deliver a call to a rendering function with the desired context. Let’s qualify that: when the page loads, the template has been loaded and so has the dust library, so all that is actually required is have a call site that starts the rendering. I think this need can quite nicely be served by a server-side helper function. In essence, when the page is loaded, the server generates a <script> tag. Here’s an example:

<body>
  ...
  <script type="text/javascript" src="/view/home"></script>
</body>

Which in turn would deliver something like:

draw("home", { name: "Whatever" });

That is, where the draw function wraps the dust rendering system. This is just an example though as the point is that this could be any kind of JavaScript. For instance, it could be a backbone.js view or something along those lines: you deliver the rendering code along with the context. This at least seems more optimal than making a fuckton of JavaScript requests for context objects as other blogs suggest.

In the next post i’ll be showing you how to combine all the latest hipster web programming tools like dust.js, CoffeeScript and Less with a Scala backend using Unfiltered to deliver the dynamic template contexts.

Leave a Comment

The Year in Photos, 2011

This is a fairly personal post, so i’m sorry if you are a regular reader of the Scala articles here - those will resume from the next post!

The Years Events

As is becoming customary, I like to try and review the year nearly past as we look forward to the delights of the next. This blog is the second in a yearly series to make a conscious effort to note what the year was made up of and recall those awesome memories made.

Alpe d’huez, France

Most years I try to get some skiing in with friends, and this year we’d decided to go to France rather than my beloved Italy. Whilst the snow could have been better, we had a great time. I took this picture at the top of the highest gondola lift…

Bathoniean Sunsets

I’m fortunate to live in a beautiful area of the UK, and spring time is quite idyllic here. As the winter thaw gives way to things growing and spouting, the sun once again starts to bathe the evenings with glorious sunsets. This is one such evening looking out over the garden.

QConn London, Scala Community Dinner

March saw QConn return to London for a whole week and bring with it a heard of geeks from all over the globe. Many of my friends were visiting for the conference and with that decided that it would be a good idea to get the Scala folks together for some food, wine and all-round good times. Having people like James and Debasish was awesome fun. Can’t wait to do something similar next year during Scala days. James, we’re still waiting for “scala minus minus” ;-)

California Route 1: Pacific Highway

For anyone who’s been reading this blog over the past years, you may recall that last year I was stranded in mainland Europe during the volcano eruption whilst attending ScalaDays 2010. After rounding up a bunch of stranded folks for dinner, I ended up meeting Jon-Anders - making a long story short, we became good friends and one day decided that what we should do is drive the length of the pacific highway in a convertible Mustang. After a hastily organised flight to america we were all set. What ensued was an utterly hilarious time away where we had arranged nothing in advance and just took things as they came. From huge trees in Redwood country…

…to the winding roads and stunning coastline of BigSur and southern california:

The trip was a complete riot and finishing it off with spending time at Xerox PARC was a real high for me. It was literally like walking the halls of technology fame.

Early Summer Thistles

Getting back from Cali was somewhat of a come down, but I often go for hikes in and around where I live and was out one day and noticed a small set of thistles just growing on this open exposed hill face. Pretty nice, especially with the city backdrop.

Canada, Up-state New York and Niagara Falls.

In summer I was sent to Canada for a meeting (technically just across the boarder in Upstate New York) but this was a pretty awesome experience. Having booked a “Small European Saloon Car” before traveling, me and my colleague were surprised to be presented with a 4.0 litre land cruiser. Clearly “small” has some definition in Canada that i’m not familiar with. Anyway… it was fun to drive around in a tank truck for a few days and enjoy the delights of Toronto and New York scenery:

Of course, being in the area it only made sense to swing by Niagara falls on the way back to the airport. It was my first time here and damn, it’s far, far bigger than you might imagine. It really is a stunning natural spectacle.

End of Summer: Harvest Time

Back in England for the end of the summer is yet another lovely time. The UK was drenched in sunshine for the last few days (weeks?) of August which meant all the farms nearby were busy making the most harvesting their crops and preparing the land for winter. Being out on a walk I picked up this picture just after the field had been harvested:

JavaZone and M.I.T Arctic

Come September it was time once again for the most awesome conference of the yearly calendar: JavaZone! I love this conference; its always well organised, has great talks and really wicked parties. This year was no exception to that rule and proved to again be most excellent. This year we were also joined by some of our american friends, which just added to the delight. Following the conference, some of us head north from Oslo to the M.I.T Arctic lab for a few days of geekery and walking in the tundra. This was a really special highlight of the year and felt like a somewhat magical trip for everyone. Here’s a small sample of what we got up too (not all pictures taken by me, credits where credit appropriate):

(photo credit: Andreas Røe)

(photo credit: Andreas Røe)

And don’t forget the somewhat successful fishing in the fjords!!

It really was an amazing trip. If you’ve never visited Norway, you should, its a stunning country.

Lift in Action goes to production

So we’re nearly at the end of the year, so I should really mention Lift in Action going to print. This was a huge, huge milestone for me personally and concluded a two year unit of work. I should also mention that during the writing of this blog post I had originally neglected to even mention the book and that a friend had to remind me to mention it… I’m sure the psychologists would have something to say about that, but anyway!…

Leave a Comment

Documenting the Difficulties of Documentation

Over the years i’ve been involved in a wide-range of diverse open source projects. Some large. Some small. Some very obscure. But every single one has at one point or another, had a “problem” with documentation. Many of you reading this will no doubt have had a similar experience at some point in your programming career. Perhaps you were a project creator wondering how best to communicate the inherent awesome of something you’ve just created, or perhaps you were that eager n00b trying to get to grips with some new technology you came across on github - in either case, you have to create or consume “documentation”.

Now then, before we go any further I do have somewhat of a problem with the overloaded meaning of the term “documentation”. The widely accepted definition of documentation is the following:

Doc·u·men·ta·tion noun - Material that provides official information or evidence or that serves as a record.

Evidence that serves as a record? That sounds awfully vague doesn’t it. I’d like to suggest that this is, to all intent purposes, too vague. More specifically it is often an ambiguous umbrella term for what people are actually referring too. It’s this ambiguity that causes all manner of problems for software projects and their (would-be)users.

Disambiguating Documentation Components

There are many excellent projects in the Scala ecosystem, and many suffer this apparent “lack of documentation”. Curiously though, each project seems to suffer this affliction in its own specific way… at least, this is often what you might witness on the mailing-list of any project you might care to pick at random. Most projects have some thread or other in the archives of their mailing lists where someone has complained about their documentation, or distinct lack of whatever it is they were looking for.

With this in mind, i’d like to now illustrate what I think are the key aspects of the wider “documentation problem”, and disambiguate their respective intentions.

Tutorials and Introductions

The first thrust of documentation i’d like to define is that all-important introductory material people will need when coming to your project in the first instance. This is quite probably the most difficult type of text to write, particularly if the subject matter is highly technical in nature. It’s typically difficult for the following three reasons:

  • Assumptions - Assuming the reader fully understands a topic, line of code, operator or anything else must be one of the most common issues. In introductory texts and tutorials its highly likely that the reader will not be in possession of the implicit knowledge that’s relevant to properly grok the topic at hand.

  • Accessibility - “Joe Developer” can initially be easily scared by large words, complex-sounding terms that originate from academic theories, and terse writing styles. It is so vitally important that your tutorials and introduction texts are accessible. For the most part this may well mean that you have to sweep over some of the finer details, or forego some more abstract possibilities in order to effectively get the point across to newcomers. Ironically, it is often this simplification or frivolity with the facts that programmers struggle with when taking up the pen (ok - keyboard, but you know what I mean).

  • Authorship - Writing is hard. Be honest with yourself and recognise that you may not be any good at it. During the writing of Lift in Action I had to throw away a whole bunch of manuscript which was either too technical or just plain rubbish. Having a professional team of editors who could help me (re)learn about writing was really key… but i’m aware that this is hardly practical for the common case. The fact is that most of us have not written long documents since high-school or college, and it is incredibly difficult. Don’t forget this when writing your project introduction and tutorials - if you can’t do it the proper justice it deserves, then embrace the fact that us humans all have different skills and find someone who can plug the required gaps. Having your texts reviewed honestly by your peers is also another useful strategy to ensure what you’re writing is actually any good.

Examples and Explanations

The second branch of the documentation umbrella is somewhat of an extension to the first, but I decided to make it separate because its use-case feels distinctly different. With this in mind i’ll add the caveat that yes, examples often form parts of tutorials (and later, references), but in and of themselves you wouldn’t use verbose introductory-style writing within an example. In my mind at least, the primary difference is that example/explanatory documentation is tightly coupled with the code to which it relates. That is to say that the text is more often than not sitting next to the code itself in the comments, which usually means there are certain conventions to follow with regard to syntax and so forth (e.g. ScalaDoc, Wiki markup etc). Critically though, the tone of voice in the explanatory text when compared to that of the introductory & tutorial manuscripts is much shorter, more concise and really focusing on the line-by-line, blow-by-blow goings on of the code. More generally, its reasonable to suggest that examples are about illustrating concepts, and this is where the writing style really differentiates itself from the other branches listed in this article.

Finally, if you’re going to go to the trouble of writing examples and explanations, make sure the code actually works! Nothing is more frustrating to find an example that simply does not do what it should because the code either doesn’t compile or doesn’t run correctly.

Reference materials

Reference material is your last line of defence before forcing users to delve into the source code. Consider the type of person might be using a reference, or what their goal might be? I’d propose that when someone is looking at a reference, they know what they are looking for: they need something specific. It’s probably also fair to assume that before arriving at the reference they will have read tutorials and examples, so there is a degree of implied understanding. Reference materials are usually heavy on details and light on fluffy writing style, which allows the author to be far more technical, and satisfy the aforementioned need for presenting the exact facts.

Unlike the other aspects of documentation writing, references can have the tendency to become quite large; even for mid-sized projects. With this in mind you should take care to refactor the organisation and layout of the reference with each major change and strives for a reference that is logically ordered and consistent throughout. If your reference is massive, then you should seriously consider having a decent search function in addition to a logical layout.

Source Code

Your last line of documentation defence is the source code. That might seem odd, as i’m not talking about comments or ScalaDoc (or similar in your language of choice). I’m talking about types (sorry dynamically typed people!). Type annotations can be extremely useful when reading code and concisely communicating the result or intention of a particular item. With Scala for example, consider these two lines:

// actual code irrelevant
val foo = whatever.map(...).flatMap(...).foldLeft(...).map(...)
val foo: Option[Int] = whatever.map(...).flatMap(...).foldLeft(...).map(...)

Having the simple type annotation frees me from having to mull over the code in order to understand its result; its right there in the type annotation. When moving code between teams, or people, having the ability to simple scan complex blocks of code and understand it is a huge win (IMHO). Sure, make use of type inference where the value is obvious (e.g. simple assignment etc), but where you think something might not be directly understood explicitly annotating can serve as effective documentation.

…In any case, writing readable and well documented code could easily be the subject of a whole other blog post (or indeed, academic paper), so we’ll put a pin in that subject and move swiftly along…

In my mind at least, when your general users complain about documentation, they are typically complaining about one of first three branches of this documentation umbrella.

Isn’t all this a lot of work?

I won’t lie to you, good reader: this will take a lot of time and dedication to do. To do it well, will take more time and a fuckton more effort. However, if you want your software or project to be used by people other than yourself, then it is imperative the documentation is well structured, and exhibiting some - if not most - of the traits in this article. It’s also important that you realise that it will, in all likelihood, be “expensive” in terms of time; this is nearly unavoidable, but it will make your project more approachable and more usable.

Interestingly one thing that you often see are projects that try to distill the documentation effort by promoting community authorship, which when considering what a social activity programming has become in recent times, does not seem like such a crazy idea. The reality however is somewhat different to the ideal: people are often keen to submit bugs and patches, but those same keen people are still typically reluctant to contribute documentation. One could speculate that writing documentation was too tedious, or that perhaps it was not as fun as doing the coding and people didn’t want to spend time in that area… the reason is actually irrelevant, as the result is always the same: without a small core of dedicated people who write, maintain and constantly improve that wiki the whole documentation effort will fail. To qualify that, i’m not saying that community-powered documentation never works, as that clearly isn’t the case. I am however suggesting that by-and-large for most communities it simply does not operate effectively, and this has an overall negative impact on the project as a whole.

Who’s doing it right?

I’m not going to gratify this article with pointing out projects that are “doing it wrong”, as frankly most projects are making a hash of their documentation; irrespective of language or community. I do however want to highlight a couple of projects that are setting an excellent example:

  • Akka - The Akka team are making a superb job of documenting their project, even with extensive changes to the codebase they are very effective when it comes to ensuring the docs are up-to-date and covering all new or refactored features. The documentation is nearly exclusively maintained by the core team of programmers.

  • JQuery - Very different to Akka, and in a different community, JQuery has been very effective in delivering core reference materials that allow (and encourage) the wider community to write tutorials, introductions and other helpful articles. JQuery also makes extensive use of illustrative examples, and its good documentation is probably one of the reasons for its apparent ubiquity on the web today.

Both of these projects exhibit dedication on the part of the coders, who are typically the ones authoring the core reference materials and bulk of the ancillary texts. Learn from these projects and others like them. We can all do a better job of documenting our projects, myself included. Say no to undocumented projects, and the next time you’re throwing something on github, take the time to write some documentation… even if its a long README people will thank you for it.

Leave a Comment

Lift in Action Finally Completed

So here we are. The end finally happened. Lift in Action was sent to print last week, representing the conclusion of 20 months of work. Without doubt this has been one of the largest and most difficult projects i’ve ever undertaken. With this in mind I wanted to thank everyone who read the MEAP edition, contributed fixes, asked questions, issued pull requests and everyone who generally helped me with the project! Without your input the book wouldn’t be what it is, and I hope that it serves to be a useful reference for learning and making the most of Lift.

The print copies should be coming out in a week or so and depending where you purchased, your copy should be arriving in the coming weeks.

Thanks again for all the support and kind words during the writing; i’m now looking forward to getting my life back! :-D

Leave a Comment

Using Apache Shiro with Lift

As it stands, Lift only has its proto* traits for user management, and that system has its limitations and you will ultimately end up replacing it in any non-trivial application as your needs change and you need to grow. Whilst this is what those traits are designed for (quick start, short haul), you typically end up rolling your own system for users etc when using Lift, and this can often be somewhat cumbersome or not particularly easy to do well. As this whole user management piece is so often requested, I figured that i’d write a plugin library for Lift.

Apache Shiro is a Java security framework (formally known as JSecurity) and it comes with a fairly abstract set of classes for building systems that have the familiar users, roles and permissions setup. Pretty much most applications these days have some notion of users, customers or some other subject that you care about and might want to conduct access control around. This is exactly what Shiro is designed for, and it ships with out of the box inter-operation with ActiveDirectory and other such repositories commonly found in the enterprise space for managing user data.

Part of the reason that other security frameworks never really took to Lift (or vice-versa) is that Lift has its own mech for managing resource ACLs and it never made sense to separate that into a different servlet filter and somehow munge that together: its not 1990. Fortunately Shiro was fairly easy to integrate with Lift in such a way that it allows you to simply augment your existing SiteMap setup, template markup and even dispatch resources. Currently this integration project is in early stages, and you can find the source code here: github.com/timperrett/lift-shiro

Example

Here’s a quick walkthrough of the various ways you can use the integration within your project. Firstly, lets assume you only want to display a section of content to authenticated users:

<lift:has_role name="admin">
  <p>This content is only available for admins</p>
</lift:has_role>

There are a range of authentication snippets that allow you to define who sees what within your templates, checkout the documentation for more on that. Nextup, what if you want to block access to an page entirely if the user is not authenticated? Just add the following to your SiteMap:

...
Menu("Home") / "index" >> RequireAuthentication
...

By default RequireAuthentication will redirect unauthenticated users back to the URL defined in Shiro.loginURL. Likewise, you can specify whole resources to require a particular role or permission:

...
Menu("Role Test") / "restricted" >> RequireAuthentication >> HasRole("admin")
...

Clearly the SiteMap functionality is implemented as LocParam, so you can implement them within your own Loc types, or simply use them declaratively within the regular SiteMap usage.

This whole integration project wraps the Shiro types, so you only need to configure shiro.ini in the root of your classpath and enter the appropriate realm information as per the regular Shiro documentation, then away you go: password files… active directory… whatever you want.

As above, this project is still early stage, but it does indeed work. I’m currently looking for feedback, so if you have some thoughts or things that would be cool to see, then please checkout the project on Github and fork away.

Leave a Comment

System Scripting with Scala

For the longest time I have used Ruby as my pocket knife for system scripting… you know, just knocking up small little executable files that run helpful tasks or automate yawnful processes. I like Ruby for this kind of thing and it works. Recently a coworker in marketing wanted some automation for something relatively simple and instead of using Ruby I thought i’d just knock it together with Scala and in doing so I came across a neat little thing with the Scala scripting support.

Assuming you have Scala installed, you can execute bash statements and pass them directly to your Scala code. This can be pretty handy as there are certain things that are particularly annoying to do from Scala (and more broadly with Java) like finding out where exactly you are executing too on the file system. Often if you use the good ol’ protection domain trick it will give you a location in /var/tmp, as opposed to the real location of the script. Consider the following:

#!/bin/sh
SCRIPT="$(cd "${0%/*}" 2>/dev/null; echo "$PWD"/"${0##*/}")"
DIR=`dirname "${SCRIPT}"}`
exec scala $0 $DIR $SCRIPT
::!#

import java.io.File

object App {
  def main(args: Array[String]): Unit = {
    val Array(directory,script) = args.map(new File(_).getAbsolutePath)
    println("Executing '%s' in directory '%s'".format(script, directory))
  }
}

Notice the base code between #!/bin/sh and ::!#. This allows you to execute bash script (or whatever script you want) before evaluating this file as a Scala script. This can be pretty handy for certain tasks when doing system scripting :-)

Assume you saved this file as “thing”, you can then execute it like any other script: ./thing

Enjoy.

Leave a Comment

Using SOAP with Scala

It seems I have inadvertently become “that guy who does SOAP with Scala”. Given this illustrious position it seemed prudent to get around to putting up an article that explains how to use the SOAP support now present in “Scalaxb”:http://scalaxb.org/.

Scalaxb is a nice tool for working with XSD Schema from Scala, and recently WSDL support was added. In short this means that you can use native Scala types (such as OptionT) for interacting with SOAP methods.

Prerequisites

Before you get going, its necessary to install Scalaxb, which itself requires “Conscript”:https://github.com/n8han/conscript/, so install that first. With ConScript in place, install Scalaxb by running:

cs eed3si9n/scalaxb

This will install Scalaxb and make it available on your $PATH. You can verify this by running the scalaxb command.

Generating Contracts

With Scalaxb installed and your WSDL to hand, you can run something like:

scalaxb -p eu.getintheloop.sample \ 
        -d src/main/scala \
        src/main/wsdl/weather.wsdl

This will generate a range of .scala files pertaining to the WSDL contract you passed as the last argument to the scalaxb tool. In addition this sample includes the -p to specify your own package and also the -d argument to place the output files into my Scala source directory. In this particular case, I’m using “this SOAP endpoint”:http://www.deeptraining.com/webservices/weather.asmx?wsdl, and you can freely do so too for the sake of this example.

Understanding Generated Classes

Scalaxb generates three separate components:

  • SOAP protocol classes: These are the ancillary classes that are used to operate the SOAP protocol for things like the message envelope.
  • Default transport implementation: The HTTP driver to actually POST the SOAP message to the endpoint URI. This default implementation is based on the blocking HTTP “Scala Dispatch”:http://dispatch.databinder.net/Dispatch.html
  • Your service contracts: This is the only service specific part of the generated files. These relate to your specific service whilst the other two parts are completely decoupled and reusable over implementations / services you might have.

Enough talking, lets look at some code.

Usage

The Scalaxb SOAP mechanism makes heavy use of the cake pattern, and thus allows you to mix and match different components as your situation dictates. Here’s the default usage pattern:

package eu.getintheloop.sample

import scalaxb._

object Main {
  def main(args: Array[String]){
    
    val remote = new WeatherSoap12s 
       with SoapClients 
       with DispatchHttpClients {}
    
    println {
      remote.service.getWeather(Some("New York"))
    }
  }
}

Note specifically that by combining the weather service contracts with the SOAP protocol classes and the transport implementation, the result is a working driver from which you can directly call the SOAP methods.

The result of the service call itself is a Either[Fault[_], Option[T]], so if you want to get at the actual value you can just simply fold on the Either.

Suggestions for Customisation

If you have multiple services to interact with then it can be nice to cake together all your service implementations so that you just have a single client that lazily loads the appropriate service classes as needed. Additionally, you could also replace the transport implementation with something asynchronous and based on futures… that also is fun.

The sample code for this example can be found at: “https://github.com/timperrett/scalaxb-soap-example”:https://github.com/timperrett/scalaxb-soap-example

Enjoy.

Leave a Comment

Running SproutCore from within Lift

If you want to run “SproutCore”:http://sproutcore.com from within your Lift application there are a couple of configurations you need to ensure you apply within your Boot.scala in order to actually make it work as you might anticipate.

Firstly you need to ensure you set the HTML parser to use HTML5 and not XHTML, otherwise your templates will explode in extraordinary fashion when using the built template file as created by SproutCore build tools:

<code>LiftRules.htmlProperties.default.set((r: Req) => 
  new Html5Properties(r.userAgent))
</code>

With that in place you need to do the equivalent of implementing a symlink for those setups that use the filesystem (a la PHP, Rails etc). Within Lift this is accomplished by using a stateless rewrite:

<code>LiftRules.statelessRewrite.append {
  case RewriteRequest(ParsePath("index" :: Nil, "", true, false),_,_) => {
       RewriteResponse("static" :: "todos" :: "en" :: "1.0" :: "index" :: Nil)
  }
}
</code>

This code tells Lift to rewrite the root path (/) to /static/todos/en//index template. Also be sure to implement your SiteMap so that only / is accessible and not the full (direct) SproutCore template path. If I do much more stuff with SproutCore I may well end up making an SBT plugin that automatically builds the updated JS files and copies them to your src/main/webapp path… not sure yet, we’ll see. Eitherway, the rewriting is certainly a good candidate for stuffing into a LocParam and making it reusable based upon some project configuration. For example:

<code>Menu("Home") / "index" << UsesSproutCore </code>

For more information like this checkout my book on Lift launching this quarter on Manning Publications.

Leave a Comment

SBT gets CloudBees support (and fast deploys!)

About 18 months ago a company came out of nowhere called Stax Networks who were offering the ability to host JVM-based applications in the cloud, for FREE. This was too awesome not to look into and I subsequently made a plugin for SBT that automatically deployed your WAR files with a simple command.

Stax themselves were recently taken over by “CloudBees”:http://www.cloudbees.com and from what I can see the service has done nothing but become more awesome since that has happened. Now that take over is bedding in, the API has changed somewhat so I thought i’d rewrite my Stax plugin to work with CloudBees instead and take advantage of the ability to delta WAR files and only publish the updates.

Speedy Deploys

The way this works is the build tool creates a WAR file like it would do normally, but as this is usually fairly large it can be exceedingly annoying to have to keep waiting for the deployment to take place. Rather than wait, the build tool calculates the difference between the deployed artefact and the one you just created, and subsequently only deploys the difference (or delta): essentially reducing your deploy time from 15 mins or more to a few seconds.

Gimmeh It

In order to get started with using the SBT plugin, you will of course need a CloudBees account and upon registering you need to grab the key and secret from grandcentral.cloudbees.com, which should look something like:

These values represent the API Key and API Secret that CloudBees will use to verify your deployment rights. Once you have these two values, you can do one of two things in order to have them recognised by the SBT:

  • Enter them when the plugin prompts you; this will be on everytime you run a deployment to the cloud so is potentially a little sub-optiomal.

  • Create the properties file $HOME/.bees/bees.config so that you only need to define them once per computer. This properties file needs to be a key-value pair which should look something like this:

    bees.api.key=XXXXXXXXXX bees.api.secret=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX=

Whichever route you choose to specify that information, you then only need to define the plugin information in any given project. Specifically, in the Plugins.scala file define the following:

import sbt._
  class Plugins(info: ProjectInfo) extends PluginDefinition(info) {
    lazy val cloudbees = "eu.getintheloop" % "sbt-cloudbees-plugin" % "0.2.7"
    lazy val sonatypeRepo = "sonatype.repo" 
      at "https://oss.sonatype.org/content/groups/public"
  }

Add the plugin to your SBT project like so:

  import sbt._
  class YourProject(info: ProjectInfo) extends DefaultWebProject(info) 
    with bees.RunCloudPlugin {
    ....
    override def beesUsername = Some("youruser")
    override def beesApplicationId = Some("whatever")
  }

Again, if you would prefer to enter these values when you deploy your application then you can of course just enter the appropriate values when prompted. Now your all configured and good to go, there are two commands you can run with this plugin:

  • Get a list of your configured applications: bees-applist
  • Deploy your application bees-deploy

Upon running bees-deploy for the first time there will of course be a wait whilst the first version is deployed, but all your subsequent deployments should be much, much quicker. CloudBees is a rather awesome service and i’d highly recommend checking it out for fast, easy deployment of your JVM web applications.

Find the source code for the CloudBees SBT plugin here

Leave a Comment