Sharing in Haskell EDSLs

Lately I’ve been trying to do some magic by way of nonstandard interpretations of abstract syntax. One of the things that I’ve managed to grok along the way has been the problem of sharing in deeply-embedded languages.

Here’s a simple illustration of the ‘vanilla’ sharing problem by way of plain Haskell; a function that computes 2^n:

naiveTree :: (Eq a, Num a, Num b) => a -> a
naiveTree 0 = 1
naiveTree n = naiveTree (n - 1) + naiveTree (n - 1)

This naive implementation is a poor way to roll as it is exponentially complex in n. Look at how evaluation proceeds for something like naiveTree 4:

> naiveTree 4
> naiveTree 3 + naiveTree 3
> naiveTree 2 + naiveTree 2 + naiveTree 2 + naiveTree 2
> naiveTree 1 + naiveTree 1 + naiveTree 1 + naiveTree 1
  + naiveTree 1 + naiveTree 1 + naiveTree 1 + naiveTree 1
> naiveTree 0 + naiveTree 0 + naiveTree 0 + naiveTree 0
  + naiveTree 0 + naiveTree 0 + naiveTree 0 + naiveTree 0
  + naiveTree 0 + naiveTree 0 + naiveTree 0 + naiveTree 0
  + naiveTree 0 + naiveTree 0 + naiveTree 0 + naiveTree 0
> 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1
> 16

Each recursive call doubles the number of function evaluations we need to make. Don’t wait up for naiveTree 50 to return a value.

A better way to write this function would be:

tree :: (Eq a, Num a, Num b) => a -> a
tree 0 = 1
tree n =
  let shared = tree (n - 1)
  in  shared + shared

Here we store solutions to subproblems, and thus avoid having to recompute things over and over. Look at how tree 4 proceeds now:

> tree 4
> let shared0 =
      let shared1 =
          let shared2 =
              let shared3 = 1
              in  shared3 + shared3
          in  shared2 + shared2
      in  shared1 + shared1
  in  shared0 + shared0
> let shared0 =
      let shared1 =
          let shared2 = 2
          in  shared2 + shared2
      in  shared1 + shared1
  in  shared0 + shared0
> let shared0 =
      let shared1 = 4
      in  shared1 + shared1
  in  shared0 + shared0
> let shared0 = 8
  in  shared0 + shared0
> 16

You could say that Haskell’s let syntax enables sharing between computations; using it reduces the complexity of our tree implementation from \(O(2^n)\) to \(O(n)\). tree 50 now returns instantly:

> tree 50
1125899906842624

So let’s move everything over to an abstract syntax setting and see how the results translate there.

Let’s start with a minimalist language, known in some circles as Hutton’s Razor. While puny, it is sufficiently expressive to illustrate the subtleties of this whole sharing business:

data Expr =
    Lit Int
  | Add Expr Expr
  deriving (Eq, Ord, Show)

instance Num Expr where
  fromInteger = Lit . fromInteger
  (+)         = Add

eval :: Expr -> Int
eval (Lit d)     = d
eval (Add e0 e1) = eval e0 + eval e1

I’ve provided a Num instance so that we can conveniently write expressions in this language. We can use conventional notation and extract abstract syntax for free by specifying a particular type signature:

> 1 + 1 :: Expr
Add (Lit 1) (Lit 1)

And of course we can use eval to evaluate things:

> eval (1 + 1 :: Expr)
2

Due to the Num instance and the polymorphic definitions of naiveTree and tree, these functions will automatically work on our expression type. Check them out:

> naiveTree 2 :: Expr
Add (Add (Lit 1) (Lit 1)) (Add (Lit 1) (Lit 1))

> tree 2 :: Expr
Add (Add (Lit 1) (Lit 1)) (Add (Lit 1) (Lit 1))

Notice there’s a quirk here: each of these functions - having wildly different complexities - yields the same abstract syntax, implying that tree is no more efficient than naiveTree when it comes to dealing with this expression type. That means..

> eval (tree 50 :: Expr)
-- ain't happening

So there is a big problem here: Haskell’s let syntax doesn’t carry its sharing over to our embedded language. Equivalently, the embedded language is not expressive enough to represent sharing in its own abstract syntax.

There are a few ways to get around this.

Memoizing Evaluation

For some interpretations (like evaluation) we can use a memoization library. Here we can use Data.StableMemo to define a clean and simple evaluator:

import Data.StableMemo

memoEval :: Expr -> Int
memoEval = go where
  go = memo eval
  eval (Lit i)     = i
  eval (Add e0 e1) = go e0 + go e1

This will very conveniently handle any grimy details of caching intermediate computations. It passes the tree 50 test just fine:

> memoEval (tree 50 :: Expr)
1125899906842624

Some other interpretations are still inefficient; a similar memoPrint function will still dump out a massive syntax tree due to the limited expressiveness of the embedded language. The memoizer doesn’t really allow us to observe sharing, if we’re interested in doing that for some reason.

Observing Implicit Sharing

We can actually use GHC’s internal sharing analysis to recover any implicit sharing present in an embedded expression. This is the technique introduced by Andy Gill’s Type Safe Observable Sharing In Haskell and implemented in the data-reify library on Hackage. It’s as technically unsafe as it sounds, but in practice has the benefits of being both relatively benign and minimally intrusive on the existing language.

Here is the extra machinery required to observe implicit sharing in our Expr type:

{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE TypeFamilies #-}

import Control.Applicative
import Data.Reify hiding (Graph)
import qualified Data.Reify as Reify
import System.IO.Unsafe

data ExprF e =
    LitF Int
  | AddF e e
  deriving (Eq, Ord, Show, Functor)

instance MuRef Expr where
  type DeRef Expr        = ExprF
  mapDeRef f (Add e0 e1) = AddF <$> f e0 <*> f e1
  mapDeRef _ (Lit v)     = pure (LitF v)

We need to make Expr an instance of the MuRef class, which loosely provides a mapping between the Expr and ExprF types. ExprF itself is a so-called ‘pattern functor’, which is a parameterized type in which recursive points are indicated by the parameter. We need the TypeFamilies pragma for instantiating the MuRef class, and DeriveFunctor eliminates the need to manually instantiate a Functor instance for ExprF.

Writing MuRef instances is pretty easy. For more complicated types you can often use Data.Traversable.traverse in order to provide the required implementation for mapDeRef (example).

With this in place we can use reifyGraph from data-reify in order to observe the implicit sharing. Let’s try this on a bite-sized tree 2 and note that it is an IO action:

> reifyGraph (tree 2 :: Expr)
let [(1,AddF 2 2),(2,AddF 3 3),(3,LitF 1)] in 1

Here we get an abstract syntax graph - rather than a tree - and sharing has been made explicit.

We can write an interpreter for expressions by internally reifying them as graphs and then working on those. reifyGraph is an IO action, but since its effects are pretty tame I don’t feel too bad about wrapping calls to it in unsafePerformIO. Interpreting these graphs must be handled a little differently from interpreting a tree; a naive ‘tree-like’ evaluator will eliminate sharing undesirably:

naiveEval :: Expr -> Int
naiveEval expr = gEval reified where
  reified = unsafePerformIO $ reifyGraph expr
  gEval (Reify.Graph env r) = go r where
    go j = case lookup j env of
      Just (AddF a b) -> go a + go b
      Just (LitF d)   -> d
      Nothing         -> 0

This evaluator fails the tree 50 test:

> naiveEval (tree 50)
-- hang

We need to use a more appropriately graph-y method to traverse and interpret this (directed, acyclic) graph. Here’s an idea:

  • topologically sort the graph, yielding a linear ordering of vertices such that for every edge \(u \to v\), \(v\) is ordered before \(u\).
  • iterate through the sorted vertices, interpreting them as desired and storing the interpretation
  • look up the previously-interpreted vertices as needed

We can use the Data.Graph module from the containers library to deal with the topological sorting and vertex lookups. The following graph-based evaluator gets the job done:

import Data.Graph
import Data.Maybe

graphEval :: Expr -> Int
graphEval expr = consume reified where
  reified = unsafePerformIO (toGraph <$> reifyGraph expr)
  toGraph (Reify.Graph env _) = graphFromEdges . map toNode $ env
  toNode (j, AddF a b) = (AddF a b, j, [a, b])
  toNode (j, LitF d)   = (LitF d, j, [])

consume :: Eq a => (Graph, Vertex -> (ExprF a, a, b), c) -> Int
consume (g, vmap, _) = go (reverse . topSort $ g) [] where
  go [] acc = snd $ head acc
  go (v:vs) acc =
    let nacc = evalNode (vmap v) acc : acc
    in  go vs nacc

evalNode :: Eq a => (ExprF a, b, c) -> [(a, Int)] -> (b, Int)
evalNode (LitF d, k, _)   _ = (k, d)
evalNode (AddF a b, k, _) l =
  let v = fromJust ((+) <$> lookup a l <*> lookup b l)
  in  (k, v)

In a serious implementation I’d want to use a more appropriate caching structure and avoid the partial functions like fromJust and head, but you get the point. In any case, this evaluator passes the tree 50 test without issue:

> graphEval (tree 50)
1125899906842624

Making Sharing Explicit

Instead of working around the lack of sharing in our language, we can augment it by adding the necessary sharing constructs. In particular, we can add our own let-binding that piggybacks on Haskell’s let. Here’s an enhanced language (using the same Num instance as before):

data Expr =
    Lit Int
  | Add Expr Expr
  | Let Expr (Expr -> Expr)

The new Let constructor implements higher-order abstract syntax, or HOAS. There are some immediate consequences of this: we can’t derive instances of our language for typeclasses like Eq, Ord, and Show, and interpreting everything becomes a bit more painful. But, we don’t need to make any use of data-reify in order to share expressions, since the language now handles that 'a la carte. Here’s an efficient evaluator:

eval :: Expr -> Int
eval (Lit d)     = d
eval (Add e0 e1) = eval e0 + eval e1
eval (Let e0 e1) =
  let shared = Lit (eval e0)
  in  eval (e1 shared)

In particular, note that we need a sort of back-interpreter to re-embed shared expressions into our language while interpreting them. Here we use Lit to do that, but this is more painful if we want to implement, say, a pretty printer; in that case we need a parser such that print (parse x) == x (see here).

We also can’t use the existing tree function. Here’s the HOAS equivalent, which is no longer polymorphic in its return type:

tree :: (Num a, Eq a) => a -> Expr
tree 0 = 1
tree n = Let (tree (n - 1)) (\shared -> shared + shared)

Using that, we can see that sharing is preserved just fine:

> eval (tree 50)
1125899906842624

Another way to make sharing explicit is to use a parameterized HOAS, known as PHOAS. This requires the greatest augmentation of the original language (recycling the same Num instance):

data Expr a =
    Lit Int
  | Add (Expr a) (Expr a)
  | Let (Expr a) (a -> Expr a)
  | Var a

eval :: Expr Int -> Int
eval (Lit d)     = d
eval (Var v)     = v
eval (Add e0 e1) = eval e0 + eval e1
eval (Let e f)   = eval (f (eval e))

Here we parameterize the expression type and add both Let and Var constructors to the language. Sharing expressions explicitly now takes a slightly different form than in the HOAS version:

tree :: (Num a, Eq a) => a -> Expr b
tree 0 = 1
tree n = Let (tree (n - 1)) ((\shared -> shared + shared) . Var)

The Var term wraps the intermediate computation, which is then passed to the semantics-defining lambda. Sharing is again preserved in the language:

> eval $ tree 50
1125899906842624

Here, however, we don’t need the same kind of back-interpreter that we did when using HOAS. It’s easy to write a pretty-printer that observes sharing, for example (from here):

text e = go e 0 where
  go (Lit j)     _ = show j
  go (Add e0 e1) c = "(Add " ++ go e0 c ++ " " ++ go e1 c ++ ")"
  go (Var x) _     = x
  go (Let e0 e1) c = "(Let " ++ v ++ " " ++ go e0 (c + 1) ++
                     " in " ++ go (e1 v) (c + 1) ++ ")"
    where v = "v" ++ show c

Which yields the following string representation of our syntax:

> putStrLn . text $ tree 2
(Let v0 (Let v1 1 in (Add v1 v1)) in (Add v0 v0))

Cluing up

I’ve gone over several methods of handling sharing in embedded languages: an external memoizer, observable (implicit) sharing, and adding explicit sharing via adding a HOAS or PHOAS let-binding to the original language. Some may be more convenient than others, depending on what you’re trying to do.

I’ve dumped code for the minimal, HOAS, and PHOAS examples in some gists.

Basic EC2 Management with Ansible

(UPDATE 2016/08/15: Here be monsters. This code is ancient, the style is not really idiomatic Ansible, and it’s likely that nothing works anymore.)

EC2 is cool. The ability to dynamically spin up a whack of free-to-cheap server instances anywhere in the world at any time, is.. well, pretty mean. Need to run a long computation job? Scale up a distributed system? Reduce latency to clients in a particular geographical region? YEAH WE CAN DO THAT.

The EC2 Management Console is a pretty great tool in of itself. Well laid-out and very responsive. But for a hacker’s hacker, a great tool to manage EC2 instances (amongst other things) is Ansible, which provides a way to automate tasks over an arbitrary number of servers, concurrently.

With EC2 and Ansible you can rapidly find yourself controlling an amorphous, globally-distributed network of servers that are eager to do your bidding.

Here’s a quick example that elides most of the nitty-gritty details. I’m going to spin up three micro instances in Asia Pacific. To do that, I’m going to use an Ansible playbook, which is essentially a YAML file that describes a sequence of commands to be performed. I’m going to delegate my local machine to handle that task, so I’m first going to store the following inventory in /etc/ansible/local:

[localhost]
127.0.0.1

The following playbook, spin-up-instances.yml is what actually launches these guys. Here’s its header:

---
- name: Spin up some EC2 instances
  hosts: 127.0.0.1
  connection: local
  tasks:
    - name: Create security group
      local_action:
        module: ec2_group
        name: my-security-group
        description: Access my-security-group
        region: ap-southeast-2
        rules:
          - proto: tcp
            from_port: 22
            to_port: 22
            cidr_ip: 0.0.0.0/0

    - name: Launch instances
      local_action:
        module: ec2
        region: ap-southeast-2
        keypair: jtobin-aws
        group: my-security-group
        instance_type: t1.micro
        image: ami-3d128f07
        count: 3
        wait: yes
      register: ec2

    - name: Add instances to host group
      local_action: add_host hostname= groupname=my-security-group
      with_items: ec2.instances

    - name: Tag instances
      local_action: ec2_tag resource= region=ap-southeast-2 state=present
      with_items: ec2.instances
      args:
        tags:
          Name: Abrek

    - name: Give everyone a minute
      pause: minutes=1

Roughly, the tasks I want performed are each declared with a name and follow the ‘tasks:’ line. They’re relatively self-explanatory. When Ansible runs this playbook, it will execute the tasks in the order they appear in the playbook.

First I create a security group in Asia Pacific (Sydney) for all the instances I want to launch, and then go ahead and actually launch the instances. You can see that I launch them as a local action on my machine. I’m using micro instances (the most lightweight instance type available) and pick an Ubuntu LTS Server machine image for each. I then do some bookkeeping and tag each instance with the name ‘Abrek’. The final task just pauses the playbook execution long enough for the instances to get up and running.

Fun fact: ‘Abrek’ was the name of one of the first two Soviet monkeys shot into space. Today is apparently the 30th anniversary of his safe return.

Now, I also want to install some software on each of these guys. I’ll separate all that into two groups: some essentials, and a specialized stack consisting of 0MQ and supporting libraries. To do that, I’ll create two separate files called ‘install-essentials.yml’ and ‘install-specialized.yml’.

I’ll keep the essentials bare for now: git, gcc/g++, and make. Here’s install-essentials.yml:

---
- name: Install git
  apt: pkg=git update_cache=yes

- name: Install gcc
  apt: pkg=gcc

- name: Install g++
  apt: pkg=g++

- name: Install make
  apt: pkg=make

I can grab all of those via apt. ‘update_cache’ is equivalent to ‘apt-get update’, which only needs to be done once.

Next, the specialized stuff in install-specialized.yml:

---
- name: Grab 0MQ
  command: >
    wget http://download.zeromq.org/zeromq-4.0.3.tar.gz
    creates=zeromq-4.0.3.tar.gz

- name: Unpack 0MQ
  command: >
    tar -xzf zeromq-4.0.3.tar.gz
    creates=zeromq-4.0.3

- name: Get libsodium
  command: >
    wget https://download.libsodium.org/libsodium/releases/libsodium-0.4.5.tar.gz
    creates=libsodium-0.4.5
    chdir=zeromq-4.0.3

- name: Install libsodium
  shell: >
    tar xzf libsodium-0.4.5.tar.gz;
    cd libsodium-0.4.5;
    ./configure && make && make check && make install
    chdir=zeromq-4.0.3

- name: Install 0MQ
  shell: >
    ./configure; make; make install
    chdir=zeromq-4.0.3

- name: Install libtool
  apt: pkg=libtool

- name: Install automake
  apt: pkg=automake

- name: Install automake
  apt: pkg=autoconf

- name: Install uuid-dev
  apt: pkg=uuid-dev

- name: Grab CZMQ
  command: >
    wget http://download.zeromq.org/czmq-2.0.3.tar.gz
    creates=czmq-2.0.3.tar.gz

- name: Unpack CZMQ
  command: >
    tar xzf czmq-2.0.3.tar.gz
    creates=czmq-2.0.3

- name: Install CZMQ
  shell: >
    ./configure && make;
    ldconfig
    chdir=czmq-2.0.3

Lots going on here. I use a variety of apt and shell commands to download and install everything I need.

Now to add those tasks back into the spin-up-instances.yml playbook so that the software gets installed right after the instances boot up. I can append the following to that file:

- name: Install essential and specialized software
  hosts: my-security-group
  user: ubuntu
  sudo: True
  tasks:
    - include: tasks/install-essentials.yml
    - include: tasks/install-specialized.yml

Let’s run the playbook and see those instances get launched. I need to use the ‘local’ inventory that I set up, so I pass that to ‘ansible-playbook’ explicitly.

Running it, we can see the security group being created, the instances popping up, and tags getting assigned:

Our essential software getting pulled down:

And the tail end of our 0MQ stack showing up before the play ends with a summary.

For a quick sanity check to ensure that everything really did go as planned, I can look for the CZMQ header on each instance. This time I’ll run a quick ad-hoc command, identifying the hosts via the ‘Abrek’ tag:

Voila, three servers ready to roll. Great stuff.

To fill the missing details, you might want to check out the excellent Ansible documentation, as well as the great tutorials at AnswersForAws.