06 Jul 2014
Automatic differentiation is one of those things that’s famous for not being as
famous as it should be (uh..). It’s useful, it’s convenient, and yet fewer
know about it than one would think.
This
article
(by one of the guys working on
Venture) is the single best
introduction you’ll probably find to AD, anywhere. It gives a wonderful
introduction to the basics, the subtleties, and the gotchas of the subject.
You should read it. I’m going to assume you have.
In particular, you should note this part:
[computing gradients] can be done — the method is called reverse mode — but
it introduces both code complexity and runtime cost in the form of managing
this storage (traditionally called the “tape”). In particular, the space
requirements of raw reverse mode are proportional to the runtime of f.
In some applications this can be somewhat inconvenient - picture iterative
gradient-based sampling algorithms like Hamiltonian Monte Carlo, its famous auto-tuning version
NUTS, and Riemannian manifold variants. Tom
noted in this reddit thread
that symbolic differentiation - which doesn’t need to deal with tapes and
infinitesimal-tracking - can often be orders of magnitude faster than AD for
this kind of problem. When running these algos we calculate gradients many,
many times, and the additional runtime cost of the reverse-mode AD dance can
add up.
An interesting question is whether or not this can this be mitigated
at all, and to what degree. In particular: can we use automatic
differentiation to implement efficient symbolic differentiation? Here’s a
possible attack plan:
- use an existing automatic differentiation implementation to calculate the
gradient for some target function at a point
- capture the symbolic expression for the gradient and optimize it by
eliminating common subexpressions or whatnot
- reify that expression as a function that we can evaluate for any input
- voila, (more) efficient gradient
Ed Kmett’s ‘ad’ library is the
best automatic differentiation library I know of, in terms of its balance
between power and accessibility. It can be used on arbitrary Haskell types
that are instances of the Num
typeclass and can carry out automatic
differentiation via a number of modes, so it’s very easy to get started with.
Rather than rolling my own AD implementation to try to do this sort of thing,
I’d much rather use his.
Start with a really basic language. In practice we’d be interested in
working with more expressive things, but this is sort of the minimal
interesting language capable of illustrating the problem:
import Numeric.AD
data Expr a =
Lit a
| Var String
| Add (Expr a) (Expr a)
| Sub (Expr a) (Expr a)
| Mul (Expr a) (Expr a)
deriving (Eq, Show)
instance Num a => Num (Expr a) where
fromInteger = Lit . fromInteger
e0 + e1 = Add e0 e1
e0 - e1 = Sub e0 e1
e0 * e1 = Mul e0 e1
We can already use Kmett’s ad library on these expressions to generate symbolic
expressions. We just have to write functions we’re interested in generically
(using +
, -
, and *
) and then call diff
or grad
or whatnot on them
with a concretely-typed argument. Some examples:
> :t diff (\x -> 2 * x ^ 2)
diff (\x -> 2 * x ^ 2) :: Num a => a -> a
> diff (\x -> 2 * x ^ 2) (Lit 1)
Mul (Add (Mul (Lit 1) (Lit 1)) (Mul (Lit 1) (Lit 1))) (Lit 2)
> grad (\[x, y] -> 2 * x ^ 2 + 3 * y) [Lit 1, Lit 2]
[ Add (Lit 0) (Add (Add (Lit 0) (Mul (Lit 1) (Add (Lit 0) (Mul (Lit 2) (Add
(Lit 0) (Mul (Lit 1) (Lit 1))))))) (Mul (Lit 1) (Add (Lit 0) (Mul (Lit 2) (Add
(Lit 0) (Mul (Lit 1) (Lit 1)))))))
, Add (Lit 0) (Add (Lit 0) (Mul (Lit 3) (Add (Lit 0) (Mul (Lit 1) (Lit 1)))))
]
It’s really easy to extract a proper derivative/gradient ‘function’ by
doing something like this:
> diff (\x -> 2 * x ^ 2) (Var "x")
Mul (Add (Mul (Var "x") (Lit 1)) (Mul (Lit 1) (Var "x"))) (Lit 2)
and then, given that expression, reifying a direct function for the derivative
by substituting over the captured variable and evaluating everything. Here’s
some initial machinery to handle that:
-- | Close an expression over some variable.
close :: Expr a -> String -> a -> Expr a
close (Add e0 e1) s x = Add (close e0 s x) (close e1 s x)
close (Sub e0 e1) s x = Sub (close e0 s x) (close e1 s x)
close (Mul e0 e1) s x = Mul (close e0 s x) (close e1 s x)
close (Var v) s x
| v == s = Lit x
| otherwise = Var v
close e _ _ = e
-- | Evaluate a closed expression.
eval :: Num a => Expr a -> a
eval (Lit d) = d
eval (Var _) = error "expression not closed"
eval (Add e0 e1) = eval e0 + eval e1
eval (Sub e0 e1) = eval e0 - eval e1
eval (Mul e0 e1) = eval e0 * eval e1
So, using this on the example above yields
> let testExpr = diff (\x -> 2 * x ^ 2) (Var "x")
> eval . close testExpr "x" $ 1
2
and it looks like a basic toDerivative
function for expressions could be
implemented as follows:
-- | Do some roundabout AD.
toDerivative expr = eval . close diffExpr "x" where
diffExpr = diff expr (Var "x")
But that’s a no go. ‘ad’ throws a type error, as presumably we’d be at risk of
perturbation confusion:
Couldn't match expected type ‘AD
s (Numeric.AD.Internal.Forward.Forward (Expr c))
-> AD s (Numeric.AD.Internal.Forward.Forward (Expr c))’
with actual type ‘t’
because type variable ‘s’ would escape its scope
This (rigid, skolem) type variable is bound by
a type expected by the context:
AD s (Numeric.AD.Internal.Forward.Forward (Expr c))
-> AD s (Numeric.AD.Internal.Forward.Forward (Expr c))
at ParamBasic.hs:174:14-32
Relevant bindings include
diffExpr :: Expr c (bound at ParamBasic.hs:174:3)
expr :: t (bound at ParamBasic.hs:173:14)
toDerivative :: t -> c -> c (bound at ParamBasic.hs:173:1)
In the first argument of ‘diff’, namely ‘expr’
In the expression: diff expr (Var "x")
Instead, we can use ad’s auto
combinator to write an alternate eval function:
autoEval :: Mode a => String -> Expr (Scalar a) -> a -> a
autoEval x expr = (`go` expr) where
go _ (Lit d) = auto d
go v (Var s)
| s == x = v
| otherwise = error "expression not closed"
go v (Add e0 e1) = go v e0 + go v e1
go v (Sub e0 e1) = go v e0 - go v e1
go v (Mul e0 e1) = go v e0 * go v e1
and using that, implement a working toDerivative
:
toDerivative :: Num a => String -> Expr (Expr a) -> Expr a
toDerivative v expr = diff (autoEval v expr)
which, though it has a weird-looking type, typechecks and does the trick:
> let d = toDerivative "x" (Mul (Lit 2) (Mul (Var "x") (Var "x"))) (Var "x")
Mul (Add (Mul (Var "x") (Lit 1)) (Mul (Lit 1) (Var "x"))) (Lit 2)
> eval . close "x" 1 $ d
4
So now we have access to a reified AST for a derivative (or gradient), which
can be tweaked and optimized as needed. Cool.
The available optimizations depend heavily on the underlying language. For
starters, there’s easy and universal stuff like this:
-- | Reduce superfluous expressions.
elimIdent :: (Num a, Eq a) => Expr a -> Expr a
elimIdent (Add (Lit 0) e) = elimIdent e
elimIdent (Add e (Lit 0)) = elimIdent e
elimIdent (Add e0 e1) = Add (elimIdent e0) (elimIdent e1)
elimIdent (Sub (Lit 0) e) = elimIdent e
elimIdent (Sub e (Lit 0)) = elimIdent e
elimIdent (Sub e0 e1) = Sub (elimIdent e0) (elimIdent e1)
elimIdent (Mul (Lit 1) e) = elimIdent e
elimIdent (Mul e (Lit 1)) = elimIdent e
elimIdent (Mul e0 e1) = Mul (elimIdent e0) (elimIdent e1)
elimIdent e = e
Which lets us do some work up-front:
> let e = 2 * Var "x" ^ 2 + Var "x" ^ 4
Add (Mul (Lit 2) (Mul (Var "x") (Var "x"))) (Mul (Mul (Var "x") (Var "x"))
(Mul (Var "x") (Var "x")))
> let ge = toDerivative "x" e
Add (Mul (Add (Mul (Var "x") (Lit 1)) (Mul (Lit 1) (Var "x"))) (Lit 2))
(Add (Mul (Mul (Var "x") (Var "x")) (Add (Mul (Var "x") (Lit 1)) (Mul
(Lit 1) (Var "x")))) (Mul (Add (Mul (Var "x") (Lit 1)) (Mul (Lit 1)
(Var "x"))) (Mul (Var "x") (Var "x"))))
> let geOptim = elimIdent ge
Add (Mul (Add (Var "x") (Var "x")) (Lit 2)) (Add (Mul (Mul (Var "x")
(Var "x")) (Add (Var "x") (Var "x"))) (Mul (Add (Var "x") (Var "x")) (Mul
(Var "x") (Var "x"))))
> eval . close "x" 1 $ geOptim
8
But there are also some more involved optimizations that can be useful for some
languages. The basic language I’ve been using above, for example, has no
explicit support for sharing common subexpressions. You’ll recall from one of my previous posts that we
have a variety of methods to do that in Haskell EDSLs, including some that
allow sharing to be observed without modifying the underlying language. We can
use data-reify
, for example, to observe any implicit sharing in expressions:
> reifyGraph $ geOptim
let [(1,AddF 2 6),(6,AddF 7 10),(10,MulF 11 12),(12,MulF 4 4),(11,AddF 4 4),
(7,MulF 8 9),(9,AddF 4 4),(8,MulF 4 4),(2,MulF 3 5),(5,LitF 2),(3,AddF 4 4),
(4,VarF "x")] in 1
And even make use of a handy library found on Hackage
for performing common subexpression elimination on graphs returned by
reifyGraph
:
> cse <$> reifyGraph geOptim
let [(5,LitF 2),(1,AddF 2 6),(3,AddF 4 4),(6,AddF 7 10),(2,MulF 3 5),
(10,MulF 3 8),(8,MulF 4 4),(7,MulF 8 3),(4,VarF "x")] in 1
With an appropriate graph evaluator
we can cut down the size of the syntax we have to traverse substantially.
Happy automasymbolic differentiating!
30 May 2014
Lately I’ve been trying to do some magic by way of nonstandard interpretations
of abstract syntax. One of the things that I’ve managed to grok along the way
has been the problem of sharing in deeply-embedded languages.
Here’s a simple illustration of the ‘vanilla’ sharing problem by way of plain
Haskell; a function that computes 2^n:
naiveTree :: (Eq a, Num a, Num b) => a -> a
naiveTree 0 = 1
naiveTree n = naiveTree (n - 1) + naiveTree (n - 1)
This naive implementation is a poor way to roll as it is exponentially complex
in n. Look at how evaluation proceeds for something like naiveTree 4
:
> naiveTree 4
> naiveTree 3 + naiveTree 3
> naiveTree 2 + naiveTree 2 + naiveTree 2 + naiveTree 2
> naiveTree 1 + naiveTree 1 + naiveTree 1 + naiveTree 1
+ naiveTree 1 + naiveTree 1 + naiveTree 1 + naiveTree 1
> naiveTree 0 + naiveTree 0 + naiveTree 0 + naiveTree 0
+ naiveTree 0 + naiveTree 0 + naiveTree 0 + naiveTree 0
+ naiveTree 0 + naiveTree 0 + naiveTree 0 + naiveTree 0
+ naiveTree 0 + naiveTree 0 + naiveTree 0 + naiveTree 0
> 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1
> 16
Each recursive call doubles the number of function evaluations we need to make.
Don’t wait up for naiveTree 50
to return a value.
A better way to write this function would be:
tree :: (Eq a, Num a, Num b) => a -> a
tree 0 = 1
tree n =
let shared = tree (n - 1)
in shared + shared
Here we store solutions to subproblems, and thus avoid having to recompute
things over and over. Look at how tree 4
proceeds now:
> tree 4
> let shared0 =
let shared1 =
let shared2 =
let shared3 = 1
in shared3 + shared3
in shared2 + shared2
in shared1 + shared1
in shared0 + shared0
> let shared0 =
let shared1 =
let shared2 = 2
in shared2 + shared2
in shared1 + shared1
in shared0 + shared0
> let shared0 =
let shared1 = 4
in shared1 + shared1
in shared0 + shared0
> let shared0 = 8
in shared0 + shared0
> 16
You could say that Haskell’s let
syntax enables sharing between
computations; using it reduces the complexity of our tree implementation from
\(O(2^n)\) to \(O(n)\). tree 50
now returns instantly:
> tree 50
1125899906842624
So let’s move everything over to an abstract syntax setting and see how the
results translate there.
Let’s start with a minimalist language, known in some circles as Hutton’s
Razor. While puny, it is sufficiently expressive to illustrate the subtleties
of this whole sharing business:
data Expr =
Lit Int
| Add Expr Expr
deriving (Eq, Ord, Show)
instance Num Expr where
fromInteger = Lit . fromInteger
(+) = Add
eval :: Expr -> Int
eval (Lit d) = d
eval (Add e0 e1) = eval e0 + eval e1
I’ve provided a Num
instance so that we can conveniently write expressions in
this language. We can use conventional notation and extract abstract syntax
for free by specifying a particular type signature:
> 1 + 1 :: Expr
Add (Lit 1) (Lit 1)
And of course we can use eval
to evaluate things:
Due to the Num
instance and the polymorphic definitions of naiveTree
and
tree
, these functions will automatically work on our expression type. Check
them out:
> naiveTree 2 :: Expr
Add (Add (Lit 1) (Lit 1)) (Add (Lit 1) (Lit 1))
> tree 2 :: Expr
Add (Add (Lit 1) (Lit 1)) (Add (Lit 1) (Lit 1))
Notice there’s a quirk here: each of these functions - having wildly different
complexities - yields the same abstract syntax, implying that tree
is no
more efficient than naiveTree
when it comes to dealing with this expression
type. That means..
> eval (tree 50 :: Expr)
-- ain't happening
So there is a big problem here: Haskell’s let
syntax doesn’t carry its
sharing over to our embedded language. Equivalently, the embedded language is
not expressive enough to represent sharing in its own abstract syntax.
There are a few ways to get around this.
Memoizing Evaluation
For some interpretations (like evaluation) we can use a memoization library.
Here we can use Data.StableMemo
to define a clean and simple evaluator:
import Data.StableMemo
memoEval :: Expr -> Int
memoEval = go where
go = memo eval
eval (Lit i) = i
eval (Add e0 e1) = go e0 + go e1
This will very conveniently handle any grimy details of caching intermediate
computations. It passes the tree 50
test just fine:
> memoEval (tree 50 :: Expr)
1125899906842624
Some other interpretations are still inefficient; a similar memoPrint
function will still dump out a massive syntax tree due to the limited
expressiveness of the embedded language. The memoizer doesn’t really allow us
to observe sharing, if we’re interested in doing that for some reason.
Observing Implicit Sharing
We can actually use GHC’s internal sharing analysis to recover any implicit
sharing present in an embedded expression. This is the technique introduced by
Andy Gill’s Type Safe Observable Sharing In Haskell
and implemented in the data-reify
library on
Hackage. It’s as
technically unsafe as it sounds, but in practice has the benefits of being both
relatively benign and minimally intrusive on the existing language.
Here is the extra machinery required to observe implicit sharing in our Expr
type:
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE TypeFamilies #-}
import Control.Applicative
import Data.Reify hiding (Graph)
import qualified Data.Reify as Reify
import System.IO.Unsafe
data ExprF e =
LitF Int
| AddF e e
deriving (Eq, Ord, Show, Functor)
instance MuRef Expr where
type DeRef Expr = ExprF
mapDeRef f (Add e0 e1) = AddF <$> f e0 <*> f e1
mapDeRef _ (Lit v) = pure (LitF v)
We need to make Expr
an instance of the MuRef
class, which loosely provides
a mapping between the Expr
and ExprF
types. ExprF
itself is a so-called
‘pattern functor’, which is a parameterized type in which recursive points are
indicated by the parameter. We need the TypeFamilies
pragma for
instantiating the MuRef
class, and DeriveFunctor
eliminates the need to
manually instantiate a Functor
instance for ExprF
.
Writing MuRef
instances is pretty easy. For more complicated types you can
often use Data.Traversable.traverse
in order to provide the required
implementation for mapDeRef
(example).
With this in place we can use reifyGraph
from data-reify
in order to
observe the implicit sharing. Let’s try this on a bite-sized tree 2
and note
that it is an IO action:
> reifyGraph (tree 2 :: Expr)
let [(1,AddF 2 2),(2,AddF 3 3),(3,LitF 1)] in 1
Here we get an abstract syntax graph - rather than a tree - and sharing has
been made explicit.
We can write an interpreter for expressions by internally reifying them as
graphs and then working on those. reifyGraph
is an IO action, but since its
effects are pretty tame I don’t feel too bad about wrapping calls to it in
unsafePerformIO
. Interpreting these graphs must be handled a little
differently from interpreting a tree; a naive ‘tree-like’ evaluator
will eliminate sharing undesirably:
naiveEval :: Expr -> Int
naiveEval expr = gEval reified where
reified = unsafePerformIO $ reifyGraph expr
gEval (Reify.Graph env r) = go r where
go j = case lookup j env of
Just (AddF a b) -> go a + go b
Just (LitF d) -> d
Nothing -> 0
This evaluator fails the tree 50
test:
> naiveEval (tree 50)
-- hang
We need to use a more appropriately graph-y method to traverse and interpret
this (directed, acyclic) graph. Here’s an idea:
- topologically sort the
graph, yielding a linear ordering of vertices such that for every edge
\(u \to v\), \(v\) is ordered before \(u\).
- iterate through the sorted vertices, interpreting them as desired and storing
the interpretation
- look up the previously-interpreted vertices as needed
We can use the Data.Graph
module from the containers
library to deal with
the topological sorting and vertex lookups. The following graph-based
evaluator gets the job done:
import Data.Graph
import Data.Maybe
graphEval :: Expr -> Int
graphEval expr = consume reified where
reified = unsafePerformIO (toGraph <$> reifyGraph expr)
toGraph (Reify.Graph env _) = graphFromEdges . map toNode $ env
toNode (j, AddF a b) = (AddF a b, j, [a, b])
toNode (j, LitF d) = (LitF d, j, [])
consume :: Eq a => (Graph, Vertex -> (ExprF a, a, b), c) -> Int
consume (g, vmap, _) = go (reverse . topSort $ g) [] where
go [] acc = snd $ head acc
go (v:vs) acc =
let nacc = evalNode (vmap v) acc : acc
in go vs nacc
evalNode :: Eq a => (ExprF a, b, c) -> [(a, Int)] -> (b, Int)
evalNode (LitF d, k, _) _ = (k, d)
evalNode (AddF a b, k, _) l =
let v = fromJust ((+) <$> lookup a l <*> lookup b l)
in (k, v)
In a serious implementation I’d want to use a more appropriate caching
structure and avoid the partial functions like fromJust
and head
, but you
get the point. In any case, this evaluator passes the tree 50
test without
issue:
> graphEval (tree 50)
1125899906842624
Making Sharing Explicit
Instead of working around the lack of sharing in our language, we can augment
it by adding the necessary sharing constructs. In particular, we can add our
own let-binding that piggybacks on Haskell’s let
. Here’s an enhanced
language (using the same Num
instance as before):
data Expr =
Lit Int
| Add Expr Expr
| Let Expr (Expr -> Expr)
The new Let
constructor implements higher-order abstract syntax, or HOAS.
There are some immediate consequences of this: we can’t derive instances of our
language for typeclasses like Eq
, Ord
, and Show
, and interpreting
everything becomes a bit more painful. But, we don’t need to make any use of
data-reify
in order to share expressions, since the language now handles that
'a la carte. Here’s an efficient evaluator:
eval :: Expr -> Int
eval (Lit d) = d
eval (Add e0 e1) = eval e0 + eval e1
eval (Let e0 e1) =
let shared = Lit (eval e0)
in eval (e1 shared)
In particular, note that we need a sort of back-interpreter to re-embed shared
expressions into our language while interpreting them. Here we use Lit
to
do that, but this is more painful if we want to implement, say, a pretty
printer; in that case we need a parser such that print (parse x) == x
(see
here).
We also can’t use the existing tree
function. Here’s the HOAS equivalent,
which is no longer polymorphic in its return type:
tree :: (Num a, Eq a) => a -> Expr
tree 0 = 1
tree n = Let (tree (n - 1)) (\shared -> shared + shared)
Using that, we can see that sharing is preserved just fine:
> eval (tree 50)
1125899906842624
Another way to make sharing explicit is to use a parameterized HOAS,
known as PHOAS. This requires the greatest augmentation of the original
language (recycling the same Num
instance):
data Expr a =
Lit Int
| Add (Expr a) (Expr a)
| Let (Expr a) (a -> Expr a)
| Var a
eval :: Expr Int -> Int
eval (Lit d) = d
eval (Var v) = v
eval (Add e0 e1) = eval e0 + eval e1
eval (Let e f) = eval (f (eval e))
Here we parameterize the expression type and add both Let
and Var
constructors to the language. Sharing expressions explicitly now takes a
slightly different form than in the HOAS version:
tree :: (Num a, Eq a) => a -> Expr b
tree 0 = 1
tree n = Let (tree (n - 1)) ((\shared -> shared + shared) . Var)
The Var
term wraps the intermediate computation, which is then passed to the
semantics-defining lambda. Sharing is again preserved in the language:
> eval $ tree 50
1125899906842624
Here, however, we don’t need the same kind of back-interpreter that we did when
using HOAS. It’s easy to write a pretty-printer that observes sharing, for
example (from here):
text e = go e 0 where
go (Lit j) _ = show j
go (Add e0 e1) c = "(Add " ++ go e0 c ++ " " ++ go e1 c ++ ")"
go (Var x) _ = x
go (Let e0 e1) c = "(Let " ++ v ++ " " ++ go e0 (c + 1) ++
" in " ++ go (e1 v) (c + 1) ++ ")"
where v = "v" ++ show c
Which yields the following string representation of our syntax:
> putStrLn . text $ tree 2
(Let v0 (Let v1 1 in (Add v1 v1)) in (Add v0 v0))
Cluing up
I’ve gone over several methods of handling sharing in embedded languages:
an external memoizer, observable (implicit) sharing, and adding explicit
sharing via adding a HOAS or PHOAS let-binding to the original language. Some
may be more convenient than others, depending on what you’re trying to do.
I’ve dumped code for the
minimal,
HOAS, and
PHOAS examples in
some gists.
21 Dec 2013
(UPDATE 2016/08/15: Here be monsters. This code is ancient, the style
is not really idiomatic Ansible, and it’s likely that nothing works anymore.)
EC2 is cool. The ability to dynamically spin up a whack of free-to-cheap
server instances anywhere in the world at any time, is.. well, pretty mean.
Need to run a long computation job? Scale up a distributed system? Reduce
latency to clients in a particular geographical region? YEAH WE CAN DO THAT.
The EC2 Management Console is a pretty great tool in of itself. Well laid-out
and very responsive. But for a hacker’s hacker, a great tool to manage EC2
instances (amongst other things) is Ansible, which provides a way to automate
tasks over an arbitrary number of servers, concurrently.
With EC2 and Ansible you can rapidly find yourself controlling an amorphous,
globally-distributed network of servers that are eager to do your bidding.
Here’s a quick example that elides most of the nitty-gritty details. I’m going
to spin up three micro instances in Asia Pacific. To do that, I’m going to use
an Ansible playbook, which is essentially a YAML file that describes a sequence
of commands to be performed. I’m going to delegate my local machine to handle
that task, so I’m first going to store the following inventory in
/etc/ansible/local:
The following playbook, spin-up-instances.yml
is what actually launches these
guys. Here’s its header:
---
- name: Spin up some EC2 instances
hosts: 127.0.0.1
connection: local
tasks:
- name: Create security group
local_action:
module: ec2_group
name: my-security-group
description: Access my-security-group
region: ap-southeast-2
rules:
- proto: tcp
from_port: 22
to_port: 22
cidr_ip: 0.0.0.0/0
- name: Launch instances
local_action:
module: ec2
region: ap-southeast-2
keypair: jtobin-aws
group: my-security-group
instance_type: t1.micro
image: ami-3d128f07
count: 3
wait: yes
register: ec2
- name: Add instances to host group
local_action: add_host hostname= groupname=my-security-group
with_items: ec2.instances
- name: Tag instances
local_action: ec2_tag resource= region=ap-southeast-2 state=present
with_items: ec2.instances
args:
tags:
Name: Abrek
- name: Give everyone a minute
pause: minutes=1
Roughly, the tasks I want performed are each declared with a name and follow
the ‘tasks:’ line. They’re relatively self-explanatory. When Ansible runs
this playbook, it will execute the tasks in the order they appear in the
playbook.
First I create a security group in Asia Pacific (Sydney) for all the instances
I want to launch, and then go ahead and actually launch the instances. You can
see that I launch them as a local action on my machine. I’m using micro
instances (the most lightweight instance type available) and pick an Ubuntu LTS
Server machine image for each. I then do some bookkeeping and tag each
instance with the name ‘Abrek’. The final task just pauses the playbook
execution long enough for the instances to get up and running.
Fun fact: ‘Abrek’ was the name of one of the first two Soviet monkeys shot into
space. Today is apparently the 30th anniversary of his safe return.
Now, I also want to install some software on each of these guys. I’ll separate
all that into two groups: some essentials, and a specialized stack consisting
of 0MQ and supporting libraries. To do that, I’ll create two separate files
called ‘install-essentials.yml’ and ‘install-specialized.yml’.
I’ll keep the essentials bare for now: git, gcc/g++, and make. Here’s
install-essentials.yml
:
---
- name: Install git
apt: pkg=git update_cache=yes
- name: Install gcc
apt: pkg=gcc
- name: Install g++
apt: pkg=g++
- name: Install make
apt: pkg=make
I can grab all of those via apt. ‘update_cache’ is equivalent to ‘apt-get
update’, which only needs to be done once.
Next, the specialized stuff in install-specialized.yml
:
---
- name: Grab 0MQ
command: >
wget http://download.zeromq.org/zeromq-4.0.3.tar.gz
creates=zeromq-4.0.3.tar.gz
- name: Unpack 0MQ
command: >
tar -xzf zeromq-4.0.3.tar.gz
creates=zeromq-4.0.3
- name: Get libsodium
command: >
wget https://download.libsodium.org/libsodium/releases/libsodium-0.4.5.tar.gz
creates=libsodium-0.4.5
chdir=zeromq-4.0.3
- name: Install libsodium
shell: >
tar xzf libsodium-0.4.5.tar.gz;
cd libsodium-0.4.5;
./configure && make && make check && make install
chdir=zeromq-4.0.3
- name: Install 0MQ
shell: >
./configure; make; make install
chdir=zeromq-4.0.3
- name: Install libtool
apt: pkg=libtool
- name: Install automake
apt: pkg=automake
- name: Install automake
apt: pkg=autoconf
- name: Install uuid-dev
apt: pkg=uuid-dev
- name: Grab CZMQ
command: >
wget http://download.zeromq.org/czmq-2.0.3.tar.gz
creates=czmq-2.0.3.tar.gz
- name: Unpack CZMQ
command: >
tar xzf czmq-2.0.3.tar.gz
creates=czmq-2.0.3
- name: Install CZMQ
shell: >
./configure && make;
ldconfig
chdir=czmq-2.0.3
Lots going on here. I use a variety of apt and shell commands to download and
install everything I need.
Now to add those tasks back into the spin-up-instances.yml
playbook so that
the software gets installed right after the instances boot up. I can append
the following to that file:
- name: Install essential and specialized software
hosts: my-security-group
user: ubuntu
sudo: True
tasks:
- include: tasks/install-essentials.yml
- include: tasks/install-specialized.yml
Let’s run the playbook and see those instances get launched. I need to use the
‘local’ inventory that I set up, so I pass that to ‘ansible-playbook’
explicitly.
Running it, we can see the security group being created, the instances popping
up, and tags getting assigned:
Our essential software getting pulled down:
And the tail end of our 0MQ stack showing up before the play ends with a
summary.
For a quick sanity check to ensure that everything really did go as planned, I
can look for the CZMQ header on each instance. This time I’ll run a quick
ad-hoc command, identifying the hosts via the ‘Abrek’ tag:
Voila, three servers ready to roll. Great stuff.
To fill the missing details, you might want to check out the excellent Ansible
documentation, as well as the great
tutorials at AnswersForAws.