Basic Hoonery

04 Feb 2019

In my last post I first introduced hnock, a little interpreter for Nock, and then demonstrated it on a hand-rolled decrement function. In this post I’ll look at how one can handle the same (contrived, but illustrative) task in Hoon.

Hoon is the higher- or application-level programming language for working with Arvo, the operating system of Urbit. The best way I can describe it is something like “Haskell meets C meets J meets the environment is always explicit.”

As a typed, functional language, Hoon feels surprisingly low-level. One is never allocating or deallocating memory explicitly when programming in Hoon, but the experience somehow feels similar to working in C. The idea is that the language should be simple and straightforward and support a fairly limited level of abstraction. There are the usual low-level functional idioms (map, reduce, etc.), as well as a structural type system to keep the programmer honest, but at its core, Hoon is something of a functional Go (a language which, I happen to think, is not good).

It’s not a complex language, like Scala or Rust, nor a language that overtly supports sky-high abstraction, like Haskell or Idris. Hoon is supposed to exist at a sweet spot for getting work done. And I am at least willing to buy the argument that it is pretty good for getting work done in Urbit.

Recall our naïve decrement function in Haskell. It looked like this:

dec :: Integer -> Integer
dec m =
  let loop n
        | succ n == m = n
        | otherwise   = loop (succ n)
  in  loop 0

Let’s look at a number of ways to write this in Hoon, showing off some of the most important Hoon programming concepts in the process.

Cores

Here’s a Hoon version of decrement. Note that to the uninitiated, Hoon looks gnarly:

|=  m=@
=/  n=@  0
=/  loop
|%
  ++  recur
    ?:  =(+(n) m)
      n
    recur(n +(n))
  --
recur:loop

We can read it as follows:

Define a function that takes an argument, ‘m’, having type atom (recall that an atom is an unsigned integer).
Define a local variable called ‘n’, having type atom and value 0, and add it to the environment (or, if you recall our Nock terminology, to the subject).
Define a local variable called ‘loop’, with precise definition to follow, and add it to the environment.
‘loop’ is a core, i.e. more or less a named collection of functions. Define one such function (or arm), ‘recur’, that checks to see if the increment of ‘n’ is equal to ‘m’, returning ‘n’ if so, and calling itself, except with the value of ‘n’ in the environment changed to ‘n + 1’, if not.
Evaluate ‘recur’ as defined in ‘loop’.

(To test this, you can enter the Hoon line-by-line into the Arvo dojo. Just preface it with something like =core-dec to give it a name, and call it via e.g. (core-dec 20).)

Hoon may appear to be a write-only language, though I’ve found this to not necessarily be the case (just to note, at present I’ve read more Hoon code than I’ve written). Good Hoon has a terse and very vertical style. The principle that keeps it readable is that, roughly, each line should contain one important logical operation. These operations are denoted by runes, the =/ and ?: and similar ASCII digraphs sprinkled along the left hand columns of the above example. This makes it look similar to e.g. J – a language I have long loved, but never mastered – although in J the rough ‘one operator per line’ convention is not typically in play.

In addition to the standard digraph runes, there is also a healthy dose of ‘irregular’ syntax in most Hoon code for simple operations that one uses frequently. Examples used above include =(a b) for equality testing, +(n) for incrementing an atom, and foo(a b) for evaluating ‘foo’ with the value of ‘a’ in the environment changed to ‘b’. Each of these could be replaced with a more standard rune-based expression, though for such operations the extra verbosity is not usually warranted.

Cores like ‘loop’ seem, to me, to be the mainstay workhorse of Hoon programming. A core is more or less a structure, or object, or dictionary, or whatever, of functions. One defines them liberally, constructs a subject (i.e. environment) to suit, and then evaluates them, or some part of them, against the subject.

To be more precise, a core is a Nock expression; like every non-atomic value in Nock, it is a tree. Starting from the cell [l r], the left subtree, ‘l’, is a tree of Nock formulas (i.e. the functions, like ‘recur’, defined in the core). The right subtree, ‘r’ is all the data required to evaluate those Nock formulas. The traditional name for the left subtree, ‘l’, is the battery of the core; the traditional name for the right subtree is the payload.

One is always building up a local environment in Hoon and then evaluating some value against it. Aside from the arm ‘recur’, the core ‘loop’ also contains in its payload the values ‘m’ and ‘n’. The expression ‘recur:loop’ – irregular syntax for =< recur loop – means “use ‘loop’ as the environment and evaluate ‘recur’.” Et voilà, that’s how we get our decrement.

You’ll note that this should feel very similar to the way we defined decrement in Nock. Our hand-assembled Nock code, slightly cleaned up, looked like this:

[8
  [1 0]
  8
    [1
      6
        [5 [4 0 6] [0 7]]
        [0 6]
        2 [[0 2] [4 0 6] [0 7]] [0 2]
    ]
    2 [0 1] [0 2]
]

This formula, when evaluated against an atom subject, creates another subject from it, defining a ‘loop’ analogue that looks in specific addresses in the subject for itself, as well as the ‘m’ and ‘n’ variables, such that it produces the decrement of the original subject. Our Hoon code does much the same – every ‘top-level’ rune expression adds something to the subject, until we get to the final expression, ‘recur:loop’, which evaluates ‘recur’ against the subject, ‘loop’.

The advantage of Hoon, in comparison to Nock, is that we can work with names, instead of raw tree addresses, as well as with higher-level abstractions like cores. The difference between Hoon and Nock really is like the difference between C and assembly!

For what it’s worth, here is the compiled Nock corresponding to our above decrement function:

[8
  [1 0]
  8
    [8
      [1
        6
          [5 [4 0 6] 0 30]
          [0 6]
          9 2 10 [6 4 0 6] 0 1
      ]
      0 1
    ]
    7 [0 2] 9 2 0 1
]

It’s similar, though not identical, to our hand-rolled Nock. In particular, you can see that it is adding a constant conditional formula, including the familiar equality check, to the subject (note that the equality check, using Nock-5, refers to address 30 instead of 7 – presumably this is because I have more junk floating around in my dojo subject). Additionally, the formulas using Nock-9 and Nock-10 reduce to Nock-2 and Nock-0, just like our hand-rolled code does.

But our Hoon is doing more than the bespoke Nock version did, so we’re not getting quite the same code. Worth noting is the ‘extra’ use of Nock-8, which is presumably required because I’ve defined both ‘recur’, the looping function, and ‘loop’, the core to hold it, and the hand-rolled Nock obviously didn’t involve a core.

Doors

Here’s another way to write decrement, using another fundamental Hoon construct, the door:

|=  m=@
=/  loop
|_  n=@
  ++  recur
    ?:  =(+(n) m)
      n
    ~(recur ..recur +(n))
  --
~(recur loop 0)

A door is a core that takes an argument. Here we’ve used the |_ rune, instead than |%, to define ‘loop’, and note that it takes ‘n’ as an argument. So instead of ‘n’ being defined external to the core, as it was in the previous example, here we have to specify it explicitly when we call ‘recur’. Note that this is more similar to our Haskell example, in which ‘loop’ was defined as a function taking ‘n’ as an argument.

The two other novel things here are the ~(recur ..recur +(n)) and ~(recur loop 0) expressions, which actually turn out to be mostly the same thing. The syntax:

~(arm door argument)

is irregular, and means “evaluate ‘arm’ in ‘door’ using ‘argument’”. So in the last line, ~(recur loop 0) means “evaluate ‘recur’ in ‘loop’ with n set to 0.” In the definition of ‘recur’, on the other hand, we need to refer to the door that contains it, but are in the very process of defining that thing. The ‘..recur’ syntax means “the door that contains ‘recur’,” and is useful for exactly this task, given we can’t yet refer to ‘loop’. The syntax ~(recur ..recur +(n)) means “evaluate ‘recur’ in its parent door with n set to n + 1.”

Let’s check the compiled Nock of this version:

[8
  [8
    [1 0]
    [1
      6
        [5 [4 0 6] 0 30]
        [0 6]
        8
          [0 1]
          9 2 10 [6 7 [0 3] 4 0 6] 0 2
    ]
    0 1
  ]
  8
    [0 2]
    9 2 10 [6 7 [0 3] 1 0] 0 2
]

There’s even more going on here than in our core-implemented decrement, but doors are a generalisation of cores, so that’s to be expected.

Hoon has special support, though, for one-armed doors. This is precisely how functions (also called gates or traps, depending on the context) are implemented in Hoon. The following is probably the most idiomatic version of naïve decrement:

|=  m=@
=/  n  0
|-
?:  =(+(n) m)
  n
$(n +(n))

The |= rune that we’ve been using throughout these examples really defines a door, taking the specified argument, with a single arm called ‘$’. The |- rune here does the same, except it immediately calls the ‘$’ arm after defining it. The last line, $(n +(n)), is analogous to the recur(n +(n)) line in our first example: it evaluates the ‘$’ arm, except changing the value of ‘n’ to ‘n + 1’ in the environment.

(Note that there are two ‘$’ arms defined in the above code – one via the use of |=, and one via the use of |-. But there is no confusion as to which one we mean, since the latter has been the latest to be added to the subject. Additions to the subject are always prepended in Hoon – i.e. they are placed at address 2. As the topmost ‘$’ in the subject is the one that corresponds to |-, it is resolved first.)

The compiled Nock for this version looks like the following:

[8
  [1 0]
  8
    [1
      6
        [5 [4 0 6] 0 30]
        [0 6]
        9 2 10 [6 4 0 6] 0 1
    ]
    9 2 0 1
]

And it is possible (see the appendix) to show that, modulo some different addressing, this reduces exactly to our hand-rolled Nock code.

UPDATE: my colleague Ted Blackman, an actual Hoon programmer, recommended the following as a slightly more idiomatic version of naïve decrement:

=|  n=@
|=  m=@
^-  @
?:  =(+(n) m)
  n
$(n +(n))

Note that here we’re declaring ‘n’ outside of the gate itself by using another rune, =|, that gives the variable a default value based on its type (an atom’s default value is 0). There’s also an explicit type cast via ^- @, indicating that the gate produces an atom (like type signatures in Haskell, it is considered good practice to include these, even though they may not strictly be required).

Declaring ‘n’ outside the gate is interesting. It has an imperative feel, as if one were writing the code in Python, or were using a monad like ‘State’ or a ‘PrimMonad’ in Haskell. Like in the Haskell case, we aren’t actually doing any mutation here, of course – we’re creating new subjects to evaluate each iteration of our Nock formula against. And the resulting Nock is very succinct:

[6
  [5 [4 0 14] 0 6]
  [0 14]
  9 2 10 [14 4 0 14] 0 1
]

Basic Generators

If you tested the above examples, I instructed you to do so by typing them into Arvo’s dojo. I’ve come to believe that, in general, this is a poor way to teach Hoon. It shouldn’t be done for all but the most introductory examples (such as the ones I’ve provided here).

If you’ve learned Haskell, you are familiar with the REPL provided by GHCi, the Glasgow Haskell Compiler’s interpreter. Code running in GHCi is implicitly running in the IO monad, and I think this leads to confusion amongst newcomers who must then mentally separate “Haskell in GHC” from “Haskell in GHCi.”

I think there is a similar problem in Hoon. Expressions entered into the dojo implicitly grow or shrink or otherwise manipulate the dojo’s subject, which is not, in general, available to standalone Hoon programs. Such standalone Hoon programs are called generators. In general, they’re what you will use when working in Hoon and Arvo.

There are four kinds of generators: naked, %say, %ask, and %get. In this post we’ll just look at the first two; the last couple are out of scope, for now.

Naked Generators

The simplest kind of generator is the ‘naked’ generator, which just exists in a file somewhere in your Urbit’s “desk.” If you save the following as naive-decrement.hoon in an Urbit’s home/gen directory, for example:

|=  m=@
=/  n  0
|-
?:  =(+(n) m)
  n
$(n +(n))

Then you’ll be able to run it in a dojo via:

~zod:dojo> +naive-decrement 20
19

A naked generator can only be a simple function (technically, a gate) that produces a noun. It has no access to any external environment – it’s basically just a self-contained function in a file. It must have an argument, and it must have only one argument; to pass multiple values to a naked generator, one must use a cell.

Say Generators

Hoon is a purely functional language, but, unlike Haskell, it also has no IO monad to demarcate I/O effects. Hoon programs do not produce effects on their own at all – instead, they construct nouns that tell Arvo how to produce some effect or other.

A %say generator (where %say is a symbol) produces a noun, but it can also make use of provided environment data (e.g. date information, entropy, etc.). The idea is that the generator has a specific structure that Arvo knows how to handle, in order to supply it with the requisite information. Specifically, %say generators have the structure:

:-  %say
|=  [<environment data> <list of arguments> <list of optional arguments>]
:-  %noun
<code>

I’ll avoid discussing what a list is in Hoon at the moment, and we won’t actually use any environment data in any examples here. But if you dump the following in home/gen/naive-decrement.hoon, for example:

:-  %say
|=  [* [m=@ ~] ~]
:-  %noun
=/  n  0
|-
?:  =(+(n) m)
  n
$(n +(n))

you can call it from the dojo via the mechanism as before:

~zod:dojo> +naive-decrement 20
19

The generator itself actually returns a particularly-structured noun; a cell with the symbol %say as its head, and a gate returning a pair of the symbol %noun and a noun as its tail. The %noun symbol describes the data produced by the generator. But note that this is not displayed when evaluating the generator in the dojo – instead, we just get the noun itself, but this behaviour is dojo-dependent.

I think one should almost get in the habit of writing %say generators for most Hoon code, even if a simple naked generator or throwaway dojo command would do the trick. They are so important for getting things done in Hoon that it helps to learn about & start using them sooner than later.

Fin

I’ve introduced Hoon and given a brief tour of what I think are some of the most important tools for getting work done in the language. Cores, doors, and gates will get you plenty far, and early exposure to generators, in the form of the basic naked and %say variants, will help you avoid the habit of programming in the dojo, and get you writing more practically-structured Hoon code from the get-go.

I haven’t had time in this post to describe Hoon’s type system, which is another very important topic when it comes to getting work done in the language. I’ll probably write one more to create a small trilogy of sorts – stay tuned.

Appendix

Let’s demonstrate that the compiled Nock code from our door-implemented decrement reduces to the same as our hand-rolled Nock, save different address use. Recall that our compile Nock code was:

[8
  [1 0]
  8
    [1
      6
        [5 [4 0 6] 0 30]
        [0 6]
        9 2 10 [6 4 0 6] 0 1
    ]
    9 2 0 1
]

An easy reduction is from Nock-9 to Nock-2. Note that *[a 9 b c] is the same as *[*[a c] 2 [0 1] 0 b]. When ‘c’ is [0 1], we have that *[a c] = a, such that *[a 9 b [0 1]] is the same as *[a 2 [0 1] 0 b], i.e. that the formula [9 b c] is the same as the formula [2 [0 1] 0 b]. We can thus reduce the use of Nock-9 on the last line to:

[8
  [1 0]
  8
    [1
      6
        [5 [4 0 6] 0 30]
        [0 6]
        9 2 10 [6 4 0 6] 0 1
    ]
    2 [0 1] 0 2
]

The remaining formula involving Nock-9 evaluates [10 [6 4 0 6] 0 1] against the subject, and then evaluates [2 [0 1] [0 2]] against the result. Note that, for some subject ‘a’, we have:

*[a 10 [6 4 0 6] 0 1]
= #[6 *[a 4 0 6] *[a 0 1]]
= #[6 *[a 4 0 6] a]
= #[3 [*[a 4 0 6] /[7 a]] a]
= #[1 [/[2 a] [*[a 4 0 6] /[7 a]]] a]
= [/[2 a] [*[a 4 0 6] /[7 a]]]
= [*[a 0 2] [*[a 4 0 6] *[a 0 7]]]
= *[a [0 2] [4 0 6] [0 7]]

such that [10 [6 4 0 6] 0 1] = [[0 2] [4 0 6] [0 7]]. And for c = [[0 2] [4 0 6] [0 7]] and some subject ‘a’, we have:

*[a 9 2 c]
= *[*[a c] 2 [0 1] 0 2]

and for b = [2 [0 1] 0 2]:

*[*[a c] b]
= *[a 7 c b]
= *[a 7 [[0 2] [4 0 6] [0 7]] [2 [0 1] 0 2]]

such that:

[9 2 [0 2] [4 0 6] [0 7]] = [7 [[0 2] [4 0 6] [0 7]] [2 [0 1] 0 2]]

Now. Note that for any subject ‘a’ we have:

*[a 7 [[0 2] [4 0 6] [0 7]] [2 [0 1] 0 2]]
= *[a 7 [[0 2] [4 0 6] [0 7]] *[a 0 2]]

since *[a 2 [0 1] 0 2] = *[a *[a 0 2]]. Thus, we can reduce:

*[a 7 [[0 2] [4 0 6] [0 7]] *[a 0 2]]
= *[*[a [0 2] [4 0 6] [0 7]] *[a 0 2]]
= *[a 2 [[0 2] [4 0 6] [0 7]] [0 2]]

such that

[7 [[0 2] [4 0 6] [0 7]] [2 [0 1] 0 2]] = [2 [[0 2] [4 0 6] [0 7]] [0 2]]

and, so that, finally, we can reduce the compiled Nock to:

[8
  [1 0]
  8
    [1
      6
        [5 [4 0 6] 0 30]
        [0 6]
        2 [[0 2] [4 0 6] [0 7]] 0 2
    ]
    2 [0 1] 0 2
]

which, aside from the use of the dojo-assigned address 30 (and any reduction errors on this author’s part), is the same as our hand-rolled Nock.

jtobin.io