11 Feb 2020
(UPDATE 2024/09/08: while hosting your own mailserver is not covered
in this post, I recommend you check out Simple NixOS Mailserver
for a borderline trivial way to do it.)
A couple of people recently asked about my email setup, so I figured it might
be best to simply document some of it here.
I run my own mail server for jtobin.io, plus another domain or two, and usually
wind up interacting with gmail for work. I use offlineimap to fetch and sync
mail with these remote servers, msmtp and msmtpq to send mail, mutt as my MUA,
notmuch for search, and tarsnap for backups.
There are other details; vim for writing emails, urlview for dealing with URLs,
w3m for viewing HTML, pass for password storage, etc. etc. But the
mail setup proper is as above.
I’ll just spell out some of the major config below, and will focus on the
configuration that works with gmail, since that’s probably of broader appeal.
You can get all the software for it via the following in nixpkgs:
mutt offlineimap msmtp notmuch notmuch-mutt
offlineimap
offlineimap is used to sync local and remote email; I use it to manually grab
emails occasionally throughout the day. You could of course set it up to run
automatically as a cron job or what have you, but I like deliberately fetching
my email only when I actually want to deal with it.
Here’s a tweaked version of one of my .offlineimaprc files:
[general]
accounts = work
[Account work]
localrepository = work-local
remoterepository = work-remote
postsynchook = notmuch new
[Repository work-local]
type = Maildir
localfolders = ~/mail/work
sep = /
restoreatime = no
[Repository work-remote]
type = Gmail
remoteuser = FIXME_user@domain.tld
remotepass = FIXME_google_app_password
realdelete = no
ssl = yes
sslcacertfile = /usr/local/etc/openssl/cert.pem
folderfilter = lambda folder: folder not in\
['[Gmail]/All Mail', '[Gmail]/Important', '[Gmail]/Starred']
You should be able to figure out the gist of this. Pay particular attention to
the ‘remoteuser’, ‘remotepass’, and ‘folderfilter’ options. For ‘remotepass’
in particular you’ll want to generate an app-specific password from Google.
The ‘folderfilter’ option lets you specify the gmail folders that you actually
want to sync; folder in [..]
and folder not in [..]
are probably all you’ll
want here.
If you don’t want to store your password in cleartext, and instead want to
grab it from an encrypted store, you can use the ‘remotepasseval’ option. I
don’t bother with this for Google accounts that have app-specific passwords,
but do for others.
This involves a little bit of extra setup. First, you can make some Python
functions available to the config file with ‘pythonfile’:
[general]
accounts = work
pythonfile = ~/.offlineimap.py
Here’s a version of that file that I keep, which grabs the desired from
pass(1)
:
#! /usr/bin/env python2
from subprocess import check_output
def get_pass():
return check_output("pass FIXME_PASSWORD", shell=True).strip("\n")
Then you can just call the get_pass
function in ‘remotepasseval’ back in
.offlineimaprc:
[Repository work-remote]
type = Gmail
remoteuser = FIXME_user@domain.tld
remotepasseval = get_pass()
realdelete = no
ssl = yes
sslcacertfile = /usr/local/etc/openssl/cert.pem
folderfilter = lambda folder: folder not in\
['[Gmail]/All Mail', '[Gmail]/Important', '[Gmail]/Starred']
When you’ve got this set up, you should just be able to run offlineimap
to
fetch your email. If you maintain multiple configuration files, it’s helpful
to specify a specific one using -c
, e.g. offlineimap -c .offlineimaprc-foo
.
msmtp, msmtpq
msmtp is used to send emails. It’s a very simple SMTP client. Here’s a
version of my .msmtprc
:
defaults
auth on
tls on
tls_starttls on
tls_trust_file /usr/local/etc/openssl/cert.pem
logfile ~/.msmtp.log
account work
host smtp.gmail.com
port 587
from FIXME_user@domain.tld
user FIXME_user@domain.tld
password FIXME_google_app_password
account default: work
Again, very simple.
You can do a similar thing here if you don’t want to store passwords in
cleartext. Just use ‘passwordeval’ and the desired shell command directly,
e.g.:
account work
host smtp.gmail.com
port 587
from FIXME_user@domain.tld
user FIXME_user@domain.tld
passwordeval "pass FIXME_PASSWORD"
I occasionally like to work offline, so I use msmtpq to queue up emails to send
later. Normally you don’t have to deal with any of this directly, but
occasionally it’s nice to be able to check the queue. You can do that with
msmtp-queue -d
:
$ msmtp-queue -d
no mail in queue
If there is something stuck in the queue, you can force it to send with
msmtp-queue -r
or -R
. FWIW, this has happened to me while interacting with
gmail under a VPN in the past.
mutt
Mutt is a fantastic MUA. Its tagline is “all mail clients suck, this one just
sucks less,” but I really love mutt. It may come as a surprise that working
with email can be a pleasure, especially if you’re accustomed to working with
clunky webmail UIs, but mutt makes it so.
Here’s a pruned-down version of one of my .muttrc files:
set realname = "MyReal Name"
set from = "user@domain.tld"
set use_from = yes
set envelope_from = yes
set mbox_type = Maildir
set sendmail = "~/.nix-profile/bin/msmtpq -a work"
set sendmail_wait = -1
set folder = "~/mail/work"
set spoolfile = "+INBOX"
set record = "+[Gmail]/Sent Mail"
set postponed = "+[Gmail]/Drafts"
set smtp_pass = "FIXME_google_app_password"
set imap_pass = "FIXME_google_app_password"
set signature = "~/.mutt/.signature-work"
set editor = "vim"
set sort = threads
set sort_aux = reverse-last-date-received
set pgp_default_key = "my_default_pgp@key"
set crypt_use_gpgme = yes
set crypt_autosign = yes
set crypt_replysign = yes
set crypt_replyencrypt = yes
set crypt_replysignencrypted = yes
bind index gg first-entry
bind index G last-entry
bind index B imap-fetch-mail
bind index - collapse-thread
bind index _ collapse-all
set alias_file = ~/.mutt/aliases
set sort_alias = alias
set reverse_alias = yes
source $alias_file
auto_view text/html
alternative_order text/plain text/enriched text/html
subscribe my_favourite@mailing.list
macro index <F8> \
"<enter-command>set my_old_pipe_decode=\$pipe_decode my_old_wait_key=\$wait_key nopipe_decode nowait_key<enter>\
<shell-escape>notmuch-mutt -r --prompt search<enter>\
<change-folder-readonly>`echo ${XDG_CACHE_HOME:-$HOME/.cache}/notmuch/mutt/results`<enter>\
<enter-command>set pipe_decode=\$my_old_pipe_decode wait_key=\$my_old_wait_key<enter>i" \
"notmuch: search mail"
macro index <F9> \
"<enter-command>set my_old_pipe_decode=\$pipe_decode my_old_wait_key=\$wait_key nopipe_decode nowait_key<enter>\
<pipe-message>notmuch-mutt -r thread<enter>\
<change-folder-readonly>`echo ${XDG_CACHE_HOME:-$HOME/.cache}/notmuch/mutt/results`<enter>\
<enter-command>set pipe_decode=\$my_old_pipe_decode wait_key=\$my_old_wait_key<enter>" \
"notmuch: reconstruct thread"
macro index l "<enter-command>unset wait_key<enter><shell-escape>read -p 'notmuch query: ' x; echo \$x >~/.cache/mutt_terms<enter><limit>~i \"\`notmuch search --output=messages \$(cat ~/.cache/mutt_terms) | head -n 600 | perl -le '@a=<>;chomp@a;s/\^id:// for@a;$,=\"|\";print@a'\`\"<enter>" "show only messages matching a notmuch pattern"
# patch rendering
# https://karelzak.blogspot.com/2010/02/highlighted-patches-inside-mutt.html
color body green default "^diff \-.*"
color body green default "^index [a-f0-9].*"
color body green default "^\-\-\- .*"
color body green default "^[\+]{3} .*"
color body cyan default "^[\+][^\+]+.*"
color body red default "^\-[^\-]+.*"
color body brightblue default "^@@ .*"
# vim: ft=muttrc:
Some comments on all that:
set mbox_type = Maildir
set sendmail = "~/.nix-profile/bin/msmtpq -a work"
set sendmail_wait = -1
set folder = "~/mail/work"
set spoolfile = "+INBOX"
set record = "+[Gmail]/Sent Mail"
set postponed = "+[Gmail]/Drafts"
Note here that we’re specifying msmtpq as our sendmail program. The -a work
command here refers to the account defined in your .msmtprc file, so if you
change the name of it there, you have to do it here as well. Ditto for the
folder.
(If you’re tweaking these config files for your own use, I’d recommend just
substituting all instances of ‘work’ with your own preferred account name.)
The negative ‘sendmail_wait’ value handles queueing mails up appropriately
when offline, IIRC.
set smtp_pass = "FIXME_google_app_password"
set imap_pass = "FIXME_google_app_password"
Here are the usual cleartext app passwords. If you want to store them
encrypted, there’s a usual method for doing that: add the following to the top
of your .muttrc:
source "gpg -d ~/.mutt/my-passwords.gpg |"
where .mutt/my-passwords.gpg
should be the above smtp_pass
and imap_pass
assignments, encrypted with your desired private key.
Continuing with the file at hand:
set signature = "~/.mutt/.signature-work"
set editor = "vim"
These should be self-explanatory. The signature file should just contain the
signature you want appended to your mails (it will be appended under a pair of
dashes). And if you want to use some other editor to compose your emails, just
specify it here.
set pgp_default_key = "my_default_pgp@key"
set crypt_use_gpgme = yes
set crypt_autosign = yes
set crypt_replysign = yes
set crypt_replyencrypt = yes
set crypt_replysignencrypted = yes
Mutt is one of the few programs that has great built-in support for PGP. It
can easily encrypt, decrypt, and sign messages, grab public keys, etc. Here
you can see that I’ve set it to autosign messages, reply to encrypted messages
with encrypted messages, and so on.
bind index gg first-entry
bind index G last-entry
bind index B imap-fetch-mail
bind index - collapse-thread
bind index _ collapse-all
These are a few key bindings that I find helpful. The first bunch are familiar
to vim users and are useful for navigating around; the second two are really
useful for compressing or expanding the view of your mailbox.
set alias_file = ~/.mutt/aliases
set sort_alias = alias
set reverse_alias = yes
source $alias_file
auto_view text/html
alternative_order text/plain text/enriched text/html
The alias file lets you define common shortcuts for single or multiple
addresses. I get a lot of use out of multiple address aliases, e.g.:
alias chums socrates@ago.ra, plato@acade.my, aristotle@lyce.um
The MIME type stuff below the alias config is just a sane set of defaults for
viewing common mail formats.
subscribe my_favourite@mailing.list
Mutt makes interacting with mailing lists very easy just by default, but you
can also indicate addresses that you’re subscribed to, as above, to unlock a
few extra features for them (‘list reply’ being a central one). To tell mutt
that you’re subscribed to haskell-cafe, for example, you’d use:
subscribe haskell-cafe@haskell.org
The three longer macros that follow are for notmuch. I really only find myself
using the last one, ‘l’, for search. FWIW, notmuch’s search functionality is
fantastic; I’ve found it to be more useful than gmail’s web UI search, I think.
The patch rendering stuff at the end is just a collection of heuristics for
rendering in-body .patch files well. This is useful if you subscribe to a
patch-heavy mailing list, e.g. LKML or git@vger.kernel.org
, or if you just
want to be able to better-communicate about diffs in your day-to-day emails
with your buddies.
Fin
There are obviously endless ways you can configure all this stuff, especially
mutt, and common usage patterns that you’ll quickly find yourself falling into.
But whatever you find those to be, the above should at least get you up and
running pretty quickly with 80% of the desired feature set.
04 Feb 2019
In my last post I first introduced hnock, a little interpreter
for Nock, and then demonstrated it on a hand-rolled decrement function.
In this post I’ll look at how one can handle the same (contrived, but
illustrative) task in Hoon.
Hoon is the higher- or application-level programming language for working with
Arvo, the operating system of Urbit. The best way I can
describe it is something like “Haskell meets C meets J meets the environment is
always explicit.”
As a typed, functional language, Hoon feels surprisingly low-level. One is
never allocating or deallocating memory explicitly when programming in Hoon,
but the experience somehow feels similar to working in C. The idea is that the
language should be simple and straightforward and support a fairly limited
level of abstraction. There are the usual low-level functional idioms (map,
reduce, etc.), as well as a structural type system to keep the programmer
honest, but at its core, Hoon is something of a functional Go (a language
which, I happen to think, is not good).
It’s not a complex language, like Scala or Rust, nor a language that overtly
supports sky-high abstraction, like Haskell or Idris. Hoon is supposed to
exist at a sweet spot for getting work done. And I am at least willing to buy
the argument that it is pretty good for getting work done in Urbit.
Recall our naïve decrement function in Haskell. It looked like this:
dec :: Integer -> Integer
dec m =
let loop n
| succ n == m = n
| otherwise = loop (succ n)
in loop 0
Let’s look at a number of ways to write this in Hoon, showing off some of the
most important Hoon programming concepts in the process.
Cores
Here’s a Hoon version of decrement. Note that to the uninitiated, Hoon looks
gnarly:
|= m=@
=/ n=@ 0
=/ loop
|%
++ recur
?: =(+(n) m)
n
recur(n +(n))
--
recur:loop
We can read it as follows:
- Define a function that takes an argument, ‘m’, having type atom (recall
that an atom is an unsigned integer).
- Define a local variable called ‘n’, having type atom and value 0, and add it
to the environment (or, if you recall our Nock terminology, to the
subject).
- Define a local variable called ‘loop’, with precise definition to follow, and
add it to the environment.
- ‘loop’ is a core, i.e. more or less a named collection of functions.
Define one such function (or arm), ‘recur’, that checks to see if the
increment of ‘n’ is equal to ‘m’, returning ‘n’ if so, and calling itself,
except with the value of ‘n’ in the environment changed to ‘n + 1’, if not.
- Evaluate ‘recur’ as defined in ‘loop’.
(To test this, you can enter the Hoon line-by-line into the Arvo dojo.
Just preface it with something like =core-dec
to give it a name, and call it
via e.g. (core-dec 20)
.)
Hoon may appear to be a write-only language, though I’ve found this to not
necessarily be the case (just to note, at present I’ve read more Hoon code than
I’ve written). Good Hoon has a terse and very vertical style. The principle
that keeps it readable is that, roughly, each line should contain one important
logical operation. These operations are denoted by runes, the =/
and ?:
and similar ASCII digraphs sprinkled along the left hand columns of the above
example. This makes it look similar to e.g. J – a language I have
long loved, but never mastered – although in J the rough ‘one operator per
line’ convention is not typically in play.
In addition to the standard digraph runes, there is also a healthy dose of
‘irregular’ syntax in most Hoon code for simple operations that one uses
frequently. Examples used above include =(a b)
for equality testing, +(n)
for incrementing an atom, and foo(a b)
for evaluating ‘foo’ with the value of
‘a’ in the environment changed to ‘b’. Each of these could be replaced with a
more standard rune-based expression, though for such operations the extra
verbosity is not usually warranted.
Cores like ‘loop’ seem, to me, to be the mainstay workhorse of Hoon
programming. A core is more or less a structure, or object, or dictionary, or
whatever, of functions. One defines them liberally, constructs a subject (i.e.
environment) to suit, and then evaluates them, or some part of them, against
the subject.
To be more precise, a core is a Nock expression; like every non-atomic value in
Nock, it is a tree. Starting from the cell [l r]
, the left subtree, ‘l’, is
a tree of Nock formulas (i.e. the functions, like ‘recur’, defined in the
core). The right subtree, ‘r’ is all the data required to evaluate those Nock
formulas. The traditional name for the left subtree, ‘l’, is the battery of
the core; the traditional name for the right subtree is the payload.
One is always building up a local environment in Hoon and then evaluating some
value against it. Aside from the arm ‘recur’, the core ‘loop’ also contains in
its payload the values ‘m’ and ‘n’. The expression ‘recur:loop’ – irregular
syntax for =< recur loop
– means “use ‘loop’ as the environment and
evaluate ‘recur’.” Et voilà, that’s how we get our decrement.
You’ll note that this should feel very similar to the way we defined decrement
in Nock. Our hand-assembled Nock code, slightly cleaned up, looked like this:
[8
[1 0]
8
[1
6
[5 [4 0 6] [0 7]]
[0 6]
2 [[0 2] [4 0 6] [0 7]] [0 2]
]
2 [0 1] [0 2]
]
This formula, when evaluated against an atom subject, creates another subject
from it, defining a ‘loop’ analogue that looks in specific addresses in the
subject for itself, as well as the ‘m’ and ‘n’ variables, such that it produces
the decrement of the original subject. Our Hoon code does much the same –
every ‘top-level’ rune expression adds something to the subject, until we get
to the final expression, ‘recur:loop’, which evaluates ‘recur’ against the
subject, ‘loop’.
The advantage of Hoon, in comparison to Nock, is that we can work with names,
instead of raw tree addresses, as well as with higher-level abstractions like
cores. The difference between Hoon and Nock really is like the difference
between C and assembly!
For what it’s worth, here is the compiled Nock corresponding to our above
decrement function:
[8
[1 0]
8
[8
[1
6
[5 [4 0 6] 0 30]
[0 6]
9 2 10 [6 4 0 6] 0 1
]
0 1
]
7 [0 2] 9 2 0 1
]
It’s similar, though not identical, to our hand-rolled Nock. In particular,
you can see that it is adding a constant conditional formula, including the
familiar equality check, to the subject (note that the equality check, using
Nock-5, refers to address 30 instead of 7 – presumably this is because I have
more junk floating around in my dojo subject). Additionally, the formulas
using Nock-9 and Nock-10 reduce to Nock-2 and Nock-0, just like our hand-rolled
code does.
But our Hoon is doing more than the bespoke Nock version did, so we’re not
getting quite the same code. Worth noting is the ‘extra’ use of Nock-8, which
is presumably required because I’ve defined both ‘recur’, the looping function,
and ‘loop’, the core to hold it, and the hand-rolled Nock obviously didn’t
involve a core.
Doors
Here’s another way to write decrement, using another fundamental Hoon
construct, the door:
|= m=@
=/ loop
|_ n=@
++ recur
?: =(+(n) m)
n
~(recur ..recur +(n))
--
~(recur loop 0)
A door is a core that takes an argument. Here we’ve used the |_
rune,
instead than |%
, to define ‘loop’, and note that it takes ‘n’ as an argument.
So instead of ‘n’ being defined external to the core, as it was in the previous
example, here we have to specify it explicitly when we call ‘recur’. Note that
this is more similar to our Haskell example, in which ‘loop’ was defined as a
function taking ‘n’ as an argument.
The two other novel things here are the ~(recur ..recur +(n))
and ~(recur
loop 0)
expressions, which actually turn out to be mostly the same thing. The
syntax:
is irregular, and means “evaluate ‘arm’ in ‘door’ using ‘argument’”. So in the
last line, ~(recur loop 0)
means “evaluate ‘recur’ in ‘loop’ with n set to 0.”
In the definition of ‘recur’, on the other hand, we need to refer to the door
that contains it, but are in the very process of defining that thing. The
‘..recur’ syntax means “the door that contains ‘recur’,” and is useful for
exactly this task, given we can’t yet refer to ‘loop’. The syntax ~(recur
..recur +(n))
means “evaluate ‘recur’ in its parent door with n set to n + 1.”
Let’s check the compiled Nock of this version:
[8
[8
[1 0]
[1
6
[5 [4 0 6] 0 30]
[0 6]
8
[0 1]
9 2 10 [6 7 [0 3] 4 0 6] 0 2
]
0 1
]
8
[0 2]
9 2 10 [6 7 [0 3] 1 0] 0 2
]
There’s even more going on here than in our core-implemented decrement, but
doors are a generalisation of cores, so that’s to be expected.
Hoon has special support, though, for one-armed doors. This is precisely how
functions (also called gates or traps, depending on the context) are
implemented in Hoon. The following is probably the most idiomatic version of
naïve decrement:
|= m=@
=/ n 0
|-
?: =(+(n) m)
n
$(n +(n))
The |=
rune that we’ve been using throughout these examples really defines a
door, taking the specified argument, with a single arm called ‘$’. The |-
rune here does the same, except it immediately calls the ‘$’ arm after defining
it. The last line, $(n +(n))
, is analogous to the recur(n +(n))
line in
our first example: it evaluates the ‘$’ arm, except changing the value of ‘n’
to ‘n + 1’ in the environment.
(Note that there are two ‘$’ arms defined in the above code – one via the use
of |=
, and one via the use of |-
. But there is no confusion as to which
one we mean, since the latter has been the latest to be added to the subject.
Additions to the subject are always prepended in Hoon – i.e. they are
placed at address 2. As the topmost ‘$’ in the subject is the one that
corresponds to |-
, it is resolved first.)
The compiled Nock for this version looks like the following:
[8
[1 0]
8
[1
6
[5 [4 0 6] 0 30]
[0 6]
9 2 10 [6 4 0 6] 0 1
]
9 2 0 1
]
And it is possible (see the appendix) to show that, modulo some different
addressing, this reduces exactly to our hand-rolled Nock code.
UPDATE: my colleague Ted Blackman, an actual Hoon programmer, recommended
the following as a slightly more idiomatic version of naïve decrement:
=| n=@
|= m=@
^- @
?: =(+(n) m)
n
$(n +(n))
Note that here we’re declaring ‘n’ outside of the gate itself by using another
rune, =|
, that gives the variable a default value based on its type (an
atom’s default value is 0). There’s also an explicit type cast via ^- @
,
indicating that the gate produces an atom (like type signatures in Haskell, it
is considered good practice to include these, even though they may not strictly
be required).
Declaring ‘n’ outside the gate is interesting. It has an imperative feel,
as if one were writing the code in Python, or were using a monad like ‘State’
or a ‘PrimMonad’ in Haskell. Like in the Haskell case, we aren’t actually
doing any mutation here, of course – we’re creating new subjects to evaluate
each iteration of our Nock formula against. And the resulting Nock is very
succinct:
[6
[5 [4 0 14] 0 6]
[0 14]
9 2 10 [14 4 0 14] 0 1
]
Basic Generators
If you tested the above examples, I instructed you to do so by typing them into
Arvo’s dojo. I’ve come to believe that, in general, this is a poor way to
teach Hoon. It shouldn’t be done for all but the most introductory examples
(such as the ones I’ve provided here).
If you’ve learned Haskell, you are familiar with the REPL provided by GHCi, the
Glasgow Haskell Compiler’s interpreter. Code running in GHCi is implicitly
running in the IO monad, and I think this leads to confusion amongst newcomers
who must then mentally separate “Haskell in GHC” from “Haskell in GHCi.”
I think there is a similar problem in Hoon. Expressions entered into the dojo
implicitly grow or shrink or otherwise manipulate the dojo’s subject, which is
not, in general, available to standalone Hoon programs. Such standalone Hoon
programs are called generators. In general, they’re what you will use when
working in Hoon and Arvo.
There are four kinds of generators: naked, %say
, %ask
, and %get
. In this
post we’ll just look at the first two; the last couple are out of scope, for
now.
Naked Generators
The simplest kind of generator is the ‘naked’ generator, which just exists in
a file somewhere in your Urbit’s “desk.” If you save the following as
naive-decrement.hoon
in an Urbit’s home/gen
directory, for example:
|= m=@
=/ n 0
|-
?: =(+(n) m)
n
$(n +(n))
Then you’ll be able to run it in a dojo via:
~zod:dojo> +naive-decrement 20
19
A naked generator can only be a simple function (technically, a gate) that
produces a noun. It has no access to any external environment – it’s
basically just a self-contained function in a file. It must have an argument,
and it must have only one argument; to pass multiple values to a naked
generator, one must use a cell.
Say Generators
Hoon is a purely functional language, but, unlike Haskell, it also has no IO
monad to demarcate I/O effects. Hoon programs do not produce effects on their
own at all – instead, they construct nouns that tell Arvo how to produce
some effect or other.
A %say
generator (where %say
is a symbol) produces a noun, but it can also
make use of provided environment data (e.g. date information, entropy, etc.).
The idea is that the generator has a specific structure that Arvo knows how to
handle, in order to supply it with the requisite information. Specifically,
%say
generators have the structure:
:- %say
|= [<environment data> <list of arguments> <list of optional arguments>]
:- %noun
<code>
I’ll avoid discussing what a list is in Hoon at the moment, and we won’t
actually use any environment data in any examples here. But if you dump the
following in home/gen/naive-decrement.hoon
, for example:
:- %say
|= [* [m=@ ~] ~]
:- %noun
=/ n 0
|-
?: =(+(n) m)
n
$(n +(n))
you can call it from the dojo via the mechanism as before:
~zod:dojo> +naive-decrement 20
19
The generator itself actually returns a particularly-structured noun; a cell
with the symbol %say
as its head, and a gate returning a pair of the symbol
%noun
and a noun as its tail. The %noun
symbol describes the data produced
by the generator. But note that this is not displayed when evaluating the
generator in the dojo – instead, we just get the noun itself, but this
behaviour is dojo-dependent.
I think one should almost get in the habit of writing %say
generators for
most Hoon code, even if a simple naked generator or throwaway dojo command
would do the trick. They are so important for getting things done in Hoon that
it helps to learn about & start using them sooner than later.
Fin
I’ve introduced Hoon and given a brief tour of what I think are some of the
most important tools for getting work done in the language. Cores, doors, and
gates will get you plenty far, and early exposure to generators, in the form of
the basic naked and %say
variants, will help you avoid the habit of
programming in the dojo, and get you writing more practically-structured Hoon
code from the get-go.
I haven’t had time in this post to describe Hoon’s type system, which is
another very important topic when it comes to getting work done in the
language. I’ll probably write one more to create a small trilogy of sorts –
stay tuned.
Appendix
Let’s demonstrate that the compiled Nock code from our door-implemented
decrement reduces to the same as our hand-rolled Nock, save different address
use. Recall that our compile Nock code was:
[8
[1 0]
8
[1
6
[5 [4 0 6] 0 30]
[0 6]
9 2 10 [6 4 0 6] 0 1
]
9 2 0 1
]
An easy reduction is from Nock-9 to Nock-2. Note that *[a 9 b c]
is the same
as *[*[a c] 2 [0 1] 0 b]
. When ‘c’ is [0 1]
, we have that *[a c] = a
,
such that *[a 9 b [0 1]]
is the same as *[a 2 [0 1] 0 b]
, i.e. that the
formula [9 b c]
is the same as the formula [2 [0 1] 0 b]
. We can thus
reduce the use of Nock-9 on the last line to:
[8
[1 0]
8
[1
6
[5 [4 0 6] 0 30]
[0 6]
9 2 10 [6 4 0 6] 0 1
]
2 [0 1] 0 2
]
The remaining formula involving Nock-9 evaluates [10 [6 4 0 6] 0 1]
against
the subject, and then evaluates [2 [0 1] [0 2]]
against the result. Note
that, for some subject ‘a’, we have:
*[a 10 [6 4 0 6] 0 1]
= #[6 *[a 4 0 6] *[a 0 1]]
= #[6 *[a 4 0 6] a]
= #[3 [*[a 4 0 6] /[7 a]] a]
= #[1 [/[2 a] [*[a 4 0 6] /[7 a]]] a]
= [/[2 a] [*[a 4 0 6] /[7 a]]]
= [*[a 0 2] [*[a 4 0 6] *[a 0 7]]]
= *[a [0 2] [4 0 6] [0 7]]
such that [10 [6 4 0 6] 0 1] = [[0 2] [4 0 6] [0 7]]
. And for
c = [[0 2] [4 0 6] [0 7]]
and some subject ‘a’, we have:
*[a 9 2 c]
= *[*[a c] 2 [0 1] 0 2]
and for b = [2 [0 1] 0 2]
:
*[*[a c] b]
= *[a 7 c b]
= *[a 7 [[0 2] [4 0 6] [0 7]] [2 [0 1] 0 2]]
such that:
[9 2 [0 2] [4 0 6] [0 7]] = [7 [[0 2] [4 0 6] [0 7]] [2 [0 1] 0 2]]
Now. Note that for any subject ‘a’ we have:
*[a 7 [[0 2] [4 0 6] [0 7]] [2 [0 1] 0 2]]
= *[a 7 [[0 2] [4 0 6] [0 7]] *[a 0 2]]
since *[a 2 [0 1] 0 2] = *[a *[a 0 2]]
. Thus, we can reduce:
*[a 7 [[0 2] [4 0 6] [0 7]] *[a 0 2]]
= *[*[a [0 2] [4 0 6] [0 7]] *[a 0 2]]
= *[a 2 [[0 2] [4 0 6] [0 7]] [0 2]]
such that
[7 [[0 2] [4 0 6] [0 7]] [2 [0 1] 0 2]] = [2 [[0 2] [4 0 6] [0 7]] [0 2]]
and, so that, finally, we can reduce the compiled Nock to:
[8
[1 0]
8
[1
6
[5 [4 0 6] 0 30]
[0 6]
2 [[0 2] [4 0 6] [0 7]] 0 2
]
2 [0 1] 0 2
]
which, aside from the use of the dojo-assigned address 30 (and any reduction
errors on this author’s part), is the same as our hand-rolled Nock.
31 Jan 2019
I wrote a little Nock interpreter called hnock some months ago
and just yesterday updated it to support the latest version of Nock, 4K. Nock
– the base layer VM of Urbit – is a very simple little “functional
assembly language” of sorts. It is of particular interest in that it is
capable of practical computation (indeed, it is Turing complete) but is not
defined in terms of the lambda calculus. So, no variable naming and
capture-avoidance and the like to deal with, which is kind of neat.
Nock (at 4K) itself has a small specification, repeated below. Famously, it
fits on a t-shirt:
nock(a) *a
[a b c] [a [b c]]
?[a b] 0
?a 1
+[a b] +[a b]
+a 1 + a
=[a a] 0
=[a b] 1
/[1 a] a
/[2 a b] a
/[3 a b] b
/[(a + a) b] /[2 /[a b]]
/[(a + a + 1) b] /[3 /[a b]]
/a /a
#[1 a b] a
#[(a + a) b c] #[a [b /[(a + a + 1) c]] c]
#[(a + a + 1) b c] #[a [/[(a + a) c] b] c]
#a #a
*[a [b c] d] [*[a b c] *[a d]]
*[a 0 b] /[b a]
*[a 1 b] b
*[a 2 b c] *[*[a b] *[a c]]
*[a 3 b] ?*[a b]
*[a 4 b] +*[a b]
*[a 5 b c] =[*[a b] *[a c]]
*[a 6 b c d] *[a *[[c d] 0 *[[2 3] 0 *[a 4 4 b]]]]
*[a 7 b c] *[*[a b] c]
*[a 8 b c] *[[*[a b] a] c]
*[a 9 b c] *[*[a c] 2 [0 1] 0 b]
*[a 10 [b c] d] #[b *[a c] *[a d]]
*[a 11 [b c] d] *[[*[a c] *[a d]] 0 3]
*[a 11 b c] *[a c]
*a *a
Perhaps you are a neophyte Haskeller and have never implemented an interpreter
for a language before. Nock makes for an excellent target to practice on.
Expressions
A noun in Nock is either an atom, i.e. an unsigned integer, or a cell,
i.e. an ordered pair of nouns:
data Noun =
Atom Integer
| Cell Noun Noun
deriving Eq
Very simple. A Nock expression is then a noun, or some operator applied to
a noun. Per the spec, the operators are denoted by ?
, +
, =
, /
, #
,
and *
, and the fun, convenient, and authentically Urbit-y way to pronounce
these are wut, lus, tis, net, hax, and tar respectively:
data Expr =
Noun Noun
| Wut Noun
| Lus Noun
| Tis Noun
| Net Noun
| Hax Noun
| Tar Noun
deriving Eq
So, equipped with the above definitions, you can write Nock expressions in
Haskell. For example, the following expression:
maps to:
Tar
(Cell (Atom 57)
(Cell (Atom 4)
(Cell (Atom 0) (Atom 1))))
Note per the spec that cells associate to the right, thus something like [4 0
1]
maps to [4 [0 1]]
, and thus Cell (Atom 4) (Cell (Atom 0) (Atom 1))
.
Evaluation and Semantics
For evaluation one can more or less copy out the production rules from the Nock
spec. We can use an Either type to denote a lack of defined semantics:
data Error = Error Noun
deriving Show
type Possibly = Either Error
and then simply define the various evaluation functions appropriately. ‘wut’,
for example, checks if something is a cell:
wut :: Noun -> Possibly Noun
wut noun = return $ case noun of
Cell {} -> Atom 0
Atom {} -> Atom 1
(note that in Nock and Urbit more generally, 0 denotes ‘true’).
‘tis’ checks if the elements of a cell are equal:
tis :: Noun -> Possibly Noun
tis noun = case noun of
Atom {} -> Left (Error noun)
Cell m n -> return $
if m == n
then Atom 0
else Atom 1
And so on. There are other operators for lookups, substitution, et cetera.
The most involved operator is ‘tar’, which constitutes the Nock interpreter
proper. One can simply match on the cases appropriately:
tar :: Noun -> Possibly Noun
tar noun = case noun of
Cell a (Cell (Cell b c) d) -> do
tard0 <- tar (Cell a (Cell b c))
tard1 <- tar (Cell a d)
return (Cell tard0 tard1)
Cell a (Cell (Atom 0) b) ->
net (Cell b a)
Cell _ (Cell (Atom 1) b) ->
return b
Cell a (Cell (Atom 2) (Cell b c)) -> do
tard0 <- tar (Cell a b)
tard1 <- tar (Cell a c)
tar (Cell tard0 tard1)
-- ... and so on
It is particularly useful to look at ‘tar’ to get an idea of what is going on
in Nock. One is always evaluating a noun to another noun. Semantics are only
defined when the noun being evaluated is a cell, and that cell is said to have
the structure:
The subject is essentially the environment, reified explicitly. The formula
is the code to execute against the subject.
The resulting noun is called a ‘product’. So, evaluation proceeds as:
[subject formula] -> product
In any valid application of ‘tar’, i.e. for some *[x y]
, ‘x’ is the subject,
and ‘y’ is the formula. The first case of ‘tar’, for example, is defined via:
Recall that cells in Nock are right-associative, so [a 0 b]
is really [a [0
b]]
. In other words: a
is the subject, [0 b]
is the formula. The formula
[0 b]
, read as ‘Nock-zero’, means “look up the value at address ‘b’ in the
subject,” the subject here being ‘a’. So *[a 0 b]
means “look up the value
at address ‘b’ in ‘a’.”
Other formulas in the various ‘tar’ cases denote additional useful language
features. A formula [1 b]
is the constant function, [6 b c d]
is a
conditional, [7 b c]
is function (viz. formula) composition, and so on.
There is a Turing-complete amount of power packed onto that t-shirt!
In any case, once one has constructed all of these requisite evaluation
functions, we can stitch them together in a single evaluator:
eval :: Expr -> Possibly Noun
eval expr = case expr of
Noun noun -> return noun
Wut e -> wut e
Lus e -> lus e
Tis e -> tis e
Net e -> net e
Hax e -> hax e
Tar e -> tar e
Parsing
Writing a combinator-based parser is an easy job. You can use parsec,
for example, and define your combinators as follows.
Given low-level ‘atom’ and ‘cell’ parsers, a noun is just an atom or a cell:
noun :: Monad m => P.ParsecT T.Text u m Noun
noun =
P.try cell
<|> atom
and an expression is either an operator, followed by an noun, or just a noun:
expr :: Monad m => P.ParsecT T.Text u m Expr
expr =
P.try operator
<|> fmap Noun noun
operator :: Monad m => P.ParsecT T.Text u m Expr
operator = do
op <- P.oneOf "?+=/#*"
case op of
'?' -> fmap Wut noun
'+' -> fmap Lus noun
'=' -> fmap Tis noun
'/' -> fmap Net noun
'#' -> fmap Hax noun
'*' -> fmap Tar noun
_ -> fail "op: bad token"
Note that we don’t allow expressions like the following, taken from the Nock
spec:
*[a *[[c d] 0 *[[2 3] 0 *[a 4 4 b]]]]
These are used to define the production rules, but are not valid Nock
expressions per se.
The end parser is simply:
parse :: T.Text -> Either P.ParseError Expr
parse = P.runParser expr [] "input"
Example
The final interpreter is pretty simple. Here we distinguish between cases in
order to report either parsing or evaluation errors in basic fashion,
respectively:
hnock :: T.Text -> Noun
hnock input = case parse input of
Left perr -> error (show perr)
Right ex -> case eval ex of
Left err -> error (show err)
Right e -> e
It’s easy to test. Take the simple Nock-zero example given previously, where
*[a 0 b]
means “look up the value at address ‘b’ in ‘a’”. Nock’s addressing
scheme is simple; address ‘2’ means the leftmost component of a cell, and ‘3’
means the rightmost component. For every additional cell you want to recurse
into, you just multiply by two and add one as is necessary. Address ‘1’ is
just the cell itself.
So, let’s test:
~ hnock "*[[[1 2] [3 4]] [0 1]"
[[1 2] [3 4]]
Again, the formula [0 1]
means look up at address one, which is the cell
itself.
~ hnock "*[[[1 2] [3 4]] [0 2]]"
[1 2]
The formula [0 2]
means look up at address 2, which is the leftmost component
of the cell.
~ hnock "*[[[1 2] [3 4]] [0 7]]"
4
To understand [0 7]
, we first look up the rightmost component [3 4]
at
address 3, then multiply by two and add one for the rightmost component of
that.
The most famous Nock learner’s problem is to implement a function for
decrementing an atom (if you could do this in ~2010 you could receive an
Urbit galaxy for it, which is now probably worth a serious chunk of change).
Nock’s only arithmetic operator is increment, so the trick is to decrement by
actually counting up from zero.
(N.b. in practice this is not how one actually evaluates decrement in Nock.
The compiler is ‘jetted’, such that alternate implementations, e.g. the CPU’s
decrement instruction directly, can be used to evaluate semantically-identical
expressions.)
But the idea is simple enough. To warm up, we can implement this kind of
decrement in Haskell. Don’t handle the underflow condition, just to keep
things simple:
dec :: Integer -> Integer
dec m =
let loop n
| succ n == m = n
| otherwise = loop (succ n)
in loop 0
Look at the language features we’re using to accomplish this. There’s:
- Variable naming / declaration, for the ‘m’, ‘loop’, and ‘n’ arguments
- Integer increment, via the built-in ‘succ’
- Equality testing, via ‘==’
- Conditional expressions, via the ‘|’ guards
- A function call, in ‘loop 0’
Let’s try and duplicate this in Nock, step by step. We’ll want analogues for
all of our declared variables, to start.
A Nock formula that simply returns the subject it’s evaluated against is easy:
[0 1]
. We used it above.
That’s the equivalent of our ‘m’ declaration in the Haskell code.
Similarly, we can just ignore the subject and return a ‘0’ via [1 0]
.
That’s the initial value, 0, that appears in ‘loop 0’.
We can use Nock-8 to pair these together. It evaluates a formula against the
subject, and then uses the original subject, augmented with the product, to
evaluate another one:
~ hnock "*[10 [8 [1 0] [0 1]]]"
[0 10]
The last thing we need define is the ‘loop’ function itself. We’re going to
place it at the head of the subject, i.e. at address two. Our ‘n’ and ‘m’
variables will then be at addresses six and seven, respectively. When we
define our ‘loop’ analogue, we just need to refer to those addresses, instead
of our variable names.
(N.b. Nock’s addressing scheme seems more or less equivalent to a
tree-flavoured variant of de Bruijn indices.)
The ‘loop’ function in the Haskell code tests to see the successor of ‘n’ is
equal to ‘m’, and if so, returns ‘n’. Otherwise it loops on the successor of
‘n’.
Nock-4 computes the successor. Remember that ‘n’ is at address six, so our
analogue to ‘succ n’ is [4 0 6]
. We want to check if it’s equal to ‘m’, at
address seven, gotten via [0 7]
. The equality check is then handled by
Nock-5:
Nock-6 is the conditional operator. If the equality check is true, we just
want to return ‘n’ via [0 6]
. The whole formula will look like this:
[6 [5 [4 0 6] [0 7]] [0 6] <loop on successor of n>]
To loop, we’ll use Nock-2. We’ll compute a new subject – consisting of our
‘loop’ analogue at address two, an updated ‘n’ at address six, and our
original ‘m’ at address seven – and then evaluate the same looping formula
against it. In Nock, the formula is:
[2 [[0 2] [4 0 6] [0 7]] [0 2]]
And our ‘loop’ analogue in full is thus:
[6 [5 [4 0 6] [0 7]] [0 6] [2 [[0 2] [4 0 6] [0 7]] [0 2]]]
Finally, we now need to actually store this in the subject. Again, we’ll
ignore the subject with Nock-1, and use Nock-8 to tuple the formula up with the
‘n’ and ‘m’ analogues:
(N.b. at this point we’re pretty much at line noise – spacing it out doesn’t
make it that much more readable, but it does avoid word wrap.)
[8
[1 0]
[8
[1
[6
[5 [4 0 6] [0 7]]
[0 6]
[2 [[0 2] [4 0 6] [0 7]] [0 2]]
]
]
[2 [0 1] [0 2]]
]
]
Let’s check if it works. We’re embedded in Haskell, after all, so let’s get
some support from our host, for convenience:
~ let dec = \subj -> "*[" <> subj <> " [8 [1 0] [8 [1 [6 [5 [4 0 6] [0 7]] [0 6] [2 [[0 2] [4 0 6] [0 7]] [0 2]]]] [2 [0 1] [0 2]]]]]"
~ hnock (dec "10")
9
~ hnock (dec "25")
24
~ hnock (dec "200")
199
Phew!
Fin
Nock makes for an interesting “functional assembly language” of sorts. Its
spec is so simple that writing a naïve interpreter is more or less a process of
copying out the production rules. You can do it without a lot of thought.
The time-honoured decrement-writing exercise, on the other hand, really makes
one think about Nock. That the environment is always explicit, you can access
it via a nice addressing scheme, that you can refer to functions via pointers
(viz. addresses), etc. It’s pretty intuitive, when you dig into the mechanics.
In any case – writing a decrement (or similar function) and writing your own
interpreter to test it is a fun exercise. From a practical perspective, it
also gives one a lot of intuition about programming in Hoon – it’s
probably not too far from the truth to compare the relationship between Nock
and Hoon to that between, say, x86 and C. If one is programming in Hoon, it
can’t hurt to to know your Nock!