Robust-first computing: Demon Horde Sort (short version)
Articles,  Blog

Robust-first computing: Demon Horde Sort (short version)

I want to talk about robust first computing, it’s what I’ve been
working on… and show a demo of the demon horde sort which is a particular
approach to sorting that exemplifies this robust-first idea now we have computers that are extremely
cheap and, less weak, fairly powerful and that culture of correctness and efficiency
only — c_e_o_ software — is creating incredible security
problems, the security nightmares that we’re living with today that we kind of
think they are sort of normal when in fact they’re crazy so if we think robust-first we say let’s
make sure we’re in the ballpark and then try to get it dead-on.
the architectural goal is to be able to take this computational
model and scale it indefinitely large, as long as we have money to buy the nodes and and power to power them and space real
estate to plug them all together to do that we need to focus on robustness this
computer potentially could be so big we cannot finish it before we start using it we’re gonna be able plug in more stuff
while it’s going it’s going to be so big that some
fraction of it’s gonna be you know getting repaired, or crapping out one way or another as we go the computations that we build have to
be inherently robust so that we can slap this stuff together. in each site it can either be empty or we can have
an atom one atom and here is an atom. it’s called ‘dreg’ uh… stands for dynamic regulator this diamond shape is called the event
window that is the limits of the number of
sites that this guy in the middle can access when he executes so there’s forty one sites total uh… they’re all within a distance of
four counting in city block distance. and when this guy in the center is executing he can read
and write to the event window arbitrarily but once he gives up control at the end of
his event then some other guy gets picked and he can read and write the event window which
might be overlapping, might be completely separate the movable feast machine as a whole can be executing many events
simultaneously as long as none of their event windows overlap now the way dreg specifically works is
this when it acts when it behaves it looks north south east or west one of
those locations picked at random to see whether it’s empty or
occupied and for example suppose it looks north it’s empty when it’s empty it picks some random number and with some low odds it will create a resource atom that I’m calling
rez in that spot and with a very low
probability instead of creating a rez it’ll create another dreg okay so dreg can reproduce into empty spots but is very low odds on the other hand if it looks in this in the direction in
this case if it looks west uh… it finds it’s occupied and picks a different random number with a
different probability and maybe just erases the thing whatever it is okay at the end of the step it diffuses and in general diffusion respects whatever constraints there are on the local layout ok? that’s it that’s the dreg rule we’ve got one dreg in it and we’ll start it up so there it was just diffusing around and eventually it will get lucky and decide to make something. now there’s, you
can’t destroy anything ’cause it’s the only thing that there is in the universe uh… there. it made a rez let’s get rid of the grid ’cause it gets visually a little crowded uh… we’re gonna we’re artificially slowing
this down so that we can get a little clue what’s going on. let’s let it run at
its natural speed uh… uh… and you see what starts to happen if we, if
you can read these numbers over here uh… there’s currently two one.. one.. there’s fifty rez… uh sixty rez and so on uh… overall there’s eighteen. nineteen percent of this little grid is occupied by
something. if we let it run, given the particular probabilities I picked for
creating stuff versus destroying stuff twice as likely to destroy as to create
and that tends to make it float around a third of the sites of the grid occupied so now we’re at thirty two percent it’ll end up going a little about the
third thirty five percent something when it reaches equilibrium. one of the essential things
about parallel computing is you have to deal with deadlock uh… crowding, queues full, and so on in this robust-first approach with this particular demonstration we’re saying let’s just deal with that by putting in this lower level that keeps the world about one-third full we can do more with this suppose i put in another element. he’s a
sorter the demon horde sort uh… but one of the behaviors that he’s got is he looks around himself when it’s his
turn to go and if he sees rez he converts them into more sorters okay so if we let this run just for a
brief bit the rez have been taken over we can take a dreg and rez base and build things on top of it that will compete for the available resources and that they will
they will automatically keep their total population under control because of they
can only get as big as the number of resources okay this is a cartoon of the world of a sorter he looks at to his right and up to see if there’s a number that’s bigger
than the number he’s carrying because that’s a number — we’re gonna try
to sort smaller numbers up bigger numbers down — uh… uh… so if he sees a bigger number
above him he would like to move it down if he sees a smaller number below him
and to the right he’d like to move it up and to the left ’cause in all cases we’re going
to sort from right to left so in this case the seventy-nine is
bigger than the threshold and so he’d be interested in moving it down the twenty-four is smaller than
threshold he’d be interested in moving it up but the destination for moving it up is all
currently occupied but we’ve got an empty spot so what the sorter decides to do is take the seventy-nine atom copy him down to this empty spot erase him from where he used to be.
moving it sort of a little quantum, a little
increment of sorting and then the sorter updates its threshold to be the value of
the atom it just moved so at the end of this event the forty-three will have changed to a
seventy-nine. not trying to do the entire sorting job yourself you just trying
to make things a little better and by making the spatial assumptions
saying that were going to sort from the right to the left small is up big is down that allows the individual sorting
element to make things better so here we’ve got a much bigger grid than we
had last last time. it doesn’t look bigger because the atoms are all drawn smaller and so on the field there’s these little eggs uh… at regular intervals each consists
of two sorters and two dregs uh… and if we look on down there’s more
and there’s more here in the middle another egg in addition there’s
this guy who’s an output guy, and the idea is we’re
sorting right to left so eventually the data guys will reach the left and these out guys will suck ’em out of
the system and do whatever the rest of the computation needs with them in our case its just gonna score them
it’s gonna say well i know what what numbers ought to be coming here
we’ll check to see how well the sorting is going. similarly over at the other end
on the right we’ve got an input guy he’s gonna generate data just at random for purposes
of this demo uh… to be sorted by the demon horde sort, and the guys
in the middle are the demon horde. so when i start this up in a sec it’ll be a
little confusing because a bunch of stuff will be happening at once but the first thing you’ll see is the input grid will spread because in addition to creating inputs,
the input elements are designed to look above, look below, and if the spots where there should be
another input guy are don’t have an input guy and they’re empty it goes ahead and
puts a new input guy there and the output grid makes a denser
row of two uh… uh… so we’ll see the grids grow as the input grids grow they will start
generating data — the blue atoms — and then the sorters will do their
diffusion and the dreg will do their diffusion and create rez and so on.
so let’s take a look at its behavior all right. so pretty quickly uh… we’ve now got a full stack of
green input grid guys that.. it takes a while for the machine to warm up.
the sorting error is seven positions on average that’s pretty bad. if it should be the
neighboring guy that’s a sorting error of one; off by two guys and so on uh… so a sorting error of.. well now
it’s down to five uh… um… is, you know, better, still not very good.
let’s let this run quicker and start to equilibrate a little bit uh… um… now we’re doing a little better the
sorting error is down to about three and a half positions on average
plus or minus, which is not too bad and the categorical error is a tenth of a
percent sometimes never over the course of a five
hundred step period not too bad And, this thing is embarrassingly robust unlike the correct-first c_e_o_ approach where do you do somethin once remember it and then trust it
from then on, here we are continually in the process of
recreating ourselves. so we can blow like two-thirds of the entire machine away and again now the score is definitely.. yeah, the categorical
error is up to a third, sorting error is terrible the thing’s getting all jammed up here so it takes a little longer to recover one last point here, this is a picture that’s showing just the sorters but instead of showing them as red it shows them colored by what their
threshold is. and you know it looks like there isn’t a whole lot happening in the last third of this grid. the colors seem
fairly uniform but they’re making increasingly fine
distinctions. if we want we can start stealing resources from this sorting thing the machine adjusts uh… uh… eventually its sorting
performance starts to degrade uh… but now we’ve got all of the space
over here there’s still events happening here okay. so that’s it this is obviously a different way to
compute than the way that we’ve gotten used to it but in a way it’s a lot more like how society
computes all of the stuff that we do in the world
tolerates these little errors the idea is that each computing element instead of assuming that everything else
is absolutely perfect should assume that everything else is
trying but might not be
perfect so computational elements that give a
little better than they get like the sorter doesn’t try to get the correct answer, it
just tries to give a little better than it gets thanks for watching


  • Roger Keulen

    Hi, Dave. I'm also trying to create a new computer. I want to tackle the problem of having a central controle unit. So yes, also mine computer you can turn on while it's not finished.

    I gone watch you movie now.. Good luck !

  • Alfonso F R

    A small and humble gift, sir: 
    Source code in Mathematica: <code>

  • Dan McNeill

    This is great. Might I suggest that you make a simpler "static" version — no "res" and the sorting demons stay in one place for educational purposes. It would be a kind of "filter" sort — really just a parallel bubble sort.

Leave a Reply

Your email address will not be published. Required fields are marked *