Cellular Automata FAQ - Cellular automata hardware

What hardware implementations exist for CA?

Contributions by:

Dr R W Taylor <rwt@ohm.york.ac.uk>

See [TM87]

The "trick" of the CAM-6 was to encode the rules as a lookup table accessed by an "address" formed of the states of the neighborhood. For example, one bit states of cell plus eight neighboors is a 512 possible results. There was also a "rule compiler" that built the transition table and other computations from a special programming language.

You might also want to look at [HTA92] which describes a 3D automata engine, computation is performed through the reconfiguration (at a very low level) of the hardware.

Are there any simulators for CAM?

Contributions by:

Contributions by: Don Hopkins <hopkins@turing.ac.uk>

I (Don Hopkins) have written a recreational CAM-6 simulator (Toffoli & Margolus's Cellular Automata Machine) and ported it to HyperLook (the user interface development system I'm working on at Turing). It displays animated cellular automata that you can edit in real time with the mouse! And it comes with a free Lava Lamp!

The Cellular Automata Machine simulator (a SPARC binary with a bunch of PostScript and data files) and the HyperLook runtime system are now available for anonymous ftp! HyperLook and the CAM simulator run under OpenWindows Version 3 on color SPARC workstations.

See also CAM-PC and CAMEX.

What about running CA's on parallel or distributed machines?

Contributions by:

Contributions by: Don Hopkins <hopkins@turing.ac.uk>

Yes, CA are pretty trivial on massively parallel computers. Especially if you have data parallel software (e.g. CM-Fortran or C* on the Connection Machine), then you just create a virtual processor for each cell, and then have each cell fetch its neighbors and do a table-lookup or computation.

If you are not using data parallel software, but instead are using message-passing, it is still pretty simple. I typically partition the 2-D array into a set of ``stripes''. E.g. on a 128-node machine, if I want to run a 1024x1024 array, I give each processor a 128x8 patch of cells (plus one extra row of boundary condition at the top and bottom, so actually 128x10 with some redundancy). Each processor updates its local array, and then exchanges its top and bottom row with its neighbors. So you alternate between a step of computation where you loop over your patch of cells (lots of work if you have a big patch), and doing 2 sends and 2 receives (hopefully pretty quick). Imagine the processors are connected as a ring; you don't need any more connectivity than that (although it's good to have some nice way to dump data out to disk or a machine for analysis).

Partitioning the array into stripes minimizes the ``surface area'' of the cell patches, and so minimizes the communication you have to do (if you partitioned it as a ``checkerboard,'' you'd have more data to exchange with more neighbors). It also makes your inner loops more efficient, because you have really wide rows to loop over, instead of a bunch of short rows. It also makes each boundary a contiguous block of memory, so it's easy to send to its neighbor.

If you have a high overhead for sending, you may want to consider having 2 boundary-rows, and doing a little bit of redundant computation, so that you only do communication every other step.

In fact, I implemented a CA simulation using a network of Sun workstations using the above layout, and BSD sockets. Using 16 Suns, I think I had CA code running about 10-12 times faster than a single Sun. This was a few years ago on Sun 3/50's at RPI. I had grand visions of turning the whole campus into a monster CA simulation environment, but shortly after that, I got access to a CM-2 and forgot about the Suns. :-) Actually, there's no need to stay on Suns only - you could have some other machines in there as well, as long as they can do socket communication to exchange data with the neighbors.

References

AS92: N. M Allinson and M. J Sales. CART - A cellular automata research tool. Microprocessors and microsystems, 16(8):4093, 1992. The authors discuss the principles of cellular automata and present a detailed design of an expandable machine together with an example maze-solving program.
HTA92: N. Howard, R. Taylor, and N. Allinson. The design and implementation of a massively-parallel fuzzy architecture. Proc. IEEE, pages 545-552, March 1992.
TM87: Tommaso Toffoli and Norman Margolus. Cellular Automata Machines. MIT Press, London, 1987.