Understanding Neuromorphic Computing

The phrase neuromorphic computing has a long history, dating back at least to the 1980s, when legendary Caltech researcher Carver Mead proposed designing ICs to mimic the organization of living neuron cells. But recently the term has taken on a much more specific meaning, to denote a branch of neural network research that has diverged significantly from the orthodoxy of convolutional deep-learning networks. So, what exactly is neuromorphic computing now? And does it have a future of important applications, or is it just another fertile ground for sowing thesis projects?

A Matter of Definition

As the name implies—if you rea Greek, anyway—neuromorphic networks model themselves closely on biological nerve cells, or neurons. This is quite unlike modern deep-learning networks, so it is worthwhile to take a quick look at biological neurons.

Living nerve cells have four major components (Figure 1). Electrochemical pulses enter the cell through tiny interface points called synapses. The synapses are scattered over the surfaces of tree-root-like fibers called dendrites, which reach out into the surrounding nerve tissue, gather pulses from their synapses, and conduct the pulses back to the heart of the neuron, the cell body.

Figure 1. A schematic diagram shows synapses, dendrites, the cell body, and an axon.

In the cell body are structures that transform the many pulse trains arriving over the dendrites into an output pulse train. At least 20 different transform types have been identified in nature, ranging from simple logic-like functions to some rather sophisticated transforms. One of the most interesting for researchers—and the most widely used in neuromorphic computing—is the leaky integrator: a function that adds up pulses as they arrive, while constantly decrementing the sum at a fixed rate. If the sum exceeds a threshold, the cell body outputs a pulse.

Synapses, dendrites, and cell bodies are three of the four components. The fourth one is the axon: the tree-like fiber that conducts output pulses from the cell body into the nervous tissue, ending at synapses on other cells’ dendrites or on muscle or organ synapses.

So neuromorphic computers use architectural structures modeled on neurons. But there are many different implementation approaches, ranging from pure software simulations to dedicated ICs. The best way to define the field as it exists today may be to contrast it against traditional neural networks. Both are networks in which relatively simple computations occur at the nodes. But beyond that generalization there are many important differences.

Perhaps the most fundamental difference is in signaling. The nodes in traditional neural networks communicate by sending numbers across the network, usually represented as either floating-point or integer digital quantities. Neuromorphic nodes send pulses, or sometimes strings of pulses, in which timing and frequency carry the information—in other words, forms of pulse code modulation. This is similar to what we observe in biological nervous systems.

A second important difference is in the function performed in each node. Conventional network nodes do arithmetic: they multiply the numbers arriving on each of their inputs by predetermined weights and add up the products. Mathematicians see this as a simple dot product of the input vector and the weight vector. The resulting sum may then be subjected to some non-linear function such as normalization, min or max setting, or whatever other creative impulse moves the network designer. The number is then sent on to the next layer in the network.

In contrast, neuromorphic nodes, like neuron cell bodies, can perform a large array of pulse-oriented functions. Most commonly used, as we have mentioned, is the leaky integrate and spike function, but various designers have implemented many others. Like real neurons, neuromorphic nodes usually have many input connections feeding in, but usually only one output. In reference to living cells, neuromorphic inputs are often called synapses or dendrites, the node may be called a neuron, and the output tree an axon.

The topologies of conventional and neuromorphic networks also differ significantly. Conventional deep-learning networks comprise strictly cascaded layers of computing nodes. The outputs from one layer of nodes go only into selected inputs of the next layer (Figure 2). In inference mode—when the network is already trained and is in use—signals flow only in one direction. (During training, signals flow in both directions, as we will discuss in a moment.)

.Figure 2. The conventional deep-learning network is a cascaded series of computing nodes.

There are no such restrictions on the topology of neuromorphic networks. As in real nervous tissue, a neuromorphic node may get inputs from any other node, and its axon may extend to anywhere (Figure 3). Thus, configurations such as feedback loops and delay-line memories, anathema in conventional neural networks, are in principle quite acceptable in the neuromorphic field. This allows the topologies of neuromorphic networks to extend well beyond what can be done in conventional networks, into areas of research such as long-short term memory networks and other recurrent networks.

Figure 3. Connections between living neurons can be complex and three-dimensional.

 

Implementation

Carver Mead may have dreamt of implementing the structure of a neuron in silicon, but developers of today’s deep-learning networks have abandoned that idea for a much simpler approach. Modern, conventional neural networks are in effect software simulations—computer programs that perform the matrix arithmetic defined by the neural network architecture. The network is just a graphic representation of a large linear algebra computation.

Given the inefficiencies of simulation, developers have been quick to adopt optimizations to reduce the computing load, and hardware accelerators to speed execution. Data compression, use of shorter number formats for the weights and outputs, and use of sparse-matrix algorithms have all been applied. GPUs, clever arrangements of multiply-accumulator arrays, and FPGAs have been used as accelerators. An interesting recent trend has been to explore FPGAs or ASICs organized as data-flow engines with embedded RAM, in an effort to reduce the massive memory traffic loads that can form around the accelerators—in effect, extracting a data-flow graph from the network and encoding it in silicon.

In contrast, silicon implementations of neuromorphic processors tend to resemble architecturally the biological neurons they consciously mimic, with identifiable hardware blocks corresponding to synapses, dendrites, cell bodies, and axons. The implementations are usually, but not always, digital, allowing them to run much faster than organic neurons or analog emulations, but they retain the pulsed operation of the biological cells and are often event-driven, offering the opportunity for huge energy savings compared to software or to synchronous arithmetic circuits.

Some Examples

The grandfather of neuromorphic chips is IBM’s TrueNorth, a 2014 spin-off from the US DARPA research program Systems of Neuromorphic Adaptive Plastic Scalable Electronics. (Now that is really working for an acronym.) The heart of TrueNorth is a digital core that is replicated within a network-on-chip interconnect grid. The core contains five key blocks:

  1. The neuron: a time-multiplexed pulse-train engine that implements the cell-body functions for a group of 256 virtual neurons.
  2. A local 256 x 410-bit SRAM which serves as a crossbar connecting synapses to neurons and axons to synapses, and which stores neuron state and parameters.
  3. A scheduler that manages sequencing and processing of pulse packets.
  4. A router that manages transmission of pulse packets between cores.
  5. A controller that sequences operations within the core.

The TrueNorth chip includes 4,096 such cores.

The components in the core cooperate to perform a hardware emulation of neuron activity. Pulses move through the crossbar switch from axons to synapses to the neuron processor, and are transformed for each virtual neuron. Pulse trains pass through the routers to and from other cores as encoded packets. Since transforms like leaky integration depend on arrival time, the supervisory hardware in the cores must keep track of a time-stamping mechanism to understand the intended arrival time of packets.

Like many other neuromorphic implementations, TrueNorth’s main neuron function is a leaky pulse integrator, but designers have added a number of other functions, selectable via control bits in the local SRAM. As an exercise, IBM designers showed that their neuron was sufficiently flexible to mimic 20 different functions that have been observed in living neurons.

Learning

So far we have discussed mostly behavior of conventional and neuromorphic networks that have already been fully trained. But of course that is only part of the story. How the networks learn defines another important distinction between conventional and neuromorphic networks. And that subject will introduce another IC example.

Let’s start with networks of living neurons. Learning in these living organisms is not well understood, but a few of the things we do know are relevant here. First, there are two separate aspects to learning: real nerve cells are able to reach out and establish new connections, in effect rewiring the network as they learn. And they also have a wide variety of functions available in cell bodies. So, learning can involve both changing connections and changing functions. Second, real nervous systems learn very quickly. Humans can learn to recognize a new face or a new abstract symbol, with one or two instances. Conventional convolutional deep-learning networks might require tens of thousands of training examples to master the new item.

This observation suggests, correctly, that training of deep-learning networks is profoundly different from biological learning. To begin with, the two aspects of learning are separated. Designers specify a topology before training, and it does not change unless the network requires redesign. Only the weights applied to the inputs at each node are altered during training.

The process itself is also different. The implementation of the network that gets trained is generally a software simulation running on server CPUs, often with graphics processing unit (GPU) acceleration. Trainers must assemble huge numbers—often tens or hundreds of thousands—of input data sets, and label each one with the correct classification values. Then one by one, trainers feed an input data set into the simulation’s inputs, and simultaneously input the labels. The software compares the output of the network to the correct classification and adjusts the weights of the final stage to bring the output closer to the right answers, generally using a gradient descent algorithm. Then the software moves back to the next previous stage, and repeats the process, and so on, until all the weights in the network have been adjusted to be a bit closer to yielding the correct classification for this example. Then on to the next example. Obviously this is time- and compute-intensive.

Once the network has been trained and tested—there is no guarantee that training on a given network and set of examples will be successful—designers extract the weights from the trained network, optimize the computations, and port the topology and weights to an entirely different piece of software with a quite different sort of hardware acceleration, this time optimized for inference. This is how a convolutional network that required days of training in a GPU-accelerated cloud can end up running in a smart phone.

Neuromorphic Learning

Learning in TrueNorth is quite a different matter. The system includes its own programming language that allows users to set up the parameters in each core’s local SRAM, defining synapses within the core, selecting weights to apply to them, and choosing the functions for the virtual neurons, as well as setting up the routing table for connections with other cores. There is no learning mode per se, but apparently the programming environment can be set up so that TrueNorth cores can modify their own SRAMs, allowing for experiments with a wide variety of learning models.

That brings us to one more example, the Loihi chip described this year by Intel. Superficially, Loihi resembles TrueNorth rather closely. The chip is built as an orthogonal array cores that contain digital emulations of cell-body functions and SRAM-based synaptic connection tables. Both use digital pulses to carry information. But that is about the end of the similarity.

Instead of one time-multiplexed neuron processor in each core, each Loihi core contains 1,024 simple pulse processors, preconnected in what Intel describes as tree-like groups. Communications between these little pulse processors are said to be entirely asynchronous. The processors themselves perform leaky integration via a digital state machine. Synapse weights vary the influence of each synapse on the neuron body. Connectivity is hierarchical, with direct tree connections within a group, links between groups within a core, and a mesh packet network connecting the 128 cores on the die.

The largest difference between Loihi and TrueNorth is in learning. Each Loihi core includes a microcoded Learning Engine that captures trace data from each neuron’s synaptic inputs and axon outputs and can modify the synaptic weights during operation. The fact that the engine is programmable allows users to explore different kinds of learning, including unsupervised approaches, where the network learns without requiring tagged examples.

Where are the Apps?

We have only described two digital implementations of neuromorphic networks. There are many more examples, both digital and mixed-signal, as well as some rather speculative projects such as an MIT analog device using crystalline silicon-germanium to implement synapses. But are these devices only research aids and curiosities, or will they have practical applications? After all, conventional deep-learning networks, for all their training costs and—probably under-appreciated—limitations, are quite good at some kinds of pattern recognition.

It is just too early to say. Critics point out that in the four years TrueNorth has been available to researchers, the most impressive demo has been a pattern recognition implementation that was less effective than convolutional neural networks, and to make things even less impressive, was constructed by emulating a conventional neural network in the TrueNorth architecture. As for the other implementations, some were intended only for neurological research, some have been little-used, and some, like Loihi, are too recent to have been explored much.

But neuromorphic networks offer two tantalizing promises. First, because they are pulse-driven, potentially asynchronous, and highly parallel, they could be a gateway to an entirely new way of computing at high performance and very low energy. Second, they could be the best vehicle for developing unsupervised learning—a goal that may prove necessary for key applications like autonomous vehicles, security, and natural-language comprehension. Succeed or fail, they will create a lot more thesis projects.


CATEGORIES : AI/ AUTHOR : Ron Wilson

5 comments to “Understanding Neuromorphic Computing”

You can leave a reply or Trackback this post.
  1. Arthur Sheiman says: -#1

    Enjoyable and educational article. Thank you!

  2. Dan Buskirk says: -#1

    You’ve established a reputation for interesting and informative articles, and this note is no exception. I’m hoping that you will be able to supply a reference to the “20 transform types” you mention.

    I would also like to make an observation concerning biologically-relevant neural networks. Since we are, at present, living in a digital world we must live with what you call “pulses”, but nature is under no such constraint. The axon potential evolved as a way of transmitting electrical signals relatively long distances using leaky wires in a saline environment. However, many neurons, perhaps the majority, have no need for and do not generate “pulses”. Local neurons are constantly integrating analog signals over time and space; calculations of daunting complexity are already complete when action potentials carry results to other parts of the brain.

  3. Great article – thank you!
    I’m just wondering if you consider BrainChip SNAPvision implementation to be first on the market (as they actually have real customers)?

  4. Interesting article . I doubt that true neuromorphic processor can be implemented as an electronic chip . Light is intrinsically parallel while electronic is inherently time sequential . Thus , hardware implementation of true neuromorphic chip should be optical . For example 3D version of silicon photonics chip with light waveguides instead of TSV . Dynamically reconfigurable optical packet switching network technology used in optical interconnects can be used for intra chip digital pulse routing

  5. Lambert Spaanenburg says: -#1

    It is always rare to find a text that clarifies such a seemingly complex affair. But clarity comes often with simplification. Basic definitions cannot bear simplification. Your definition of neuromorphy is not really wrong. It has a lot of things going for it, like the person who blamed the cat for the disappearance of the bird from the cage: he licks his mouth, sits underneath the cage and the cage door is open! The Joplin hardware designed in 1991 at IMS Stuttgart for driver-less lane driving fits your definition but was (for marketing reasons) simply introduced as an adaptive controller for historical reasons.
    What I miss in your text is the proud reference to a long history of FPGAs nicely supporting neural systems. Development boards were freely distributed in the academic world, though I fear that Altera did not fully gathered the fruit. These designs worked (in contrast to the TrueNorth)!

Write a Reply or Comment

Your email address will not be published.