Advancing interconnect density for spiking neural network hardware implementations using traffic-aware adaptive network-on-chip routers

doi:10.1016/j.neunet.2012.04.004

Neural Networks

Volume 33, September 2012, Pages 42-57

https://doi.org/10.1016/j.neunet.2012.04.004 Get rights and content

Abstract

The brain is highly efficient in how it processes information and tolerates faults. Arguably, the basic processing units are neurons and synapses that are interconnected in a complex pattern. Computer scientists and engineers aim to harness this efficiency and build artificial neural systems that can emulate the key information processing principles of the brain. However, existing approaches cannot provide the dense interconnect for the billions of neurons and synapses that are required. Recently a reconfigurable and biologically inspired paradigm based on network-on-chip (NoC) and spiking neural networks (SNNs) has been proposed as a new method of realising an efficient, robust computing platform. However, the use of the NoC as an interconnection fabric for large-scale SNNs demands a good trade-off between scalability, throughput, neuron/synapse ratio and power consumption. This paper presents a novel traffic-aware, adaptive NoC router, which forms part of a proposed embedded mixed-signal SNN architecture called EMBRACE (EMulating Biologically-inspiRed ArChitectures in hardwarE). The proposed adaptive NoC router provides the inter-neuron connectivity for EMBRACE, maintaining router communication and avoiding dropped router packets by adapting to router traffic congestion. Results are presented on throughput, power and area performance analysis of the adaptive router using a 90 nm CMOS technology which outperforms existing NoCs in this domain. The adaptive behaviour of the router is also verified on a Stratix II FPGA implementation of a 4 × 2 router array with real-time traffic congestion. The presented results demonstrate the feasibility of using the proposed adaptive NoC router within the EMBRACE architecture to realise large-scale SNNs on embedded hardware.

Introduction

Brain functionality is based on the specialised signal processing capabilities of neurons (Trappenberg, 2010). A neuron consists of a cell body (soma) that possesses many input branches called dendrites, which carry information from other neurons. The neuron output (axon) communicates information to other neurons in the form of action potential pulses or spikes. Spikes are transmitted between neurons via weighted synaptic connections (synapses). Each neuron maintains a membrane potential, which is a function of the incoming spike rate, the neural network spatial distribution (network topology) and the synaptic weights (Maass, 1997). A neuron fires (generates an output spike) when the sum of its weighted input spikes exceeds a firing threshold value.

The brain contains an estimated $10^{10}$ neuron cells and $10^{15}$ synapses arranged in a parallel manner (Gerstner & Kistler, 2002). Large-scale artificial neural networks (ANNs) attempt to emulate (to a degree) the information processing function of the mammalian brain (Jain, Mao, & Mohiuddin, 1996). Researchers have studied and developed various neuron mathematical models to emulate neuron behaviour. Spiking neural networks (SNNs) (Maass, 1997) are a type of ANN, which use the timing of spikes, the network topology and synaptic weights to code information. SNNs offer the possibility of developing biological plausible neuron models and exhibit the ability to quickly adapt and to provide fault-tolerant capabilities (Paugam-Moisy & Bohte, 2009). These attributes, and the ability of SNNs to provide a good solution in partially observable environments and data with uncertainty, make SNNs suitable for implementing resilient classifiers and control applications.

The brain is highly efficient at processing information and tolerating faults. Research aims to harness these efficiencies and to build artificial neural systems that can emulate the key information processing principles of the brain. However, spiking neuron models have become very complex and the number of neurons anticipated for large-scale simulations is significant. Currently, high-performance computing systems are required to perform simulation of large-scale neural networks. Software-based SNNs are too slow to execute large-scale SNN-based algorithms and do not scale efficiently to ever increasing number of neurons. The complexity of inter-neuron connectivity makes it difficult to develop biological-scale SNNs in hardware, where the rapid increase in the ratio of fixed connections to the number of neurons limits the size of the network (Maguire et al., 2007). The traditional bus-based approach does not provide the mechanisms to overcome the SNN interconnection problem. The challenge is therefore to develop a dense synapse/neuron interconnection pattern, implemented in an embedded electronic device with low power consumption, reconfigurable capabilities, intrinsic parallelism and a high level of scalability; however this still remains a significant engineering problem.

Recently, research has focused on the network-on-chip (NoC) interconnect paradigm as a possible mechanism to support SNN scalability. The NoC paradigm replaces the bus-based approach to overcome the problem of wiring and the single point of arbitration. Furthermore, it enables parallel communication of concurrently handled data packets. In the case of SNNs, the NoC paradigm uses an array of routers to provide the communication infrastructure for neuron interconnection. Therefore, the router architecture plays an important role due to the use of the NoC as an interconnection fabric for large-scale SNNs, and demands a good trade-off between power consumption, throughput and traffic congestion, for example:

Power consumption. The NoC router is the communication point to which the neurons are attached. The number of routers increases proportionally with the number of neurons. Therefore, the power consumption for large-scale SNN hardware implementations increases since the major contributor towards power consumption is the interconnection fabric, rather than the neuron itself (Hale, Grot, & Keckler, 2009). The neuron typically has a power consumption approximately six orders of magnitude smaller than the interconnection fabric (Livi & Indiveri, 2009).

Throughput. The router is responsible for managing spike events. SNN traffic patterns are highly asynchronous and non-uniform (Gerstner & Kistler, 2002). Hence, an effective arbitration policy is required. In addition to offering adaptive routing based on the traffic behaviour, a router should also maximise spike throughput, without affecting the traffic performance and without incurring any significant hardware overhead.

Traffic congestion. Although the typical firing rate for a biological neuron is between 10 and 30 ms (Trappenberg, 2010), the number of spike events increases with the number of neurons. The NoC router must achieve real-time performance and offer a low probability of spike packet drop out as the neuron density increases. Therefore, routing algorithms should implement traffic congestion management features.

The authors have investigated and proposed EMBRACE (Emulating Biologically InspiRed ArChitectures in hardwarE), a scalable, mixed-signal embedded hardware SNN device (Harkin et al., 2009). An overview of the EMBRACE framework is presented in Fig. 1. The proposed EMBRACE SNN hardware, which is still to be fully realised, incorporates a low-area/power CMOS-compatible analogue neuron/synapse cell architecture (Chen, McDaid, Hall, & Kelly, 2008). EMBRACE implements inter-neuron connectivity through the use of a digital packet-based NoC communication architecture, illustrated in Fig. 1(a), which provides flexible, time-multiplexed communication channels, scalable interconnect and reconfigurability (Carrillo et al., 2011, Carrillo et al., 2010). The main building block for EMBRACE is the neural tile. The neural tile allows the merging of the analogue neuron/synapse circuitry with NoC digital interconnect to provide a scalable and reconfigurable neural building block. Spike events are passed between neural tiles via the NoC routers. Additional, EMBRACE research incorporates SystemC modelling and SNN NoC traffic analysis (Pande et al., 2010) and an evolvable hardware SNN training platform and configuration toolset (Cawley et al., 2011), as shown in Fig. 1(b).

This paper presents a novel adaptive NoC router which provides the inter-neuron connectivity within the EMBRACE architecture. The novel adaptive NoC router maintains SNN communication and avoids dropped SNN packets by adapting to SNN traffic congestion in large-scale neural network-based hardware computing systems. The adaptability of the proposed router can be described in two dimensions: (1) an adaptive arbitration policy which combines the fairness policy of a round-robin arbiter and the priority scheme of a first-come first-served approach, enabling improved router throughput according to the traffic behaviour presented across the network; (2) an adaptive routing decision module which enables the selection of different router paths to avoid traffic congestion based on pattern traffic and a channel congestion detector.

Therefore, the proposed NoC router is able to sustain the throughput under different spike traffic loads and also adapts to NoC router congestion and broken router connections by reconfiguring the routing topology to select an alternative route. This provides a fault-tolerant capability which is of paramount importance if large-scale SNNs are to be achieved in hardware. Router adaptive behaviour in the presence of applied real-time traffic congestion has been demonstrated on a Stratix II Altera FPGA for a 4×2 router array. Performance results on power and area analysis of the proposed adaptive router, using a 90 nm CMOS technology, indicates the feasibility of using the proposed adaptive NoC router within a scalable EMBRACE hardware SNN architecture.

The remainder of the paper is organised as follows: In Section 2, the motivation for this research and a summary of previous work on SNN-based NoC hardware implementations is presented. In Section 3, the proposed adaptive NoC router architecture incorporated within the EMBRACE architecture is detailed. In Section 4, results and analysis of the proposed adaptive NoC router architecture in terms of area utilisation, power consumption and spike packet throughput are discussed. Additionally, an FPGA-based hardware implementation of the proposed NoC router is presented along with results to validate the router performance. Finally, in Section 5, future trends for large-scale SNN hardware implementations and conclusions are discussed.

Section snippets

Background

This section summarises relevant related work regarding the interconnect strategies employed for hardware SNN implementations. In particular, network-on-chip (NoC) architectures are discussed and their suitability in supporting large-scale SNN hardware implementations is highlighted. A detailed review of artificial neural network hardware implementations can be found in (Misra & Saha, 2010).

Adaptive NoC router

This section presents the proposed adaptive NoC router incorporated within the EMBRACE architecture. More specifically, the main two components for the proposed router, i.e. the adaptive arbitration policy and the adaptive routing scheme that provide the adaptive mechanisms for the proposed NoC router, are discussed and detailed.

Performance analysis

This section presents results on the throughput capability, area utilisation and power consumption of the proposed adaptive NoC router for varied SNN traffic loads and compares the performance of the adaptive NoC router against other reported approaches. Firstly, the methodology of evaluation used to carry out the experiments and the testbench environment are explained. Secondly, trade-offs between throughput, area utilisation and power requirements of the proposed adaptive NoC router are

Conclusion and future work

This paper has summarised a range of the inherent challenges associated with the development of efficient adaptive NoC router architectures to support large-scale NoC-based hardware SNNs. Results demonstrate that software-based SNNs face the problem of scalability and performance limitations due to the lack of an inherently parallel capability. On the other hand, firmware approaches such as FPGAs and GPUs are power hungry and are not efficient for large-scale SNNs hardware implementation.

Acknowledgements

Snaider Carrillo Lindado is supported by a Vice-Chancellor’s Research Scholarship (VCRS) from the University of Ulster.

References (42)

H. de Garis et al.
A world survey of artificial brain projects, part i: large-scale brain simulations
Neurocomputing
(2010)
W. Maass
Networks of spiking neurons: the third generation of neural network models
Neural Networks
(1997)
L. Maguire et al.
Challenges for large-scale implementations of spiking neural networks on fpgas
Neurocomputing
(2007)
J. Misra et al.
Artificial neural networks in hardware: a survey of two decades of progress
Neurocomputing
(2010)
B. Roche et al.
Signalling techniques and their effect on neural network implementation sizes
Information Sciences
(2001)
A. Agarwal et al.
Survey of network on chip (noc) architectures & contributions
Journal of Engineering, Computing and Architecture
(2009)
Y. Ajima et al.
Tofu: A 6d mesh/torus interconnect for exascale computers
Computer
(2009)
L. Benini et al.
Networks on chips: a new soc paradigm
Computer
(2002)
T. Bjerregaard et al.
A survey of research and practices of network-on-chip
ACM Computing Surveys
(2006)
Boura, Y., & Das, C. (1994). Efficient fully adaptive wormhole routing in n-dimensional meshes. In Distributed...

S. Carrillo et al.

Adaptive routing strategies for large scale spiking neural network hardware implementations

S. Carrillo et al.

An efficient, high-throughput adaptive noc router for large scale spiking neural network hardware implementations

S. Cawley et al.

Hardware spiking neural network prototyping and application

Genetic Programming and Evolvable Machines

(2011)

Chen, Y., McDaid, L., Hall, S., & Kelly, P. (2008). A programmable facilitating synapse device. In Neural networks,...

G.-M. Chiu

The odd-even turn model for adaptive routing

Parallel and Distributed Systems, IEEE Transactions on

(2000)

W. Dally et al.

Principles and practices of interconnection networks

(2004)

J. Duato et al.

Interconnection networks: an engineering approach

(2003)

Furber, S., & Brown, A. (2009). Biologically-inspired massively-parallel architectures — computing beyond a million...

S. Furber et al.

Neural systems engineering

Journal of the Royal Society interface

(2007)

Furber, S., Temple, S., & Brown, A. (2006). High-performance computing for systems of spiking neurons. In AISB06...

W. Gerstner et al.

Spiking neuron models: single neurons, populations, plasticity

(2002)

Cited by (56)

An area and energy efficient LIF neuron model with spike frequency adaptation mechanism
2021, Neurocomputing
As neuron is the fundamental unit of the nervous system, it is one of the main building blocks in the spiking neural networks hardware implementation. To implement hardware that consists of many thousands of neurons and accurately mimics the brain functions, an energy and area-efficient design for the neuron is essential. In this paper, we propose a VLSI circuit for the leaky integrate and fire (LIF) neuron model, in 130 nm CMOS technology. The proposed neuron has some important features: first, it consumes very low energy; second, it is simple and area-efficient. Third, it has the spike frequency adaptation capability that makes it more biologically plausible. The spike frequency adaptation mechanism is added to the model only by one additional transistor, and it is done just by one parameter. The energy per spike of the neuron circuit in the worst case is only 0.67 fJ/spike. The spike frequency is in the MHz range, which enables attractive hardware acceleration.
Predicting Networks-on-Chip traffic congestion with Spiking Neural Networks
2021, Journal of Parallel and Distributed Computing
Network congestion is one of the critical reasons for degradation of data throughput performance in Networks-on-Chip (NoCs), with delays caused by data-buffer queuing in routers. Local buffer or router congestion impacts on network performance as it gradually spreads to neighbouring routers and beyond. In this paper, we propose a novel approach to NoC traffic prediction using Spiking Neural Networks (SNNs) and focus on predicting local router congestion so as to minimize its impact on the overall NoCs throughput. The key novelty is utilizing SNNs to recognize temporal patterns from NoC router buffers and predicting traffic hotspots. We investigate two neural models, Leaky Integrate and Fire (LIF) and Spike Response Model (SRM) to check performance in terms of prediction coverage. Results on prediction accuracy and precision are reported using a synthetic and real-time multimedia applications with simulation results of the LIF based predictor providing an average accuracy of 88.28%–96.25% and precision of 82.09%–96.73% as compared to 85.25%–95.69% accuracy and 73% and 98.48% precision performance of SRM based model when looking at congestion formations 30 clock cycles in advance of the actual hotspot occurrence.
Deep learning in spiking neural networks
2019, Neural Networks
Citation Excerpt :
An advantage of this in biological networks is that the spike events consume energy and that using few spikes which have high information content reduces energy consumption (Stone, 2018). This same advantage is maintained in hardware (Carrillo et al., 2013, 2012; Merolla et al., 2014; Seo et al., 2011). Thus, it is possible to create low energy spiking hardware which is highly responsive to event-based sensors based on the property that spikes are sparse in time (Pfeiffer & Pfeil, 2018).
In recent years, deep learning has revolutionized the field of machine learning, for computer vision in particular. In this approach, a deep (multilayer) artificial neural network (ANN) is trained, most often in a supervised manner using backpropagation. Vast amounts of labeled training examples are required, but the resulting classification accuracy is truly impressive, sometimes outperforming humans.
Neurons in an ANN are characterized by a single, static, continuous-valued activation. Yet biological neurons use discrete spikes to compute and transmit information, and the spike times, in addition to the spike rates, matter. Spiking neural networks (SNNs) are thus more biologically realistic than ANNs, and are arguably the only viable option if one wants to understand how the brain computes at the neuronal description level. The spikes of biological neurons are sparse in time and space, and event-driven. Combined with bio-plausible local learning rules, this makes it easier to build low-power, neuromorphic hardware for SNNs. However, training deep SNNs remains a challenge. Spiking neurons’ transfer function is usually non-differentiable, which prevents using backpropagation.
Here we review recent supervised and unsupervised methods to train deep SNNs, and compare them in terms of accuracy and computational cost. The emerging picture is that SNNs still lag behind ANNs in terms of accuracy, but the gap is decreasing, and can even vanish on some tasks, while SNNs typically require many fewer operations and are the better candidates to process spatio-temporal data.
A Spiking Neural Network implemented with Single-Electron Transistors and NoCs
2018, Nano Communication Networks
Citation Excerpt :
These networks interact through pulses and, therefore, are more resembling to biological systems than traditional ANNs [13,14]. SNNs present potential for building highly dense, massively parallel and fully interconnected systems [7]. Such networks would have great capability of data processing and, due to the high level of parallelism, good fault tolerance [12].
This work proposes a Spiking Neural Network, SNN, based on a nanoelectronic spiking neuron – as building block – and a 2D-mesh network-on-chip, NoC — as interconnect architecture. The SNN obtained from the NoC is an alternative for high density architectures, providing reconfigurability, high scalability and low power consumption. A look-up table router was used to connect all units. Further on, the eXclusive-OR, XOR, benchmark problem was used to validate the functionality of the nanoelectronic SNN.
Layered tile architecture for efficient hardware spiking neural networks
2017, Microprocessors and Microsystems
Spiking Neural Network (SNN) is the most recent computational model that can emulate the behaviour of biological neuron system. However, its main drawback is that it is computationally intensive, which limits the system scalability. This paper highlights and discusses the importance and significance of emulating SNNs in hardware devices. A layer-level tile architecture (LTA) is proposed for hardware-based SNNs. The LTA employs a two-level sharing mechanism of computing components at the synapse and neuron levels, and achieves a trade-off between computational complexity and hardware resource costs. The LTA is implemented on a Xilinx FPGA device. Experimental results demonstrate that this approach is capable of scaling to large hardware-based SNNs.
A hot-module-aware mapping approach in network-on-chip
2024, Journal of Supercomputing

View all citing articles on Scopus

¹: Tel.: +44 28 71675128.

²: Tel.: +44 28 71675452.

³: Tel.: +353 91 493301.

⁴: Tel.: +353 91 493137.

View full text

Advancing interconnect density for spiking neural network hardware implementations using traffic-aware adaptive network-on-chip routers

Abstract

Introduction

Section snippets

Background

Adaptive NoC router

Performance analysis

Conclusion and future work

Acknowledgements

Neurocomputing

Neural Networks

Neurocomputing

Neurocomputing

Information Sciences

Survey of network on chip (noc) architectures & contributions

Journal of Engineering, Computing and Architecture

Tofu: A 6d mesh/torus interconnect for exascale computers

Computer

Networks on chips: a new soc paradigm

Computer

A survey of research and practices of network-on-chip

ACM Computing Surveys

Adaptive routing strategies for large scale spiking neural network hardware implementations

An efficient, high-throughput adaptive noc router for large scale spiking neural network hardware implementations

Hardware spiking neural network prototyping and application

Genetic Programming and Evolvable Machines

The odd-even turn model for adaptive routing

Parallel and Distributed Systems, IEEE Transactions on

Principles and practices of interconnection networks

Interconnection networks: an engineering approach

Neural systems engineering

Journal of the Royal Society interface

Spiking neuron models: single neurons, populations, plasticity