Skip to main content

1999 | Buch

Neural Networks and Analog Computation

Beyond the Turing Limit

verfasst von: Hava T. Siegelmann

Verlag: Birkhäuser Boston

Buchreihe : Progress in Theoretical Computer Science

insite
SUCHEN

Über dieses Buch

Humanity's most basic intellectual quest to decipher nature and master it has led to numerous efforts to build machines that simulate the world or communi­ cate with it [Bus70, Tur36, MP43, Sha48, vN56, Sha41, Rub89, NK91, Nyc92]. The computational power and dynamic behavior of such machines is a central question for mathematicians, computer scientists, and occasionally, physicists. Our interest is in computers called artificial neural networks. In their most general framework, neural networks consist of assemblies of simple processors, or "neurons," each of which computes a scalar activation function of its input. This activation function is nonlinear, and is typically a monotonic function with bounded range, much like neural responses to input stimuli. The scalar value produced by a neuron affects other neurons, which then calculate a new scalar value of their own. This describes the dynamical behavior of parallel updates. Some of the signals originate from outside the network and act as inputs to the system, while other signals are communicated back to the environment and are thus used to encode the end result of the computation.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Computational Complexity
Abstract
Although neural networks are based on continuous operations, we still analyze their computational power using the standard framework of computational complexity. In this chapter we provide the background material required for the search of the computational fundamentals of neural network and analog computational models. Our presentation starts with elementary definitions of computational theory, but gradually builds to advanced topics; each computational term introduced is immediately related to neural models.
Hava T. Siegelmann
Chapter 2. The Model
Abstract
In this chapter we introduce the formal model of the neural network to be utilized and analyzed in this book.
Hava T. Siegelmann
Chapter 3. Networks with Rational Weights
Abstract
The full neural network model with real weights will be analyzed in the next chapter. Here, we discuss the model with the weights constrained to the set of rationals. In contrast to the case described in the previous chapter, where we dealt only with integer weights, and each neuron could assume two values only, here a neuron can take on countably infinite different values. The analysis of networks with rational weights is a prerequisite for the proofs of the real weight model in the next chapter. It also sheds light on the role of different types of weights in determining the computational capabilities of the model.
Hava T. Siegelmann
Chapter 4. Networks with Real Weights
Abstract
This chapter describes the network in its full richness: the internal weights may assume any real number. One may argue that such networks are, for all practical purposes, useless since systems with infinitely precise constants cannot be built. However, the real weights are appealing for the mathematical modeling of analog computation that occurs in nature, as discussed in Chapter 2. In nature, the fact that the constants are not known to us, or cannot even be measured, is irrelevant for the true evolution of the system. For example, the planets revolve according to the exact values of G, π, and their masses, regardless of our inability to gauge these values. Although one could replace these constants with rational numbers in any finite time interval, and observe the same qualitative behavior, the long-term infinite-time characteristics of the system depend on the precise real values.
Hava T. Siegelmann
Chapter 5. Kolmogorov Weights: Between P and P/poly
Abstract
In previous chapters, we showed that the computational power of our neural networks depends on the type of numbers utilized as weights. Neural networks with rational weights, just like Turing machines, are finite objects, in the sense that they can be described with a finite amount of information. This is not true for networks with real weights; these have access to a potentially infinite source of information, which may allow them to compute nonrecursive functions. This chapter proves the intuitive notion that as the real numbers used grow richer in information, more functions become computable. To formalize this statement, we need a measure by which to quantify the information contained in real numbers.
Hava T. Siegelmann
Chapter 6. Space and Precision
Abstract
So far, we have considered neural networks with two types of resource constraints: time, and the Kolmogorov complexity of the weights. Here, we consider rational-weight neural networks in which a bound is set on the precision available for the neurons. The issue of precision comes up when simulating a neural network on a digital computer. Any implementation of real arithmetic in hardware will handle “reals” of limited precision, seldom larger than 64 bits. When more precision is necessary, one must resort to a software implementation of real arithmetic (sometimes provided by the compiler), and even in this case a physical limitation on the length of the mantissa of each state of a neural network under simulation is imposed by the amount of available memory. This observation suggests that some connection can be established between the space requirements needed to solve a problem and the precision required by the activations of the neural networks that solve it.
Hava T. Siegelmann
Chapter 7. Universality of Sigmoidal Networks
Abstract
Up to this point we considered only the saturated linear activation function. In this chapter, we investigate the computational power of networks with sigmoid activation functions, such as those widely considered in the neural network literature, e.g.,
$$ \varrho (x) = \frac{1} {{1 + e^{ - x} }} $$
or
$$ \varrho (x) = \frac{2} {{1 + e^{ - x} }} - 1. $$
(7.1)
. In Chapter 10 we will see that a large class of activation functions, which also includes the sigmoid, yields networks whose computational power is bounded from above by P/poly. In this chapter we obtain a lower bound on the computational power of sigmoidal networks. We prove that there exists a universal architecture of sigmoidal neurons that can be used to compute any recursive function, with exponential slowdown. Our proof techniques can be applied to a much more general class of “sigmoidal-like” activation functions, suggesting that Turing universality is a common property of recurrent neural network models. In conclusion, the computational capabilities of sigmoidal networks are located in between Turing machines and advice Turing machines.
Hava T. Siegelmann
Chapter 8. Different-limits Networks
Abstract
In Chapter 7 we showed that the saturated-linear activation function is not unique in its Turing universality, but rather that various sigmoidal-like activation functions can form finite-size architectures which are Turing universal as well. The class of activation functions considered in this chapter is much wider than that of the previous chapter, and as a result the lower bound on its computational power is weaker. We prove that any function for which the left and right limits exist and are different can serve as an activation function for the neurons to yield a network that is at least as strong computationally as a finite automaton.
Hava T. Siegelmann
Chapter 9. Stochastic Dynamics
Abstract
Having understood the power of deterministic analog recurrent neural networks, we now turn our attention to networks that exhibit stochastic and random behavior. Randomness is a basic characteristic of large distributed systems. It may result from the activity of the individual agents, from unpredictable changes in the communication pattern among the agents, or even just from the different update paces. All previous work that examined stochasticity in networks, e.g., [vN56, Pip90, Adl78, Pip88, Pip89, DO77a, DO77b], studied only acyclic architectures of binary gates, while we study general architectures of analog components. Due to these two qualitative differences, our results are totally different from the previous ones, and require new proof techniques.
Hava T. Siegelmann
Chapter 10. Generalized Processor Networks
Abstract
Up to this point we have analyzed in detail the computational properties of the analog recurrent neural network. From here on we turn to consider more general models of analog computation, and place our network within this wider framework.
Hava T. Siegelmann
Chapter 11. Analog Computation
Abstract
This chapter summarizes previous work in the field of analog computation, allowing us to view our model from this perspective. It also serves as a good introduction to the next chapter.
Hava T. Siegelmann
Chapter 12. Computation Beyond the Turing Limit
Abstract
This chapter differs from the rest of the book. In addition to containing mathematical proofs, it also discusses possible philosophical consequences in the realm of super-Turing theories.
Hava T. Siegelmann
Backmatter
Metadaten
Titel
Neural Networks and Analog Computation
verfasst von
Hava T. Siegelmann
Copyright-Jahr
1999
Verlag
Birkhäuser Boston
Electronic ISBN
978-1-4612-0707-8
Print ISBN
978-1-4612-6875-8
DOI
https://doi.org/10.1007/978-1-4612-0707-8