nach oben

1999 | Buch

Kapitel lesen Erstes Kapitel lesen

Neural Networks and Analog Computation

Beyond the Turing Limit

verfasst von: Hava T. Siegelmann

Verlag: Birkhäuser Boston

Buchreihe : Progress in Theoretical Computer Science

Enthalten in: Professional Book Archive

Einloggen, um Zugang zu erhalten

Über dieses Buch

Humanity's most basic intellectual quest to decipher nature and master it has led to numerous efforts to build machines that simulate the world or communi cate with it [Bus70, Tur36, MP43, Sha48, vN56, Sha41, Rub89, NK91, Nyc92]. The computational power and dynamic behavior of such machines is a central question for mathematicians, computer scientists, and occasionally, physicists. Our interest is in computers called artificial neural networks. In their most general framework, neural networks consist of assemblies of simple processors, or "neurons," each of which computes a scalar activation function of its input. This activation function is nonlinear, and is typically a monotonic function with bounded range, much like neural responses to input stimuli. The scalar value produced by a neuron affects other neurons, which then calculate a new scalar value of their own. This describes the dynamical behavior of parallel updates. Some of the signals originate from outside the network and act as inputs to the system, while other signals are communicated back to the environment and are thus used to encode the end result of the computation.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Computational Complexity

Abstract

Although neural networks are based on continuous operations, we still analyze their computational power using the standard framework of computational complexity. In this chapter we provide the background material required for the search of the computational fundamentals of neural network and analog computational models. Our presentation starts with elementary definitions of computational theory, but gradually builds to advanced topics; each computational term introduced is immediately related to neural models.

Hava T. Siegelmann

Chapter 2. The Model

Abstract

In this chapter we introduce the formal model of the neural network to be utilized and analyzed in this book.

Hava T. Siegelmann

Chapter 3. Networks with Rational Weights

Abstract

The full neural network model with real weights will be analyzed in the next chapter. Here, we discuss the model with the weights constrained to the set of rationals. In contrast to the case described in the previous chapter, where we dealt only with integer weights, and each neuron could assume two values only, here a neuron can take on countably infinite different values. The analysis of networks with rational weights is a prerequisite for the proofs of the real weight model in the next chapter. It also sheds light on the role of different types of weights in determining the computational capabilities of the model.

Hava T. Siegelmann

Chapter 4. Networks with Real Weights

Abstract

This chapter describes the network in its full richness: the internal weights may assume any real number. One may argue that such networks are, for all practical purposes, useless since systems with infinitely precise constants cannot be built. However, the real weights are appealing for the mathematical modeling of analog computation that occurs in nature, as discussed in Chapter 2. In nature, the fact that the constants are not known to us, or cannot even be measured, is irrelevant for the true evolution of the system. For example, the planets revolve according to the exact values of G, π, and their masses, regardless of our inability to gauge these values. Although one could replace these constants with rational numbers in any finite time interval, and observe the same qualitative behavior, the long-term infinite-time characteristics of the system depend on the precise real values.

Hava T. Siegelmann

Chapter 5. Kolmogorov Weights: Between P and P/poly

Abstract

In previous chapters, we showed that the computational power of our neural networks depends on the type of numbers utilized as weights. Neural networks with rational weights, just like Turing machines, are finite objects, in the sense that they can be described with a finite amount of information. This is not true for networks with real weights; these have access to a potentially infinite source of information, which may allow them to compute nonrecursive functions. This chapter proves the intuitive notion that as the real numbers used grow richer in information, more functions become computable. To formalize this statement, we need a measure by which to quantify the information contained in real numbers.

Hava T. Siegelmann

Chapter 6. Space and Precision

Abstract

So far, we have considered neural networks with two types of resource constraints: time, and the Kolmogorov complexity of the weights. Here, we consider rational-weight neural networks in which a bound is set on the precision available for the neurons. The issue of precision comes up when simulating a neural network on a digital computer. Any implementation of real arithmetic in hardware will handle “reals” of limited precision, seldom larger than 64 bits. When more precision is necessary, one must resort to a software implementation of real arithmetic (sometimes provided by the compiler), and even in this case a physical limitation on the length of the mantissa of each state of a neural network under simulation is imposed by the amount of available memory. This observation suggests that some connection can be established between the space requirements needed to solve a problem and the precision required by the activations of the neural networks that solve it.

Hava T. Siegelmann

Chapter 7. Universality of Sigmoidal Networks

Abstract

Up to this point we considered only the saturated linear activation function. In this chapter, we investigate the computational power of networks with sigmoid activation functions, such as those widely considered in the neural network literature, e.g.,

$$ \varrho (x) = \frac{1} {{1 + e^{ - x} }} $$

$$ \varrho (x) = \frac{2} {{1 + e^{ - x} }} - 1. $$

(7.1)

. In Chapter 10 we will see that a large class of activation functions, which also includes the sigmoid, yields networks whose computational power is bounded from above by P/poly. In this chapter we obtain a lower bound on the computational power of sigmoidal networks. We prove that there exists a universal architecture of sigmoidal neurons that can be used to compute any recursive function, with exponential slowdown. Our proof techniques can be applied to a much more general class of “sigmoidal-like” activation functions, suggesting that Turing universality is a common property of recurrent neural network models. In conclusion, the computational capabilities of sigmoidal networks are located in between Turing machines and advice Turing machines.

Hava T. Siegelmann

Chapter 8. Different-limits Networks

Abstract

In Chapter 7 we showed that the saturated-linear activation function is not unique in its Turing universality, but rather that various sigmoidal-like activation functions can form finite-size architectures which are Turing universal as well. The class of activation functions considered in this chapter is much wider than that of the previous chapter, and as a result the lower bound on its computational power is weaker. We prove that any function for which the left and right limits exist and are different can serve as an activation function for the neurons to yield a network that is at least as strong computationally as a finite automaton.

Hava T. Siegelmann

Chapter 9. Stochastic Dynamics

Abstract

Having understood the power of deterministic analog recurrent neural networks, we now turn our attention to networks that exhibit stochastic and random behavior. Randomness is a basic characteristic of large distributed systems. It may result from the activity of the individual agents, from unpredictable changes in the communication pattern among the agents, or even just from the different update paces. All previous work that examined stochasticity in networks, e.g., [vN56, Pip90, Adl78, Pip88, Pip89, DO77a, DO77b], studied only acyclic architectures of binary gates, while we study general architectures of analog components. Due to these two qualitative differences, our results are totally different from the previous ones, and require new proof techniques.

Hava T. Siegelmann

Chapter 10. Generalized Processor Networks

Abstract

Up to this point we have analyzed in detail the computational properties of the analog recurrent neural network. From here on we turn to consider more general models of analog computation, and place our network within this wider framework.

Hava T. Siegelmann

Chapter 11. Analog Computation

Abstract

This chapter summarizes previous work in the field of analog computation, allowing us to view our model from this perspective. It also serves as a good introduction to the next chapter.

Hava T. Siegelmann

Chapter 12. Computation Beyond the Turing Limit

Abstract

This chapter differs from the rest of the book. In addition to containing mathematical proofs, it also discusses possible philosophical consequences in the realm of super-Turing theories.

Hava T. Siegelmann

Backmatter

Titel: Neural Networks and Analog Computation
verfasst von: Hava T. Siegelmann
Verlag: Birkhäuser Boston
Electronic ISBN: 978-1-4612-0707-8
Print ISBN: 978-1-4612-6875-8
DOI: https://doi.org/10.1007/978-1-4612-0707-8

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Chapter 1. Computational Complexity

Chapter 2. The Model

Chapter 3. Networks with Rational Weights

Chapter 4. Networks with Real Weights

Chapter 5. Kolmogorov Weights: Between P and P/poly

Chapter 6. Space and Precision

Chapter 7. Universality of Sigmoidal Networks

Chapter 8. Different-limits Networks

Chapter 9. Stochastic Dynamics

Chapter 10. Generalized Processor Networks

Chapter 11. Analog Computation

Chapter 12. Computation Beyond the Turing Limit

Backmatter