A formal theory of inductive inference. Part I*,

https://doi.org/10.1016/S0019-9958(64)90223-2Get rights and content
Under an Elsevier user license
open archive

1. Summary

In Part I, four ostensibly different theoretical models of induction are presented, in which the problem dealt with is the extrapolation of a very long sequence of symbols—presumably containing all of the information to be used in the induction. Almost all, if not all problems in induction can be put in this form.

Some strong heuristic arguments have been obtained for the equivalence of the last three models. One of these models is equivalent to a Bayes formulation, in which a priori probabilities are assigned to sequences of symbols on the basis of the lengths of inputs to a universal Turing machine that are required to produce the sequence of interest as output.

Though it seems likely, it is not certain whether the first of the four models is equivalent to the other three.

Few rigorous results are presented. Informal investigations are made of the properties of these models. There are discussions of their consistency and meaningfulness, of their degree of independence of the exact nature of the Turing machine used, and of the accuracy of their predictions in comparison to those of other induction methods.

In Part II these models are applied to the solution of three problems—prediction of the Bernoulli sequence, extrapolation of a certain kind of Markov chain, and the use of phrase structure grammars for induction.

Though some approximations are used, the first of these problems is treated most rigorously. The result is Laplace's rule of succession.

The solution to the second problem uses less certain approximations, but the properties of the solution that are discussed, are fairly independent of these approximations.

The third application, using phrase structure grammars, is least exact of the three. First a formal solution is presented. Though it appears to have certain deficiencies, it is hoped that presentation of this admittedly inadequate model will suggest acceptable improvements in it. This formal solution is then applied in an approximate way to the determination of the “optimum” phrase structure grammar for a given set of strings. The results that are obtained are plausible, but subject to the uncertainties of the approximation used.

Cited by (0)

*

This research was supported by Air Force Office of Scientific Research Contract No. AF 49(638)-376, Grant No. AF-AFOSR 62-377, and Public Health Service NIH Grant No. GM 11021-01.

A paper given at the Conference on “Cerebral Systems and Computers” held at the California Institute of Technology, February 8–11, 1960 was the subject of Zator Technical Bulletin No. 138. Much of the material of Sections 3.1 to 3.4 first appeared in Zator Technical Bulletins 138 and 139 of November 1960 and January 1961, respectively. Sections 4.1 and 4.2 are more exact presentations of Zator Technical Bulletins 140 and 141 of April 1961 and April 1962, respectively.