2022 | Buch

# Computability

verfasst von: Prof. George Tourlakis

2022 | Buch

verfasst von: Prof. George Tourlakis

This survey of computability theory offers the techniques and tools that computer scientists (as well as mathematicians and philosophers studying the mathematical foundations of computing) need to mathematically analyze computational processes and investigate the theoretical limitations of computing. Beginning with an introduction to the mathematisation of “mechanical process” using URM programs, this textbook explains basic theory such as primitive recursive functions and predicates and sequence-coding, partial recursive functions and predicates, and loop programs.

Advanced chapters cover the Ackerman function, Tarski’s theorem on the non-representability of truth, Goedel’s incompleteness and Rosser’s incompleteness theorems, two short proofs of the incompleteness theorem that are based on Lob's deliverability conditions, Church’s thesis, the second recursion theorem and applications, a provably recursive universal function for the primitive recursive functions, Oracle computations and various classes of computable functionals, the Arithmetical hierarchy, Turing reducibility and Turing degrees and the priority method, a thorough exposition of various versions of the first recursive theorem, Blum’s complexity, Hierarchies of primitive recursive functions, and a machine-independent characterisation of Cobham's feasibly computable functions.

Anzeige

Abstract

As noted in the Preface, I will be assuming of the reader at least a solid course in discrete math, including simple proof techniques (e.g., by induction, by contradiction) and some elementary logic. This chapter will repeat much of these topics, but I will ask forgiveness if there are no strict linear dependencies between the sections in Chap. 0.

For example I talk about sets in the section below, which is about induction (but I deal with sets in later sections), I talk about proof by contradiction (to be redone formally in the section on logic, later).

There are two licensing considerations for this: One, this section is a review of (assumed) known to the reader topics (except, probably, for the topic on topology). It is not a careful construction of a body of mathematics based on previous knowledge. I am only making sure that we start “on the same page” as it were.

Two, the concept of natural number is fundamentally grounded in our intuition and it is safe, with some care, to start this review with the topics of induction and the least principle in the set of natural numbers prior to a review of other topics such as sets and logic.

The most important topics in Chap. 0 are the existence and use of nontotal functions —and the realisation that we deal in this volume with both total(ly defined) functions and non total functions, the two types collectively put under the category of partial functions. These play a major role in computability.

Other important topics in our review here are definitions by induction in a partially ordered set, and Cantor’s diagonal method or diagonalisation, which is an important tool in computability. We will also do a tiny bit of point set topology, only used (sparsely) in Chap. 13.

Abstract

Computability is the part of logic and theoretical computer science that gives a mathematically precise formulation to the concepts of algorithm, mechanical process, and calculable or computable function (or relation).

Its development was strongly motivated, in the 1930s, by Hilbert’s program to found mathematics on a (provably —Metamathematically) consistent (i.e., free from contradiction) axiomatic basis, in particular by his belief that the Entscheidungsproblem, or decision problem, for axiomatic theories —that is, the problem “is this formula a theorem of that theory?”— was solvable by a mechanical procedure. Naturally, towards investigating this belief an important ingredient was to establish what was a mechanical procedure.

Now, since antiquity, mathematicians have invented “mechanical procedures”, e.g., Euclid’s algorithm for the “greatest common divisor”, —that is, the largest positive integer that is a common divisor of two given integers— and had no problem recognising such procedures when they encountered them. But how do you mathematically prove the nonexistence of such a mechanical procedure for a particular problem, such as the Entscheidungsproblem, for example? You cannot just try all such procedures on the problem to see if one solves it or no one does. There are infinitely many mechanical procedures!

We encounter the same difficulty towards establishing any result of the type “there is no algorithm that solves problem X”. Problem“X” could be, e.g., to check if an arbitrary program, say written in C, will ever print the number 42.

You need a mathematical formulation of what is a “mechanical procedure” or “algorithm” in order to hope that you can address the Entscheidungsproblem, or problem X above. By the way, Church proved (Church (J. Symb. Logic 1:40–41, 101–102, 1936a; Amer. J. Math. 58:345–363, 1936b)) that Hilbert’s fundamental Entscheidungsproblem admits no solution by functions that are calculable within any of the known equivalent mathematical frameworks of computability, and in this volume we will see that problem X has no algorithmic solution either, if we use the term “algorithm” in the technical, mathematical sense of this volume.

Intensive activity by many researchers, notably Post (J Symbolic Logic 1:103–105, 1936), Post (Bull Amer Math Soc 50:284–316, 1944), Kleene (Math Annalen 112:727–742, 1936, Church (Amer. J. Math. 58:345–363, 1936b), Turing (Proc London Math Soc 2(42, 43): 230–265, 544–546, 1936, 1937), (and some 30 years later Markov Trans Am Math Soc 2(15), 1960), led in the first instance in the 1930s to several alternative formulations, each purporting to mathematically characterise the concepts algorithm, mechanical procedure, and calculable function. All these formulations were soon proved to be equivalent; that is, the calculable functions admitted by any one of them were the same as those that were admitted by any other. This led Alonzo Church to formulate his conjecture, known as “Church’s Thesis”, that any intuitively computable function is also calculable within any of these equivalent mathematical frameworks of calculability or computability.

The eventual introduction of computers provided further motivation for research on the various mathematical frameworks of computation, “models of computation” as we often say, and computability is nowadays a deep and very broad research field. The model of computation that I will present here, due to Shepherdson and Sturgis (J ACM 10:217–255, 1963), is a later model that has been informed by developments in computer science, in particular, by the advent of so-called high level —the level is “higher” the more the programming language is distanced from machine-dependent details— programming languages.

Abstract

We saw that the successor, zero, and the generalised identity functions —which we name S, Z and \(U^{n}_{i}\) respectively— are in \({\mathcal {P}}\); thus, not only are they “intuitively computable”, but they are so in a precise mathematical sense: each is computable by a URM.

We have also shown that “computability” of functions is preserved by the operations of composition, primitive recursion, and unbounded search.

The primitive recursive functions are an important tool in computability theory, providing coding tools, and insights in every-day programming (with FORTRAN-like loops). These we introduce in this chapter and we will come back to them very often. They are the tools Gödel used in proving his Incompleteness theorems.

Most people introduce the primitive recursive functions —as Dedekind did— via derivations, just as one introduces the theorems of logic via proofs, as we will detail in the definition below.

Abstract

The loop programs were introduced by Meyer and Ritchie (Computational complexity and program structure, Technical Report RC-1817, IBM, 1967) as a program theoretic counterpart to the number theoretic introduction of the set of primitive recursive functions \(\mathcal {P}\mathcal {R}\). This programming formalism among other things connects the definitional (or structural) complexity of primitive recursive functions with their (run time) computational complexity, but this latter phenomenon will be addressed in Chap. 15. In the present chapter we will offer an easy informal proof that \(\mathcal {P}\mathcal {R}\) is an incomplete formalism for the concept of “computable function”.

Abstract

The “Ackermann function” was proposed, of course, by Ackermann. The version here is a simplification by Robert Ritchie.

It provides us with an example of a recursive function that is not in \(\mathcal {P}\mathcal {R}\). Unlike the example in Chap. 3, which provided an alternative such function by diagonalisation, the proof that the Ackermann function is not primitive recursive is by a majorisation argument. Simply, it is too big to be primitive recursive!

But this function is more than just intuitively computable! It is computable —no hedging —as we will show by showing it to be a member of \({\mathcal {R}}\).

Another thing it does is that it provides us with an example of a function \(\lambda \vec x.f(\vec x)\) that is “hard to compute” (in the sense \(f\notin \mathcal {P}\mathcal {R}\)) but whose graph —\(\lambda y\vec x.y=f(\vec x)\)— is nevertheless “easy to compute” (\({}\in \mathcal {P}\mathcal {R}_{*}\)). (Here the colloquialisms “easy to compute” and “hard to compute” are aliases for “primitive recursive” and “not primitive recursive”, respectively. This is a hopelessly coarse rendering of easy/hard and a much better gauge for the runtime complexity of a problem is on which side of O(2^{n}) it lies. However, our gauge will have to do for now: All I want to leave you with is that for some functions it is easier to compute the graph —to the quantifiable extent that it is in \(\mathcal {P}\mathcal {R}_{*}\)— than the function itself, meaning that the latter fails to be primitive recursive.)

Abstract

The aim of Computability is to mathematically capture (for example, via URMs) the informal notions of “algorithm” and “computable function” (or “computable relation”).

Several mathematical models of computation, that were very different in their syntactic details and semantics, have been proposed in the 1930s by several researchers (Post, Church, Kleene, Turing), and, more recently, by Markov (Transl. Amer. Math. Soc. 2(15), 1960) and Shepherdson and Sturgis (J. ACM 10:217–255, 1963). They were promptly proved to be equivalent, that is, they all computed exactly the same number-theoretic functions, those in \({\mathcal {P}}\) that we introduced earlier.

This provided “empirical” evidence that prompted Church to state his belief, known as “Church’s Thesis”, that

Every informal algorithm (pseudo-program) that we propose for the computation of a function can be implemented in each of the known mathematical models of computation. In particular, it can be “programmed” as a URM.

We note that at the present state of our understanding of the concept of “algorithm” or “algorithmic process”, there is no known way to define —via a pseudo-program of sorts— an “intuitively computable” function on the natural numbers, which is outside of \({\mathcal {P}}\). Thus, \({\mathcal {P}}\) appears to formalise the largest —i.e., most inclusive— set of “intuitively computable” functions (on the natural numbers) known.

Church’s Thesis is not a theorem. It cannot be, as it “connects” precise mathematical objects (URM, \({\mathcal {P}}\)) with imprecise informal ones (“algorithm”, “computable function”).

However, in use it provides the operational convenience and pedagogical advantage of concentrating on the high level of detail (essentially, no much detail at all) of why a construction does what we say it does —or why a mathematical definition produces a computable function— without overdoing the mathematics in the process.

In the early chapters of this book we will use Church’s Thesis —to which we will refer, in short, as “CT”— to justify that various constructions we jot down yield computable functions. This will make arguments less detailed and more connected with experience, thus also easier; and shorter.

In the literature, Rogers (Theory of Recursive Functions and Effective Computability (McGraw-Hill, New York, 1967)) relies on CT almost exclusively, despite the advanced level of his book. Cutland (Computability; An Introduction to Recursive Function Theory (Cambridge University Press, Cambridge, 1980)) also delegates to Church’s thesis the more involved proofs.

On the other hand, Davis (Computability and Unsolvability (McGraw-Hill, New York, 1958a)), Tourlakis (Computability (Reston Publishing, Reston, 1984); Theory of Computation (Wiley, Hoboken, 2012)) never use CT, and give all the necessary mathematical constructions (and proofs) in their full gory details. We will follow the fully mathematical approach in the later more advanced chapters, but even when we do use Church’s thesis we will, invariably, also provide a fully mathematical proof.

Abstract

This chapter introduces the semi-recursive relations \(Q(\vec x)\). These play a central role in computability. As the name suggests these are kind of “half” recursive. Indeed, we can program a URM M to verify membership in such a Q, but if an input is not in Q, then our verifier will not tell us so; it will loop for ever. That is, M verifies membership but does not decide it in a yes/no manner, that is, by halting and “printing” the appropriate 0 (yes) or 1 (no).

Can’t we tweak M into an M′ that is a decider of such a Q? No, not in general! For example, our halting set K has a verifier but (provably) has no decider! This much we know: having a decider for K means \(K\in \mathcal {R}_{*}\), and we know that this is not the case.

Since the “yes” of a verifier M is signaled by halting but the “no” is signaled by looping forever, the definition below does not require the verifier to print 0 for “yes”. In this case “yes” equals “halting”.

The chapter introduces a general reduction technique to prove that relations are not recursive or not semi-recursive according to the case. We prove the closure properties of the set of all semi-recursive relations and also the important graph theorem. We prove the projection theorem and also give a characterisation of semi-recursive sets as images of recursive functions. The startling theorem of Rice is included: all sets of the form \(A=\{x:\phi _{x}\in {\mathcal {C}}\subseteq {\mathcal {P}}\}\) are not recursive, unless A = ∅ or \(A={\mathbb {N}}\).

Abstract

In Davis (Computability and Unsolvability (McGraw-Hill, New York, 1958b)), Kleene (Introduction to Metamathematics (North-Holland, Amsterdam, 1952)), Tourlakis (Computability (Reston Publishing, Reston, VA, 1984)) one finds an alternative inductive characterisation of \({\mathcal {P}}\) based on an augmented set of initial functions, but primitive recursion as a primary operation (definitionally) is absent (but, of course, is a derived operation). Moreover unbounded search is primitive, but is a different version than the (μy) of Definition 1.1.25, and is defined on total functions only. We introduce the tools for coding from Smullyan (Theory of Formal Systems. Annals of Mathematics Studies, vol. 47 (Princeton University Press, Princeton, 1961)) and Bennett (On spectra. PhD thesis, Princeton University, 1962), based on 2-adic (pronounced “dyadic” as opposed to binary) concatenation. The inductive characterisation of \({\mathcal {R}}\) derived in this chapter will be used in Chap. 12.

Abstract

It is rather surprising that Unprovability and Uncomputability are intimately connected. Gödel’s original proof of his first incompleteness theorem did not use methods of computability —indeed computability theory was not yet developed.

Before we turn to an elementary proof of the incompleteness theorem that is based on the unsolvability of the halting problem —hence our opening remarks— we show here, in outline, how he did it in Gödel (Monatshefte für Math Phys 38:173–198, 1931).

He constructed a sentence (closed formula) D (“D” for diagonal, since diagonalisation was at play in Gödel’s proof.) in the language of Peano Arithmetic (PA) that meant —when interpreted in the standard model— “I am not a (PA) theorem”.

This idea was based on a variation —namely, “I am lying”— of the “liar’s paradox” of the ancient Cretan philosopher Epimenides. (His original version was “All Cretans are liars”).

Abstract

The recursion theorem is attributed to Kleene, but it was embedded in a somewhat different format in Gödel’s first incompleteness theorem proof (Gödel (Monatshefte für Math Phys 38:173–198)), where, via his substitution function and the fixed point lemma, he constructed a sentence of Peano Arithmetic (PA) that said “I am not a theorem”.

The analogues in Kleene’s proof of his second recursion theorem (He has also stated and a proved a “first recursion theorem”. See Chap. 13.) are the S-m-n functions (instead of the Gödel substitution function) and the second recursion theorem itself (instead of the fixpoint lemma of Gödel (Monatshefte für Math Phys 38:173–198); Carnap (Logische Syntax der Sprache (Springer, Berlin, 1934))).

Abstract

This chapter shows how to assign a numerical code to every function f of \(\mathcal {P}\mathcal {R}\) so that the code encapsulates a specific derivation of f. This is similar to Exercise 3.5.15, but here we give a number-theoretic version.

In turn, this tool allows a different proof than either that of said exercise, or those via Ackermann’s function or the informal one in Sect. 3.4 for the result \(\mathcal {P}\mathcal {R}\subsetneq {\mathcal {R}}\): We show here that there is a universal function for \(\mathcal {P}\mathcal {R}\), and diagonalise (alternatively, majorise).

In more detail, unpacking said code provides us with the tools to compute f. In effect, we will build a total computable function Eval (stands for “evaluate”) such that, essentially, for every \(\lambda \vec x_{n}.f(\vec x_{n})\in \mathcal {P}\mathcal {R}\), we have \(Eval(\widehat f, \vec x_{n})=f(\vec x_{n})\) for all \(\vec x_{n}\), where \(\widehat f\) is some code that encapsulates a derivation for f. Thus Eval is universal for \(\mathcal {P}\mathcal {R}\). Provably, it will not be in \(\mathcal {P}\mathcal {R}\).

Abstract

We saw that the semi-recursive sets are computably enumerable and vice versa (Sect. 6.4). In this chapter we will look more closely at enumerations of semi-recursive —especially the special cases of recursive and finite— sets.

Abstract

This chapter introduces the very important topic of productive and creative sets, the simple sets of Post, and also takes an in depth look at the phenomenon of the incompleteness of “strong” mathematical theories like Peano Arithmetic. It turns out that there is a connection between the logic and the computability topics and we give several mathematically complete proofs of the first incompleteness theorem (Gödel (Monatshefte für Math Phys 38:173–198, 1931 (Also in English in M. Davis, The Undecidable (Raven Press, Hewlett, 1965), pp. 5–38))) including the version by Rosser (J Symb Logic 1:87–91, 1936). We also prove Church’s theorem of the undecidability of the set of theorems of logic (Church (J Symbolic Logic 1:40–41, 101–102, 1936a); Church (Am J Math 58:345–363, 1936b (Also in M. Davis, The Undecidable (Raven Press, Hewlett, 1965), pp. 89–107)); this provides a negative answer to Hilbert’s Entscheidungsproblem). Our inevitable arithmetisation of logic towards obtaining the above results is based on concatenation, in the style of Smullyan (Gödel’s Incompleteness Theorems (Oxford University Press, Oxford, 1992)), Bennett (On Spectra. Ph.D. Thesis, Princeton University, 1962), and Quine (J Symb Logic 11:105–114, 1946).

Abstract

We have had the briefest glimpse of computing relative to a given function f (cf. Definition 9.5.1) —or in a function f— that is, having the computation rely on obtaining from time to time values f(a) for various a using an external agent, the computation being totally uninterested in how the agent computes, or obtains in some perhaps non effective (not “computable”) way, said values.

The details of how the agent obtains answers being unknown and irrelevant suggested the nickname “oracle for f” for the agent. So we have a “secret” process —that is not part of our explicit computation— which when our computation requires a value f(a) of the function the said secret process furnishes that value if (the function call “f(a)” returns) f(a) ↓, or, if f(a) ↑, returns nothing, in which case, by convention, the explicit part of the computation diverges as well because the latter is “stuck” waiting for an answer that will not be forthcoming.

Abstract

Our investigations so far were aimed mainly towards understanding limitations of computing and logic, exploring uncomputability, undecidability and unprovability (in ROB, Definition 12.6.1) and its recursive extensions, most notable of which being PA), with short studies of what is computable —e.g., in the early chapters on primitive recursive functions and loop programs— doing the latter task (so far) mostly for “tooling” reasons, for example, obtaining coding tools. Our last excursion into the unsolvable was the Kleene-Mostowski Arithmetical Hierarchy of increasingly undecidable relations and sets.

There is another kind of limitation, this one found among the computable functions. In “real life” the programmer is interested not only in finding a correct computational solution to a problem, but also one that is “efficient” meaning it spends, relatively to other solutions, fewer computation resources such as computer memory or computer (execution) time.

Thus, the theoretician classifies problems according to their degree of unsolvability —a measure of problem complexity (the higher it is in the Kleene-Mostowski hierarchy the more uslovabhe or complex it is). The serious programming practitioner (Computer programming can be quite a mathematically challenging activity. See, for example the series of volumes titled “the Art of Computer Programming” (with each volume having an appropriate subtitle that fits the specific topics in it) starting with Vol. 1, “fundamental algorithms” Knuth (The Art of Computer Programming. Fundamental Algorithms , vol. 1 (Addison-Wesley, Reading, 1973))). on the other hand would rather classify solvable, decidable, problems according to the amount of computational resources they consume —their computational complexity.

Mathematicians and computer scientists have contributed to the body of knowledge known as computational complexity at least since the mid 1960s. We divide our brief review of this vast area into a chapter (the present one) on the complexity theory of the set \({\mathcal {P}}\) of partial recursive functions and into a chapter (the next one) on the complexity of the primitive recursive functions. In both cases —right at the outset in the case of \(\mathcal {P}\mathcal {R}\)— we adopt a hierarchy point of view.

In the present chapter we follow the axiomatic approach of Blum (J ACM 14:322–336, 1967). The cited work is a marvelous mathematical feat where just two “intuitively obvious and acceptable” axioms lead to deep and at times unexpected (by intuition) results.

We already saw in the case of the arithmetical hierarchy that the classification of relations by complexity was at first “static” or definitional but Post’s theorem demonstrated the tight connection between static and dynamic (or computational) complexity.

This chapter is solely about the computational, or dynamic, complexity of the functions is \({\mathcal {P}}\). A gauge for such complexity, intuitively, may be the amount of “steps” or “time” a computation takes to conclude (halt), or may be the amount of memory, e.g., maximum total size of stored data —we often say “space”— in all the variables of a URM that provides instructions for the computation.

Chapter 15 on the other hand starts by classifying the \(\mathcal {P}\mathcal {R}\) functions in a provably infinite hierarchy —actually we introduce several different hierarchies that we compare and prove equivalent in a very strong sense— according to static or definitional complexity and concludes by showing via the “Ritchie-Cobham property” that the dynamic or computational complexity of the functions —say, computed by a URM— at any level of the hierarchy are accurately predicted by their static complexity!

This static complexity of a function could be the size of a (URM or Loop) program that computes it (e.g., the program length, as a string of symbols) or more usually some structural attributes of program complexity, e.g., depth of nesting of the Loop-end instruction, or equivalently, but more mathematically the static complexity of a function (rather than a program) may be the maximum depth of nesting over all primitive recursions needed in a given primitive recursive derivation of said function. Or, it could even be the size of the function’s output (as is the case with the Grzegorczyk hierarchy).

This discussion will make more sense in Chap. 15. In particular, “the maximum depth of nesting over all primitive recursions” will be carefully and simply defined.

Abstract

The literature refers to the complexity theory of the partial recursive functions as “high level complexity” due to its generality and the mostly theoretical and far removed from practice essence of it. The present chapter is about more down to earth functions.