nach oben

2009 | Buch

Kapitel lesen Erstes Kapitel lesen

Quantifiers in Action

Generalized Quantification in Query, Logical and Natural Languages

verfasst von: Antonio Badia

Verlag: Springer US

Buchreihe : Advances in Database Systems

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The database industry is a multi-billion, world-wide, all-encompassing part of the software world. Quantifiers in Action: Generalized Quantification in Query, Logical and Natural Languages introduces a query language called GQs—Generalized Quantification in Query. Most query languages are simply versions of First Order Logic (FOL). GQs are an extension of the idea of quantifier in FOL. GQs are a perfect example of a practical theory within databases.

This book provides a brief background in logic and introduces the concept of GQs, and then develops a query language based on GQs. Using Query Language with Generalized Quantifiers, the reader explores the efficient implementation of the concept, always a primary consideration in databases. This professional book also includes several extensions for use with documents employing question and answer techniques.

Designed for practitioners and researchers within the database management field; also suitable for advanced-level students in computer science.

Inhaltsverzeichnis

Frontmatter

1. Introduction

This monograph is written with two purposes: to help bridge the gap between theory and practice in databases, and to help bridge the gap between research in query language and research in other areas (outside databases, and some outside Computer Science) that are clearly related to querying. The motiva- tion behind the book, then, is a belief that the gaps exist, and that this is a bad situation.

Databases are a strange area. On the one hand, they are clearly an applied field. Databases (and related services) are a multi-billion, world-wide software industry. Some advances (for instance, in query optimization) make it to mar- ket in relatively short time. Changes in the industry (for instance, the move to data warehouses) provoke comparable changes in the field, creating whole subareas of research. On the other hand, databases are a theoretical field. Since the definition of the relational model by Codd, its main concepts have been tied to logic ideas, and logician’s methods have been used to study the model abstractly. This study has blossomed in entirely new subareas, like descriptive complexity ([56]) and finite model theory ([27, 69]). Thus, databases is, at the same time, a clearly applied and a clearly theoretical field of study.

This would be a good thing if theory and practice walked hand in hand. However, this does not always happen. To quote a familiar dictum, “In the- ory, theory and practice are the same. In practice, they are not.”1. While occasionally theory occupies itself with questions of practical interest, and practice does generate questions of theoretical interest, the mainstream body of theory and practice remain separated. This is not due to the will of the re- searchers -many theoreticians find delight in seeing theoretical developments put to good use in practice; and they are also eager to apply their skills to problems motivated by new areas of practice. It is also the case, in my opinion, that many practitioners consider the establishment of a solid, formal founda- tion for a practical problem a positive development. In fact, most database people (certainly this author) adhere to another familiar dictum, “There is nothing as practical as a good theory”. But theory and practice have different goals, demand different approaches and produce different results. Most of the time, the disconnect is present. It is for this reason that it is a pleasure to work on a subject that gives an opportunity to bridge this gap.

Antonio Badia

2. Basic Concepts

In this chapter we introduce some basic logic concepts, focusing on those ideas related to quantification. This chapter is for readers with no background in logic, and it only includes some core concepts that facilitate further reading.

Antonio Badia

3. Generalized Quantifiers

GeneralizedQuantifiers (henceforthGQs) were originally introduced byMostowski ([76]). Lindstrom generalized the original idea and set the definition used nowadays ([70])1. GQs remained mostly a concept of interest for logicians until years later. In a seminal paper, Barwise and Cooper showed their im- portance in natural language formalization ([16]). Following in a tradition started by Montague, Barwise and Cooper show how GQs are a useful and powerful tool in the formal analysis of linguistic phenomena. This insight es- tablished an interesting area of research, which we discuss briefly in chapter 8. Also, work on descriptive complexity has produced recently a large body of results tied to GQs (see [47, 65, 93] as a small sample). Finally, some work has focused on practical applications of the concept in diverse areas of Computer Science, like query languages ([52, 85]), Artificial Intelligence ([90], [82]) and graphical interfaces ([17]). The research described in this book continues in this tradition, focusing on practical uses in query languages, and Information Systems in general.

In this chapter we introduce the concept of GQ formally, provide several examples and describe some basic properties. Here we focus on fixing notation and providing some basic facts that will be used in the rest of the book. Additional properties will be introduced later as needed.

Antonio Badia

4. QLGQ: A Query Language with Generalized Quantifiers

The idea of using GQs in query languages can be traced back to the develop- ments of extensions of relational algebra that hinge on the idea that dealing with sets is a necessary tool for a query language. SQL contains a rudimen- tary version of generalized quantification in the EXISTS predicate1. However, SQL’s syntax does not allow for any other kind of quantification except uni- versal quantification which is achieved by combining negation and existential quantification (the NON EXISTS predicate).

There have been three kinds of proposals that are directly relevant to the present work. The first one is the incorporation of extended quantification, in the form of a universal quantifier, to the relational algebra. The second one is the incorporation of set predicates to the algebra. The third one is a direct attempt to incorporate Generalized Quantifiers to SQL. All these proposals work only at the language level, adding new constructs to the query language. Other proposals work at the implementation level, proposing more efficient support for already existing constructs.

Our motivation for creating a new query language is to be able to study issues of generalized quantification in a setting as general as possible. To this end, the language introduced is tailored to the use of GQs, while being kept as simple as possible. We will also study the issue of how to use GQs in SQL; we will see then that there are several choices one can make with respect to the exact syntax required. However, we don’t want such choices to interfere with the analysis of quantification. Therefore, in the next section we introduce the language that we will use throughout the rest of the book.

Antonio Badia

5. Implementation and Optimization of Standard GQs

For the rest of this chapter, we will restrict our attention to standard quanti- fiers; that is, monadic binary quantifiers (of type (1, 1)). In addition, we will make one further restrictions on the quantifiers to be considered in this sec- tion: we will assume that they obey EXT. This was defined in chapter 3; here we repeat definition 3.2.5 simplified to the (1, 1) type:

Definition 5.1.1 (EXT) A quantifier Q of type (1, 1) follows EXT if for all M,M', A ₁,A ₂ ⊆ M ⊆ M', Q _M(A ₁,A ₂) iff Q _M' (A ₁,A ₂).

What will allow us to define effective ways to deal with these quantifiers is the following: when dealing with monadic quantifiers, all arguments are sets. Sets do not carry structural information, like relations do. Thus, isomorphisms between sets are simply bijections. In other words, all that matters to check if sets are isomorphic is the cardinality of the sets. When dealing with finite models, as we do, all sets are finite. Thus, we restrict ourselves to finite quan- tifiers, those where all arguments are finite sets. Hence, matters of cardinality can be seen as simply arithmetic predicates on natural numbers. To define quantifiers, then, we can give properties on natural numbers.

Antonio Badia

6. Quantifier Prefixes

In this chapter we restrict ourselves to quantifiers of type (1) and analyze quantifier prefixes. We already gave some basic properties for first order pre- fixes in subsection 2.3.1, but here we will study prefixes with Generalized Quantifiers, as well as other tuples of prefixes not present in FOL.

Since we establish in the previous chapter that all monadic quantifiers can be captured by cardinality properties, it would seem that quantifiers of type (1) can be trivially defined. Indeed, the definition 5.1.3 can be applied to these quantifiers and immediately shows that a quantifier QM(A) can be defined by two numbers, |M - A| and |A|. When the quantifier is domain independent, the first of these two numbers is irrelevant, and only |A| matters. Note, though, that not all type (1) quantifiers are domain independent: all(1) is not; quantifiers like \(\textbf{n}\%\textbf{of}M(A) = \{A \subseteq M | |A| = \frac{|M| \times n}{100}\}\) is not domain independent either. It is arguable that in some of these cases, the relativization to a type (1, 1) quantifier, which yields a domain independent quantifier, is more meaningful. But in any case, quantifiers of type (1) can be easily captured by number properties, and hence the approach of the previous chapter applies. Why, then, study such quantifiers separately? The answer is that its type allows such quantifiers to be easily combined into quantifier prefixes, and that such prefixes exhibit interesting properties. Prefixes, not isolated quantifiers, are the real theme of this chapter.

Antonio Badia

7. Cooperative Query Answering

In this chapter we add a pragmatic dimension to our previous analysis. This is consistent with our goal of treating both questions and queries, and their rela- tionship, as a serious and legitimate area of research in databases. Pragmatic issues come about because of the way we use natural language to express questions. The claim is that queries (and query languages) that pay attention to such use are more likely to be of relevance to an end user.

This claim is not new; it is the basis for the field of Cooperative Query Answering (henceforth CQA). CQA is very diverse; for an overview see [32]. More in-depth information can be found in the book [55] or in the work- shop proceedings of [21], [67]. Here we will focus only on the aspects directly relevant to us. Our main point here is that Generalized Quantification is a very useful tool to develop CQA, and we will support this point by showing concrete examples in which GQs illustrate and exemplify CQA techniques.

We will start with a brief overview of CQA for those readers not familiar with the area, and then show how QLGQ fits quite well with the framework.

Antonio Badia

8. Generalized Quantifiers and Natural Language

In this chapter we continue the linguistic turn that we initiate in the previous one by addressing pragmatic issues. Here we deal squarely with linguistic issues, that is, with questions instead of queries. We can do this because one of the appeals of the concept of GQ is that it has been widely used in linguistic analysis. In particular, the notion of interrogative quantifier has been proposed and analyzed ([94, 20, 42, 35]. Here, we build on such analysis to show how QLGQ can be used to formalize questions, thus providing a solid formal framework for work in Question Answering (henceforth QA).

We first provide a (very brief) overview of QA, in order to make the material as self-contained as possible. We then show in section 8.3 how GQs have been used by linguists for the formal analysis of natural language. We then show how QLGQ can be used to support QA. The approach presented here, however, opens up more questions than it answers, and hence we close the chapter with a detailed discussion of our motivation, and where to go from here.

Antonio Badia

9. Extensions

So far, we have presented a basic language which is not all that dissimilar from other, existing query languages. The main reason d’etre for QLGQ was to allow us to incorporate Generalized Quantifiers into the query language in an easy, natural way.

However, in order to allow for efficient processing, we have constructed a limited environment, and it is natural to ask if this environment could be extended, and if so, in which ways. Here we discuss several extensions to the language that seem of special interest, each one for its own reason. Since QLGQ is so close to Datalog, adding adding recursion (fixpoint) to the language is an obvious step to take. Adding aggregates to the language is also an obvious step for a language that wants to have practical uses. Another, less obvious step is to allow for second order variables; we also explore this idea because of its connections to tasks like data mining. An extension to distributed environments is worthwhile studying, as the issue of quantification in distributed databases has barely been studied. Finally, an extension to other data models besides relational emphasizes that the concept of Generalized Quantifier is high-level and data model independent.

Antonio Badia

10. Conclusion

In the previous chapters, we have explored the use of Generalized Quantifica- tion in practical applications. We have centered the work on query languages, but have progressively extended the scope to deal with questions and (limited) natural language analysis, as it is used in Question Answering. After intro- ducing the basic logical concepts in chapter 2 and the concept of Generalized Quantifier in chapter 3, we presented the query language QLGQ in chapter 4. This language was conceived basically as a vehicle to explore the use of general- ized quantification, and is not very different from existing (logic-based) query languages, but it allow us to explore adding (and efficiently implementing) standard (type (1, 1)) quantifiers to the language in chapter 5, and complex quantifier prefixes in chapter 6. We then took a linguistic turn ¹ in chapter 7 by introducing pragmatic considerations in query processing, and completed the transition from queries to questions in chapter 8 by focusing in Question Answering. Throughout, the use of Generalized Quantification provided the glue that held all the material together as a coherent whole -hopefully!

Antonio Badia

Backmatter

Titel: Quantifiers in Action
verfasst von: Antonio Badia
Verlag: Springer US
Electronic ISBN: 978-0-387-09564-6
Print ISBN: 978-0-387-09563-9
DOI: https://doi.org/10.1007/978-0-387-09564-6

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

1. Introduction

2. Basic Concepts

3. Generalized Quantifiers

4. QLGQ: A Query Language with Generalized Quantifiers

5. Implementation and Optimization of Standard GQs

6. Quantifier Prefixes

7. Cooperative Query Answering

8. Generalized Quantifiers and Natural Language

9. Extensions

10. Conclusion

Backmatter

Premium Partner