Skip to main content
main-content

Über dieses Buch

This book presents an exciting new synthesis of directed and undirected, discrete and continuous graphical models. Combining elements of Bayesian networks and Markov random fields, the newly introduced hybrid random fields are an interesting approach to get the best of both these worlds, with an added promise of modularity and scalability. The authors have written an enjoyable book---rigorous in the treatment of the mathematical background, but also enlivened by interesting and original historical and philosophical perspectives.
-- Manfred Jaeger, Aalborg Universitet

The book not only marks an effective direction of investigation with significant experimental advances, but it is also---and perhaps primarily---a guide for the reader through an original trip in the space of probabilistic modeling. While digesting the book, one is enriched with a very open view of the field, with full of stimulating connections. [...] Everyone specifically interested in Bayesian networks and Markov random fields should not miss it.
-- Marco Gori, Università degli Studi di Siena


Graphical models are sometimes regarded---incorrectly---as an impractical approach to machine learning, assuming that they only work well for low-dimensional applications and discrete-valued domains. While guiding the reader through the major achievements of this research area in a technically detailed yet accessible way, the book is concerned with the presentation and thorough (mathematical and experimental) investigation of a novel paradigm for probabilistic graphical modeling, the hybrid random field. This model subsumes and extends both Bayesian networks and Markov random fields. Moreover, it comes with well-defined learning algorithms, both for discrete and continuous-valued domains, which fit the needs of real-world applications involving large-scale, high-dimensional data.

Inhaltsverzeichnis

Frontmatter

Introduction

“Graphical models are a marriage between probability theory and graph theory” (Michael Jordan, 1999 [154]). The basic idea underlying probabilistic graphical models is to offer “a mechanism for exploiting structure in complex distributions to describe them compactly, and in a way that allows them to be constructed and utilized effectively” (Daphne Koller and Nir Friedman, 2009 [174]). They “have their origin in several scientific areas”, and “their fundamental and universal applicability is due to a number of factors” (Steffen Lauritzen, 1996 [189]). For example, the generality of graphical models is due to the fact that they “reveal the interrelationships between multiple variables and features of the underlying conditional independence” (Joe Whittaker, 1990 [315]). Moreover, “[c]omplex computations, required to perform inference and learning in sophisticated models, can be expressed in terms of graphical manipulations, in which underlying mathematical expressions are carried along implicitly” (Christopher Bishop, 2006 [30]).
Antonino Freno, Edmondo Trentin

Bayesian Networks

For some mysterious reasons, Bayesian networks are often introduced in the literature with examples concerning weather forecast (e.g., ‘Is it likely to be sunny on Sunday given the fact that it is raining on Saturday, and that my granny’s back hurts?’), or gardening issues (e.g., ‘How plausible is it that my courtyard is wet given the fact that it is raining and that the sprinkler is on?’). This may give the reader the somewhat embarrassing idea that Bayesian networks are a sort of tool you may be wishing to buy at the “Lawn & Outdoor” level of the shopping mall at the corner. As a matter of fact, there are more intriguing, yet useful, domains: for instance, soccer games. Suppose you want (anybody wants) to make a prediction on whether your favorite soccer team will win, lose, or draw tonight International League game. You can suitably describe the outcome of the match as a random variable Y, possibly taking any one of the three different values w (win), l (lose), or d (draw), according to a certain (yet, unknown) probability distribution. As unknown as this distribution may be, some probabilistic reasoning, and your prior knowledge of the situation, may help you out predicting the final outcome.
Antonino Freno, Edmondo Trentin

Markov Random Fields

Let’s give Bayesian networks a break, and let us go back to our favorite topic, namely soccer. Suppose you want to develop a probabilistic model of the ranking of your team in the domestic soccer league championship at any given time t throughout the current season. In this setup, it is reasonable to assume that t is a discrete time index, denoting t-th game in the season and ranging from t = 1 (first match of the tournament) to t = T (season finale). Assuming the championship is organized as a round-robin tournament among N teams, then T = 2(N − 1). The ranking of your team at time t + 1 is likely to change with a certain probability distribution which (i) accounts for the randomness of the results at the end of the corresponding matchday, and (ii) depends on the ranking at time t.
Antonino Freno, Edmondo Trentin

Introducing Hybrid Random Fields: Discrete-Valued Variables

Both Bayesian networks and Markov random fields are widely used tools for statistical data analysis. Applying Markov random fields can be relatively expensive because of the computational cost of learning the model weights from data. As we saw in Chapter 3, this task involves optimization over a set of continuous parameters, and the number of these parameters can become very large (in the order of several thousands and more) when we try to address problems involving several variables. For example, the link-prediction application we will describe in Section 6.3.4.3, which involves 1,682 variables, is such that the Markov random field applied to it contains 16,396 feature functions, with the corresponding weights.
Antonino Freno, Edmondo Trentin

Extending Hybrid Random Fields: Continuous-Valued Variables

In Section 1.2 we introduced the concept of feature, or attribute. The idea is that, in order to apply statistics or learning machines to phenomena occurring in the real world, it is necessary to describe them in a proper feature space, such that each phenomenon can be thought of as a random variable (if a single attribute is used), or a random vector (if multiple features are extracted). The feature extraction process is crucial for the success of an application, since there is no good model that can compensate for wrong, or poor, features. Although several types of attributes are sometimes referred to in the literature (e.g., categorical, nominal, ordinals, etc.), two main families of features can be pointed out, namely discrete and continuous-valued (or, simply, continuous). Attributes are discrete when they belong to a domain which is a countable set, such that each feature can be seen as a symbol from a given alphabet. Examples of discrete features spaces are any subsets of the integer numbers, or arbitrary collections of alphanumeric characters. Continuous-valued feature spaces are compact subsets of ℝ—in which case the attribute is thought of as the outcome of a real-valued random variable—or ℝ d for a certain integer d > 1—in which case the d-dimensional feature vector is assumed to be the outcome of a real-valued random vector.
Antonino Freno, Edmondo Trentin

Applications

As we have said, hybrid random fields are not meant just as a general graphical model with nice theoretical properties, featuring algorithms for inference and learning over discrete and continuous variables. Above all, they are expected to reveal useful. This means that our ultimate goal is to exploit the flexibility of HRFs in modeling independence structures, as well as the scalability of algorithms for learning HRFs, in order to tackle real-world problems. Improvements over the traditional approaches, both in terms of prediction accuracy and computational efficiency, are sought.
Antonino Freno, Edmondo Trentin

Probabilistic Graphical Models: Cognitive Science or Cognitive Technology?

This chapter is an attempt to make explicit the philosophical and cognitive perspective that the scientific work presented in Chapters 2–6 should be viewed from. This does not mean that the scientific material collected in this work needs a philosophical foundation in order to make sense or to be really interesting. The only aim of embedding scientific results within a philosophical framework is “to understand how things in the broadest possible sense of the term hang together in the broadest possible sense of the term” [275], which is what Wilfrid Sellars regarded as the general aim of philosophy. In other words, while proposing a philosophical reflection on the meaning of the technical results collected in the previous chapters, we do not think that the value of those results depends in any important way on their philosophical meaning. Our standpoint is rather that, if we ask how those results in AI “hang together” with other results in the cognitive sciences and with particular views advocated in the philosophy of mind, then the philosophical remarks contained in this chapter are the answer we give to that question. But the reader should keep in mind that our ‘philosophical’ reflections are more properly meant as a scientific contribution to philosophy, rather than a philosophical contribution to science, where the guiding idea is that science can take care of itself.
Antonino Freno, Edmondo Trentin

Conclusions

The time has come to summarize the main contributions we have attempted to make throughout the book. To this aim, let us look back at each one of the main chapters, in turn:
  • Chapters 2-3 presented the main theoretical concepts and engineering techniques related to the representation and estimation of joint probability distributions in Bayesian networks and Markov random fields. After reading them, the reader should have acquired some of the basic competences needed on the one hand to implement and use directed and undirected graphical models in application domains involving discrete variables only, and on the other hand to read more advanced literature dealing with learning techniques for (discrete) probabilistic graphical models;
  • Chapter 4 introduced the hybrid random field model, exploring its mathematical properties and explaining how it can be used for representing joint probabilities (via Gibbs sampling) and pseudo-likelihood functions (via simple factorization into local conditional distributions), for estimating pseudo-likelihood distributions in large-scale domains involving discrete variables (via suitable parameter and structure learning algorithms), and for accomplishing arbitrarily shaped inference tasks (again, via Gibbs sampling). After reading this chapter, readers should be able to exploit hybrid random fields as an alternative for virtually any learning and reasoning application addressed through the techniques drawn from Chapters 2-3;
  • Chapter 5 explained how directed, undirected, and (more explicitly) hybrid graphical models can be applied to domains involving continuous-valued variables. To this aim, three different families of (conditional) density estimation techniques were presented, i.e. parametric techniques (based on Gaussian models), semiparametric techniques (based on the nonparanormal model), and nonparametric techniques (based on kernel methods). On the other hand, the issue of structure learning in hybrid random fields was tackled from the perspective of continuous application domains, which required to suitably adapt the Markov Blanket Merging algorithm developed in Chapter 4. This chapter should have provided the reader with some tools needed for extending probabilistic graphical modeling to various kinds of densities underlying high-dimensional, continuous random vectors;
  • Chapter 6 presented a number of applications of probabilistic graphical models to feature selection, pattern classification, and link prediction tasks, featuring both synthetic and real-world benchmarks involving discrete and continuous variables. The considered applications allowed to evaluate the behavior of hybrid random fields from the points of view of prediction accuracy and learning scalability as well, with very promising results. After considering this chapter, readers should have gained insight into a few representative kinds of applications allowing for learning and inference through probabilistic graphical models, and especially into the practical advantages of hybrid random fields as an alternative to more traditional graphical models;
  • Finally, Chapter 7 investigated the relationships existing between the theoretical and application-oriented framework inspiring the research presented in this book and some epistemological questions traditionally dealt with in the philosophy of cognitive science. After delving into the arguments put forward in this chapter, the philosophically-minded reader should have realized the relevance of even the most overly technical sections of the book for a number of interesting issues in epistemology and cognitive science.
Antonino Freno, Edmondo Trentin

Backmatter

Weitere Informationen

Premium Partner

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

Whitepaper

- ANZEIGE -

Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung

Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier  - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis.
Jetzt gratis downloaden!

Bildnachweise