Skip to main content

1997 | Buch

Modern Multidimensional Scaling

Theory and Applications

verfasst von: Ingwer Borg, Patrick Groenen

Verlag: Springer New York

Buchreihe : Springer Series in Statistics

insite
SUCHEN

Über dieses Buch

Multidimensional scaling (MDS) is a technique for the analysis of similarity or dissimilarity data on a set of objects. Such data may be intercorrelations of test items, ratings of similarity on political candidates, or trade indices for a set of countries. MDS attempts to model such data as distances among points in a geometric space. The main reason for doing this is that one wants a graphical display of the structure of the data, one that is much easier to understand than an array of numbers and, moreover, one that displays the essential information in the data, smoothing out noise. There are numerous varieties of MDS. Some facets for distinguishing among them are the particular type of geometry into which one wants to map the data, the mapping function, the algorithms used to find an optimal data representation, the treatment of statistical error in the models, or the possibility to represent not just one but several similarity matrices at the same time. Other facets relate to the different purposes for which MDS has been used, to various ways of looking at or "interpreting" an MDS representation, or to differences in the data required for the particular models. In this book, we give a fairly comprehensive presentation of MDS. For the reader with applied interests only, the first six chapters of Part I should be sufficient. They explain the basic notions of ordinary MDS, with an emphasis on how MDS can be helpful in answering substantive questions.

Inhaltsverzeichnis

Frontmatter

Fundamentals of MDS

Frontmatter
1. The Four Purposes of Multidimensional Scaling
Abstract
Multidimensional scaling (MDS) is a method that represents measurements of similarity (or dissimilarity) among pairs of objects as distances between points of a low-dimensional multidimensional space. The data, for example, may be correlations among intelligence tests, and the MDS representation is a plane that shows the tests as points that are closer together the more positively the tests are correlated. The graphical display of the correlations provided by MDS enables the data analyst to literally “look” at the data and to explore their structure visually. This often shows regularities that remain hidden when studying arrays of numbers. Another application of MDS is to use some of its mathematics as models for dissimilarity judgments. For example, given two objects of interest, one may explain their perceived dissimilarity as the result of a mental arithmetic that mimics the distance formula. According to this model, the mind generates an impression of dissimilarity by adding up the perceived differences of the two objects over their properties.
Ingwer Borg, Patrick Groenen
2. Constructing MDS Representations
Abstract
An MDS representation is found by using an appropriate computer program. The program, of course, proceeds by computation. But one- or two- dimensional MDS representations can also be constructed by hand, using nothing but a ruler and compass. In the following, we will discuss such constructions in some detail for both ratio MDS and for ordinal MDS. This leads to a better understanding of the geometry of MDS. In this context, it is also important to see that MDS is almost always done in a particular family of geometries, i.e., in flat geometries.
Ingwer Borg, Patrick Groenen
3. MDS Models and Measures of Fit
Abstract
MDS models are defined by specifying how given similarity or dissimilarity data, the proximities p ij , are mapped into distances of an m-dimensional MDS configuration X. The mapping is specified by a representation function, f : P ij d ij (X), which specifies how the proximities should be related to the distances. In practice, one usually does not attempt to strictly satisfy f. Rather, what is sought is a configuration (in a given dimensionality) whose distances satisfy f as closely as possible. The condition “as closely as” is quantified by a badness-of-fit measure or loss function. The loss function is a mathematical expression that aggregates the representation errors, e ij = f (p ij )−d ij (X), over all pairs (i, j). A normed sum-of-squares of these errors defines Stress, the most common loss function in MDS. How Stress should be evaluated is a major issue in MDS. It is discussed at length in this chapter, and various criteria are presented.
Ingwer Borg, Patrick Groenen
4. Three Applications of MDS
Abstract
Three applications of MDS are discussed in some depth. Emphasis is given to the questions of how to choose a particular MDS solution and how to interpret it. First, data on the perceived similarity of colors are studied. The predicted MDS configuration is a color circle, which is indeed found to be the best representation for the data. Second, confusion data on Morse codes are investigated. The MDS space shows two regional patterns, which reflect two physical properties of the signals. Third, global similarity judgments on different facial expressions are studied. A dimensional system can be found which relates to three empirical scales for the faces.
Ingwer Borg, Patrick Groenen
5. MDS and Facet Theory
Abstract
Regional interpretations of MDS solutions are very general and particularly successful approaches for linking MDS configurations and substantive knowledge about the represented objects. Facet theory (FT) provides a systematic framework for regional interpretations. FT structures a domain of interest by partitioning it into types. The typology is generated by coding the objects of interest on some facets of their content. The logic is similar to stratifying a sample of persons or constructing stimuli in a factorial design. What is then tested by MDS is whether the distinctions made on the conceptual (design) side are mirrored in the MDS representation of the objects’ similarity coefficients such that different types of objects fall into different regions of the MDS space.
Ingwer Borg, Patrick Groenen
6. How to Obtain Proximities
Abstract
Proximities are either collected by directly judging the (dis-)similarity of pairs of objects, or they are derived from score or attribute vectors associated with each of these objects. Direct proximities typically result from similarity ratings on object pairs, from rankings, or from card-sorting tasks. The anchor stimulus method leads to conditional proximities that have a restricted comparability and require special MDS procedures. Derived proximities are, in practice, most often correlations of item scores over individuals. Since there is so much work involved in building a complete proximity matrix, it is important to know about the performance of incomplete proximity matrices (with missing data) in MDS. It turns out that MDS is quite robust against randomly distributed missing data. MDS is also robust when used with coarse proximities, e.g., dichotomous proximities.
Ingwer Borg, Patrick Groenen

MDS Models and Solving MDS Problems

Frontmatter
7. Matrix Algebra for MDS
Abstract
In this chapter, we build a basis for a more technical understanding of MDS. Matrices are of particular importance here. They bring together, in one single mathematical object, such notions as a whole configuration of points, all of the distances among the points of this configuration, or a complete matrix of proximities. Mathematicians developed a sophisticated algebra for matrices that allows one to derive, for example, how a configuration that represents a matrix of distances can be computed, or how the distances among all points can be derived from a configuration. Most of these operations can be written in just a few lines, in very compact notation, which helps tremendously to see what is going on. The reader does not have to know everything in this chapter to read on in this book. It suffices to know the main concepts and theorems and then later come back to this chapter when necessary. Proofs in this chapter are meant to better familiarize the reader with the various notions. One may opt to skip the proofs and accept the respective theorems, as is common practice in mathematics (“It can be shown that...”).
Ingwer Borg, Patrick Groenen
8. A Majorization Algorithm for Solving MDS
Abstract
An elegant algorithm for computing an MDS solution is discussed in this chapter. We reintroduce the Stress function that measures the deviance of the distances between points in a geometric space and their corresponding dissimilarities. Then, we focus on how a function can be minimized An easy and powerful minimization strategy is the principle of minimizing a function by iterative majorization. This method is applied in the SMACOF algorithm for minimizing Stress.
Ingwer Borg, Patrick Groenen
9. Metric and Nonmetric MDS
Abstract
In the previous chapter, we derived a majorization algorithm for fixed dissimilarities. However, in practical research we often have only rank-order information of the dissimilarities (or proximities), so that transformations that preserve the rank-order of the dissimilarities become admissible. In this chapter, we discuss optimal ways of estimating this and other transformations. One strategy for ordinal MDS is to use monotone regression. A different strategy, rank-images, is not optimal for minimizing Stress, but it has other properties that can be useful in MDS. An attractive group of transformations are spline transformations, which contain ordinal and linear transformations as special cases.
Ingwer Borg, Patrick Groenen
10. Confirmatory MDS
Abstract
If more is known about the proximities or the objects, then additional restrictions (or constraints) can be imposed on the MDS model. This usually means that the MDS solutions must satisfy certain additional properties of the points’ coordinates or the distances. These properties are derived from substantive considerations. The advantage of enforcing such additional properties onto the MDS model is that one thus gets direct feedback about the validity of one’s theory about the data. If the Stress of a confirmatory MDS solution is not much higher than the Stress of a standard (“unconstrained”) MDS solution, the former is accepted as an adequate model. Several procedures that allow one to impose such external constraints are described and illustrated.
Ingwer Borg, Patrick Groenen
11. MDS Fit Measures, Their Relations, and Some Algorithms
Abstract
A problem in MDS is how to evaluate the Stress value. Once a solution is found, how good is it? In Chapter 3, several statistical simulation studies were reported. Here we give an interpretation of normalized Stress in terms of the proportion of the explained sum-of-squares of the disparities. We also show that normalized Stress is equal to Stress-1 at a minimum and that the configurations only differ by a scale factor. Then, other common measures of fit for MDS are discussed. For these fit measures, we refer to some recent algorithmic work.
Ingwer Borg, Patrick Groenen
12. Classical Scaling
Abstract
Since the first practical method available for MDS was a technique due to Torgerson (1952, 1958) and Gower (1966), classical scaling is also known under the names Torgerson scaling and Torgerson-Gower scaling. It is based on theorems by Eckart and Young (1936) and by Young and Householder (1938). The basic idea of classical scaling is to assume that the dissimilarities are distances and then find coordinates that explain them. In (7.5) a simple matrix expression is given between the matrix of squared distances D (2)(X) — we also write D (2) for short — and the coordinate matrix X which shows how to get squared Euclidean distances from a given matrix of coordinates and then scalar products from these distances. In Section 7.9, the reverse was discussed, i.e., how to find the coordinate matrix given a matrix of scalar products B = XX′. Classical scaling uses the same procedure but operates on squared dissimilarities Δ (2) instead of D (2), since the latter is unknown. This method is popular because it gives an analytical solution, requiring no iterations.
Ingwer Borg, Patrick Groenen
13. Special Solutions, Degeneracies, and Local Minima
Abstract
In this chapter, we explain several technical peculiarities of MDS. First, we consider MDS of a constant dissimilarity matrix (all dissimilarities are equal) and indicate what configurations are found. Then we discuss degenerate solutions in ordinal MDS, where Stress approaches zero even though the MDS distances do not represent the data properly. Another problem in MDS is the existence of multiple local minima solutions. This problem is especially severe for unidimensional scaling. For this case, several strategies are discussed that are less prone to local minima For full-dimensional scaling, in contrast, it is shown that the majorization algorithm always finds a globally optimal solution. For other dimensionalities, several methods for finding a global minimum exist, e.g., the tunneling method.
Ingwer Borg, Patrick Groenen

Unfolding

Frontmatter
14. Unfolding
Abstract
The unfolding model is a model for preferential choice. It assumes that different individuals perceive various objects of choice in the same way but differ with respect to what they consider an ideal combination of the objects’ attributes. In unfolding, the data are usually preference scores (such as rank-orders of preference) of different individuals for a set of choice objects. These data can be conceived as proximities between the elements of two sets, individuals and choice objects. Technically, unfolding can be seen as a special case of MDS where the within-set proximities are missing. Individuals are represented as “ideal” points in the MDS space so that the distances from each ideal point to the object points correspond to the preference scores. We indicate how an unfolding solution can be computed by the majorization algorithm. Two variants for incorporating transformations are discussed: the conditional approach, which only considers the relations of the data values within rows (or columns), and the unconditional approach, which considers the relations among all data values as meaningful. It is found that if transformations are allowed on the data, then unfolding solutions are subject to many potential degeneracies. Stress forms that reduce the chances for degenerate solutions are discussed. Also, a mixed ordinal-linear method is suggested as a reasonable compromise.
Ingwer Borg, Patrick Groenen
15. Special Unfolding Models
Abstract
In this chapter, some special unfolding models are discussed. First, we distinguish internal and external unfolding. In the latter, one first derives an MDS configuration of the choice objects from proximity data and afterwards inserts ideal points to represent preference data. Then, the vector model for unfolding is introduced as a special case of the ideal-point model. In the vector model, individuals are represented by vectors and choice objects as points such that the projections of the objects on an individual’s vector correspond to his or her preference scores. Then, in weighted unfolding, dimensional weights are chosen freely for each individual. A closer investigation reveals that these weights must be positive to yield a sensible model. A variant of metric unfolding is discussed that builds on the BTL choice theory.
Ingwer Borg, Patrick Groenen

MDS Geometry as a Substantive Model

Frontmatter
16. MDS as a Psychological Model
Abstract
MDS has been used not only as a tool for data analysis but also as a framework for modeling psychological phenomena. This is made clear by equating an MDS space with the notion of psychological space. A metric geometry is interpreted as a model that explains perceptions of similarity. Most attention has been devoted to investigations where the distance function was taken as a composition rule for generating similarity judgments from dimensional differences. Minkowski distances are one family of such composition rules. Guided by such modeling hypotheses, psychophysical studies on well-designed simple stimuli such as rectangles uncovered interesting regularities of human similarity judgments. This model also allows one to study how responses conditioned to particular stimuli are generalized to other stimuli.
Ingwer Borg, Patrick Groenen
17. Scalar Products and Euclidean Distances
Abstract
Scalar products are functions that are closely related to Euclidean distances. They are often used as an index for the similarity of a pair of vectors. A particularly well-known variant is the product-moment correlation for (deviation) scores. Scalar products have convenient mathematical properties and, thus, it seems natural to ask whether they can serve not only as indexes but as models for judgments of similarity. Although there is no direct way to collect scalar product judgments, it seems possible to derive scalar products from “containment” questions such as “How much of A is contained in B?” Since distance judgments can be collected directly, but scalar products are easier to handle numerically, it is also interesting to study whether distances can be converted into scalar products.
Ingwer Borg, Patrick Groenen
18. Euclidean Embeddings
Abstract
Distances are functions that can be defined on any set of objects. Euclidean distances, in contrast, are functions that can only be defined on sets that possess a particular structure. Given a set of proximities, one can test whether these values are distances and, moreover, whether they can even be interpreted as Euclidean distances. More generally, one can ask the same questions allowing for particular transformations of the given proximities such as adding a constant to each value. For ordinal transformations, the hypothesis that proximities are Euclidean distances is trivially true. Hence, in ordinal MDS, we learn nothing from the fact that the proximities can be represented in a Euclidean space. In interval MDS, in contrast, Euclidean embedding is nontrivial. If the data can be mapped into Euclidean distances, one can ask how many dimensions at most are necessary for a perfect representation. A further question, related to classical MDS, is how to find an interval transformation that leads to approximate Euclidean distances, while keeping the dimensionality of the MDS space as low as possible.
Ingwer Borg, Patrick Groenen

MDS and Related Methods

Frontmatter
19. Procrustes Procedures
Abstract
The Procrustes problem is concerned with fitting a configuration (testee) to another (target) as closely as possible. In the simplest case, both configurations have the same dimensionality and the same number of points, which can be brought into a 1-1 correspondence by substantive considerations. Under orthogonal transformations, the testee can be rotated and reflected arbitrarily in an effort to fit it to the target. In addition to such rigid motions, one may also allow for dilations and for shifts. In the oblique case, the testee can also be distorted linearly. Further generalizations include an incompletely specified target configuration, different dimensionalities of the configurations, and different numbers of points in both configurations.
Ingwer Borg, Patrick Groenen
20. Three-Way Procrustean Models
Abstract
In this chapter, we look at some varieties of generalized Procrustes analysis. The simplest task is to fit several given coordinate matrices X k (k = 1,..., K) to each other in such a way that uninformative differences are eliminated. We also consider generalizations of the Procrustean problem that first find an optimal average configuration for all X k and then attempt to explain each individual X k in turn by some simple transformation of the average configuration. One important case is to admit different weights on the dimensions of the average configuration. This case defines an interesting model for individual differences scaling: if the fit is good, then the perceptual space of individual k corresponds to the group’s perceptual space, except that k weights the space’s dimensions in his or her own idiosyncratic way.
Ingwer Borg, Patrick Groenen
21. Three-Way MDS Models
Abstract
In the Procrustean context, the dimension-weighting model was used in order to better match a set of K given configurations X k to each other. We now ask how a solution of the dimension-weighting model can be found directly from the set of K proximity matrices without first deriving individual MDS spaces X k for each individual k. We discuss how dimension weighting can be incorporated into a framework for minimizing Stress. Another popular algorithm for solving this problem, indscal, is considered in some detail. Then, some algebraic properties of dimension-weighting models are investigated. Finally, matrix-conditional and -unconditional approaches are distinguished, and some general comments on dimension-weighting models are made. Table 21.1 gives an overview of the (three-way) Procrustean models discussed so far and the three-way MDS models of this chapter.
Ingwer Borg, Patrick Groenen
22. Methods Related to MDS
Abstract
In this chapter, some other techniques are discussed that have something in common with MDS. First, we discuss the analysis of a variables-by-objects data matrix by principal components analysis and show how it is related to MDS. Then, some approaches for the analysis of asymmetric dissimilarity data are discussed, all with emphasis on a graphic display of the asymmetry. Finally, we discuss correspondence analysis, a technique particularly suited for the analysis of a contingency table of two categorical variables.
Ingwer Borg, Patrick Groenen
Backmatter
Metadaten
Titel
Modern Multidimensional Scaling
verfasst von
Ingwer Borg
Patrick Groenen
Copyright-Jahr
1997
Verlag
Springer New York
Electronic ISBN
978-1-4757-2711-1
Print ISBN
978-1-4757-2713-5
DOI
https://doi.org/10.1007/978-1-4757-2711-1