Elsevier

Journal of Choice Modelling

Volume 27, June 2018, Pages 19-36
Journal of Choice Modelling

A model for broad choice data

https://doi.org/10.1016/j.jocm.2017.09.001Get rights and content

Abstract

This paper analyzes a discrete choice model where the observed outcome is not the exact alternative chosen by a decision maker but rather the broad group of alternatives which contain the chosen alternative. The model is designed for situations where the choice behavior at a particular level is of interest but only broader level data are available. For example, consider analyzing a household's choice for a vehicle at the make-model-trim level but only choice data at the make-model level are observed. The proposed model is a generalization of the multinomial logit model and collapses to it when there is full observability of the exact choices. We show that the parameters in the model are at least locally identified, but for certain configurations of the data, they are only weakly identified. Methods to address weak identification are proposed when there are data available on the overall market shares of all alternatives, and both maximum likelihood and Bayesian estimation methods are explored.

Introduction

Discrete choice models are usually estimated with data on the exact choices made by the decision makers from a well-specified choice set, as well as with observable attributes that are related to the choices, decision makers, or both. With these data standard discrete choice models like multinomial logit, probit, and generalized extreme value models can easily be estimated.

In contrast to this standard setting, our paper focuses on the situation where the econometrician does not observe the choices made by the decision makers at the level of interest, but rather only observes the broad groups of alternatives in which the chosen alternatives belong to. We refer to the choices at the original level of interest as exact choice data and the broader level group data as broad choice data. As a running example, suppose that it is of interest to model a household's vehicle choice at the make-model-trim level where the choice set contains a Honda Civic LX, Honda Civic Hybrid, Toyota Camry LE, and Toyota Camry XLE Hybrid. Instead of observing the household's exact choices from this four-vehicle choice set, the econometrician only observes the broad make-model group choices from the choice set, either from the Honda Civic or Toyota Camry group. The main objective of our paper is to only use the broad choice data to make inferences for the parameters belonging to the original exact choice data (e.g., alternative-specific constants).

There have been a few directions in the literature to address this data observability issue. From the statistics side, this type of data is referred to as either grouped data (Heitjan, 1989, Gjeddebaek, 1956a, Gjeddebaek, 1956b, Gjeddebaek, 1961, Gjeddebaek, 1949, Gjeddebaek, 1957, Gjeddebaek, 1959), partially categorized data (Blumenthal, 1968, Nordheim, 1984), or coarse data (Heitjan and Rubin, 1990, Heitjan and Rubin, 1991). These three concepts are closely related and generally address the problem of only observing power sets from the sample space for the original random variable of interest. They differ only in terms of the type of random variable being analyzed: grouped data generally refer to observing interval data from continuous random variables, partially categorized data refer to observing set data from discrete random variables, and coarse data refer to observing general power set data from any random variable. Our definition of broad choice data is similar to partially categorized data and is similar to coarse data for discrete random variables.

Another direction of research is to redefine both the choice and attribute data into a common level of observability so that standard methods can be applied. For instance, in the vehicle choice example, the observable attributes at the make-model-trim level are either aggregated or averaged into the make-model level prior to estimation (e.g., average Honda Civic miles per gallon is used instead of specific trim-level fuel consumption). The attributes and choice data at the matching make-model level are then analyzed using standard discrete models. This approach has two major drawbacks. One is that using aggregate or average attributes will result in loss of precision for the parameter estimates when the members within a make-model level group are not homogeneous with respect to their attributes. This is obvious since the miles per gallon ratings are significantly different between Honda Civic hybrids and non-hybrids, so averaging over this attribute within the make-model set will create measurement error which will lead to inconsistent parameter estimates. McFadden (1978) shows that if the distribution of attribute values being aggregated can be approximated by a multivariate normal distribution, then this inconsistency can be removed by including the covariances of the attributes within the group as well as the log of the number of alternatives in the group as additional explanatory variables. The second drawback is that, by averaging over the make-model-trim level attributes, there may not be enough variation to identify the parameters specific to the make-model-trim level, which is the level that we wish to make inferences in. We may need to identify these parameters to analyze the impacts of fuel economy standards.

Multiple imputation is another direction of research. Intuitively, this approach imputes the exact choices from the original choice set of interest for each decision maker, estimates the model using the imputed exact choice data and attributes, and averages the parameter estimates over the numerous sets of imputed data. This is an attractive method since, given each set of imputed exact choices, standard discrete choice models can be used. Unfortunately a key requirement for multiple imputation estimators to be consistent is that the estimator must be consistent for each completed data set based on a single set of imputations (Rubin, 2004, Chapter 4). Unless the imputed alternative is the one actually chosen by the household, then the estimates on each completed set of data are not consistent.

In this paper, we propose a formal regression-based model for broad choice data that addresses the drawbacks in the current literature. In particular, our model is different than the work from the statistics literature in that it is a discrete choice model (i.e., the probabilities are based on utility maximization) and is a regression-based (i.e., attribute or covariate-based) model, while most of the current literature in statistics is based on general analysis of the data observability mechanism without covariates. In our framework, the broad choice data can be used together with the attributes at the exact choice level, avoiding the need of the previous literature to redefine data into a common level prior to estimation. The estimators we propose are either maximum likelihood or Bayes estimators, so they are fully efficient.1 And finally, our paper is unique in that we closely analyze the issue of identification when broad choice data are used instead of exact choice data. We show that the parameters in our model are locally identified, but for certain cases the parameters are only weakly identified (i.e., the likelihood function is almost completely flat). To address this weak identification issue, we introduce a novel technique to incorporate external information into the model in the form of parameter constraints or informative priors (in the Bayesian sense), and we also show how this information can be easily incorporated into maximum likelihood and Bayesian estimation routines. We only consider the case where the underlying choice model is conditional logit, but extensions to other discrete choice models are straightforward. Wong et al. (2017) provide Monte Carlo results showing that the broad choice model described in this paper performs much better than McFadden's procedure, averaging over aggregated alternatives, or using a “representative alternative” in a realistic vehicle choice situation.

The paper proceeds as follows. The model for broad choice data is formally stated in Section 2, and the likelihood-based quantities are derived in Section 3. Using the quantities from the preceding section, Section 4 discusses the identification issues associated with using the broad choice data. The details for maximum likelihood and Bayesian estimation of the parameters are discussed in Section 5, and Section 6 illustrates the various estimators on simulated data. Concluding remarks are in Section 7.

Section snippets

Model for broad choice data

The model specification is similar to that of a multinomial logit model and is based on random utility theory. Formally, the model is expressed asUij=δj+xijβ+εij,εiji.i.d.Type 1 Extreme Value,Yi=jifUijUikkC={1,2,,J},Yi=mifYiCm,

for decision makers i=1,,N, alternatives j=1,,J, and groups m=1,2,,M.

The latent utility that decision maker i obtains from alternative j is given by Uij in (1). It is a function of an “average” level of utility that is constant for alternative j across all

Likelihood function and associated quantities

This section discusses the likelihood function of the sample, score function, and Hessian matrix of the log-likelihood function for the model with broad choice data assuming that the underlying choice model is conditional logit. The Hessian matrix is simple and provides insight into the likelihood function. It is also useful for the discussions on identification, information loss, and estimation in the subsequent sections.

Before discussing the likelihood function, some additional notation is

Identification

Identification is assessed by analyzing whether the information matrix is nonsingular. The results in this section utilize Theorem 1 from Rothenberg (1971). Using the notation from the previous section, the theorem states that θ is locally identified if and only if the information matrix IB(θ)=E(HB(θ)) is nonsingular, or equivalently, has rank G (the number of elements in θ).

The information matrix corresponding to (9) is equal toIB(θ)=E(HB(θ)),=FIL,=IE(θ)IL,where F=IE(θ) from McFadden (1973)

Estimation

This section describes maximum likelihood (ML) and Bayesian estimation of θ. Throughout the entire discussion, we assume the second case mentioned in Section 4 which implies that θ is at least locally identified. But for scenarios in which the broad choice data are not informative about θ (see Section 4 for an example), we propose incorporating additional information in the form of population market shares into the problem.

For ML estimation, the market share information is implemented as

Simulation results

This section applies the estimation methods developed in Section 5 to simulated data. The results are used to compare the different estimators and to highlight some key points regarding the inclusion of market share information. For this simulation study, the maximum likelihood and Bayesian estimators are analyzed in both repeated sample and single sample settings.

We use 250 repeated samples. Each sample is a random draw of 20,000 decision makers from a large population based on (1), (2), (3).

Conclusion

This paper introduces a new discrete choice model to analyze choice outcomes that only broadly represent the actual choices made by the decision makers. It is useful in analyzing situations where the choice behavior at a lower level is desired but only higher level choice data are observed. The parameters from the proposed model are locally identified, but in some perverse yet interesting cases, they may only be weakly identified. To efficiently recover the parameter estimates in these

Acknowledgements

We appreciate the thorough and thoughtful comments and references from an anonymous referee. We also want to thank Ivan Jeliazkov, Dale Poirier, Jinwon Kim, Alicia Lloro, Tim Wong, Jonathan Cook, and Simpson Zhang for their thoughtful comments. We acknowledge funding provided by the University of California, Irvine through the Department of Economics, Institute of Transportation Studies, School of Social Sciences and Center for Economics and Public Policy, the University of California Center

References (29)

  • S. Chib et al.

    Tailored randomized block MCMC methods with application to DSGE models

    J. Econ.

    (2010)
  • T. Amemiya

    Advanced Econometrics

    (1985)
  • T.W. Anderson et al.

    Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes

    Ann. Math. Stat.

    (1952)
  • S. Berry et al.

    Automobile prices in market equilibrium

    Econometrica

    (1995)
  • S. Blumenthal

    Multinomial sampling with partially categorized data

    J. Am. Stat. Assoc.

    (1968)
  • G. Casella

    An introduction to empirical Bayes data analysis

    Am. Statistician

    (1985)
  • S. Chib et al.

    Understanding the Metropolis-Hastings algorithm

    Am. Statistician

    (1995)
  • J.-P. Dubé et al.

    Improving the numerical performance of static and dynamic aggregate discrete choice random coefficients demand estimation

    Econometrica

    (2012)
  • N. Gjeddebaek

    Contribution to the study of grouped observations: application of the method of maximum likelihood in case of normally distributed observations

    Scand. Actuar. J.

    (1949)
  • N. Gjeddebaek

    Contributions to the study of grouped observations. ii. loss of information caused by grouping of normally distributed observations

    Skand. Aktuarietidskr.

    (1956)
  • N. Gjeddebaek

    Contributions to the study of grouped observations. v. three-class grouping of normal observations

    Skand. Aktuarietidskr.

    (1956)
  • N. Gjeddebaek

    Contribution to the study of grouped observations. iii: the distribution of estimates of the mean

    Scand. Actuar. J.

    (1957)
  • N. Gjeddebaek

    Contribution to the study of grouped observations. iv. some comments on simple estimates

    Biometrics

    (1959)
  • N. Gjeddebaek

    Contributions to the study of grouped observations. vi

    Skand. Aktuarietidskr.

    (1961)
  • Cited by (7)

    • Hotel choice: A closer look at demographics and online ratings

      2019, International Journal of Hospitality Management
      Citation Excerpt :

      Highly differentiated markets are usually formed by the levels of heterogeneity in preferences among customers, suggesting the importance of using a choice model to address these heterogeneities (Sarrias and Daziano, 2018). The choice model enables us to analyze the choice behavior in which the choice outcomes are estimated with data from a well-specified choice set (Brownstone and Li, 2018). In fact, Kim and Park (2017) indicated that understanding customer choice behavior is crucial for effective management and marketing success, which is also the case of the current study.

    • Aggregation biases in discrete choice models

      2019, Journal of Choice Modelling
      Citation Excerpt :

      One solution to overcome this problem is to supplement the data with macro-level market share data at the exact choice level and take a BLP approach to estimating the model. Wong (2015) and Brownstone and Li (2017) explore choice set aggregation within that context. We construct the dataset for the Monte Carlo study based on a vehicle choice application.

    • On the implications of using composite vehicles in choice model prediction

      2018, Transportation Research Part B: Methodological
    View all citing articles on Scopus
    View full text