Enabled by modern interaction-logging technologies, managers increasingly have access to outcome data from customer interactions. We consider the direct marketing targeting problem in situations where 1) the customer’s outcomes vary randomly and independently from occasion to occasion, 2) the firm has measures of the outcomes experienced by each customer on each occasion, and 3) the firm can customize marketing according to these measures and the customer’s behaviors. A primary contribution of this paper is a framework and methodology to use data on customer outcome data to model a customer’s evolving beliefs related to the firm and how these beliefs combine with marketing to influence purchase behavior. Thereby, this paper allows the manager to assess the marketing response of a customer with any specific outcome and behavior history, which in turn can be used to decide which customers to target for marketing. This research develops a novel, tractable way to estimate and introduce flexible heterogeneity distributions into Bayesian dynamic discrete choice learning models on large datasets. The model is estimated using data from the casino industry, an industry which generates more than $60 billion in U.S. revenues but has surprisingly little academic, econometric research. The counterfactuals suggest that casino profitability can increase substantially when marketing incorporates gamblers’ beliefs and past outcome sequences into the targeting decision.
This paper focuses on the direct marketing problem of whom to target and with what offers in situations where the customer learns about the firm through multiple interactions. The situations we focus on have the following four specific characteristics: (a) The customer’s outcomes are random in that they vary from occasion to occasion according independent draws from a certain distribution. (b) The customer uses the outcomes to learn and form beliefs about some key characteristics of the distribution. These beliefs affect the customer’s expected utility from future interactions with the firm, which in turn affect future decisions on whether to interact with the firm. (c) The firm has access to measures of the customer’s outcomes. (d) The firm can use information on these measures to make different offers to different customers. These four characteristics hold in a large number of industries, like the airline, financial services, restaurant or casino industries. To take an example from the financial services industry: A private banking advisor’s performance is often random. A customer may use the performance stream to assess the account’s likely long-run return and volatility, which may then influence his/her decision on whether to continue doing business with the firm. The firm observes the performance stream and can make different offers to different customers, potentially customizing according to the performance stream experienced by a customer and according to the customer’s behaviors in response to those performances.
A primary contribution of this paper is a framework and methodology to use data on customer outcomes to model a customer’s evolving beliefs related to the firm and how these beliefs combine with marketing to influence purchase behavior. This allows the manager to forecast the likely marketing response for any customer with any specific experience and behavior history, which in turn can be used to decide which customers to target for marketing. In this paper we apply the framework on gambling outcomes from casino visitors. An important insight identified from this methodology is that the optimal targeting decision may depend on a customer’s expectations of outcomes because of its influence on marketing responsiveness. For instance, a customer with early outcomes that are of atypically low levels will likely form a belief distribution with a low expectation of outcomes from future visits and, on the basis of this belief, may reduce or altogether stop interactions with the firm. So, if the private banking customer we talked about earlier sees a string of losses, he/she will likely believe that the returns generating ability of the firm’s team is low and reduce further dealings through that firm. Attractive direct marketing offers targeted at such a customer can increase the customer’s preference level of expected utility for a future visit, incentivize the customer to interact further with the client, improve the belief distribution based on the new experience, increase the likelihood of future visits, and increase the future profits of the firm from that customer. This discussion suggests that if the marketing offers are costly and the firm can make the offers to only a limited set of customers, then it should target those customers whose future profits will be most increased by the offer, possibly those customers whose belief distributions can be improved the most, which may be those customers who have had atypically low levels of outcomes. However, this intuition should not be taken to imply that the firm should simply direct offers to the customers with the lowest outcomes. One has to balance the cost of an offer against the benefits, which will depend on marketing responsiveness and the extent to which the offer will influence the customer’s future behaviors and the firm’s profit from those behaviors. In this paper, we present a model to do exactly that and provide evidence that incorporating a customer’s beliefs of future outcomes into the targeting decision can significantly increase a firm’s profit.
Anzeige
The targeting framework and methodology developed in this paper can be applied to a broad set of industries and contexts where there is randomness in customer outcomes, the firm has data on the customer’s experienced outcomes for each customer interaction, and where each customer forms evolving beliefs depending on the realized outcomes. In air transportation, there is randomness in customer experience related to flight delays. Airlines often track the flight on-time performance history experienced by the customer and may take action accordingly. Emirates Air, for example, gives reward points as an offsetting mechanism when a passenger has experienced unusual delays. Whenever a customer uses Uber, the driver-rider marketplace firm, the quality of the driver they receive is random. Passengers rate the quality of their experience which Uber can then use for targeting purposes. With new interaction-logging technologies in delivery systems, online channels, call centers and mobile devices, firms are increasingly in a situation where they have access to data on outcomes on a customer-by-customer, occasion-by-occasion basis. This may make timely this paper’s main contribution of developing a framework and methodology toward using such customer data for optimal targeting of marketing offers via a model of customer belief evolution and the impact of marketing.
To address the direct marketing decision problem in this important class of situations where the manager can react to a customer’s observed experiences and behavior history, we employ a Bayesian, dynamic learning framework. A customer starts with a prior belief distribution on the mean value of future outcomes with the firm and updates this belief distribution according to new experiences accumulated with each interaction with the firm. For a rational consumer, the decision to engage in repeat interactions with the firm, thereby forming more accurate beliefs, will depend on the utility from the interactions, the value of the increased belief accuracy, and the consequent utilities from potential future visits. The model presented in this paper allows the manager to determine the extent to which direct marketing influences these utilities, the customer’s consequent interactions with the firm and firms’ consequent profits.
Because prior beliefs, marketing responsiveness and utility function parameters can vary from customer to customer, it is important that the model and estimation methodology allow for across-customer heterogeneity. Incorporating learning into a dynamic discrete choice model is difficult because the optimal choice is the solution to a Bellman equation, with a correspondingly difficult likelihood function. Including unobserved heterogeneity makes the problem even more difficult. An additional contribution of this paper is that we develop a tractable solution to this class of problems by combining a forward simulation algorithm with Markov Chain Monte Carlo ideas. By deploying our solution on scalable, parallelizable cloud computers we are able to easily estimate complex dynamic discrete choice models that incorporate consumer learning.
We illustrate this paper’s framework and methodology in the context of the casino industry. The outcome we consider relates to the payoff from gambling as mediated by what is called the house advantage. Gambling outcomes vary randomly and independently from occasion to occasion. The casino observes these outcomes through its transaction records and can customize marketing according to its inferences about the customer’s beliefs on the average payoff and the moderating effect of firm marketing as evidenced by previous behaviors and hierarchical Bayesian priors across all customers. We focus on the gaming industry for a few reasons. First, it is a moderately large industry in the United States: Gaming revenues are now at an historic peak at nearly $100 billion, with nearly 900 commercial and tribal casinos operating in nearly all states. In 2022, 84 million gamblers spent $60 billion in gaming revenues at commercial casinos. The pervasiveness of the industry is as significant as its size; nearly one third of Americans have gambled at a casino within the previous twelve months [2]. Therefore, insights into this industry may be valuable. Second, there is limited research on the impacts of casino marketing, probably primarily to the difficulty of obtaining sensitive casino data rather than the lack of importance of marketing in the gaming industry. Third, it is behaviorally interesting because these are real gamblers responding to uncertainty, as opposed to lab participants. Fourth, an advantage of studying casinos is that gambling outcomes are exogenous. Even though casinos can control the overall house advantage and its distribution, the outcome of any one trip for any individual is independent of that individual’s history and can take any value from a diverse set. This can greatly help in model identification. Finally, many casinos base marketing offer values on gamblers’ past expected losses, and not their actual outcomes. We discuss this later in more detail but the implication is that much of the offer endogeneity is removed.
Anzeige
We use data on real gamblers to understand how their outcomes influence the time until their return trip. Specifically, we answer the following: How do past outcomes influence gamblers’ beliefs on the house advantage and how can marketers use this information in their one-to-one targeting decisions? To estimate these impacts, we specify a dynamic learning model that incorporates the belief uncertainty into the utility specification. In the traditional random utility framework, consumers know the attributes of their choices perfectly. Learning models extend the traditional framework by recognizing that consumers may have incomplete information and thus make choices based on perceived rather than actual attributes [7]. In this model, gamblers learn about the casino’s house advantage by gambling at slot machines. Gamblers use their beliefs to form future cost expectations, which influences their decision to return to the casino. The uncertainty in these beliefs can also influence the decision to return. By fully modeling the learning process (rather than simply conditioning on the last outcome), the model permits gamblers to use their entire trip history when forming future cost expectations. The reduced-form evidence supports a full structural model of the learning process.
Academic researchers have shown considerable interest in decision making under uncertainty for decades, roughly starting with Tversky and Kahneman [29]. Meyer [16] found that temporal variability increases the cost of information gathering, which suggests that variability comes with a premium. More recent work studies uncertainty in customer satisfaction [4], service quality [5, 27], and product attributes [10]. An important difference between some this paper and some of these previous papers is that the focus of this research is on how a customer’s evolving belief (and the uncertainty of this belief) can influence a firm’s targeted marketing actions. The intent is similar to that of Narayanan and Manchanda [18], who also study learning behavior and its impact on targeting. However, a significant difference with respect to our research is in the way beliefs evolve as a function of experienced outcomes and the way customers are strategic on the value of future interactions in forming more accurate beliefs and making better decisions as a consequence. Gamblers can make tradeoffs between today’s knowns and tomorrow’s potential upsides to decide whether to continue engaging with the firm.
The dataset used in model estimation comes from a large destination casino in the United States. We observe the complete trip histories and marketing offers for a random sample of gamblers. The empirical strategy takes two parts. First, we show descriptive and reduced-form evidence that motivate the need for a structural learning model. Gamblers who incur a single loss return to the casino about ten days later than gamblers who incur a single win, but the return time increases as more losses are incurred: gamblers with three past losses return about forty days later than a gambler with three past wins. These findings suggest that gamblers incorporate outcomes from multiple past trips into the decision process. Reduced-form evidence also shows gamblers’ return times are significantly influenced by their beliefs about the house advantage and the uncertainty surrounding their beliefs.
Next, we estimate a structural model of the return time using a dynamic discrete choice framework. As is widely recognized, structural methods allow for counterfactual predictions about how changes in marketing policies will affect consumer behavior [22]. Our structural model incorporates the dynamic forward-looking behavior of individuals. One obstacle to estimating such structural models has been the contingent computational burden, which is mainly due to two reasons. First, the likelihood is based on the explicit solution of a dynamic programming (DP) model. This requires us to obtain the fixed point of a Bellman operator for each possible point in the state space. Second, the number of points in the state space increases exponentially with the dimensionality of the state space, a phenomenon referred to as the “curse of dimensionality”. Imai et al. [12] introduce a full-solution Bayesian approach to estimation of structural parameters. An important innovation in their algorithm is that they only need to conduct a single iteration of the Bellman operation during each estimation step (i.e., each MCMC draw). While conventional methods estimate the model only after solving the DP problem, their approach simultaneously solves the DP problem during parameter estimation. Because of this, the computational burden of their method is similar to that of non-Bayesian approaches but still intractable for dynamic learning models. In this paper we use forward simulation (see Bajari et al. [3] and Hotz et al. [11]). This significantly reduces the computation time and makes Bayesian estimation of a complex learning model feasible.
Learning models were first applied to marketing by Roberts and Urban [23] and Eckstein et al. [9]. The initial models were relatively simple due to limitations on computer processing speeds and estimation algorithms available at the time. Erdem and Keane [10] represents a significant methodological advance because it expanded the the class of learning models that became feasible to estimate. They used the method of Keane and Wolpin [13] to obtain a fast and accurate approximate solution to the dynamic optimization problem and used simulation methods to approximate the likelihood function.
Recently, a few research papers applied the modified Bayesian MCMC algorithm first proposed by Imai et al. [12] to estimate learning models with forward-looking consumers (see Roos et al. [24]). Osborne [20] is the first paper to allow for both learning and switching costs as sources of state dependence in a forward looking learning model. He also incorporated unobserved heterogeneity, however his paper assumed a “one-shot” learning model where only one purchase occurrence is all that is needed to learn everything about the product. While this is defensible in the product category he analyzed (laundry detergent), the purchase to purchase variability in many settings, including gambling, detracts from the basic assumption underlying this method. The model considered in our paper allows for learning to happen over multiple exposures and also allows for individual parameters to be estimated using Bayesian methods with flexible mixture distributions on the heterogeneous parameters.
Our empirical results suggest that gamblers’ prior beliefs translate to an overestimation of the house advantage by a factor of about four. Further, our counterfactual analyses suggest that this overestimation may be costing the casino substantial amounts of revenue. When gamblers overestimate the house advantage, they overestimate projected future expenditures, which reduces the probability of revisiting the casino within any given time period. The counterfactuals also illustrate the value of incorporating the beliefs and outcomes into the targeted marketing decision. We show that naive targeting strategies based on simple outcome heuristics are not sufficient and that more sophisticated targeting strategies can lead to higher revenue. The final simulation shows that marketing strategies which vary offer amounts based on gamblers’ beliefs in the house advantage improve profitability by close to 20%.
1 The Casino Industry
The gaming industry is a critical component of the U.S. economy. Gaming revenues are now at an historic peak at nearly $100 billion, with nearly 900 commercial and tribal casinos operating in nearly all states. In 2022, 84 million gamblers spent $60 billion in gaming revenues at commercial casinos [2]. Despite its substantial contribution to the economy, there are only a handful of papers that study effects of casino marketing [17, 19, 21].
The casino industry has long understood the importance of effective customer relationship management [8]. In today’s gaming environment, a sophisticated tracking system is essential to remain competitive, especially in saturated markets like those of Las Vegas and Atlantic City [14]. Casino marketing offers typically include a combination of free room nights (if the casino has a hotel) and slot promotional credits. Offers can also include additional complimentaries (or “comps”) for virtually any other amenities available at the casino, such as shows, spa treatments, or dining.
To determine the optimal level of comps to offer their players, managers need to estimate gaming revenue at the player level. Casinos use player rating systems to track and record individual plays and player information. The marketing department uses these data to segment customers into tiers for targeting marketing offers.
Casinos record player information by having them enroll in the casino’s loyalty program. Most casinos have enrollment centers on the casino floor. To incentivize gamblers to have their play tracked, casinos offer rewards programs. Rewards programs are different across casinos, but one feature they share is that the magnitude of the reward is a function of play volume and possibly on-property purchases. A rewards system increases the likelihood that the casino has the complete play history recorded because without rewards players are unlikely to allow the casino to track their play. While no-card play (that is, gambling without a loyalty card) still contributes a substantial portion to a casino’s revenues, the amount of no-card play per person is often insignificant and unlikely to be of interest to the casino. Frequent gamblers typically understand that it is in their best interest to have all of their play tracked to increase rewards earning.
1.1 Slot Machines
We limit our empirical analysis to gamblers who play only slot machines, for three reasons. First, electronic gaming machines are the most popular game among casino visitors, as more than half (51 percent) choose slot machines or video poker as their favorite game [1]. Second, tracking table games activity at the individual player level is still a very manual process and often inaccurate. On slot machines, all play is recorded electronically through each gambler’s loyalty card and, because of this, revenue is exact down to the penny. Third, the skill of a tablegame player can to some degree dictate the outcome. For example, a skilled blackjack player can reduce the house advantage to nearly zero (or negative if they are counting cards) but an unskilled player can lose far more than expected in the long run. On slot machines it does not matter who is pulling the lever (or nowadays more often pressing the “spin” button); the outcome is completely random and in the long run the hold percentage should converge to the house advantage regardless of the individual gambler.
Before proceeding further, we introduce a few terms and concepts which are used by industry practitioners and by us in the exposition of our learning model. Handle measures the total amount of money wagered on the machine. This measure of volume allows management to monitor the overall popularity of games. The hold percentage is the percentage of handle that the machine keeps in any particular play event. For example, if a player plays puts in $5 and gets out $6, then the hold is -20%, but if the player loses all of the $5 then the hold is 100%. The hold percentage can vary quite a lot across play events, and is governed by the randomness programmed into the the particular machine. Different machines can have different hold distributions programmed into them. The hold percentage for a consumer on a certain trip is the percentage of that trip’s play volume lost to the casino, and is an aggregation of the hold percentages of the machines that he/she played on. The house advantage is the expectation over the hold percentage distribution of the slot machine and depends on the payouts and odds specific to that machine. The house advantage can also be seen, based on the weak law of large numbers, as the hold percentage aggregated over a large number of plays. Slot machine advantages range from as low as .5% to as much as 25%. The Las Vegas Strip house advantage is around 7%, in Reno it is about 5%.1 Many gaming jurisdictions have established minimum levels at which slot machines must pay back in order to prevent casino operators from placing too great a disadvantage on players [14]. The important point to remember is that the hold percentage reflects the actual, empirically realized amount of money kept by the casino on any one occasion, while the house advantage is what the casino keeps on average across occasions.
1.2 Theoretical and Actual Outcomes
Casino operators track both actual and theoretical player losses. The theoretical loss (also called “theo”) is the amount of money the player was expected to lose. It is based on the following formula:
As mentioned earlier, for targeting purposes many casinos typically ignore the actual outcomes and instead value their players on theoretical losses alone. The primary reason for doing so is to control for the randomness of outcomes. This creates a significant advantage for an analyst studying the impact of outcomes because marketing offer values are not endogenous, in the sense that the offers are not directly tied to the actual outcomes. For example, if one gambler loses $500 and the another wins $4,000 but they were both expected to lose $300 they will receive the same offer. While some casinos do employ sophisticated targeting strategies that incorporate actual outcomes, it is important to recognize that at the casino providing the data, and at many other casinos without dedicated analytics teams, this is simply not the case: targeting tends to be based solely on past expected losses.
2 Data
The dataset used to estimate the structural model comes from a large destination casino in the United States. The dataset includes the complete trip histories from over 28,000 randomly selected slot gamblers with around 110,000 trips occurring between February 2006 and May 2015. We observe basic demographic information such as gender, distance to the casino, age, and current loyalty card level (either “Silver” or “Gold”).2 A “trip” is defined as a distinct period of time where gambling activity is observed. For instance, a new trip is initiated when either 1) a player inserts their loyalty card into a slot machine for the first time or 2) a significant lapse in play occurs. The lapse used to demarcate a new trip is set by management in a way that makes it very likely that each trip record captures a distinct return to the casino rather than simply a suspension in play within a single trip. Typically, the cutoff is three days, meaning that if no activity is observed for three days, the trip is considered to have ended and any further play initiates a new trip record.3
Table 1
Descriptive statistics
All data
Gamblers>= 3 Trips
Gamblers
28,362
13,964
Average age
54
56
Male
36%
35%
Trips
113,752
94,139
Months between Trips
9.6
8.9
Trip length
2.5 days
2.8 days
Slot average bet
$1.95
$1.69
Spins/Minute
7.7
7.8
Hours played/Trip
5.3
5.9
Hold % experienced
12%
10%
Expected loss/Trip
$387
$416
×
Each trip record includes detailed information about the gambling activity from that trip. The variables of interest are the start and end dates of the trip, actual and theoretical loss values, time played, average bet, promotional credits redeemed, comps received, and whether they stayed at the hotel or not. We also observe all marketing activity for these gamblers. Over the observed period, the gamblers redeemed over 2,500 separate offers. For each offer, we observe the period for which the offer is active (typically about 2.5 months), the date when the offer was sent to the gamblers (the “drop” date), the total number of promotional credits in the offer, and the comp room type. A slot promotional credit is essentially free slot play where any winnings from the promotional credit can be kept. The promotional value itself cannot be converted to cash.
3 Descriptive Analysis and Reduced-Form Evidence
Before discussing the structural estimation procedure, we will first describe the data and show reduced-form evidence that gambling outcomes impact the timing of the return trip. We exclude players whose play is high enough to warrant a casino host. This ensures that the only marketing communication between the casino and the gambler is done through direct marketing offers. We only keep gamblers who play slot machines exclusively, for reasons discussed previously, which represents 37% of the low-end player base. We remove gamblers whose first trip to the casino occurred before the first available marketing offer data is observed so that we have the complete marketing and trip history for each gambler, right from his first gambling interaction with the firm. This is important as a cautionary step so that the customers we are analyzing are indeed those who are likely to be learning and evolving in belief about gambling at the firm. Table 1 summarizes the cleaned data. The gamblers tend to be older and mostly female. The casino providing the data informed us that because they serve as a destination casino, the gamblers tend to have above-average disposable income, which we confirmed based on the median incomes of the gambler zip codes. For estimation, we only use gamblers with at least three trips to the casino. This is done to ensure that each gambler observes a sufficient amount of variance in the experienced hold percentages. In general, their statistics are quite similar to the aggregate level statistics.
×
×
Figure 1 shows the distribution of average uncensored intervisit times across gamblers with at least three trips at the casino. The median return time is about ten months.
If gamblers learn from experience, their sensitivity to a single trip’s gambling outcomes should decline over time. Experienced gamblers have more certainty about expected outcomes and therefore should be less likely to be swayed by their most recent outcome. Inexperienced gamblers (those with only a few trips) project expected outcomes only using a few signals, which can vary greatly between players and cause biases depending on the sequence realized. Over time gambler beliefs will converge to the truth and a single outcome will not have as much of an impact on the return decision. We see evidence for this in the data. Figure 2 shows the difference in median return times after a gambler lost compared with after a gambler won. The differences are grouped by experience, represented by the number of trips to the casino when the win or loss was realized. For example, when gamblers have less than five casino trips, they tend to return about ten days later when they lose versus when they win. With more trips (and more experience) the difference in return times diminishes and gamblers become less impacted by the most recent trip’s win or loss. To handle potential selection bias in Figs. 2 and 3 (that is, gamblers who eventually have many trips at the casino are inherently different from those with fewer trips) we only include players who eventually have between 15 and 30 total trips at the casino.
×
Additional evidence for learning across trips is given by Fig. 3 in which we show the difference in return times between winning and losing streaks, where the streak occurs over the past one through four trips. Ignoring any streaks (where the gambling streak equals one), players that lose tend to come back 10 days later than those that win. Moreover, as players lose multiple times in a row, they delay the return trip by even greater lengths. For example, a gambler who lost three trips in a row will return about forty days later compared with a gambler who won three trips in a row. Figure 3 also provides evidence against gamblers’ budgets dictating return times. If losing gamblers return only after their lost wagers have been recouped elsewhere (say from employment), the time until the next trip should remain relatively constant and not increase as more losses are incurred across trips.
Next we provide reduced-form evidence that gamblers’ return times to the casino are influenced by their beliefs in the house advantage. We estimate return times using a Weibull hazard model and include the posterior mean of the house advantage and its posterior variance as covariates - the posteriors are generated from a Bayesian learning process (the updating process is discussed in detail later). We also include several demographic and last-trip variables: age, sex, card level, distance to the casino, whether they stayed at the hotel on the last trip, whether they redeemed a promotion on the last trip, the log of the total comps received, trip length, and last trip theoretical loss.
×
In a Bayesian learning process, the prior mean and prior variance dictate the evolution of the posterior mean and variance. Because of this, in a reduced-form hazard model a prior mean and variance needs to be selected in order to generate the posterior beliefs on the house advantage (the mean and its uncertainty). In the structural model, these priors are estimated, but for reduced-form evidence we estimate 150 hazard models over a grid of 15 prior mean and 10 prior variance starting points. Figure 4 shows examples of the truncated normal shapes to illustrate the variety of prior settings that are considered for the house advantage beliefs. The idea is that by estimating many hazard models over this grid we can determine if the reduced-form coefficient estimates are sensitive to the learning process priors. The specific gridpoints are available in the Web Appendix.
Figure 5 plots the coefficient on the posterior mean across all 150 gridpoints. The latticed plane is positioned where the z-axis equals zero; any points above this plane are positive and below are negative and points that are filled in are significant at the .05 level. Except for very low prior variance values (where convergence of weibull is not achieved) the coefficients on the posterior mean tend to be positive and significant, which means that the return time increases with the posterior mean of the house advantage. In other words, as the belief in the house advantage increases gamblers take longer to return. Figure 6 shows a similar plot but for the coefficient on the posterior variance. At very low prior variance settings the coefficient cannot be estimated. The coefficients that can be estimated are significant, suggesting that uncertainty in the gamblers’ beliefs on the house advantage influence the return decision.
×
The reduced-form evidence suggests that 1) learning should be incorporated into a model of the return times, and 2) posterior beliefs in the house advantage (the mean and its uncertainty) influence the return time. The drawback of a reduced-form approach is that it does not account for any forward-looking behavior of the gamblers.
4 A Model of Gambler Learning
In this section, we propose a structural model of the casino return decision process. We first outline the dynamic optimization problem somewhat generally and then introduce the learning component and specific utility function.
4.1 Gambler Dynamic Optimization Problem
We model casino return times in the framework of a dynamic discrete choice model, which can be interpreted as a generalization of a structural hazard model [30]. We estimate an infinite horizon model of a forward looking agent. In each decision period, the gambler decides to return to the casino or not by comparing his/her current and discounted future utilities of each action.
Let \(\theta \) be the J-dimensional parameter vector. Let S be the finite set of state space points and s be an element of S. Let A be the finite set of all possible actions and a be an element of A. Let \(u\left( s,a,\varepsilon _{a},\theta \right) \) be the current period utility of choosing action a , given state s and \(\epsilon \) , is a vector whose ath element is a random choice to the current returns of choice a. The transition probability of next period state \(s^{\prime }\), given current state s and action a is \(f\left( s^{\prime }|s,a,\theta \right) \). Given a discount rate \(\beta \), The time invariant value function can be defined to be the maximum of the discounted sum of expected utilities:
The first expectation is included because even when making the decision to return to the casino the utility is not known until after the trip has been realized. The second expectation is taken with respect to the next period shock \(\varepsilon ^{\prime }\) and the next period state \(s^{\prime }\).
If we define \(EV\left( s,a,\varepsilon _{a},\theta \right) \) to be the expected value of choosing action a then
The dataset for estimation includes variables which correspond to state vector s and choice a but the choice shock \(\varepsilon \) is not observed. We observe data for \(i=1,\ldots ,N\) gamblers, and each gambler i has \(T_{i}\) observations. The observed data for individual i is denoted \(y_{i}^{d}\equiv \left\{ a_{i,t}^{d},s_{i,t}^{d}\right\} _{t=1}^{T_{i}}\) and \(Y^{d}\equiv \left\{ y_{i}^{d}\right\} _{i=1}^{N}\)with superscript d to represent that this is observable data. Furthermore,
Let \(\pi \left( \cdot \right) \) be the prior distribution of \(\theta \) and let \(L\left( Y^{d}|\theta \right) \) be the likelihood of the model, given the parameter \(\theta \) and the value function \(V\left( \cdot ,\cdot ,\theta \right) \), which is the solution of the dynamic programming problem. Then we have a posterior distribution function of \(\theta \):
Let \(\varepsilon _{i}\equiv \left\{ \varepsilon _{i,t}\right\} _{t=1}^{T_{i}}\) and \(\varepsilon \equiv \left\{ \varepsilon _{i}\right\} _{i=1}^{N}\). The expressions above are conditional on \(\varepsilon \). Because \(\varepsilon \) is not observed to the analyst, the unconditional likelihood needs to be used, obtained by integrating over it. That is, if we define \(L\left( Y^{d}|\varepsilon ,\theta \right) \) to be the likelihood conditional on \(\left( \varepsilon ,\theta \right) \), then
We assume that each \(\varepsilon _{a}\) is independently drawn from the same extreme value distribution. In addition, we introduce a hierarchical structure so that each V and \(\theta \) are specific to gambler i. In the empirical application since A contains two actions, either return to the casino \(\left( a=1\right) \) or not \(\left( a=0\right) \), the conditional choice probabilities take the following form:
To allow for parameter estimates to vary by individual characteristics, we introduce a hierarchical structure. Understanding individual differences is crucial in strategic CRM applications when developing targeted marketing strategies [26]. The hierarchical parameters are specified as a function of an individual’s observable characteristics. We have nz observable characteristics on each individual. If we let Z denote a matrix with N rows and nz columns and similarly \(\Theta \) be a matrix of N rows and J columns, where the ith row of \(\Theta \) is the parameter estimates for individual i then we have:
$$ \Theta =Z\Delta +U $$
Where \(\Delta \) is a \(nz\times J\) matrix of coefficients on the observables and U is a vector of residuals. This is simply a multivariate regression of \(\Theta \) on Z. In each row of U,
Hierarchical models for panel data structures are ideally suited for MCMC methods. A Gibbs-style Markov chain can be constructed by considering the two sets of conditionals:
The first line exploits the fact that the \(\theta _{i}\) are independent, conditional on the first stage priors \(\tau =\left\{ \Delta ,\Sigma _{\theta }\right\} \). The second line exploits the fact that \(\left\{ \theta _{i}\right\} \)are sufficient for \(\tau \). That is, once the individual level parameters are drawn they serve as “data” to the inferences on the priors. Due to the non-linearity of the model proposed, there is no way to conveniently sample from the conditional posterior (i.e., using a Gibbs sampler). For this reason, we employ a Metropolis algorithm to draw \(\theta _{i}\) For each gambler i, we draw candidate random effects parameters \(\theta _{i}^{n}\) by perturbing the current draw \(\theta _{i}^{o}\): \(\theta _{i}^{n}=\theta _{i}^{o}+\varepsilon \), where \(\varepsilon \sim \mathbb {\mathcal {N}}\left( 0,s^{2}\Sigma \right) \). We then compare the likelihood of the new parameters with the old parameters and accept the new parameters with probability \(\alpha \):
The second line is a result of the symmetry of the transition density \(q\left( \cdot ,\cdot \right) \).
×
4.3 Learning About the House Advantage
In this section, we introduce the learning process. As gamblers play slot machines, they receive signals about that casino’s house advantage. Before receiving any information, they have a truncated normal prior belief on the house advantage:
In other words, before a gambler’s first trip to the casino, they expect to lose a certain percentage of every dollar cycled through the slot machine. The house advantage is bounded from below at zero because it is irrational for a gambler to expect to win money from a slot machine in the long run. It is also bounded from above at one because in the long run it is impossible for a machine to pay out more money than is put into it. Again, in the short term the hold percentage can fall outside of these bounds, but the gambler’s beliefs on the house advantage cannot reasonably be outside of this range.
The player’s experience at the casino does not fully reveal the house advantage because of the inherent variability of gambling outcomes. As previously discussed, there is quite often a difference between the hold percentage and the house advantage for gambler i on occasion t. We denote the hold percentage as \(H_{it}\), which can be interpreted as the “experienced” house advantage, and the house advantage as \(A_{i}\). The hold percentage is thus a noisy signal of the house advantage:
After the trip ends, the gamblers update their posterior mean and variance of the house advantages using a Bayesian updating process. This model does not capture any within-trip learning and belief updating is only based on the outcomes at the end of the trip. The updating formulas are given below:
where \(N_{i}\left( t\right) \) is the number of gambling experiences realized up through time t and \(d_{it}\) is an indicator for whether the player gambled at time t. The Appendix contains a proof showing that if the prior is truncated normal and the signal is an unbounded normal then the corresponding posterior is also a truncated normal.
Figure 7 plots the distribution of the hold percentage and house advantages across all trips in the dataset. The hold percentage distribution is what the gamblers experience and the house advantage is what gamblers are attempting to learn about. The dashed vertical line is the mean house advantage: with a sufficient number of exposures, the gamblers will learn this value with certainty if slot machines are selected at random. Note that each gambler’s experienced house advantage is observed by the analyst, so even if gamblers do not select slot machines at random (for instance, they only play one machine that happens to have a very low house advantage) the analyst can still determine if their estimated posterior beliefs are above or below the true house advantage. However, because the distribution of the house advantages is so tightly centered, we make the defensible simplifying assumption that the machines are selected at random and only the mean house advantage matters.
4.4 Cost of Gambling
When gamblers consider a return trip to the casino they need to form projections on the cost of gambling. This influences the expected future utilities. Under perfect knowledge expected cost is the same as the theoretical loss:
However, since gamblers have imperfect knowledge on the house advantages there is uncertainty in projections of their gambling costs. This uncertainty depends on their current beliefs at time t:
“BDH” represents the product of average bet, decisions per hour, and hours played. These three variables are completely within the gambler’s control (i.e., there is no uncertainty) and represent the gambler’s play style.
It is important to note that the gambler’s projected average bet, game speed, and time may be a function of their current beliefs on the house advantage. For example, if faced with a relatively high house advantage players may decide to decrease their average bet to reduce projected gambling costs (everything else held constant). Similarly, higher uncertainty in the cost may lead to play that is more likely to result in a lower cost. Furthermore, gamblers may also adjust their play style based on currently available marketing offers. For instance, a gambler returning to the casino on a free room offer may play more aggressively than usual since the comped room frees up money that could be used for gambling. To account for this, the BDH value can vary during the forward simulation (discussed in more detail later).
4.5 Utility Specification
In this section we introduce the utility function. The utility associated with returning to the casino is given by the following expression:
Where \(u_{it}\) is the utility for gambler i at time t. \(\text {BDH}\) is the product of average bet, pulls per hour, and hours played.4\(\text {BDH}\) multiplied by the hold H it captures the gambling expense realized from that trip. Importantly, this expense is not known at the time of the decision and only realized after experiencing the outcome. \(\theta _{1}\) represents the utility weight gamblers attach to this cost, r is the risk coefficient, \(\theta _{2}\) is the utility weight of the offer’s gaming value, and \(\theta _{3}\) is the utility weight of the offer’s room value. \(\Omega \) is the vector of utility weights associated with a function of the time since the last trip (w), which we specify as polynomials: \(\Omega f\left( w\right) =\omega _{1}w+\omega _{2}w^{2}+\omega _{3}w^{3}+\omega _{4}w^{4}+\omega _{5}w^{5}\). \(\Gamma \) is the vector of utility weights associated with the month the decision to return was made, in order to capture impacts from seasonality: \(\Gamma \text {Month}=\gamma _{1}\mathbb {I}\left[ \text {Month}=1\right] +\ldots +\gamma _{11}\mathbb {I}\left[ \text {Month}=11\right] \). \(\theta _{0}\) is an intercept. \(\varepsilon \) is the random component associated with this choice, which is known to the gambler but not observed by the analyst. \(\Omega \) and \(\Gamma \) are common across individuals, while \(\theta _{0}\), \(\theta _{1}\), \(\theta _{2}\), and r are specific to the individual.
Given the utility specification and the learning process, expected utility is given by the following:
Under this specification, utility is linear in the cost of gambling. As in Erdem and Keane [10], the formulation is such that given a strictly negative \(\theta _{1}\), utility is concave in A for \(r>0\), linear in A for \(r=0\), and convex for \(r<0\). Thus if there is uncertainty about the house advantage, the consumer is risk averse, risk neutral or risk seeking as \(r>0\), \(r=0\), or \(r<0\), respectively. As noted earlier, the uncertainty is in the beliefs on the house advantage, even in the “current” decision period. Furthermore, while the offer values are known in the current period they are not known in future periods, so gamblers form expectations over these values as well. In the simulation we draw values from the empirical joint distribution of room and gaming offer values.
Importantly, we emphasize that the model specified focuses on learning from prior outcomes and in the interest of generalizing outside of the casino industry does not incorporate gambling addiction directly into the model. Studying addictive behavior in casinos is an important topic but outside the scope of this analysis. Furthermore, Narayanan and Manchanda [19] found that only 7% of all casino gamblers exhibited evidence of addiction. Finally, based on our discussions with the casino management, we expect addictive prevalence to be even lower in this particular dataset given that this is a destination casino where gamblers tend to have above-average disposable income.
5 Model Estimation
5.1 The Estimation Procedure
The structural parameters of interest are \(\big \{ \theta _{0i},\theta _{1i},\theta _{2i},\theta _{3i},r_{i},\)\(\Omega ,\Gamma \big \} \) and the priors on each individual’s learning process \(\left\{ A_{0i},\sigma _{0i}^{2}\right\} \). The proposed estimation procedure uses the advantages of Bayesian estimation (versus classical estimation methods) while remaining computationally feasible. The biggest challenge presented when estimating structural learning models is that the state space is incredibly large. When discounting future expectations, a forward looking gambler needs to consider the impacts of all potential outcomes and the associated implications on the learning process itself. For example, the specific hold percentage a gambler expects to experience on a return trip will influence how their posterior beliefs update, which in turn influences later return decisions. Clearly, evaluating every single potential learning path is daunting and because of this a full-solution Bayesian approach is not feasible, such as the method proposed by Imai et al. [12].
Erdem and Keane [10] use backwards induction to solve their learning model. However, the entire backwards induction needs to be re-solved at every parameter estimate. This is not feasible for Bayesian methods, which typically rely on tens of thousands of MCMC draws to converge onto the posteriors. The impracticality of their method is not limited simply to the desire to use Bayesian rather than classical methods: the complexity of the proposed utility function and the hierarchical structure also render their approach as unfeasible.
Rather than attempt to visit every single learning path we forward simulate over many potential paths and discount the simulated values. More likely paths will be simulated more often and averaging over many simulated paths provides a consistent estimate of the discounted future returns. The advantage of this approach is that if the utility function is linear in parameters we only have to simulate the paths once for each considered starting state since the current parameter estimates do not affect the simulated values (see Hotz et al. [11] for a discussion of this method). The discounted terms can be separated from the parameters such that the parameters simply scale the discounted values during estimation.
One challenge is that in the utility function specified some variables enter non-linearly, namely the prior mean and prior variance of the beliefs in the house advantage. Note that the variability in the hold percentage (\(\sigma _{\eta _{i}}^{2}\)) is observed by both the analyst and the gambler, so there is no need to estimate this. To handle the non-linearity of the learning priors, we forward simulate over a grid of prior mean and prior variance values and during likelihood evaluation use bi-linear interpolation to fill in areas near the simulated prior mean and variance gridpoints. The intuition is that the observed data should reflect a specific learning process with a particular prior mean and prior variance and during the MCMC iterations we search over the prior learning parameters that maximize the likelihood. One disadvantage of the estimation approach is that the discount factor cannot be estimated and needs to be selected prior to the forward simulation procedure. However, the estimation strategy makes it relatively easy to compare a few candidate discount factors by simply adding additional parallelized forward simulations. More details on the forward simulation algorithm and bi-linear interpolation are available in the Appendix and the Web Appendix.
After forward simulation is complete the MCMC draws can proceed at usual speed. At first glance, it appears that we have simply pushed the computational intractability to the front of the estimation process, but it is important to note that each forward simulation for each grid point and each starting state can be run at the same time. With enough computers the whole procedure can be completed in minutes due to its massively parallel nature. Once the forward simulation is complete the discounted expected values are simply plugged into the likelihood and Bayesian estimation proceeds as usual.
5.2 Play Style Estimation
Since a portion of the gambling cost is within the player’s control (average bet, decisions per hour, and time, or “BDH”) We account for potential adjustments in a gambler’s play style due during the forward simulation. For instance, if a gambler believes that the house advantage is very high they may decrease their next trip’s average bet to reduce expected gambling costs. To estimate these effects, we estimate a regression of the log ratio of the next return trip’s BDH relative to the previous trip’s BDH:
where g is the offer gaming value, r is the offer room value, and o is the outcome, represented as the casino’s revenue from the player (positive values indicate a player loss), and \(\varepsilon _{it}\sim \mathcal {N}\left( 0,\sigma _{\varepsilon }^{2}\right) \). The coefficients on promotional credits and comp values control for any changes in play behavior attributed to reductions in the overall trip cost. For instance, if a player is returning on a free room offer they may increase their BDH. It is important to note that these coefficients are in regards to the play style, not the return decision. For example, if the coefficient on house advantage is positive it simply means that when the player returns they tend to play more aggressively - it does not imply that higher gambling costs increase the utility of returning to the casino.
For each of the 150 prior mean and prior variance gridpoint combinations, we run 10,000 MCMC iterations (keeping only every 10th draw) and save the posterior means. The posterior means are used during the forward simulation for adjusting BDH values as more experience signals are realized. The priors are specified as follows:
The coefficient estimates and prior settings are available in the Web Appendix.
5.3 Policy Function
In this section we outline the policy function used in forward simulation. The policy function estimates the probability of return given the current state (see Bajari et al. [3] for more details on using policy functions). We use a Bayesian non-parametric method as outlined by Rossi [25] to estimate this policy function. Non-parametrically, a regression models the conditional distribution of y given x. A fully non-parametric approach to regression uses the entire conditional distribution of y given x as the object of interest for inference. For the policy function we model the joint distribution of y and x and then use this joint distribution to compute the conditional distribution of y|x. This approach does not require assumptions and specific functional forms for how the x variables influence the conditional distribution of y.
For the policy function we estimate a five component mixture model. The covariates in x are the posterior mean and variance, predicted next trip BDH, the gambler’s weeks since the last trip, month, and the room and slot promotional credit values if an offer is available during that week. We first approximate the joint distribution and then use these draws to compute the implied conditional distribution. Formally, for the rth draw with K mixing components:
Here \(y_{i}\) is a two dimensional vector and \(\pi \) is a vector of K mixture probabilities. Priors for the model are specified in conditionally conjugate forms:
Any functional of the conditional distribution such as the conditional mean can be computed based on the rth draw of the joint distribution. In the policy function, we use the conditional mean in the policy regression. The linear structure of the mixture of normals model can be exploited to facilitate computation of the conditional mean. More details are available in the Web Appendix.
Suppose \(\sigma \left( s,\varepsilon \right) \) is the optimal action given state s and shock \(\varepsilon \) based on the policy function estimated in the previous section. Following Bajari et al. [3], we take advantage of the fact that for a given learning process prior mean and prior variance, the parameters enter the utility linearly.
Exploiting this allows us to forward simulate the data only once (for each prior mean and variance gridpoint). This eases the computational burden significantly, allowing us to use the stored values when searching over the \(\theta \) parameters during MCMC draws.
5.5 Forward Simulation and Parallelization
By taking advantage of the massively parallel structure of the forward simulation the expected value terms can be computed in a manner of hours with a reasonably sized dataset so long as many processors are available. With recent advances in online computing, estimating this complex model becomes a relatively inexpensive and fast process. To execute the forward simulation process, we use Amazon’s EC2 service which rents processors at an hourly rate.5 With small memory loads the cost is very low (less than a penny per hour) so running hundreds of instances simultaneously for a few hours is quite inexpensive.
We simply use each record in the data directly as starting states because creating starting states intended to “cover” the state space of the data is just as complex and would also require interpolation. Note that each record represents one decision period (one week), so there are hundreds of records per gambler. To give some context as to the scale of the parallelization in the empirical estimation, we conduct 100 forward simulations for each decision at each of the 287,205 rows of data over each of the 150 prior mean and variance gridpoints. This implies that theoretically the process can be divided across 8,616,150,000 servers and completed nearly instantaneously. In reality we divide the process over 30,000 servers and the process is done in about 50 hours (the servers are not all initiated at once).
At the end of the forward simulation, we obtain the expected values for returning or not at each record for each learning process prior mean and prior variance gridpoint. To recover the structural parameters, the three dimensional array (rows of data x discounted basis functions x 150 gridpoints) is then referenced during the MCMC process. We allow proposed prior mean and prior learning variances to take on any value within the range of gridpoints and use bi-linear interpolation to estimate the missing expected value. See the Web Appendix for more details on the bi-linear interpolation process.
5.6 A Summary of the Estimation Procedure
For clarity, in this section we summarize the estimation procedure. First, we estimate the play style regression coefficients and policy function mixture components for each of the 150 prior mean and variance learning process parameter combinations. We do this because each prior mean and variance determines the evolution of the Bayesian updating process that each player experiences. Next, at each starting state and for each of the 150 learning process prior gridpoints we forward simulate using the play style regression coefficients and policy function parameters specific to that gridpoint. This process mimics a gambler projecting potential outcomes and then discounting the values that results in decision to return or not in a particular week. Since this process can be run in parallel across starting states and learning prior grid we divide the estimation over many cloud computers using Amazon EC2. Once the average discounted values are obtained for each of the 150 gridpoints and each record of the data, we then use standard Bayesian MCMC methods to estimate the structural parameters. As previously noted, the coefficients simply scale the values obtained from the forward simulation and because of this it is easy to introduce a hierarchical structure. The MCMC routine then searches for the structural parameters that make the observed data most likely. More details on the entire estimation procedure are available in the Web Appendix.
Table 2
Homogeneous results
Coefficient
Posterior Mean
SE
Coefficient
Posterior Mean
SE
Intercept
–2.8563
(1.69e–03)
\(\gamma _{1}\)
–0.3543
(1.88e–03)
\(A_{0}\)
0.3973
(1.54e–04)
\(\gamma _{2}\)
–0.1600
(1.82e–03)
\(\sigma _{0}^{2}\)
0.0012
(2.89e–06)
\(\gamma _{3}\)
–0.1878
(1.73e–03)
Cost
–0.0018
(1.72e–06)
\(\gamma _{4}\)
–0.6775
(1.88e–03)
Risk
–1.85e–07
(1.12e–09)
\(\gamma _{5}\)
–0.0153
(1.94e–03)
Gaming Offer
0.0050
(5.46e–06)
\(\gamma _{6}\)
0.0484
(1.77e–03)
Room Offer
0.0168
(5.03e–06)
\(\gamma _{7}\)
0.0775
(1.77e–03)
\(\omega _{1}\)
0.0372
(2.54e–05)
\(\gamma _{8}\)
0.2299
(1.78e–03)
\(\omega _{2}\)
–0.0013
(2.20e–07)
\(\gamma _{9}\)
0.2059
(1.76e–03)
\(\omega _{3}\)
1.31e–05
(5.78e–10)
\(\gamma _{10}\)
–0.1051
(1.80e–03)
\(\omega _{4}\)
–5.10e–08
(3.25e–15)
\(\gamma _{11}\)
–0.0606
(1.78e–03)
\(\omega _{5}\)
6.78e–11
(2.68e–17)
6 Identification
The structural parameters of interest are \(\left\{ \theta _{0i},\theta _{1i},\theta _{2i},r_{i},\Omega ,\Gamma \right\} \) and the priors on each individual’s learning process \(\left\{ A_{0i},\sigma _{0i}^{2}\right\} \). Recall the choice specific value functions are as follows:
Suppose that gamblers had complete information about the casino’s house advantage. This would imply that \(A_{it}=A_{i}\) and \(\sigma _{it}^{2}=0\), and results in us being unable to separately identify \(A_{it}\) and \(\theta _{1i}\). Since gamblers observe the variation in the hold percentages, \(\sigma _{\eta }^{2}\) does not need to be estimated, unlike in Erdem and Keane [10]. Because the variability in hold percentages changes over time, it appears we can identify \(r_{i}\). But since we cannot identify \(\theta _{1i}\) in this complete information scenario, only the product \(\theta _{1i}r_{i}\) is identified. So identification rests upon the assumption that incomplete information exists (which is true for a static model as well).
With incomplete information, the gambler’s priors and their hold percentage exposures will guide the learning process path. Identifying the prior mean separately from the prior variance is challenging in most applications, the common solution being to fix the prior variance at one and estimate the signal variance and prior mean. But since we observe the signal variance we use the functional form of the Bayesian learning process to enable identification. A similar argument is made in Sriram et al. [28]. The priors determines how \(A_{it}\) and \(\sigma _{it}^{2}\) evolve. Thus these parameters are pinned down by the extent to which new hold percentage signals change the probability of returning (and hence the actual returns observed in the data). The hold percentage exposures vary across gamblers and create variation in the evolution of \(A_{it}\) and \(\sigma _{it}^{2}\). So even if every gambler started with the same learning priors, the variability in outcomes across gamblers allows us to identify \(r_{i}\).
6.1 Endogeneity Bias because of the Firm’s Marketing depending on Customer Behavior?
In this paper, we focus primarily on customer behavior conditional on firm activity, like in most empirical papers on customer choice. While we do not model firm behavior, we recognize of course that the firm’s promotional offers may depend on the customer behavior. For example, it may be that the firm’s room offer in a certain period depends on the customer’s gambling volume in the previous period. Can the fact that we do not model this relationship create an endogeneity bias? The answer is No. This is because the relationship in this story, which actually reflects the realities of loyalty program practice in many industries including the casino industry, is a sequential relationship rather than a contemporaneous relationship. On the other hand, if the firm’s offer in a certain period is a function of some demand shock that influences customer behavior in the same period, then there is the clear threat of an endogeneity bias because we are in a situation akin to the classic case of regression coefficient estimate bias because of the omission of a correlated variable. An instance of such contemporaneous dependence between the casino’s offer and customer behavior is when a seasonality variable has been omitted from the model. We have defended ourself against such an endogeneity bias or an omitted variable bias by controlling for such temporal effects.
7 Results
The results are estimated on a random subsample of 1,000 gamblers. For each gambler, 100 paths were forward simulated to derive the discounted values.6 To assist with parameter convergence in the hierarchical model, we first estimate a homogeneous model and use those parameters as the starting values in the hierarchical estimation. The parameters are estimated using a random-walk step on each MCMC draw. Since the parameter space is quite large, we partition the estimation into four parameter blocks to make the parameter search easier [6]. The first block contains the learning process prior mean and variance \(\left\{ A_{0},\sigma _{0}^{2}\right\} \), the second contains the cost, risk aversion, offer coefficients, and intercept\(\left\{ \theta _{0},\theta _{1},\theta _{2},\theta _{3},r\right\} \), the third block are the coefficients on the weeks since last trip polynomials \(\left\{ \Omega \right\} \), and the fourth block are the month control variables \(\left\{ \Gamma \right\} \). Details on the estimation procedure is available in the Web Appendix.
Table 3
Hierarchical interactions
Coefficient
Description
Intercept
Age (divided by 10)
Male
Log Distance (miles)
Gold LP Card
\(A_{0}\)
Prior mean
0.523*
–0.014
–0.004
–0.002
–0.003
\(\sigma _{0}^{2}\)
Prior uncertainty
0.040*
0.000
0.001
0.001
–0.005
\(\theta _{1}\)
Cost
–0.067*
–0.011
0.012
–0.008
–0.035
r
Risk
–0.004
0.001
–0.002
0.002
–0.002
\(\theta _{2}\)
Offer promo credits
0.151*
0.024
0.004
0.012
0.118
\(\theta _{3}\)
Offer room value
0.075*
–0.048
0.011
–0.141*
0.008
\(\theta _{0}\)
Intercept
–2.793*
0.064
0.189*
–0.092*
0.287*
* = 95% highest posterior density does not cover zero
In the homogeneous model, we run 80,000 MCMC draws. We discard the first 60,000 draws and keep only every 10th draw thereafter. We initialized the chain using MLE estimates. The acceptance rates of each of the four blocks is between 15% and 50% and the likelihood is -15,950. Table 2 contains the posterior means of the kept draws. As expected the coefficient on the gambling expense is negative.
The homogeneous results are used as starting parameters for the hierarchical model. In the hierarchical model, we allow the learning process prior mean and variance, intercept, cost, risk coefficient, and offer coefficients to be a function of individual level information. The coefficients on the weeks since last trip polynomials \(\left\{ \Omega \right\} \)and the month control variables \(\left\{ \Gamma \right\} \) remain fixed across the gamblers. The individual level covariates are the gambler’s age, sex, distance to the casino, and an indicator for whether the gambler is at the “Gold” loyalty card status. We run 80,000 MCMC draws, discarding the first 60,000 and keeping every 10th draw thereafter. The model’s likelihood is -11,358. This is a significant improvement over the homogeneous model and also greater than the likelihood from the same model with no forward looking (-11,401 when the discount factor \(\beta =0\) versus \(\beta =.98\)). Details on other model parameters are available in the Web Appendix.
Table 3 displays the estimates for the hierarchical parameters. Recall that each individual level variable influences the coefficient estimate through a multivariate regression. The individual-level variables are demeaned so that the regression intercepts reflect an “average” gambler.
Table 4
Full hierarchical results
Coefficient
Description
Posterior Mean
SE
\(A_{0}\)
Prior mean
0.5231
(2.40e–04)
\(\sigma _{0}^{2}\)
Prior uncertainty
0.0398
(2.30e–05)
\(\theta _{1}\)
Cost
–0.0668
(1.24e–04)
r
Risk
–3.76e–03
(7.84e–05)
\(\theta _{2}\)
Offer promo credits
0.1510
(3.88e–04)
\(\theta _{3}\)
Offer room value
0.0765
(3.05e–04)
\(\theta _{0}\)
Intercept
–2.7933
(4.98e–04)
\(\omega _{1}\)
Weeks since last trip\(^{1}\)
0.0489
(1.82e–05)
\(\omega _{2}\)
Weeks since last trip\(^{2}\)
–0.0013
(1.57e–07)
\(\omega _{3}\)
Weeks since last trip\(^{3}\)
1.31e–05
(1.22e–10)
\(\omega _{4}\)
Weeks since last trip\(^{4}\)
–5.10e–08
(1.59e–15)
\(\omega _{5}\)
Weeks since last trip\(^{5}\)
6.78e–11
(1.22e–17)
\(\gamma _{1}\)
Jan
–0.6550
(1.90e–03)
\(\gamma _{2}\)
Feb
–0.4129
(1.72e–03)
\(\gamma _{3}\)
Mar
–0.5658
(1.75e–03)
\(\gamma _{4}\)
Apr
–0.7259
(1.85e–03)
\(\gamma _{5}\)
May
–0.2667
(1.75e–03)
\(\gamma _{6}\)
Jun
–0.1619
(1.58e–03)
\(\gamma _{7}\)
Jul
–0.1741
(1.76e–03)
\(\gamma _{8}\)
Aug
–0.0560
(1.63e–03)
\(\gamma _{9}\)
Sep
–0.0990
(1.70e–03)
\(\gamma _{10}\)
Oct
–0.2499
(1.77e–03)
\(\gamma _{11}\)
Nov
–0.1716
(1.63e–03)
The average gambler believes that that house advantage is around 52%. While higher than the true house advantage (about 12%) gamblers have substantial uncertainty surrounding this belief, with a standard deviation of .2. As expected, the cost coefficient is negative - high house advantage perceptions lower the probability of returning. The average gambler is risk seeking (at least directionally) and the offer values significantly influence the probability of returning.The interactions with the intercept are intuitive: gamblers that live far away are less likely to return while those in the higher tier LP are more likely to return. The posterior means for all of the parameters are presented in Table 4. The results for the fixed parameters are similar to the homogeneous results.
Table 5 displays the variances and correlations across the individual-level coefficient estimates. There is substantial heterogeneity across gamblers’ coefficient estimates. Interestingly, there is a positive correlation between the prior mean and uncertainty: gamblers whose prior beliefs are higher tend to be more certain in their beliefs. There is also a strong negative correlation between the cost coefficient and the risk aversion; Gamblers who are more sensitive to the cost of gambling are more risk averse while those that are not as sensitive tend to be more risk seeking.
Table 5
Heterogeneity across gamblers
Coefficient
Description
Variance (diagonal) and Correlation (off-diagonal)
\(A_{0}\)
Prior mean
.1337
\(\sigma _{0}^{2}\)
Prior uncertainty
–.24
.0015
\(\theta _{1}\)
Cost
.06
–.03
.0563
r
Risk
–.01
.02
–.19
.0142
\(\theta _{2}\)
Offer promo credits
.00
–.06
–.05
–.04
.7150
\(\theta _{3}\)
Offer room value
–.01
.05
.02
.05
–.55
.8034
\(\theta _{0}\)
Intercept
.13
–.10
.00
–.02
.06
–.06
1.0623
×
Figure 8 shows the distribution in posterior means across gamblers in the estimated prior house advantage and its uncertainty. Most players tend to overestimate the house advantage but the distribution is quite dispersed across gamblers. The level of uncertainty is somewhat bi-modal: while there is some mass around low uncertainty estimates there is also substantial mass around .05. One potential explanation for gamblers overestimating the house advantage (or at least acting “as if” they do) is that most gamblers either don’t play long enough or don’t have a large enough bankroll for the law of large numbers to allow for convergence between the experienced hold percentage and the house advantage.
8 Policy Simulations
The structural parameters are used to simulate six counterfactuals. The first two counterfactuals illustrate how projected casino revenues are quite sensitive to gamblers’ prior beliefs in the house advantage and the volatility of outcomes. While these counterfactuals are informative, they do not provide casino marketers with practical solutions to act upon, for reasons to be discussed. The third and fourth counterfactuals focus on marketing solutions and show that sophisticated targeting strategies should consider how both the outcome sequence and prior beliefs may dictate where targeting is most effective. The remaining two counterfactuals explore belief-based targeting in more depth. The fifth counterfactual uses the model to identify the gamblers that are most responsive to marketing. Finally, the sixth counterfactual does a partial search for an optimal marketing strategy. While a full search is incredibly complex, the partial search still highlights that offer values should vary depending on both the outcome sequence and gambler beliefs.
Table 6
Accurate priors increase casino revenue
Current prior
Accurate prior
\(A_{0}\)
.523
.125
Trips
2,379
10,027
Average weeks to next trip
24
16
Average trip slot theoretical loss
$460
$162
Total theoretical loss
$1,094,114
$1,621,907
Increase in gaming revenue
48.2%
# of gamblers simulated
1,000
Years simulated
5
Table 7
Hold percentage volatility impacts casino revenues
Variance multiplier
Variance
1% LB
99% UB
Avg. Return Weeks
Avg. Theo. Loss
Total Theo. Loss
0.001
.0002
.09
.16
29
$362
$2,630,910
0.005
.0010
.05
.20
26
$353
$2,861,178
0.010
.0021
.02
.23
23
$345
$3,138,954
0.025
0.01
–.04
.30
16
$323
$4,193,657
0.050
0.01
–.11
.37
11
$299
$5,925,798
0.200
0.04
–.35
.60
10
$319
$5,403,860
0.350
0.07
–.50
.76
15
$374
$2,672,789
0.500
0.10
–.62
.88
19
$410
$1,626,595
0.650
0.14
–.73
.99
23
$442
$1,145,819
0.800
0.17
–.82
1.08
24
$446
$913,859
0.950
0.2
–.91
1.17
23
$459
$761,572
1.100
0.23
–.99
1.24
28
$476
$621,727
1.250
0.26
–1.06
1.32
25
$465
$563,224
1.400
0.29
–1.13
1.39
26
$481
$556,337
8.1 Counterfactual 1: Accurate Prior Beliefs
In this data the average slot machine house advantage is 12.5%. The estimation results therefore suggest that gamblers overestimate the house advantage by a factor of about four prior to their first trip to the casino. Given that the cost coefficient \(\theta _{1}\) is negative, gamblers may be overestimating the cost of a return trip which in turn delays the return time. This counterfactual simulates expected gaming revenues under the assumption that each gambler’s prior belief in the house advantage is accurate. That is, their prior belief equals the true house advantage. The results are shown in Table 6.
As expected, gamblers return at a faster rate if their prior beliefs in the house advantage are lower. With lower cost expectations gamblers no longer need many trips for their beliefs to converge to the true house advantage. Even though gamblers play less on each return trip the impact on the aggregate expected casino revenue is still positive.
If accurate beliefs in the house advantage can potentially increase long term casino revenue, why do not casino marketers simply advertise the accurate house advantages through direct mail? The primary reason is that this is not practical. Casinos tend to be very cautious on how they advertise slot machines in their direct mail offers. There is a risk that a gambler will interpret the true house advantage as a guaranteed loss limit. The casino may face backlash from the gamblers who lose more than the house advantage suggests they should. The purpose of presenting this counterfactual is to simply highlight that changes in a gambler’s beliefs can have drastic long term consequences on casino revenues.
8.2 Counterfactual 2: Slot Machine Volatility
Next, we consider the impact of reducing the volatility of the slot machine hold variance. When a casino orders a slot machine from a manufacturer they specify the variability in that machine’s outcomes. In this dataset, the slot machine hold is 13.9% and has a variance of .05, meaning 98% of the hold percentages (at the trip level) are between -38% and +66%. We simulate 1,000 gamblers over 5 years to measure the revenue impact of lowering and raising the hold variance relative to its current level. The results are presented in Table 7. Figure 9 plots the casino theoretical win against a multiplier on the hold variance - the dashed line at 1 means variance is at its current level.
The simulation results show that as the volatility decreases the projected casino win increases. However, when the volatility shrinks to a point that gambler wins become very infrequent the theoretical win declines. Clearly, the volatility in the outcomes has dramatic impacts on long term casino revenues. As with the first counterfactual, even though these findings are informative they do not point to any reasonable short term solution for managers. In order for a casino to change their aggregate slot machine volatility they would need to order new slot machines and spend time installing the machines on the gaming floor. These machine, labor, and additional opportunity costs are substantial and not accounted for here.
Table 8
Naive targeting is ineffective
Targeting criteria
Industry standard
Actual ex Wins
Theo. ex Wins
Trips
3,084
2,441
2,388
Avg. weeks to return
19
25
24
Avg. theoretical win/Trip
$472
$498
$482
Total theoretical win
$1,456,769
$1,215,938
$1,150,119
Total actual win
$1,453,383
$1,408,149
$1,266,449
Promotions redeemed
1,328
724
697
Room value
$128,509
$168,937
$65,902
Promotional credits
$64,254
$84,469
$32,951
Room cost ($30 per roomnight)
$43,380
$41,460
$21,840
Promotional credit cost (1 cycle)*
$56,222
$73,910
$28,832
Net theoretical win
$1,357,167
$1,100,568
$1,099,446
Net actual win
$1,353,780
$1,292,779
$1,215,777
*The cost of promotional credits is not a certainty since wins can be cycled back into the machine and generate additional payouts. See the Web Appendix for a discussion
×
8.3 Counterfactual 3: Incorporating Gambler Outcomes with Naive Targeting
The first two counterfactuals illustrate that changes in prior beliefs and hold percentage volatility can have substantial impacts on long term casino revenue. However, as discussed the results alone do not lend themselves immediately to practical solutions for managers. The purpose of these remaining four counterfactuals is to show how targeted marketing could be used in conjunction with the outcomes and player beliefs to improve casino profitability.
In this counterfactual we compare three marketing strategies: 1) the industry standard of basing offer values on gamblers’ theoretical losses (“Industry Standard”), 2) basing offer values on actual outcomes but excluding gamblers players who won on their last trip (“Actual ex Wins”), and 3) basing offer values on theoretical losses (similar to the industry standard) but again excluding gamblers who won on their last trip (“Theo ex Wins”). The second and third strategies are meant to represent naive targeting strategies: gamblers who win are more likely to have low beliefs in the house advantage and therefore should be more likely to return to the casino anyways. Given this, the casino may be able to save on marketing expenses by excluding these players from offers. Furthermore, in the second scenario the casino provides an incentive to return that is directly in line with the loss experienced. We consider these strategies “naive” because they do not consider how each gambler’s beliefs in the house advantage may influence the effectiveness of marketing - only the outcomes are used.
Table 9
Marketing impact depends on prior beliefs
Prior belief in house advantage
Prior uncertainty
Player winning
Player losing
\(\Delta \)
High
High
3.8
1.3
2.6
High
Low
0.0
0.0
0.0
Low
High
0.1
3.4
–3.3
Low
Low
0.3
2.4
–2.1
As in the empirical data, in each decision period there is a 45% chance that the gambler will be exposed to a marketing offer. The offers are valued at 30% of their last trip’s theoretical or actual win. The total offer value is split into a room component and promotional credits, with two-thirds of the total offer value going to the room and one-third going to promotional credits. In the “Industry Standard” simulation, all gamblers have an opportunity to obtain an offer but in the “Actual ex Win” and “Theo ex Win” simulations offers will not be available to gamblers who won on their last trip.
The results in Table 8 show that both naive approaches to targeting are less profitable than the current industry standard. In the “Actual ex Wins” scenario, top line revenue remains relatively constant but the overall promotional costs are higher, even though fewer offers were redeemed. This may seem counterintuitive but this is because actual outcomes tend to have much more variability relative to theoretical outcomes, especially when evaluated at the trip level (as data becomes aggregated the theoretical outcomes converge to actual outcomes). In the “Theo ex Wins” strategy, promotional costs decrease dramatically but top line revenue also suffers. The short term gains that might be had from the reduction in promotional costs is offset by longer intervisit times. The results suggest that strategies that appear intuitive at first are not always more profitable in the long term. This counterfactual emphasizes the need for a more sophisticated targeting strategy.
×
8.4 Counterfactual 4: Marketing Impact by Past Outcomes and Beliefs
This simulation extends the previous by incorporating prior beliefs into the targeting decision. Table 9 shows the impact of marketing when gamblers’ prior beliefs and uncertainty are high or low and when gamblers are either winning or losing. The impact of marketing is measured by comparing overall expected casino revenue with marketing versus without marketing. For example, an impact of .1 means that there is a 10% increase in revenue across gamblers in the presence of marketing. The marketing rule imposed is the same as the industry standard as described in the previous counterfactual. The “high” and “low” categorizations are set using the 5th and 95th percentile estimated prior means and prior variances.
When a gambler’s prior belief in the house advantage is high and their uncertainty is high, marketing is more impactful if the player is on a winning streak rather than a losing streak. However, for gamblers whose prior beliefs in the house advantage are low, marketing is more impactful when players are on a losing streak. Notice that marketing is ineffective for gamblers whose beliefs in the house advantage are very high and their uncertainty is very low. This is intuitive: these gamblers are very certain that the cost is very high and because of they will not return regardless of marketing offers.
This simulation emphasizes the importance of considering both the prior beliefs and the outcome sequence when designing the targeting strategy. In the previous counterfactual the naive assumption was that only the outcome mattered but here we see that it is a combination of the outcomes and prior beliefs that dictate where marketing is more effective. This insight is very useful to managers who need to allocate their limited marketing budget across gamblers.
8.5 Counterfactual 5: Marketing Impact by Gambler
In this counterfactual we analyze the relationship between gamblers’ posterior beliefs and the marketing impact. The posterior beliefs summarize both the prior beliefs and the outcome sequences realized, thereby reducing the number of metrics managers need to consider for targeting. To add more realism to this simulation, we use the 1,000 gamblers from the dataset rather than creating artificial gamblers. We simulate five years worth of gambling activity, picking up where the observed data ends. Again the focus is on the impact of marketing, meaning the change in expected casino revenue when there is marketing versus no marketing present. The goal of this simulation is to identify gamblers where marketing has the greatest impact and then determine if the marketing impact is in any way related to posterior mean and uncertainty in the belief of the house advantage.
Figure 10 shows the marketing impact represented by a lift chart. If gamblers were randomly targeted the total impact is expected to follow the dashed line. However, the simulations allow us to identify the gamblers where marketing will likely have the greatest impact.7 Notice that just about all of the gains from marketing activity are realized from about one quarter of the gamblers. The other gamblers are not impacted by the marketing activity or in a few rare cases the marketing actions actually result in declines in gaming revenue.
For gamblers who are most impacted by marketing (those in the front of the curve where the cumulative impact is less than 99%), the posterior belief in the house advantage tends to be higher and the uncertainty much lower.
# of Gamblers
Posterior mean
Posterior uncertainty
99% of Cumulative Marketing Impact
242
.227
.0036
Remaining 1%
758
.196
.0076
Figure 11 illustrates the differences across gamblers. For each of the 1,000 gamblers, the marketing impact is plotted against the posterior mean and posterior uncertainty averaged across all of their realized return trips. Marketing has a greater impact on gamblers with higher posterior means and lower uncertainty. The correlation between the posterior mean and posterior variance across gamblers is -.128: gamblers with higher beliefs in the house advantage tend to have less uncertainty. While this may seem to contradict the findings from the previous counterfactual it is important to note that the previous counterfactual examined the extremes of beliefs, at the 5th and 95th percentile. In addition, this counterfactual uses the actual gamblers, rather than simulated gamblers. In both cases the fact remains that there is a strong relationship between the posterior beliefs and the impact of marketing.
×
8.6 Counterfactual 6: Optimal Marketing Offers
The previous counterfactual provides evidence that posterior beliefs influence the impact of marketing. A natural extension is to then search for the optimal marketing strategy. That is, for each gambler and each outcome experience which offer strategy will lead to the highest long term expected revenue? Finding the global optimum is very difficult (at least in this casino example) because each room value and slot promotional credit combination would need to be evaluated for each gambler at each decision period for every potential outcome sequence. Even though finding the global optimum is incredibly complex this counterfactual shows that even a relatively simple constrained optimization can lead to substantial improvements in projected revenue.
In this constrained search we vary the slot promotional credits and bin the posterior beliefs into four categories. The goal is to determine how much each of the four posterior belief categories should receive in slot promotional credits. In this dataset the promotional credit value is typically set at 10% of the past theoretical loss level. We simulate this baseline percentage and four alternatives: 0%, 5%, 15%, and 20%. The belief and uncertainty levels are grouped into four categories: high/low belief in the house advantage and high/low uncertainty. The cutoff for the belief in the house advantage is the casino’s true house advantage and the cutoff for the uncertainty is based on a median split of the observed gambler’s posterior variances.
Category
Belief in house advantage
Uncertainty in Belief
Low
<12.5%
<.0029
High
>=12.5%
>=.0029
Another challenge in searching for the optimal marketing offer is that gamblers can switch categories over time depending on their outcomes. That is, they may start in a high belief/high uncertainty state, move to high belief/low uncertainty state, and then end in a low belief/low uncertainty state. Because each state will have its own marketing strategy, all 625 combinations of offers need to be considered: five promotional credit percentages in each of the four offer states.
We simulate one hundred gamblers for two years in each of the 625 offer value combinations. Each gambler starts with the same prior beliefs and uncertainty on the house advantage (based on the hierarchical results for the “average” gambler). Profit is obtained by subtracting room and promotional credit costs from the projected casino revenue.
Figure 12 shows the sorted profit across all 625 simulations. The dashed line shows baseline profitability where the four belief categories each receive promotional credits valued at 10% of theoretical losses. The range in the profit is substantial: the top strategies generate over $55,000 in profit while the worst strategies generate around $25,000.
×
×
Rather than try to evaluate each of the 625 simulations individually, we instead compare the differences in the most and least profitable strategies, shown in Fig. 13. This shows which promotional credit percentage is associated with the most and least profitable strategies in each of the four belief/uncertainty categories. Notice that the most profitable strategy does not use the baseline percentage of 10% in any of the four belief categories: when the belief in the house advantage is below the actual house advantage, a higher percentage is recommended whereas when the belief in the house advantage is high the policy depends on the uncertainty. It is also interesting to note that the most profitable strategy does not max out or eliminate the promotional credit amount in any of the four belief bins, suggesting that the solution is contained within the boundaries of the simulation. The most profitable strategy generated $62,290 in profit, compared to $36,479 in the baseline scenario where all gamblers receive the same promotional credit percentage regardless of their beliefs in the house advantage, an increase of 85.3%. For a more conservative (and realistic) measure of success, the top half of the strategies still increased baseline profit by an average of 19.7%.
The model presented provides a framework for managers to use in order to target gamblers based on their beliefs and outcome sequences. This simulation shows that the gains from doing so can be significant, even when the strategy employed is the result of a heavily constrained search.
9 Discussion and Conclusion
Recent improvements in interaction-logging technologies in delivery systems, online channels, call centers and mobile devices have made it increasingly common for managers to have access to data on outcomes at the customer- and occasion-specific level. When outcomes vary randomly and independently from occasion to occasion, it is reasonable for a customer’s beliefs about the average outcome with the firm to evolve over time. A primary contribution of this paper is a framework and methodology to use data on customer outcomes to model such evolving beliefs and how they combine with the firm’s marketing to influence purchase behavior. Our modeling allows the firm to estimate the likely marketing response of a customer with any specific experience and behavior history and use this for an optimal across-customer allocation of targeting resources. The methodological framework is applicable in many industries, specially those that are service-based where across-occasion variation in outcomes can be high and is oftentimes unavoidable. Depending on the signal variability, it may take many experiences for the customer to learn the true distribution of the outcome. Until the true distribution is learned, the customer will likely have biased perceptions. If a customer’s initial experiences are likely to lead to an inference that the future outcomes will be lower than the truth which could result in a cessation of interaction with the firm, the situation may warrant targeted marketing as an offsetting influence.
We illustrated our proposed methodology in the context of casino customers. We use gambling payoffs from slot machines as the outcome, which is mediated by beliefs about the house advantage of the machines. In our conceptualization, gamblers develop evolving beliefs about the average slot machine house advantage based on their experienced payoff hold values. Gamblers use their beliefs on the house advantage to project future trip utilities which in turn influence when they return to the casino and how they play on a return trip. The gaming industry offers an attractive setting to illustrate this methodology, for a variety of reasons, one of which is that exogenous gambling outcomes provide many distinct and unique experience sequences at the gambler level.
The results and the six counterfactuals highlight the potential benefits realized by incorporating gambler outcomes into the targeting decision. The first two counterfactuals illustrate how revenues are very sensitive to gambler beliefs and the variability in their signals. These findings suggest that an individuals’ beliefs can have managerially significant impacts on a firm’s profitability. As discussed, since changing the beliefs and variability directly can be difficult (especially in the casino industry), a natural alternative is to develop marketing strategies which incorporate beliefs directly into the targeting decision as a way to influence purchase behavior. The third counterfactual motivates this notion by comparing a common industry targeting strategy with two naive alternatives that incorporate past gambler outcomes (but not beliefs) into the targeting decision. The simulation is meant to show that strategies that are intuitively appealing are not necessarily effective. In this example, the intuition is that winning players do not need incentives to return to the casino. This naive strategy is based on some of the sentiment currently considered in the industry. Interestingly, the naive strategies do not outperform the industry standard strategy, thereby providing face validity as to why many casinos still segment customers using on past expected losses and do not incorporate outcomes into the targeting decision. These results also motivate the need for a more sophisticated targeting strategy – one that incorporates experienced outcomes and gamblers’ beliefs of future outcomes into the targeting decision.
The fourth counterfactual provides evidence that it is not only the outcome that matters to a customer, but rather how that customer’s past experiences juxtapose against their beliefs on the future outcomes. In our empirical example, a gambler’s marketing response depends on both their belief in the house advantage and the certainty that the gambler has in this belief. This finding motivates the primary contribution of our paper: the beliefs of the customers should inform the design of the targeted marketing strategy. The fifth counterfactual extends this insight by analyzing how posterior beliefs in future outcomes (and the uncertainty in these beliefs) influence the impact of marketing – the idea being that the posterior beliefs capture the prior beliefs and the experienced signals in a single metric. This simulation shows how the model can identify gamblers where marketing is likely to be most effective, where the effectiveness depends on the gamblers’ own string of experiences, estimated beliefs, and baseline offer responsiveness. Finally the sixth counterfactual demonstrates how the model can be used to search for optimal marketing strategies that incorporate beliefs into the targeting decision. Given the sheer complexity in possible combinations of offers with experienced outcome sequences and beliefs, the model provides a relatively fast way to hone in on strategies that can be compared with the current strategy. Of course, once the model identifies plausible marketing strategies, small scale field experiments can further validate the model simulations.
In sum, the counterfactual results provide evidence for the value of the primary contribution of the paper, which is in a model that makes use of customer outcome data to target customers. The consequent model improves firm profits significantly. The firm can benefit from using the data to model customers’ beliefs and incorporate these into the targeting framework, especially in the presence of random fluctuations in outcomes across transactions.
We close by identifying some extensions to our current work on how customers’ beliefs combine with marketing to influence purchase behavior, extensions which are likely to be important and interesting. One feature of our model is that we assumed that customers update beliefs using the classical Bayesian updating mechanism. There is an ongoing debate as to whether or not humans truly update beliefs in this manner as some view the Bayesian updating process as mathematically tractable but unrealistic. While this updating method may correctly replicate the learning process of the customer, it does not mean that customers are in fact integrating new information in the way the equations suggest. Further examination is required to determine how this difference can affect marketers. In addition, there are many other factors that can influence the learning process besides the magnitude and order of the experienced outcomes. Some of these factors include the timing between signals, whether the beliefs influence how new signals are integrated, and whether the signals are weighted in some predictable fashion (rather than each signal receiving equivalent importance). By relaxing the forward looking assumption of gamblers many of these extensions become much more tractable to analyze and measure.
In our modeling framework and illustration, the customer is uncertain about, and is learning about, only one attribute. In many cases, the customer is attempting to learn about multiple aspects about the firm’s offering. In our casino example, the customer may be trying to assess the mean outcome level related to not just the house advantage but also factors like room service and dining. Our modeling framework can be readily extended to multiple-feature-learning, not just for the simple case where the experience data signals for the different features are independent but also the case where a certain experience data point is simultaneously informative of multiple features or the case where the belief distributions are not independent across the features.
Our paper assumes that the customer learns about the mean experience outcomes only through his/her own experiences with the firm. In today’s world of widespread social media use, customers can learn about the firm’s outcomes also through information gathering from other customers. A customer is more likely to seek out information on others’ experiences with the firm if he/she believes others’ valuations are likely to be similar to his/her own, suggesting that such information gathering may be more likely for objective performance measures like response time and less likely for performance measures like food quality where there is higher across-consumer variance in assessment. Our model can be extended to handle the case where there is inter-person information sharing by modifying the customer’s Bayesian belief evolution process to depend on outcome data not just of the focal customer but also of other customers that he/she is connected to. This extension requires data on inter-customer relationship strengths, which can be constructed by partnering with social media platforms or looking at customer referral histories or co-location or co-habitation.
We should recognize that a customer’s beliefs may be formed not just using experience outcome data from the customer and his/her friends, but also from the firm’s marketing communications, though admittedly this latter factor may have less of an impact because of credibility concerns in the mind of the consumer. To this end, the Bayesian belief updating process of this paper could be extended to include the effects of advertising. The policy simulations will then offer the possibility of identifying optimal targeting strategies for the firm on which advertising content should be directed at which customers. For example, an airline may direct advertising featuring third-party audit reports on flight punctuality just to those customers who have experienced atypically high flight delays.
In addition, we assumed that the gamblers do not learn from the marketing strategy itself. Due to the relative infrequency of interactions with the casino this was reasonable in our empirical example but in other situations this might be a reasonable concern. For example, if there are many customer interactions such that the beliefs (and the targeting strategy) evolve relatively quickly the customer may anticipate how the firm will respond based on the outcome of the experience. Finally, it is possible that the targeted offer itself can influence the beliefs, which we did not account for in our model. We leave these topics for future research.
Declarations
Conflicts of Interest
There are no known potential conflicts of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Appendix A: Conjugate Prior in Truncated Normal Distribution
If the prior is truncated normal and the signal is an unbounded normal then the corresponding posterior is also a truncated normal. In other words, the truncated normal distribution is also a conjugated prior for a standard normal likelihood of signal generation. This proof is similar to the one in Li [15].
Theorem 1. Suppose the parameter of interest \(\theta \) is distributed in normal distribution truncated at 0 and 1, i.e., \(\theta \sim \mathcal{T}\mathcal{N}\left( \mu _{0},\sigma _{0}^{2}=\lambda _{0}^{-1},0,1\right) \), and the likelihood for signal
$$ x=\theta +\xi $$
where \(\xi \sim \mathcal {N}\left( 0,\sigma _{\xi }^{2}=\lambda _{\xi }^{-1}\right) \), then the posterior distribution
Let \(\phi \left( t,\mu ,\sigma ^{2}\right) \) be the normal pdf with mean \(\mu \) and variance \(\sigma ^{2}\), and \(\Phi \left( t,\mu ,\sigma ^{2}\right) =\int _{-\infty }^{t}\phi \left( s,\mu ,\sigma ^{2}\right) ds\) be the CDF. We know that
Since the impact of marketing depends on the outcome sequence a more thorough analysis would simulate over many potential outcome paths. We conducted a simulation setting the hold percentage to a constant (the mean) and the interpretations are the same.