Introduction
Literature review
Summary and contributions of this paper
-
The paper proposes a novel approach for modelling people’s behaviour through a discrete choice process considering the existence of hard constraints. Specifically, we propose a mathematical formulation of the nested logit model to cope with hard constraints. We provide a Monte Carlo simulation study on an application with hard constraints to show that the demand estimations performed by the logit-type models are in error when they predict new scenarios in which the current hard constraints in the base scenario of the estimation are modified. This disadvantage does not show up in the constrained models.
-
Secondly, we address the building of a general framework to specify dynamic non-linear utilities. The main feature of this approach is that the specification of the utility function is not centered on a specific functional form (linear, polynomial, sinusoidal, etc) but on belonging to a specific space of functions, the so-called Reproducing Kernel Hilbert Spaces (RKHS). The essential advantage of this design is the search for the most suitable shape for the utility within the set of functions for the problem at hand. Furthermore, we propose a method for estimating the constrained nested model with general utilities based on the novel point of view of considering a subset of utilities as parameters for the estimation instead of the classical weightings of the attributes. This approach has been illustrated with the modelling of the selection of railway services.
CNL
model" section discusses the procedure followed to estimate the constrained nested logit model with this type of non-linear utility. Fifth section analyses numerically the logit-type models with the constrained counterparts and it illustrates the methodology by numerically solving a railway service selection problem. Finally, last section concludes with a discussion of our findings and future work.The constrained nested logit model
Formulation of the constrained nested logit
CNL
). These constraints may introduce new factors in the forecasting process which could influence the behaviour of the users. Therefore, the CNL
model is formulated as follows:CNL
is separable for each individual \(\ell \), leading to n independent problems in which there is no interaction between individuals.CNL
. Suppose the variable \(g^{m\ell _{k}}\) with \(k=1,\ldots ,j\), is known, and we calculate the right side of Eq. (3) and solve the CNL
problem to calculate the probabilities of choice \(g^{m\ell _{j+1}}_s\) of the consumer \(\ell _{j+1}\), subsequently iterating the process again.CNL
the constraintsCNL
model.Equilibrium issues
CNL
model represents an equilibrium between users. They compete for the existent resources (alternatives) considering their preferences and the imposed constraints which may affect their choices, like the capacity of the system. This fact leads to an equilibrium situation in which each user cannot improve their utility by selecting a different alternative.CNL
satisfies the classic Nested Logit probability equations, but show that the constraints introduced in the CNL
model penalize the utilities of the lower level, producing a modification to the forecasting of demand depending on the active constraints. A function of the multipliers, the named \(W^{m\ell '}_s\), can be interpreted as the shadow price that the type of user \(\ell '\) must pay for choosing the alternative s. Alternative s consumes scarce resources that a set of users must compete for, and this is the price, expressed in terms of utility, that users are prepared to pay to choose that alternative.Non-linear utility specifications using Reproducing Kernel Hilbert Spaces
-
Step (i) know the attributes values of the alternatives \(x_s\) with \(s\in \cup _{m} {{\mathcal {S}}}_m\),
-
Step (ii) estimate the utilities \(V^m_s\) on a subset of alternatives \({{\mathcal {D}}}_1 \subset \cup _{m} {{\mathcal {S}}}_m \), and
-
Step (iii) calculate the parameters \((\alpha _0, \alpha ^T)\) from the estimated utilities.
CNL
requires the parameters \(V^m_s\) in order to be used focuses the estimation process on these parameters, and so for the alternatives \({{\mathcal {D}}}_1\) it is not necessary to know the functional form and the attributes which are relevant for the decision maker. Moreover, the parameters \(V^m_s\) have a clear interpretation but the interpretation of the weightings \(\alpha _y\) is unclear.Set of alternatives \({s\in {{\mathcal {D}}}_1}\)
| ||||
---|---|---|---|---|
\(s_1\)
|
\(s_2\)
|
\(s_3\)
|
\(s_4\)
| |
\(t_s\)
| 9 | 10 | 12 | 16 |
\(U_s\)
| 3 | 5 | 2 | 4 |
Gaussian coefficients | Multi-quadratic coefficients | |
---|---|---|
(\(i=1\)) | (\(i=2\)) | |
\(\alpha ^i_1\)
| 1.3582 | 1.4401 |
\(\alpha ^i_2\)
| 4.4476 | −1.1676 |
\(\alpha ^i_3\)
| 1.9107 | 1.5238 |
\(\alpha ^i_4\)
| 3.9841 | 0.5754 |
-
Remark 1. Functional expression It is worth noting that calculating the vector of parameters \(\alpha \) by solving the system of equations (15) allows the analytical expression of V(x), Eq. (14), to be known, and it is possible to calculate the marginal utilities \(\frac{\partial V^*}{\partial x_s}\). This allows the subjective values of travel time (SVT) to be calculated (see Jara-Díaz 2000; Amador et al. 2005). Moreover, it allows any utility \(V(x_s)\) to be estimated if the vector of attributes of the alternative s \(x_s\) is known.
-
Remark 2. Various user types If there were several users \(\ell \) it could happen that one alternative s were common to some of them. This may introduce ambiguities. We modify the definition of set \({{\mathcal {D}}}_1\) to avoid this, considering that its elements are pairs of the form \((s,\ell )\), thus leaving nothing undefined. In the case where there are various user types we would estimate a utility function for each of them in this way:$$\begin{aligned} V^{*\ell }(x) =\sum _{(s,\ell ) \in {{\mathcal {D}}}_1} \alpha ^\ell _s K(x,x^\ell _s), \quad \hbox { for all } x \in X, \end{aligned}$$(19)
Estimation of the CNL
model
CNL
parameters are \(V,\lambda =(\cdots ,\lambda ^\ell _1,\cdots ,\lambda ^{m\ell }_2,\cdots ), b_r, {\widehat{g}}^\ell \). It is assumed that the upper bounds \(b_r\) of the constraints and \({\widehat{g}}^\ell \) are known. The remaining parameters \((V,\lambda )\) must be estimated. In this section a generic estimation methodology is described. As CNL
is a strictly convex program (assuming that the functions \({\widehat{h}}_r\) are convex) it poses a single optimum and CNL
implicitly defines a function, which obtains the disaggregation of the demand by alternatives for each pair \((V,\lambda )\). This is depicted thus:CNL
model is to select and estimate a subset of utilities \(\widehat{V}_1\) which will be used for calculating the other utilities \(\widehat{V}_2\), repeating this process iteratively, changing the values of \(\widehat{V}_1\) with the objective of approximating the demand estimated by the CNL
to the real known values.CNL
model, g, and the observed values, \({\mathbf {N}}\). It is worth noting that the parameters to be estimated are the utility vector \({\widehat{V}}_1\) and the vector of scale parameters \(\lambda \).CNL
is formulated as a bi-level optimization model and Fig. 3 shows the application of free-derivative optimization methods to this problem. This calculation scheme is conceptual and susceptible of many implementations. The convergence to the optimal parameters is not guaranteed and it will depend on the free-derivative optimization method applied.
CNL
proposed are explained in “Appendix 2”.Numerical analysis
CNL
model must be solved. The proposed method is conceptual and does not specify any given optimization algorithm. The only assumption about the algorithm is that it be derivative-free, as the problem is bi-level and the so there is no guarantee of convergence. Moreover, the bi-level nature presents the challenge of addressing the computational burden in real applications.-
Experiment 1 This experiment was carried out on synthetic data. The objective is to compare the logit model and the nested logit model with its constrained counterparts. As well as checking numerically that the constrained models display better fitting to the data, we show that the predictions of the unconstrained models may not be reliable in scenarios where the constraints are changed.
-
Experiment 2 This experiment is a real application consisting of fitting the
CNL
model to the rail services choice problem. In this experiment we describe a metaheuristic methodology based on hybridization of the Particle Swarm Optimization with the Nelder–Mead method. The main issue analysed in this experiment is the applicability of the proposed method and its convergence.
A simulation study (Experiment 1)
CNL
models.
fminsearch
function in MATLAB) and limiting the maximum number of iterations 2000. The constrained MNL and NL models were solved using the Sequential Quadratic Programming algorithm implemented in MATLAB (fmincon
function in MATLAB).CNL
modelsUtility | MNL | Constrained MNL | ||||||
---|---|---|---|---|---|---|---|---|
\(L^*\)
|
\(\rho ^2\)
|
\({\bar{\rho }}^2\)
|
CPU(s.)
|
\(L^*\)
|
\(\rho ^2\)
|
\( {\bar{\rho }}^2\)
|
CPU(s.)
| |
Constant | −2024.12 | 0.0676 | 0.0662 | 2.76 | −1147.01 | 0.4716 | 0.4703 | 531.67 |
Linear | −1438.96 | 0.3371 | 0.3344 | 10.17 | −1074.19 | 0.5052 | 0.5024 | 1593.20 |
Quadratic | −1334.08 | 0.3855 | 0.3813 | 36.17 | −1060.68 | 0.5114 | 0.5073 | 1935.74 |
Utility | NL | Constrained NL | ||||||
---|---|---|---|---|---|---|---|---|
\(L^*\)
|
\(\rho ^2\)
|
\({\bar{\rho }}^2\)
|
CPU(s.)
|
\(L^*\)
|
\(\rho ^2\)
|
\( {\bar{\rho }}^2\)
|
CPU(s.)
| |
Constant | −2516.61 | 0.3188 | 0.3172 | 61.41 | −1419.18 | 0.6159 | 0.6143 | 13,448.42 |
Linear | −1705.44 | 0.5384 | 0.5352 | 133.75 | −1362.27 | 0.6313 | 0.6280 | 64,126.07 |
Quadratic | −1694.24 | 0.5414 | 0.5366 | 244.18 | −1518.24 | 0.6323 | 0.6274 | 87,688.47 |
Number of observations = 1976 |
CNL
modelsUtility | MNL | Constrained MNL | ||
---|---|---|---|---|
\(L^*\)
|
\(L^*\)
|
\(\chi ^2\)
|
p-value | |
Constant | −2024.12 | −1281.1 | 1486.0 | 0.0124(*)
|
Linear | −1438.96 | −1196.6 | 484.7 | 0.9999 |
Quadratic | −1334.08 | −1176.2 | 315.8 | 0.9999 |
Utility | NL | Constrained NL | ||
---|---|---|---|---|
\(L^*\)
|
\(L^*\)
| |||
Constant | −2516.61 | −1634.51 | 1764.2 | 0.0000(*)
|
Linear | −1705.44 | −1534.94 | 341.0 | 0.9999 |
Quadratic | −1694.24 | −1516.92 | 354.6 | 0.9999 |
Scenario | Simultation | Utility | Constrained MNL | Constrained NL | |||
---|---|---|---|---|---|---|---|
Average
|
\(\sigma \)
|
Average
|
\(\sigma \)
|
Average
|
\(\sigma \)
| ||
Original scenario | Constant |
\(g_1=22.97\)
| 3.79 |
\(g_1=21.92\)
| 4.07 | ||
\(g_2=40.01\)
| 2.88 |
\(g_2=41.99\)
| 3.02 | ||||
\({K_1=25}\), |
\(g_1=22.71\)
| 3.24 | Linear |
\(g_1=23.63\)
| 4.31 |
\(g_1=23.08\)
| 3.27 |
\({K_2=40}\)
|
\(g_2=39.99\)
| 0.12 |
\(g_2=40.00\)
| 2.90 |
\(g_2=40.62\)
| 2.98 | |
Quadratic |
\(g_1=23.14\)
| 4.48 |
\(g_1=23.28\)
| 3.81 | |||
\(g_2=40.00\)
| 2.85 |
\(g_2=40.45\)
| 2.94 | ||||
\({K_1=25}\), |
\(g_1=16.42\)
| 4.30 | Constant |
\(g_1=14.16\)
| 3.66 |
\(g_1=14.42\)
| 3.44 |
\({K_2=60}\)
|
\(g_2=56.88\)
| 4.45 |
\(g_2=58.11\)
| 3.91 |
\(g_2=60.99\)
| 4.11 | |
Linear |
\(g_1=14.66\)
| 2.95 |
\(g_1=15.33\)
| 2.89 | |||
\(g_2=57.99\)
| 3.96 |
\(g_2=59.35\)
| 4.05 | ||||
Quadratic |
\(g_1=15.76\)
| 3.21 |
\(g_1=15.97\)
| 2.92 | |||
\(g_2=57.37\)
| 3.96 |
\(g_2=58.40\)
| 4.06 | ||||
\({K_1=10}\), |
\(g_1=9.86\)
| 0.58 | Constant |
\(g_1=7.01\)
| 1.99 |
\(g_1=7.33\)
| 2.08 |
\({K_2=70}\)
|
\(g_2=61.83\)
| 6.76 |
\(g_2=65.75\)
| 6.17 |
\(g_2=68.62\)
| 6.33 | |
Linear |
\(g_1=8.99\)
| 1.91 |
\(g_1=9.17\)
| 2.01 | |||
\(g_2=65.23\)
| 6.57 |
\(g_2=66.52\)
| 6.44 | ||||
Quadratic |
\(g_1=9.29\)
| 2.03 |
\(g_1=9.43\)
| 2.20 | |||
\(g_2=64.49\)
| 5.99 |
\(g_2=65.81\)
| 6.69 | ||||
\({K_1=+\infty }\), |
\(g_1=15.05\)
| 3.91 | Constant |
\(g_1=11.16\)
| 1.12 |
\(g_1=11.67\)
| 1.18 |
\({K_2=+\infty }\)
|
\(g_2=59.99\)
| 7.70 |
\(g_2=63.09\)
| 6.37 |
\(g_2=66.21\)
| 6.69 | |
Linear |
\(g_1=12.77\)
| 1.39 |
\(g_1=13.33\)
| 1.42 | |||
\(g_2=62.75\)
| 3.37 |
\(g_2=64.22\)
| 6.49 | ||||
Quadratic |
\(g_1=13.47\)
| 1.44 |
\(g_1=14.14\)
| 1.47 | |||
\(g_2=62.07\)
| 6.34 |
\(g_2=62.98\)
| 6.42 |
Scenario | Utility | MNL | NL | ||
---|---|---|---|---|---|
Average
|
\(\sigma \)
|
Average
|
\(\sigma \)
| ||
ALL | Constant |
\(g_1=23.64\)
| 2.39 |
\(g_1=23.63\)
| 2.39 |
\(g_2=40.00\)
| 4.04 |
\(g_2=40.00\)
| 4.04 | ||
Linear |
\(g_1=23.63\)
| 2.49 |
\(g_1=23.90\)
| 2.40 | |
\(g_2=40.00\)
| 5.27 |
\(g_2=39.60\)
| 5.41 | ||
Quadratic |
\(g_1=23.64\)
| 2.35 |
\(g_1=24.22\)
| 2.38 | |
\(g_2=40.00\)
| 5.51 |
\(g_2=39.56\)
| 5.44 |
Application of the CNL
model for railway service choice modelling (Experiment 2)
Case study
Type | Route | Amount | Type |
---|---|---|---|
of service | of services | of train | |
1 | MAD \(\rightarrow \) CR \(\rightarrow \) PU | 11 | AVANT |
2 | MAD \(\rightarrow \) CR \(\rightarrow \) PU \(\rightarrow \) COR | 3 | AVE |
3 | MAD \(\rightarrow \) CR \(\rightarrow \) PU \(\rightarrow \) COR \(\rightarrow \) SEV | 5 | AVE |
4 | MAD \(\rightarrow \) COR | 4 | AVE |
5 | MAD \(\rightarrow \) COR \(\rightarrow \) SEV | 9 | AVE |
6 | MAD \(\rightarrow \) SEV | 3 | AVE |
7 | COR \(\rightarrow \) SEV | 6 | MD |
8 | COR \(\rightarrow \) SEV | 9 | AVANT |
9 | SEV \(\rightarrow \) COR | 6 | MD |
10 | SEV \(\rightarrow \) COR | 9 | AVANT |
11 | SEV \(\rightarrow \) COR \(\rightarrow \) PU \(\rightarrow \) CR \(\rightarrow \) MAD | 5 | AVE |
12 | SEV \(\rightarrow \) COR \(\rightarrow \) MAD | 8 | AVE |
13 | SEV \(\rightarrow \) MAD | 3 | AVE |
14 | COR \(\rightarrow \) PU \(\rightarrow \) CR \(\rightarrow \) MAD | 3 | AVE |
15 | COR \(\rightarrow \) MAD | 5 | AVE |
16 | PU \(\rightarrow \) MAD \(\rightarrow \) CR | 11 | AVANT |
Type | Train capacity |
---|---|
\(K_s\) (passengers) | |
AVE | 308 |
AVANT | 237 |
MD | 190 |
Estimation methods
CHL
model. In this example the values of \(\lambda _1\) and \(\lambda _2\) have also been estimated.CNL
model has been GAMS 24 with solver CONOPT, showing that in an Intel I7 4 Cores 3.2 GHZ with 16 GB RAM computer the CPU time for each problem is around 0.12 seconds.CNL
problem. The stopping criterion is based on the total number of solved CNL
models. The SPSO algorithm was run for 50,000 objective function evaluations. A random start on an interval defined by Eq. (33) was used. The size of the swarm was 40 particles. The PSO-parameters w, \(c_1\) and \(c_2\) for updating velocity are defined as \(w = 1/(2\ln (2))\), \(c_1 = 0.5 + \ln (2)\) and \(c_2 = c_1\) (Zambrano-Bigiarini et al. 2013). NM is run for 50,000 function evaluations starting from the best solution found by the SPSO algorithm to improve the solution.CNL
model can be computed in an affordable time.Case | −B | B | SPSO |
\(\hbox {SPSO}+\hbox {NM}\)
|
---|---|---|---|---|
1 | −0.5 | 0.5 | 4.7276E\(+\)05 | 2.5866E\(+\)05 |
2 | −1 | 1 | 3.2272E\(+\)05 | 2.5248E\(+\)05 |
3 | −2 | 2 | 3.2828E\(+\)05 | 2.5988E\(+\)05 |
4 | −5 | 5 | 4.8158E\(+\)05 | 2.7113E\(+\)05 |
5 | −10 | 10 | 5.6018E\(+\)05 | 2.6979E\(+\)05 |
6 | −30 | 30 | 1.1273E\(+\)06 | 1.1273E\(+\)06 |
7 | −50 | 50 | 1.2238E\(+\)06 | 1.2238E\(+\)06 |
8 | −100 | 100 | 1.4023E\(+\)06 | 1.4023E\(+\)06 |
9 | −500 | 500 | 1.1742E\(+\)06 | 1.1742E\(+\)06 |
10 | −1000 | 1000 | 1.0889E\(+\)06 | 1.0628E\(+\)06 |
11 | −5E\(+\)03 | 5E\(+\)03 | 1.2428E\(+\)06 | 1.2428E\(+\)06 |
12 | −1E\(+\)04 | −1E\(+\)04 | 1.0640E\(+\)06 | 1.0640E\(+\)06 |
13 | −5E\(+\)04 | 5E\(+\)04 | 1.3045E\(+\)06 | 1.3045E\(+\)06 |
14 | −5E\(+\)05 | 5E\(+\)05 | 1.2318E\(+\)06 | 1.2318E\(+\)06 |
15 | −1E\(+\)07 | 1E\(+\)07 | 2.2174E\(+\)06 | 2.2113E\(+\)06 |
Study of the best solution obtained
Conclusions
CNL
to model both the dynamic and constrained decision spaces in discrete choice contexts. This type of approach is suited to modelling problems in which exogenous and endogenous factors limit the universal choice set of the decision-makers. Applying the model requires additional data to specify the constraints and derivative-free optimization methods for solving the estimation problem.CNL
formulation are also discussed. The introduction of the estimation method requires testing its capacity to infer unbiased estimates of the true parameters. Further research into this subject, and how to specify the type of kernels and its parametrization is necessary to assess whether this methodology can give a perfect reconstruction of the original utility.CNL
. The computational cost of solving the bi-level model is 4.4 h. The importance of eliminating over-specification of the model can be seen in the quality of the results obtained. The results show how the CNL
model could represent the behaviour of users of the railway network.