Skip to main content
Log in

Optimal balanced control for call centers

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

In this paper we study the optimal assignment of tasks to agents in a call center. For this type of problem, typically, no single deterministic and stationary (i.e., state independent and easily implementable) policy yields the optimal control, and mixed strategies are used. Other than finding the optimal mixed strategy, we propose to optimize the performance over the set of “balanced” deterministic periodic non-stationary policies. We provide a stochastic approximation algorithm that allows to find the optimal balanced policy by means of Monte Carlo simulation. As illustrated by numerical examples, the optimal balanced policy outperforms the optimal mixed strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Altman, E. (1999). Constrained Markov decision processes. London: Chapman & Hall.

    Google Scholar 

  • Altman, E., Gaujal, B., & Hordijk, A. (2000a). Balanced sequences and optimal routing. Journal of the ACM, 47, 4.

    Article  Google Scholar 

  • Altman, E., Gaujal, B., & Hordijk, A. (2000b). Multimodularity, convexity and optimization properties. Mathematics of Operations Research, 25(2), 324–347.

    Article  Google Scholar 

  • Altman, E., Gaujal, B., & Hordijk, A. (2002). Regular ordering and applications in control policies. Discrete Event Dynamic Systems, 12(2), 187–210.

    Article  Google Scholar 

  • Altman, E., Gaujal, B., & Hordijk, A. (2003). Discrete-event control of stochastic networks: multimodularity and regularity. New York: Springer.

    Book  Google Scholar 

  • Altman, E., Gaujal, B., Hordijk, A., & Koole, G. (1998). Optimal admission, routing and service assignment control: the case of single buffer queues. In 37th IEEE conference on decision and control, Tampa, FL, USA (Vol. 2, pp. 2119–2124).

    Google Scholar 

  • Altman, E., & Shwartz, A. (1993). Time-sharing policies for controlled Markov chains. Operations Research, 41(6), 1116–1124.

    Article  Google Scholar 

  • Armony, M., & Maglaras, C. (2004). On customer contact centers with a call-back option: customer decisions, routing rules and system design. Operations Research, 52, 271–292.

    Article  Google Scholar 

  • Bhulai, S., & Koole, G. (2003). A queueing model for call blending in call centers. IEEE Transactions on Automatic Control, 48, 1434–1438.

    Article  Google Scholar 

  • Cao, X. (2007). Stochastic learning and optimization: a sensitivity-based approach. New York: Springer.

    Google Scholar 

  • Dai, J., & Tezcan, T. (2008). Optimal control of parallel server systems with many servers in heavy traffic. Queueing Systems, 59, 95–134.

    Article  Google Scholar 

  • Feinberg, E. A., & Shwartz, A. (Eds.) (2002). Handbook of Markov decision processes. Norwell: Kluwer Academic.

    Google Scholar 

  • Franx, G., Koole, G., & Pot, S. (2006). Approximating multi-skill blocking systems by hyperexponential decomposition. Performance Evaluation, 63(8), 799–824.

    Article  Google Scholar 

  • Gans, N., Koole, G., & Mandelbaum, A. (2003). Telephone call centers: tutorial, review, and research prospects. Manufacturing & Service Operations Management, 5, 79–141.

    Article  Google Scholar 

  • Gans, N., & Zhou, Y. (2003). A call-routing problem with service-level constraints. Operations Research, 51, 255–271.

    Article  Google Scholar 

  • Glynn, P. W., & Whitt, W. (1992). The asymptotic efficiency of simulation estimators. Operations Research, 40(3), 505–520.

    Article  Google Scholar 

  • Gurvich, I., Armony, M., & Mandelbaum, A. (2008). Service level differentiation in call centers with fully flexible servers. Management Science, 54(2), 279–294.

    Article  Google Scholar 

  • Hajek, B. (1985). Extremal splittings of point processes. Mathematics of Operations Research, 10(4), 543–556.

    Article  Google Scholar 

  • Hardy, G. H., & Wright, E. M. (1960). An introduction to the theory of numbers (4th ed.). Oxford: Clarendon Press.

    Google Scholar 

  • Heidergott, B., & Cao, X. R. (2002). A note on the relation between weak derivatives and perturbation realization. IEEE Transactions on Automatic Control, 47(7), 1112–1115.

    Google Scholar 

  • Heidergott, B., & Hordijk, A. (2003). Taylor series expansions for stationary Markov chains. Advances in Applied Probability, 23, 1046–1070.

    Article  Google Scholar 

  • Heidergott, B., Hordijk, A., & Weisshaupt, H. (2006). Measure-valued differentiation for stationary Markov chains. Mathematics of Operations Research, 31(1), 154–172.

    Article  Google Scholar 

  • Heidergott, B., & Vázquez-Abad, F. (2006). Measure-valued differentiation for random horizon problems. Markov Processes and Related Fields, 12, 509–536.

    Google Scholar 

  • Heidergott, B., & Vázquez-Abad, F. (2008). Measure-valued differentiation for Markov chains. Journal of Optimization and Applications, 136(2), 187–209.

    Article  Google Scholar 

  • Koole, G., & Mandelbaum, A. (2002). Queueing models of call centers: an introduction. Annals of Operations Research, 113, 41–59.

    Article  Google Scholar 

  • Koole, G., & Talim, J. (2000). Exponential approximation of multi-skill call centers architecture. In QNETs 2000: fourth international workshop on queueing networks with finite capacity, Craiglands Hotel, Ilkley, West Yorkshire, UK (pp. 23/1–23/10).

    Google Scholar 

  • Lothaire, M. (2002). Algebraic combinatorics on words. Cambridge: Cambridge University Press.

    Google Scholar 

  • Morse, M., & Hedlund, G. (1940). Symbolic dynamics II: Sturmian trajectories. American Journal of Mathematics, 62, 1–42.

    Article  Google Scholar 

  • Perry, M., & Nilsson, A. (1992). Performance modeling of automatic call distributors: assignable grade of service staffing. In 14th international switching symposium, Yokohama (pp. 294–298).

    Google Scholar 

  • Pflug, G. (1996). Optimization of stochastic models. Boston: Kluwer Academic.

    Book  Google Scholar 

  • Puterman, M. L. (1994). Markov decision processes: discrete stochastic dynamic programming. New York: Wiley.

    Google Scholar 

  • Shinya, S., Naoto, M., & Ryohei, K. (2004). m-Balanced words: a generalization of balanced words. Theoretical Computer Science, 314(1), 97–120.

    Article  Google Scholar 

  • Shumsky, R. (2004). Approximation and analysis of a queueing system with flexible and specialized servers. OR Spektrum, 26, 307–330.

    Article  Google Scholar 

  • Stanford, D., & Grassmann, W. (2000). Bilingual server call centers. In D. McDonald & S. Turner (Eds.), Fields institute communications: Vol. 208. Analysis of communication networks: call centers, traffic and performance (pp. 31–47).

    Google Scholar 

  • Wallace, R., & Whitt, W. (2005). A staffing algorithm for call centers with skill-based routing. Manufacturing & Service Operations Management, 7(4), 276–294.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernd Heidergott.

Appendices

Appendix A: Implementation of the phantom estimator

In the following we discuss the implementation of the phantom estimator. Denote the waiting time difference between two phantoms by Δ w (s):

(17)

Note that there are only two different choices for the Markov kernel (P or Q) at the splitting point. Therefore, one of the two phantoms is equal to the nominal path. Specifically, if the decision variable (either zero or one) for the j-th choice is u(j)=1, then the “plus” phantom is equal to the nominal path, which yields \(W_{1}^{+} (s, j ) = W_{1}(\theta) \). In this case, using the difference variable Δ w (s), then average waiting time of the “minus” phantom is obtained by

$$ W_1^- ( s, j ) = W_1^+ (s, j ) - \Delta_w ( s, j ) = W_1(\theta) - \Delta_w ( s, j ). $$
(18)

On the other hand, if u(j)=0, then average waiting time of the “minus” phantom is equal to the nominal path, which yields \(W_{1}^{-} X_{\theta}( s, j ) =W_{1}(\theta) \) and

$$ W_1^+ (s, j ) = W_1^- ( s, j ) + \Delta_w ( s, j ) = W_1(\theta) + \Delta_w ( s, j ). $$
(19)

Let \(\mathcal{I}(j)\) be the indicator whether the transition kernel of the nominal path is equal to P or Q:

The nominal path coincides with either of the phantoms and the average waiting time of the phantom that has to be additionally simulated is obtained from combining (18) with (19):

$$W_1 ( \theta) - \mathcal{I}(j) \Delta_w(s, j) . $$

Following this train of thoughts, the difference between the cumulative costs in Eq. (15) can be written as

$$ D(s, j) = A \mathcal{I}(j) \bigl( \bigl(W_1 ( \theta) - \alpha \bigr)^2 - \bigl( W_1 ( \theta) - \mathcal{I}(j) \cdot \Delta_w(s,j) - \alpha\bigr)^2 \bigr) $$
(20)

and the resulting gradient estimator is

(21)

In the following we discuss implementation issues for the phantom estimator when the balanced control policy is used. Let θ be a rational number. In this case θ can be written as a fraction of two division-free integers, that is, θ=p/q, for p,q∈ℕ. In the balanced sequence for θ, there will be p 1s in any subsequence of length q of u θ,ϕ (n), and {u θ,ϕ (k)} will be periodic with the period length q. For example, the period of the balanced sequence generated for θ=0.3=3/10 and ϕ=0 is: (0,0,0,1,0,0,1,0,0,1) with period length q=10. Let n%q denote the position in the period of the nth element of {u θ,ϕ (k)}. It is easy to see that if m%q=n%q, then for k≥1, it holds that

$$u_{\theta,\phi}(m+k) = u_{\theta,\phi}(n+k). $$

We extend the state-space of the phantom process \(X^{\pm}_{\theta}( n ) \) with the auxiliary variable \(u_{\theta,\phi}^{\pm}(n) \) denoting at which position in the period of the θ-balanced sequence the process is. For irrational θ, using the fact that, on a computer, θ is given in finite decimal notation, we may obtain a rational number equal or arbitrarily close to θ by applying the continued fraction algorithm on θ (see, for example, Chap. 10 in Hardy and Wright 1960). Then in the coupling schemes we extend the physical state by \(u_{\theta,\phi}^{\pm}(n)\). In words, the “plus” and “minus” version couple as soon as the following variables are identical: the number of jobs waiting, the number of type 1 jobs in service, the number of type 2 jobs in service, and the balanced sequences driving the policy have the same position in the period.

Appendix B: Balanced sequences

For an infinite sequence U=(u 1,u 2,…) of zeros and ones,

$$s(k,n) = \sum_{j = k}^{k+n-1} u_j , \quad k \leq n , $$

denotes the numbers of ones in the subsequence of length n beginning at the k-th element of U. With this notation s(n):=s(1,n) is the number of ones in the prefix of length n of U. We say that U=(u 1,u 2,…) has density θ∈[0,1] if lim n→∞ s(n)/n exists and is equal to θ. Consider, for example, the case θ=1/2. Then the sequence (0,1,0,1,0,1,…) has rate 1/2 and could be used for modeling a mixing policy at θ=1/2. However, (0,1,0,0,1,1,0,0,0,1,1,1,…) has also rate 1/2. The latter sequence is probably not a good choice as the sequences of consecutive ones and zeros lead to a rater unbalanced system behavior.

In the following we introduce the concept of uniformly recurrent sequences. An infinite sequence U=(u 1,u 2,…) is uniformly recurrent if for every finite subsequence W of U, an integer k exists such that W is a subsequence of every subsequence of U of length k. Note that any periodic sequence is uniformly recurrent. For example, the sequence (0,1,0,1,…) is uniformly recurrent as any subsequence of length l is contained in any other finite subsequence of length l+1.

Note that if U has some density θ, it is the asymptotic frequency of the ones in U. If such U is applied as control sequence then policy P is applied with fraction θ of all decision events while Q is applied with remaining fraction 1−θ. Moreover, it follows for k=0,1,… that

$$\lim_{n \rightarrow\infty} \frac{s(k,n)-n \theta}{n} = 0 . $$

Intuitively a small maximal absolute deviation between s(k,n) and means that the ones and zeros are regularly distributed with density θ. Thus U should be such that the maximal absolute deviation is small. It can be shown that for every density θ∈[0,1] there exists U for which the maximal deviation of |s(k,n)−| is smaller than one. Such sequences are called regular. Following, for example, Altman et al. (2000a), a sequence of zeros and ones is called balanced if the difference in number of ones (or zeros which is equivalent) for subsequences of the same length is at most one.

Obviously any regular sequences is a balanced sequence. Also it can be proven (see, for example, Lothaire 2002 Chaps. 1 and 2) that regular sequences are uniformly recurrent.

Example 4

An example of a regular sequence with density θ=2/5 is the periodic sequence (1,0,1,0,0,1,0,1,0,0,1,0,1,0,0,…). If θ is rational then corresponding regular sequences are always periodic with period equal to the denominator of θ. Thus 5 is the period of the sequence in this example. Another example of a regular sequence is the well-known (see Lothaire 2002) Fibonacci sequence (1,0,1,1,0,1,0,1,1,0,1,1,0,…) with density θ= \((\sqrt {5}-1)/2 \) which can not be periodic since its density θ is irrational. However, also the Fibonacci sequence is uniformly recurrent as any regular sequence is.

By the classification of balanced sequences (see Morse and Hedlund 1940) it follows that any sequence which is both uniformly recurrent and balanced is in fact a regular sequence. Thus within the subset of uniformly recurrent control sequences to which we restricted earlier regular is equivalent to balanced. Therefore in this paper we use the term “balanced control policy” in the following sense:

A balanced control policy of rate θ∈[0,1] is a deterministic policy mixing P and Q according to an infinite control sequence U=(u 1,u 2,…) of zeros and ones where U is regular of density θ. This implies that U is both uniformly recurrent and balanced. Moreover, on average over all decision events P is applied with fraction θ and Q with fraction 1−θ.

Another useful fact is that regular sequences of given density θ have the same finite subsequences (see Lothaire 2002) which implies that regularity of a given density θ uniquely determines the sequence (and thus also the policy) modulo a shift ϕ. Therefore balanced control policies of the same density θ have the same Césaro average performance and thus we may speak of the balanced policy of rate θ instead of a balanced policy of rate θ.

In this paper we compare while varying over θ∈[0,1] the performance of the balanced control policy of rate θ with the Bernoulli policy of rate θ while varying over θ∈[0,1]. For simulation a practical implementation of the balanced policy of rate θ is necessary, several ways to construct regular sequences are known. In this paper we use the one given by (6).

Appendix C: Proof of Lemma 2

Lemma 2

Let \(\{ R(U_{\theta,\varPhi}(j))\}_{j=1}^{\infty} \) be a randomized balanced (P,Q)-policy of rate θ. Then for j=1,2,… we have that

$$\mathbb{E}_\varPhi\bigl[R\bigl(U_{\theta,\varPhi}(j)\bigr)\bigr] = \theta P + (1-\theta) Q. $$

Proof

From 0≤θ≤1 it is easily seen that U θ,Φ (j)∈{0,1} for j=1,2,… . Thus the random variable U θ,Φ (j) has a Bernoulli distribution. Also for any x∈ℝ we trivially have that

$$ \int_{0}^1 \lfloor x + \phi\rfloor d \phi= x. $$
(22)

By (22) it follows that

Hence

Thus

 □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhulai, S., Farenhorst-Yuan, T., Heidergott, B. et al. Optimal balanced control for call centers. Ann Oper Res 201, 39–62 (2012). https://doi.org/10.1007/s10479-012-1215-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-012-1215-1

Keywords

Navigation