Skip to main content
Erschienen in: OR Spectrum 3/2021

25.02.2021 | Regular Article

A column-oriented optimization approach for the generation of correlated random vectors

verfasst von: Jorge A. Sefair, Oscar Guaje, Andrés L. Medaglia

Erschienen in: OR Spectrum | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

To induce a desired correlation structure among random variables, widely popular simulation software relies upon the method of Iman and Conover (IC). The underlying premise is that the induced Spearman rank correlation is a meaningful way to approximate other correlation measures among the random variables (e.g., Pearson’s correlation). However, as expected, the desired a posteriori correlation structure often deviates from the Spearman correlation structure. Rooted in the same principle of IC, we propose an alternative distribution-free method based on mixed-integer programming to induce a Pearson correlation structure to bivariate or multivariate random vectors. We also extend our distribution-free method to other correlation measures such as Kendall’s coefficient of concordance, Phi correlation coefficient, and relative risk. We illustrate our method in four different contexts: (1) the simulation of a healthcare facility, (2) the analysis of a manufacturing tandem queue, (3) the imputation of correlated missing data in statistical analysis, and (4) the estimation of the budget overrun risk in a construction project. We also explore the limits of our algorithms by conducting extensive experiments using randomly generated data from multiple distributions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Abdella M, Marwala T (2005) The use of genetic algorithms and neural networks to approximate missing data in database. In: IEEE 3rd international conference on computational cybernetics, 2005 (ICCC 2005). IEEE, pp 207–212 Abdella M, Marwala T (2005) The use of genetic algorithms and neural networks to approximate missing data in database. In: IEEE 3rd international conference on computational cybernetics, 2005 (ICCC 2005). IEEE, pp 207–212
Zurück zum Zitat Altiok T, Melamed B (2001) The case for modeling correlation in manufacturing systems. IIE Trans 33(9):779–791CrossRef Altiok T, Melamed B (2001) The case for modeling correlation in manufacturing systems. IIE Trans 33(9):779–791CrossRef
Zurück zum Zitat Batista G, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17(5–6):519–533CrossRef Batista G, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17(5–6):519–533CrossRef
Zurück zum Zitat Biswas A (2004) Generating correlated ordinal categorical random samples. Stat Probab Lett 70(1):25–35CrossRef Biswas A (2004) Generating correlated ordinal categorical random samples. Stat Probab Lett 70(1):25–35CrossRef
Zurück zum Zitat Cahen EJ, Mandjes M, Zwart B (2018) Estimating large delay probabilities in two correlated queues. ACM Trans Model Comput Simul 28(1):2CrossRef Cahen EJ, Mandjes M, Zwart B (2018) Estimating large delay probabilities in two correlated queues. ACM Trans Model Comput Simul 28(1):2CrossRef
Zurück zum Zitat Cario MC, Nelson BL (1997) Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. Technical report, Department of Industrial Engineering and Management Science, Northwestern University Cario MC, Nelson BL (1997) Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. Technical report, Department of Industrial Engineering and Management Science, Northwestern University
Zurück zum Zitat Chakraborty A (2006) Generating multivariate correlated samples. Comput Stat 21(1):103–119CrossRef Chakraborty A (2006) Generating multivariate correlated samples. Comput Stat 21(1):103–119CrossRef
Zurück zum Zitat Charmpis DC, Panteli PL (2004) A heuristic approach for the generation of multivariate random samples with specified marginal distributions and correlation matrix. Comput Stat 19(2):283CrossRef Charmpis DC, Panteli PL (2004) A heuristic approach for the generation of multivariate random samples with specified marginal distributions and correlation matrix. Comput Stat 19(2):283CrossRef
Zurück zum Zitat Clark DE, El-Taha M (1998) Generation of correlated logistic-normal random variates for medical decision trees. Methods Inf Med 37(03):235–238CrossRef Clark DE, El-Taha M (1998) Generation of correlated logistic-normal random variates for medical decision trees. Methods Inf Med 37(03):235–238CrossRef
Zurück zum Zitat Cornfield J (1951) A method of estimating comparative rates from clinical data: applications to cancer of the lung, breast, and cervix. J Natl Cancer Inst 11(6):1269–1275 Cornfield J (1951) A method of estimating comparative rates from clinical data: applications to cancer of the lung, breast, and cervix. J Natl Cancer Inst 11(6):1269–1275
Zurück zum Zitat Corredor D, Cabrera N, Medaglia AL, Akhavan-Tabatabaei R (2020) Data-driven approach for the shortest \(\alpha\)-reliable path problem. COPA working paper Corredor D, Cabrera N, Medaglia AL, Akhavan-Tabatabaei R (2020) Data-driven approach for the shortest \(\alpha\)-reliable path problem. COPA working paper
Zurück zum Zitat Dai YS, Xie M, Poh KL, Ng SH (2004) A model for correlated failures in n-version programming. IIE Trans 36(12):1183–1192CrossRef Dai YS, Xie M, Poh KL, Ng SH (2004) A model for correlated failures in n-version programming. IIE Trans 36(12):1183–1192CrossRef
Zurück zum Zitat Deb R, Liew AW-C (2016) Missing value imputation for the analysis of incomplete traffic accident data. Inf Sci 339:274–289CrossRef Deb R, Liew AW-C (2016) Missing value imputation for the analysis of incomplete traffic accident data. Inf Sci 339:274–289CrossRef
Zurück zum Zitat Desaulniers G, Desrosiers J, Solomon MM (2006) Column generation, vol 5. Springer, New York Desaulniers G, Desrosiers J, Solomon MM (2006) Column generation, vol 5. Springer, New York
Zurück zum Zitat Dias CTDS, Samaranayaka A, Manly B (2008) On the use of correlated beta random variables with animal population modeling. Ecol Model 215:293–300CrossRef Dias CTDS, Samaranayaka A, Manly B (2008) On the use of correlated beta random variables with animal population modeling. Ecol Model 215:293–300CrossRef
Zurück zum Zitat Ghosh S, Henderson SG (2003) Behavior of the Norta method for correlated random vector generation as the dimension increases. ACM Trans Model Comput Simul 13(3):276–294CrossRef Ghosh S, Henderson SG (2003) Behavior of the Norta method for correlated random vector generation as the dimension increases. ACM Trans Model Comput Simul 13(3):276–294CrossRef
Zurück zum Zitat Gross D, Harris CM (1985) Fundamentals of queueing theory. Wiley, New York Gross D, Harris CM (1985) Fundamentals of queueing theory. Wiley, New York
Zurück zum Zitat Haas CN (1999) On modeling correlated random variables in risk assessment. Risk Anal 19(6):1205–1214CrossRef Haas CN (1999) On modeling correlated random variables in risk assessment. Risk Anal 19(6):1205–1214CrossRef
Zurück zum Zitat Harris CM, Hoffman KL, Yarrow L- (1995a) Obtaining minimum-correlation Latin hypercube sampling plans using an ip-based heuristic. OR Spektrum 17(2–3):139–148CrossRef Harris CM, Hoffman KL, Yarrow L- (1995a) Obtaining minimum-correlation Latin hypercube sampling plans using an ip-based heuristic. OR Spektrum 17(2–3):139–148CrossRef
Zurück zum Zitat Harris CM, Hoffman KL, Yarrow L-A (1995b) Using integer programming techniques for the solution of an experimental design problem. Ann Oper Res 58(3):243–260CrossRef Harris CM, Hoffman KL, Yarrow L-A (1995b) Using integer programming techniques for the solution of an experimental design problem. Ann Oper Res 58(3):243–260CrossRef
Zurück zum Zitat Hill RR, Reilly CH (1994) Composition for multivariate random variables. In: Proceedings of winter simulation conference. IEEE, pp 332–339 Hill RR, Reilly CH (1994) Composition for multivariate random variables. In: Proceedings of winter simulation conference. IEEE, pp 332–339
Zurück zum Zitat Hill RR, Reilly CH (2000) The effects of coefficient correlation structure in two-dimensional knapsack problems on solution procedure performance. Manag Sci 46(2):302–317CrossRef Hill RR, Reilly CH (2000) The effects of coefficient correlation structure in two-dimensional knapsack problems on solution procedure performance. Manag Sci 46(2):302–317CrossRef
Zurück zum Zitat Iman RL, Conover W-J (1982) A distribution-free approach to inducing rank correlation among input variables. Commun Stat Simul Comput 11(3):311–334CrossRef Iman RL, Conover W-J (1982) A distribution-free approach to inducing rank correlation among input variables. Commun Stat Simul Comput 11(3):311–334CrossRef
Zurück zum Zitat Kendall MG, Babington-Smith B (1939) The problem of m rankings. Ann Math Stat 10(3):275–287CrossRef Kendall MG, Babington-Smith B (1939) The problem of m rankings. Ann Math Stat 10(3):275–287CrossRef
Zurück zum Zitat Kolev N, Paiva D (2008) Random sums of exchangeable variables and actuarial applications. Insur Math Econ 42(1):147–153CrossRef Kolev N, Paiva D (2008) Random sums of exchangeable variables and actuarial applications. Insur Math Econ 42(1):147–153CrossRef
Zurück zum Zitat Law AM, Kelton WD (2000) Simulation modeling and analysis, 3rd edn. Mc Graw-Hill, New York Law AM, Kelton WD (2000) Simulation modeling and analysis, 3rd edn. Mc Graw-Hill, New York
Zurück zum Zitat L’Ecuyer P, Meliani L, Vaucher J (2002) Ssj: a framework for stochastic simulation in java. In: Proceedings of the (2002) winter simulation conference. IEEE, Piscataway, NJ, pp 234–242 L’Ecuyer P, Meliani L, Vaucher J (2002) Ssj: a framework for stochastic simulation in java. In: Proceedings of the (2002) winter simulation conference. IEEE, Piscataway, NJ, pp 234–242
Zurück zum Zitat Legendre P (2005) Species associations: the Kendall coefficient of concordance revisited. J Agric Biol Environ Stat 10(2):226–245CrossRef Legendre P (2005) Species associations: the Kendall coefficient of concordance revisited. J Agric Biol Environ Stat 10(2):226–245CrossRef
Zurück zum Zitat Leschied JR, Mazza MB, Davenport MS, Chong ST, Smith EA, Hoff CN, Ladino-Torres MF, Khalatbari S, Ehrlich PF, Dillman JR (2016) Inter-radiologist agreement for CT scoring of pediatric splenic injuries and effect on an established clinical practice guideline. Pediatr Radiol 46(2):229–236CrossRef Leschied JR, Mazza MB, Davenport MS, Chong ST, Smith EA, Hoff CN, Ladino-Torres MF, Khalatbari S, Ehrlich PF, Dillman JR (2016) Inter-radiologist agreement for CT scoring of pediatric splenic injuries and effect on an established clinical practice guideline. Pediatr Radiol 46(2):229–236CrossRef
Zurück zum Zitat Levitin G, Xie M (2006) Performance distribution of a fault-tolerant system in the presence of failure correlation. IIE Trans 38(6):499–509CrossRef Levitin G, Xie M (2006) Performance distribution of a fault-tolerant system in the presence of failure correlation. IIE Trans 38(6):499–509CrossRef
Zurück zum Zitat Li ST, Hammond JL (1975) Generation of pseudorandom numbers with specified univariate distributions and correlation coefficients. IEEE Trans Syst Man Cybern 5:557–561CrossRef Li ST, Hammond JL (1975) Generation of pseudorandom numbers with specified univariate distributions and correlation coefficients. IEEE Trans Syst Man Cybern 5:557–561CrossRef
Zurück zum Zitat Little RJA, Rubin DB (2019) Statistical analysis with missing data, vol 793. Wiley, New York Little RJA, Rubin DB (2019) Statistical analysis with missing data, vol 793. Wiley, New York
Zurück zum Zitat Lübbecke ME, Desrosiers J (2005) Selected topics in column generation. Oper Res 53(6):1007–1023CrossRef Lübbecke ME, Desrosiers J (2005) Selected topics in column generation. Oper Res 53(6):1007–1023CrossRef
Zurück zum Zitat Lurie PM, Goldberg MS (1998) An approximate method for sampling correlated random variables from partially-specified distributions. Manag Sci 44(2):203–218CrossRef Lurie PM, Goldberg MS (1998) An approximate method for sampling correlated random variables from partially-specified distributions. Manag Sci 44(2):203–218CrossRef
Zurück zum Zitat Medaglia AL, Sefair JA (2009) Generating correlated random vectors using mixed-integer programming. In: Proceedings of the IIE annual conference. Institute of Industrial and Systems Engineers (IISE), 1759 Medaglia AL, Sefair JA (2009) Generating correlated random vectors using mixed-integer programming. In: Proceedings of the IIE annual conference. Institute of Industrial and Systems Engineers (IISE), 1759
Zurück zum Zitat Mitchell CR, Paulson AS, Beswick CA (1977) Effect of correlated exponential service times on single server tandem queues. Naval Res Logist 24(1):95–112CrossRef Mitchell CR, Paulson AS, Beswick CA (1977) Effect of correlated exponential service times on single server tandem queues. Naval Res Logist 24(1):95–112CrossRef
Zurück zum Zitat Moorthy K, Saberi Mohamad M, Deris S (2014) A review on missing value imputation algorithms for microarray gene expression data. Curr Bioinform 9(1):18–22CrossRef Moorthy K, Saberi Mohamad M, Deris S (2014) A review on missing value imputation algorithms for microarray gene expression data. Curr Bioinform 9(1):18–22CrossRef
Zurück zum Zitat Morris JA, Gardner MJ (1988) Calculating confidence intervals for relative risk (odds ratios) and standardised ratios and rates. Br Med J 296(6632):1313–1316CrossRef Morris JA, Gardner MJ (1988) Calculating confidence intervals for relative risk (odds ratios) and standardised ratios and rates. Br Med J 296(6632):1313–1316CrossRef
Zurück zum Zitat Nasr WW, Maddah B (2015) Continuous (s, S) policy with MMPP correlated demand. Eur J Oper Res 246(3):874–885CrossRef Nasr WW, Maddah B (2015) Continuous (s, S) policy with MMPP correlated demand. Eur J Oper Res 246(3):874–885CrossRef
Zurück zum Zitat Park CG, Dong WS (1998) An algorithm for generating correlated random variables in a class of infinitely divisible distributions. J Stat Comput Simul 61(1–2):127–139CrossRef Park CG, Dong WS (1998) An algorithm for generating correlated random variables in a class of infinitely divisible distributions. J Stat Comput Simul 61(1–2):127–139CrossRef
Zurück zum Zitat Park CG, Park T, Shin DW (1996) A simple method for generating correlated binary variates. Am Stat 50(4):306–310 Park CG, Park T, Shin DW (1996) A simple method for generating correlated binary variates. Am Stat 50(4):306–310
Zurück zum Zitat Patuwo BE, Disney RL, McNickle DC (1993) The effect of correlated arrivals on queues. IIE Trans 25(3):105–110CrossRef Patuwo BE, Disney RL, McNickle DC (1993) The effect of correlated arrivals on queues. IIE Trans 25(3):105–110CrossRef
Zurück zum Zitat Polge RJ, Holliday EM, Bhagavan BK (1973) Generation of a pseudo-random set with desired correlation and probability distribution. Simulation 20(5):153–158CrossRef Polge RJ, Holliday EM, Bhagavan BK (1973) Generation of a pseudo-random set with desired correlation and probability distribution. Simulation 20(5):153–158CrossRef
Zurück zum Zitat Pouillot R, Delignette-Muller M-L (2010) Evaluating variability and uncertainty in microbial quantitative risk assessment using two R packages. Int J Food Microbiol 142(3):330–40CrossRef Pouillot R, Delignette-Muller M-L (2010) Evaluating variability and uncertainty in microbial quantitative risk assessment using two R packages. Int J Food Microbiol 142(3):330–40CrossRef
Zurück zum Zitat Qaqish BF (2003) A family of multivariate binary distributions for simulating correlated binary variables with specified marginal means and correlations. Biometrika 90(2):455–463CrossRef Qaqish BF (2003) A family of multivariate binary distributions for simulating correlated binary variables with specified marginal means and correlations. Biometrika 90(2):455–463CrossRef
Zurück zum Zitat Reilly CH (2009) Synthetic optimization problem generation: show us the correlations! INFORMS J Comput 21(3):458–467CrossRef Reilly CH (2009) Synthetic optimization problem generation: show us the correlations! INFORMS J Comput 21(3):458–467CrossRef
Zurück zum Zitat Rosenfeld S (2008) Approximate bivariate gamma generator with prespecified correlation and different marginal shapes. ACM Trans Model Comput Simul 18(4):16CrossRef Rosenfeld S (2008) Approximate bivariate gamma generator with prespecified correlation and different marginal shapes. ACM Trans Model Comput Simul 18(4):16CrossRef
Zurück zum Zitat Schmeiser BW, Lal R (1982) Bivariate gamma random vectors. Oper Res 30(2):355–374CrossRef Schmeiser BW, Lal R (1982) Bivariate gamma random vectors. Oper Res 30(2):355–374CrossRef
Zurück zum Zitat Sefair JA, Méndez CY, Babat O, Medaglia AL, Zuluaga LF (2017) Linear solution schemes for mean-semivariance project portfolio selection problems: an application in the oil and gas industry. Omega 68:39–48CrossRef Sefair JA, Méndez CY, Babat O, Medaglia AL, Zuluaga LF (2017) Linear solution schemes for mean-semivariance project portfolio selection problems: an application in the oil and gas industry. Omega 68:39–48CrossRef
Zurück zum Zitat Sheskin DJ (2000) Handbook of parametric and nonparametric statistical procedures, 3rd edn. Chapman and Hall-CRC, Boca Raton Sheskin DJ (2000) Handbook of parametric and nonparametric statistical procedures, 3rd edn. Chapman and Hall-CRC, Boca Raton
Zurück zum Zitat Shin K, Pasupathy R (2010) An algorithm for fast generation of bivariate Poisson random vectors. INFORMS J Comput 22(1):81–92CrossRef Shin K, Pasupathy R (2010) An algorithm for fast generation of bivariate Poisson random vectors. INFORMS J Comput 22(1):81–92CrossRef
Zurück zum Zitat Shults J (2017) Simulating longer vectors of correlated binary random variables via multinomial sampling. Comput Stat Data Anal 114:1–11CrossRef Shults J (2017) Simulating longer vectors of correlated binary random variables via multinomial sampling. Comput Stat Data Anal 114:1–11CrossRef
Zurück zum Zitat Sigler EA, Tallent-Runnels MK (2006) Examining the validity of scores from an instrument designed to measure metacognition of problem solving. J Gener Psychol 133(2):257–276CrossRef Sigler EA, Tallent-Runnels MK (2006) Examining the validity of scores from an instrument designed to measure metacognition of problem solving. J Gener Psychol 133(2):257–276CrossRef
Zurück zum Zitat Stanfield PM, Wilson JR, King RE (2004) Flexible modelling of correlated operation times with application in product-reuse facilities. Int J Prod Res 42(11):2179–2196CrossRef Stanfield PM, Wilson JR, King RE (2004) Flexible modelling of correlated operation times with application in product-reuse facilities. Int J Prod Res 42(11):2179–2196CrossRef
Zurück zum Zitat Todd CR, Ng MP (2001) Generating unbiased correlated random survival rates for stochastic population models. Ecol Model 144(1):1–11CrossRef Todd CR, Ng MP (2001) Generating unbiased correlated random survival rates for stochastic population models. Ecol Model 144(1):1–11CrossRef
Zurück zum Zitat Touran A (1993) Probabilistic cost estimating with subjective correlations. J Constr Eng Manag 119(1):58–71CrossRef Touran A (1993) Probabilistic cost estimating with subjective correlations. J Constr Eng Manag 119(1):58–71CrossRef
Zurück zum Zitat Touran A, Suphot L (1997) Rank correlations in simulating construction cost. J Constr Eng Manag 123(3):297–301CrossRef Touran A, Suphot L (1997) Rank correlations in simulating construction cost. J Constr Eng Manag 123(3):297–301CrossRef
Zurück zum Zitat Van der Geest PAG (1998) An algorithm to generate samples of multi-variate distributions with correlated marginals. Comput Stat Data Anal 27(3):271–289CrossRef Van der Geest PAG (1998) An algorithm to generate samples of multi-variate distributions with correlated marginals. Comput Stat Data Anal 27(3):271–289CrossRef
Zurück zum Zitat Wallis WA (1939) The correlation ratio for ranked data. J Am Stat Assoc 34(207):533–538CrossRef Wallis WA (1939) The correlation ratio for ranked data. J Am Stat Assoc 34(207):533–538CrossRef
Zurück zum Zitat Xiao Q (2017) Generating correlated random vector involving discrete variables. Commun Stat Theory Methods 46(4):1594–1605CrossRef Xiao Q (2017) Generating correlated random vector involving discrete variables. Commun Stat Theory Methods 46(4):1594–1605CrossRef
Zurück zum Zitat Yan C, Kung J (2016) Robust aircraft routing. Transp Sci 52(1):118–133CrossRef Yan C, Kung J (2016) Robust aircraft routing. Transp Sci 52(1):118–133CrossRef
Zurück zum Zitat Young DJ, Beaulieu NC (2000) The generation of correlated Rayleigh random variates by inverse discrete Fourier transform. IEEE Trans Commun 48(7):1114–1127CrossRef Young DJ, Beaulieu NC (2000) The generation of correlated Rayleigh random variates by inverse discrete Fourier transform. IEEE Trans Commun 48(7):1114–1127CrossRef
Metadaten
Titel
A column-oriented optimization approach for the generation of correlated random vectors
verfasst von
Jorge A. Sefair
Oscar Guaje
Andrés L. Medaglia
Publikationsdatum
25.02.2021
Verlag
Springer Berlin Heidelberg
Erschienen in
OR Spectrum / Ausgabe 3/2021
Print ISSN: 0171-6468
Elektronische ISSN: 1436-6304
DOI
https://doi.org/10.1007/s00291-021-00620-5

Weitere Artikel der Ausgabe 3/2021

OR Spectrum 3/2021 Zur Ausgabe