ABSTRACT
The "wisdom of crowds" dictates that aggregate predictions from a large crowd can be surprisingly accurate, rivaling predictions by experts. Crowds, meanwhile, are highly heterogeneous in their expertise. In this work, we study how the heterogeneous uncertainty of a crowd can be directly elicited and harnessed to produce more efficient aggregations from a crowd, or provide the same efficiency from smaller crowds. We present and evaluate a novel strategy for eliciting sufficient information about an individual's uncertainty: allow individuals to make multiple simultaneous guesses, and reward them based on the accuracy of their closest guess. We show that our multiple guesses scoring rule is an incentive-compatible elicitation strategy for aggregations across populations under the reasonable technical assumption that the individuals all hold symmetric log-concave belief distributions that come from the same location-scale family. We first show that our multiple guesses scoring rule is strictly proper for a fixed set of quantiles of any log-concave belief distribution. With properly elicited quantiles in hand, we show that when the belief distributions are also symmetric and all belong to a single location-scale family, we can use interquantile ranges to furnish weights for certainty-weighted crowd aggregation. We evaluate our multiple guesses framework empirically through a series of incentivized guessing experiments on Amazon Mechanical Turk, and find that certainty-weighted crowd aggregations using multiple guesses outperform aggregations using single guesses without certainty weights.
- Glenn Brier. 1950. Verification of forecasts expressed in terms of probability. Monthly Weather Rev 78, 1 (1950), 1--3.Google ScholarCross Ref
- David Budescu and Eva Chen. 2014. Expertise to Extract the Wisdom of Crowds. Management Science (2014).Google Scholar
- Pierre Cohort. 2000. Sur quelques problemes de quantification. Ph.D. Dissertation. Univ. Paris 6.Google Scholar
- Richard Courant. 1950. Dirichlet's principle, conformal mapping, and minimal surfaces. Vol. 3. Springer.Google Scholar
- Tore Dalenius. 1950. The Problem of Optimum Stratification. Scand Actuarial J 1950, 3--4 (1950), 203--213.Google ScholarCross Ref
- Abhimanyu Das, Sreenivas Gollapudi, Rina Panigrahy, and Mahyar Salek. 2013. Debiasing social wisdom. In KDD. ACM, 500--508. Google ScholarDigital Library
- Clintin P Davis-Stober, David V Budescu, Jason Dana, and Stephen B Broomell. 2014. When is a crowd wise? Decision 1, 2 (2014), 79.Google ScholarCross Ref
- Pierre Simon de Laplace. 1820. Théorie analytique des probabilités. Courcier.Google Scholar
- Ofer Dekel and Ohad Shamir. 2009. Vox populi: Collecting high-quality labels from a crowd. In COLT.Google Scholar
- Sylvain Delattre, Siegfried Graf, Harald Luschgy, Gilles Pages, and others. 2004. Quantization of probability distributions under norm-based distortion measures. Statistics and Decisions 22 (2004), 261--282.Google ScholarCross Ref
- Sándor P Fekete, Joseph SB Mitchell, and Karin Beurer. 2005. On the continuous Fermat-Weber problem. Operations Research 53, 1 (2005), 61--76. Google ScholarDigital Library
- P Fleischer. 1964. Sufficient conditions for achieving minimum distortion in a quantizer. IEEE Int. Conv. Rec 12 (1964), 104--111.Google Scholar
- Jean-Claude Fort and Gilles Pagès. Asymptotics of optimal quantizers for some scalar distributions. J. Comput. Appl. Math. 146, 2 (2002), 253--275. Google ScholarDigital Library
- Rafael M Frongillo, Yiling Chen, and Ian A Kash. 2015. Elicitation for Aggregation. In AAAI.Google Scholar
- Francis Galton. 1907a. One vote, one value. Nature 75 (1907), 414.Google ScholarCross Ref
- Francis Galton. 1907b. Vox populi. Nature 75 (1907), 450.Google ScholarCross Ref
- Tilmann Gneiting and Adrian E Raftery. 2007. Strictly proper scoring rules, prediction, and estimation. JASA 102, 477 (2007), 359--378.Google ScholarCross Ref
- Daniel Goldstein, R Preston McAfee, and Siddharth Suri. 2014. The Wisdom of Smaller, Smarter Crowds. In EC. ACM. Google ScholarDigital Library
- Daniel G Goldstein and David Rothschild. 2014. Lay understanding of probability distributions. Judgment and Decision Making 9, 1 (2014), 1--14.Google Scholar
- Cecil Hastings, Frederick Mosteller, John W Tukey, and Charles P Winsor. 1947. Low moments for small samples: a comparative study of order statistics. Annals of Mathematical Statistics (1947), 413--426.Google Scholar
- Stefan M Herzog and Ralph Hertwig. 2009. The wisdom of many in one mind improving individual judgments with dialectical bootstrapping. Psychological Science 20, 2 (2009), 231--237.Google ScholarCross Ref
- Stefan M Herzog and Ralph Hertwig. 2013. The Crowd Within and the Benefits of Dialectical Bootstrapping A Reply to White and Antonakis (2013). Psychological Science 24, 1 (2013), 117--119.Google ScholarCross Ref
- John J Horton. 2010. The Dot-Guessing Game: A "Fruit Fly" for Human Computation Research. SSRN 1600372 (2010).Google Scholar
- Harold Hotelling. 1929. Stability in Competition. The Economic Journal 39, 153 (1929), 41--57.Google ScholarCross Ref
- Victor Richmond R Jose, Yael Grushka-Cockayne, and Kenneth C Lichtendahl Jr. 2013. Trimmed opinion pools and the crowd's calibration problem. Management Science 60, 2 (2013), 463--475. Google ScholarDigital Library
- Ece Kamar and Eric Horvitz. 2012. Incentives and truthful reporting in consensus-centric crowdsourcing. Technical Report. MSR-TR-2012--16, Microsoft Research.Google Scholar
- Gideon Keren. 1991. Calibration and probability judgements: Conceptual and methodological issues. Acta Psychologica 77, 3 (1991), 217--273.Google ScholarCross Ref
- John Kieffer. 1983. Uniqueness of locally optimal quantizer for log-concave density and convex error weighting function. IEEE Transactions on Information Theory 29, 1 (1983), 42--47. Google ScholarDigital Library
- Nicolas S Lambert, David M Pennock, and Yoav Shoham. 2008. Eliciting properties of probability distributions. In EC. ACM, 129--138. Google ScholarDigital Library
- Kenneth C Lichtendahl Jr, Yael Grushka-Cockayne, and Phillip E Pfeifer. 2013. The Wisdom of Competitive Crowds. Operations Research 61, 6 (2013), 1383--1398. Google ScholarDigital Library
- Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE Trans on Inf Theory 28, 2 (1982), 129--137. Google ScholarDigital Library
- Jan Lorenz, Heiko Rauhut, Frank Schweitzer, and Dirk Helbing. 2011. How social influence can undermine the wisdom of crowd effect. PNAS 108, 22 (2011), 9020--9025.Google ScholarCross Ref
- Irving Lorge, David Fox, Joel Davitz, and Marlin Brenner. 1958. A survey of studies contrasting the quality of group performance and individual performance, 1920--1957. Psychological bulletin 55, 6 (1958), 337.Google Scholar
- David Mease, Vijayan N Nair, and Agus Sudjianto. 2004. Selective assembly in manufacturing: statistical issues and optimal binning strategies. Technometrics 46, 2 (2004), 165--175.Google ScholarCross Ref
- Anthony Mendes and Kent E Morrison. 2014. Guessing games. AMM 121, 1 (2014), 33--44.Google Scholar
- Theo Offerman, Joep Sonnemans, Gijs Van de Kuilen, and Peter Wakker. 2009. A truth serum for non-bayesians: Correcting proper scoring rules for risk attitudes. The Review of Economic Studies 76, 4 (2009), 1461--1489.Google ScholarCross Ref
- Martin J Osborne and Carolyn Pitchik. 1986. The nature of equilibrium in a location model. International Economic Review 27, 1 (1986), 223--37.Google ScholarCross Ref
- Marco Ottaviani and Peter Norman Sørensen. 2006. The strategy of professional forecasting. Journal of Financial Economics 81, 2 (2006), 441--466.Google ScholarCross Ref
- DraĚen Prelec. 2004. A Bayesian truth serum for subjective data. Science 306, 5695 (2004), 462--466.Google Scholar
- Leonard J Savage. 1971. Elicitation of personal probabilities and expectations. JASA 66 (1971), 783--801.Google ScholarCross Ref
- Nihar B Shah and Dengyong Zhou. 2014. Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing. arXiv preprint arXiv:1408.1387 (2014).Google Scholar
- David B Shmoys, Éva Tardos, and Karen Aardal. 1997. Approximation algorithms for facility location problems. In STOC. ACM, 265--274. Google ScholarDigital Library
- Herbert A Simon. 1972. Theories of bounded rationality. Decision and organization 1 (1972), 161--176.Google Scholar
- James Surowiecki. 2005. The wisdom of crowds. Random House LLC. Google ScholarDigital Library
- A Trushkin. 1982. Sufficient conditions for uniqueness of a locally optimal quantizer for a class of convex error weighting functions. IEEE Trans. on Information Theory 28, 2 (1982), 187--198. Google ScholarDigital Library
- John W Tukey. 1960. A survey of sampling from contaminated distributions. Contributions to probability and statistics 39 (1960), 448--485.Google Scholar
- Edward Vul and Harold Pashler. 2008. Measuring the crowd within probabilistic representations within individuals. Psychological Science 19, 7 (2008), 645--647.Google ScholarCross Ref
- Thomas S Wallsten, David Budescu, Ido Erev, and Adele Diederich. 1997. Evaluating and combining subjective probability estimates. J Behavioral Decision Making 10, 3 (1997), 243--268.Google ScholarCross Ref
- Chris M White and John Antonakis. 2013. Quantifying Accuracy Improvement in Sets of Pooled Judgments Does Dialectical Bootstrapping Work? Psychological science 24, 1 (2013), 115--116.Google Scholar
Index Terms
- The Wisdom of Multiple Guesses
Recommendations
The wisdom of smaller, smarter crowds
EC '14: Proceedings of the fifteenth ACM conference on Economics and computationThe "wisdom of crowds" refers to the phenomenon that aggregated predictions from a large group of people can rival or even beat the accuracy of experts. In domains with substantial stochastic elements, such as stock picking, crowd strategies (e.g. ...
The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing
WWW '14: Proceedings of the 23rd international conference on World wide webWorker reliability is a longstanding issue in crowdsourcing, and the automatic discovery of high quality workers is an important practical problem. Most previous work on this problem mainly focuses on estimating the quality of each individual worker ...
Trimmed Opinion Pools and the Crowd's Calibration Problem
We introduce an alternative to the popular linear opinion pool for combining individual probability forecasts. One of the well-known problems with the linear opinion pool is that it can be poorly calibrated. It tends toward underconfidence as the crowd'...
Comments