Skip to main content

2011 | OriginalPaper | Buchkapitel

24. Information Quality and Relevance in Large-Scale Social Information Systems

verfasst von : Munmun De Choudhury

Erschienen in: Handbook of Data Intensive Computing

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As the surge of today’s pervasive social applications continues unabatedly, it has greatly expanded our horizons of putting the shared information artifacts to good use. Almost inconceivable scarcely a decade ago, on one hand, it has enabled researchers to study social processes on these systems at extremely large-scales. While on the other, it has streamlined the end user experience in terms of exploring real-time event based information ubiquitously via a variety of devices, almost anytime, anywhere. However, with several terrabytes of such information generated everyday, we are presented with the daunting question: how do we identify those pieces of information that are relevant and interesting? This book chapter sheds light on the significance, challenges associated with this problem domain and presents a case study geared towards addressing these challenges. Finally it identifies the impact of the vision to the larger data intensive computing community.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
10
The diversity index of a sample population has been widely used by researchers in different areas ranging from economics, ecology and statistics, to measure the differences among members of the population consisting of various types of objects. Although there are a host of measures to estimate such diversity (e.g., species richness, concentration ratio, etc.), the most popular and robust measure by far is Shannon’s entropy based quantification [16]. This motivated us to utilize an information theoretic formulation to represent the diversity existing in social information spaces.
 
11
Note that we do not make apriori assumptions about what value of the diversity parameter is more desirable for the content selection task. Instead, diversity is a parameter in our experimental design, and we provide discussions on how the choice of its value affects the end-user’s perception of information consumption.
 
12
Although our proposed content selection technique can generate tweet sets of any given size, we considered sets of a reasonably small size (ten items) in our experimental design. The goal was to ensure that while going through the user study and evaluating different sets, the end-user participant was not overwhelmed by the quantity of information presented.
 
Literatur
1.
Zurück zum Zitat Dimitris Achlioptas, Aaron Clauset, David Kempe, and Cristopher Moore. On the bias of traceroute sampling: or, power-law degree distributions in regular graphs. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, STOC ’05, pages 694–703, New York, NY, USA, 2005. ACM. Dimitris Achlioptas, Aaron Clauset, David Kempe, and Cristopher Moore. On the bias of traceroute sampling: or, power-law degree distributions in regular graphs. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, STOC ’05, pages 694–703, New York, NY, USA, 2005. ACM.
2.
Zurück zum Zitat Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis, and Gilad Mishne. Finding high-quality content in social media. In Proceedings of the international conference on Web search and web data mining, WSDM ’08, pages 183–194, New York, NY, USA, 2008. ACM. Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis, and Gilad Mishne. Finding high-quality content in social media. In Proceedings of the international conference on Web search and web data mining, WSDM ’08, pages 183–194, New York, NY, USA, 2008. ACM.
3.
Zurück zum Zitat Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999.
4.
Zurück zum Zitat Suh B. Hong L. Chen J. Kairam S. Bernstein, M. and E.H. Chi. Eddi: Interactive topic-based browsing of social status streams. In ACM User Interface Software and Technology (UIST) conference, 2010. To appear. Suh B. Hong L. Chen J. Kairam S. Bernstein, M. and E.H. Chi. Eddi: Interactive topic-based browsing of social status streams. In ACM User Interface Software and Technology (UIST) conference, 2010. To appear.
5.
Zurück zum Zitat Georg Buscher, Andreas Dengel, and Ludger van Elst. Query expansion using gaze-based feedback on the subdocument level. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’08, pages 387–394, New York, NY, USA, 2008. ACM. Georg Buscher, Andreas Dengel, and Ludger van Elst. Query expansion using gaze-based feedback on the subdocument level. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’08, pages 387–394, New York, NY, USA, 2008. ACM.
6.
Zurück zum Zitat Jilin Chen, Rowan Nairn, Les Nelson, Michael Bernstein, and Ed Chi. Short and tweet: experiments on recommending content from information streams. In CHI ’10: Proceedings of the 28th international conference on Human factors in computing systems, pages 1185–1194, New York, NY, USA, 2010. ACM. Jilin Chen, Rowan Nairn, Les Nelson, Michael Bernstein, and Ed Chi. Short and tweet: experiments on recommending content from information streams. In CHI ’10: Proceedings of the 28th international conference on Human factors in computing systems, pages 1185–1194, New York, NY, USA, 2010. ACM.
7.
Zurück zum Zitat Thomas M. Cover and Joy A. Thomas. Elements of information theory. Wiley-Interscience, New York, NY, USA, 1991.CrossRefMATH Thomas M. Cover and Joy A. Thomas. Elements of information theory. Wiley-Interscience, New York, NY, USA, 1991.CrossRefMATH
8.
Zurück zum Zitat M. Czerwinski, E. Horvitz, and E. Cutrell. Subjective duration assessment: An implicit probe for software usability. In Proceedings of IHM-HCI, pages 167–170, September 2001. M. Czerwinski, E. Horvitz, and E. Cutrell. Subjective duration assessment: An implicit probe for software usability. In Proceedings of IHM-HCI, pages 167–170, September 2001.
9.
Zurück zum Zitat P J Daniels. Cognitive models in information retrieval—an evaluative review. J. Doc., 42: 272–304, December 1986. P J Daniels. Cognitive models in information retrieval—an evaluative review. J. Doc., 42: 272–304, December 1986.
10.
Zurück zum Zitat Gautam Das, Nick Koudas, Manos Papagelis, and Sushruth Puttaswamy. Efficient sampling of information in social networks. In SSM, pages 67–74, 2008. Gautam Das, Nick Koudas, Manos Papagelis, and Sushruth Puttaswamy. Efficient sampling of information in social networks. In SSM, pages 67–74, 2008.
11.
Zurück zum Zitat Anish Das Sarma, Atish Das Sarma, Sreenivas Gollapudi, and Rina Panigrahy. Ranking mechanisms in twitter-like forums. In Proceedings of the third ACM international conference on Web search and data mining, WSDM ’10, pages 21–30, New York, NY, USA, 2010. ACM. Anish Das Sarma, Atish Das Sarma, Sreenivas Gollapudi, and Rina Panigrahy. Ranking mechanisms in twitter-like forums. In Proceedings of the third ACM international conference on Web search and data mining, WSDM ’10, pages 21–30, New York, NY, USA, 2010. ACM.
12.
Zurück zum Zitat Munmun De Choudhury, Scott Counts, and Mary Czerwinski. Identifying relevant social media content: leveraging information diversity and user cognition. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia, HT ’11, pages 161–170, New York, NY, USA, 2011. ACM. Munmun De Choudhury, Scott Counts, and Mary Czerwinski. Identifying relevant social media content: leveraging information diversity and user cognition. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia, HT ’11, pages 161–170, New York, NY, USA, 2011. ACM.
13.
Zurück zum Zitat Munmun De Choudhury, Y-R Lin, Hari Sundaram, K.S. Candan, Lexing Xie, and Aisling Kelliher. How does the data sampling strategy impact the discovery of information diffusion in social media? In ICWSM ’10: Proceedings of the 4th International Conference on Weblogs and Social Media, Washington D.C., May 2010. AAAI Press, AAAI Press. Munmun De Choudhury, Y-R Lin, Hari Sundaram, K.S. Candan, Lexing Xie, and Aisling Kelliher. How does the data sampling strategy impact the discovery of information diffusion in social media? In ICWSM ’10: Proceedings of the 4th International Conference on Weblogs and Social Media, Washington D.C., May 2010. AAAI Press, AAAI Press.
14.
Zurück zum Zitat Nigel Ford. Modeling cognitive processes in information seeking: from popper to pask. J. Am. Soc. Inf. Sci. Technol., 55:769–782, July 2004. Nigel Ford. Modeling cognitive processes in information seeking: from popper to pask. J. Am. Soc. Inf. Sci. Technol., 55:769–782, July 2004.
15.
Zurück zum Zitat O. Frank. Sampling and estimation in large social networks. Social Networks, 1(91):101, 1978. O. Frank. Sampling and estimation in large social networks. Social Networks, 1(91):101, 1978.
17.
Zurück zum Zitat W. Kellogg. Information rates in sampling and quantization. Information Theory, IEEE Transactions on, 13(3):506 – 511, jul 1967. W. Kellogg. Information rates in sampling and quantization. Information Theory, IEEE Transactions on, 13(3):506 – 511, jul 1967.
18.
Zurück zum Zitat Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. What is twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 591–600, New York, NY, USA, 2010. ACM. Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. What is twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 591–600, New York, NY, USA, 2010. ACM.
19.
Zurück zum Zitat J. Leskovec and C. Faloutsos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, page 636. ACM, 2006. J. Leskovec and C. Faloutsos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, page 636. ACM, 2006.
20.
Zurück zum Zitat Arun S. Maiya and Tanya Y. Berger-Wolf. Sampling community structure. In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 701–710, New York, NY, USA, 2010. ACM. Arun S. Maiya and Tanya Y. Berger-Wolf. Sampling community structure. In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 701–710, New York, NY, USA, 2010. ACM.
21.
Zurück zum Zitat Qiaozhu Mei, Jian Guo, and Dragomir Radev. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’10, pages 1009–1018, New York, NY, USA, 2010. ACM. Qiaozhu Mei, Jian Guo, and Dragomir Radev. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’10, pages 1009–1018, New York, NY, USA, 2010. ACM.
22.
Zurück zum Zitat Owen Phelan, Kevin McCarthy, and Barry Smyth. Using twitter to recommend real-time topical news. In Proceedings of the third ACM conference on Recommender systems, RecSys ’09, pages 385–388, New York, NY, USA, 2009. ACM. Owen Phelan, Kevin McCarthy, and Barry Smyth. Using twitter to recommend real-time topical news. In Proceedings of the third ACM conference on Recommender systems, RecSys ’09, pages 385–388, New York, NY, USA, 2009. ACM.
23.
Zurück zum Zitat P. Rusmevichientong, D.M. Pennock, S. Lawrence, and C.L. Giles. Methods for sampling pages uniformly from the world wide web. In AAAI Fall Symposium on Using Uncertainty Within Computation, pages 121–128, 2001. P. Rusmevichientong, D.M. Pennock, S. Lawrence, and C.L. Giles. Methods for sampling pages uniformly from the world wide web. In AAAI Fall Symposium on Using Uncertainty Within Computation, pages 121–128, 2001.
24.
Zurück zum Zitat Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 851–860, New York, NY, USA, 2010. ACM. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 851–860, New York, NY, USA, 2010. ACM.
25.
Zurück zum Zitat Marc Smith, Vladimir Barash, Lise Getoor, and Hady W. Lauw. Leveraging social context for searching social media. In Proceeding of the 2008 ACM workshop on Search in social media, SSM ’08, pages 91–94, New York, NY, USA, 2008. ACM. Marc Smith, Vladimir Barash, Lise Getoor, and Hady W. Lauw. Leveraging social context for searching social media. In Proceeding of the 2008 ACM workshop on Search in social media, SSM ’08, pages 91–94, New York, NY, USA, 2008. ACM.
26.
Zurück zum Zitat S. M. Smith. Remembering in and out of context. Journal of Experimental Psychology: Human Learning and Memory, 5(5):460–471, 1979.CrossRef S. M. Smith. Remembering in and out of context. Journal of Experimental Psychology: Human Learning and Memory, 5(5):460–471, 1979.CrossRef
27.
Zurück zum Zitat G. Sperling. A model for visual memory tasks. Human Factors, 5:19–31, 1963. G. Sperling. A model for visual memory tasks. Human Factors, 5:19–31, 1963.
28.
Zurück zum Zitat D. Stutzbach, R. Rejaie, N. Duffield, S. Sen, and W. Willinger. Sampling techniques for large, dynamic graphs. In INFOCOM 2006: Proceedings of the 25th IEEE International Conference on Computer Communications, pages 1–6. IEEE, April 2006. D. Stutzbach, R. Rejaie, N. Duffield, S. Sen, and W. Willinger. Sampling techniques for large, dynamic graphs. In INFOCOM 2006: Proceedings of the 25th IEEE International Conference on Computer Communications, pages 1–6. IEEE, April 2006.
29.
Zurück zum Zitat Pertti Vakkari. Relevance and contributing information types of searched documents in task performance. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’00, pages 2–9, New York, NY, USA, 2000. ACM. Pertti Vakkari. Relevance and contributing information types of searched documents in task performance. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’00, pages 2–9, New York, NY, USA, 2000. ACM.
30.
Zurück zum Zitat Steve Whittaker and Candace Sidner. Email overload: exploring personal information management of email. In CHI ’96: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 276–283, New York, NY, USA, 1996. ACM. Steve Whittaker and Candace Sidner. Email overload: exploring personal information management of email. In CHI ’96: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 276–283, New York, NY, USA, 1996. ACM.
31.
Zurück zum Zitat Yunjie (Calvin) Xu and Zhiwei Chen. Relevance judgment: What do information users consider beyond topicality? J. Am. Soc. Inf. Sci. Technol., 57:961–973, May 2006. Yunjie (Calvin) Xu and Zhiwei Chen. Relevance judgment: What do information users consider beyond topicality? J. Am. Soc. Inf. Sci. Technol., 57:961–973, May 2006.
32.
Zurück zum Zitat Judith Lynne Zaichkowsky. Measuring the involvement construct. Journal of Consumer Research: An Interdisciplinary Quarterly, 12(3):341–52, 1985. Judith Lynne Zaichkowsky. Measuring the involvement construct. Journal of Consumer Research: An Interdisciplinary Quarterly, 12(3):341–52, 1985.
Metadaten
Titel
Information Quality and Relevance in Large-Scale Social Information Systems
verfasst von
Munmun De Choudhury
Copyright-Jahr
2011
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-1415-5_24