Skip to main content
Top

2011 | OriginalPaper | Chapter

24. Information Quality and Relevance in Large-Scale Social Information Systems

Author : Munmun De Choudhury

Published in: Handbook of Data Intensive Computing

Publisher: Springer New York

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

As the surge of today’s pervasive social applications continues unabatedly, it has greatly expanded our horizons of putting the shared information artifacts to good use. Almost inconceivable scarcely a decade ago, on one hand, it has enabled researchers to study social processes on these systems at extremely large-scales. While on the other, it has streamlined the end user experience in terms of exploring real-time event based information ubiquitously via a variety of devices, almost anytime, anywhere. However, with several terrabytes of such information generated everyday, we are presented with the daunting question: how do we identify those pieces of information that are relevant and interesting? This book chapter sheds light on the significance, challenges associated with this problem domain and presents a case study geared towards addressing these challenges. Finally it identifies the impact of the vision to the larger data intensive computing community.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
10
The diversity index of a sample population has been widely used by researchers in different areas ranging from economics, ecology and statistics, to measure the differences among members of the population consisting of various types of objects. Although there are a host of measures to estimate such diversity (e.g., species richness, concentration ratio, etc.), the most popular and robust measure by far is Shannon’s entropy based quantification [16]. This motivated us to utilize an information theoretic formulation to represent the diversity existing in social information spaces.
 
11
Note that we do not make apriori assumptions about what value of the diversity parameter is more desirable for the content selection task. Instead, diversity is a parameter in our experimental design, and we provide discussions on how the choice of its value affects the end-user’s perception of information consumption.
 
12
Although our proposed content selection technique can generate tweet sets of any given size, we considered sets of a reasonably small size (ten items) in our experimental design. The goal was to ensure that while going through the user study and evaluating different sets, the end-user participant was not overwhelmed by the quantity of information presented.
 
Literature
1.
go back to reference Dimitris Achlioptas, Aaron Clauset, David Kempe, and Cristopher Moore. On the bias of traceroute sampling: or, power-law degree distributions in regular graphs. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, STOC ’05, pages 694–703, New York, NY, USA, 2005. ACM. Dimitris Achlioptas, Aaron Clauset, David Kempe, and Cristopher Moore. On the bias of traceroute sampling: or, power-law degree distributions in regular graphs. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, STOC ’05, pages 694–703, New York, NY, USA, 2005. ACM.
2.
go back to reference Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis, and Gilad Mishne. Finding high-quality content in social media. In Proceedings of the international conference on Web search and web data mining, WSDM ’08, pages 183–194, New York, NY, USA, 2008. ACM. Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis, and Gilad Mishne. Finding high-quality content in social media. In Proceedings of the international conference on Web search and web data mining, WSDM ’08, pages 183–194, New York, NY, USA, 2008. ACM.
3.
go back to reference Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999.
4.
go back to reference Suh B. Hong L. Chen J. Kairam S. Bernstein, M. and E.H. Chi. Eddi: Interactive topic-based browsing of social status streams. In ACM User Interface Software and Technology (UIST) conference, 2010. To appear. Suh B. Hong L. Chen J. Kairam S. Bernstein, M. and E.H. Chi. Eddi: Interactive topic-based browsing of social status streams. In ACM User Interface Software and Technology (UIST) conference, 2010. To appear.
5.
go back to reference Georg Buscher, Andreas Dengel, and Ludger van Elst. Query expansion using gaze-based feedback on the subdocument level. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’08, pages 387–394, New York, NY, USA, 2008. ACM. Georg Buscher, Andreas Dengel, and Ludger van Elst. Query expansion using gaze-based feedback on the subdocument level. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’08, pages 387–394, New York, NY, USA, 2008. ACM.
6.
go back to reference Jilin Chen, Rowan Nairn, Les Nelson, Michael Bernstein, and Ed Chi. Short and tweet: experiments on recommending content from information streams. In CHI ’10: Proceedings of the 28th international conference on Human factors in computing systems, pages 1185–1194, New York, NY, USA, 2010. ACM. Jilin Chen, Rowan Nairn, Les Nelson, Michael Bernstein, and Ed Chi. Short and tweet: experiments on recommending content from information streams. In CHI ’10: Proceedings of the 28th international conference on Human factors in computing systems, pages 1185–1194, New York, NY, USA, 2010. ACM.
7.
go back to reference Thomas M. Cover and Joy A. Thomas. Elements of information theory. Wiley-Interscience, New York, NY, USA, 1991.CrossRefMATH Thomas M. Cover and Joy A. Thomas. Elements of information theory. Wiley-Interscience, New York, NY, USA, 1991.CrossRefMATH
8.
go back to reference M. Czerwinski, E. Horvitz, and E. Cutrell. Subjective duration assessment: An implicit probe for software usability. In Proceedings of IHM-HCI, pages 167–170, September 2001. M. Czerwinski, E. Horvitz, and E. Cutrell. Subjective duration assessment: An implicit probe for software usability. In Proceedings of IHM-HCI, pages 167–170, September 2001.
9.
go back to reference P J Daniels. Cognitive models in information retrieval—an evaluative review. J. Doc., 42: 272–304, December 1986. P J Daniels. Cognitive models in information retrieval—an evaluative review. J. Doc., 42: 272–304, December 1986.
10.
go back to reference Gautam Das, Nick Koudas, Manos Papagelis, and Sushruth Puttaswamy. Efficient sampling of information in social networks. In SSM, pages 67–74, 2008. Gautam Das, Nick Koudas, Manos Papagelis, and Sushruth Puttaswamy. Efficient sampling of information in social networks. In SSM, pages 67–74, 2008.
11.
go back to reference Anish Das Sarma, Atish Das Sarma, Sreenivas Gollapudi, and Rina Panigrahy. Ranking mechanisms in twitter-like forums. In Proceedings of the third ACM international conference on Web search and data mining, WSDM ’10, pages 21–30, New York, NY, USA, 2010. ACM. Anish Das Sarma, Atish Das Sarma, Sreenivas Gollapudi, and Rina Panigrahy. Ranking mechanisms in twitter-like forums. In Proceedings of the third ACM international conference on Web search and data mining, WSDM ’10, pages 21–30, New York, NY, USA, 2010. ACM.
12.
go back to reference Munmun De Choudhury, Scott Counts, and Mary Czerwinski. Identifying relevant social media content: leveraging information diversity and user cognition. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia, HT ’11, pages 161–170, New York, NY, USA, 2011. ACM. Munmun De Choudhury, Scott Counts, and Mary Czerwinski. Identifying relevant social media content: leveraging information diversity and user cognition. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia, HT ’11, pages 161–170, New York, NY, USA, 2011. ACM.
13.
go back to reference Munmun De Choudhury, Y-R Lin, Hari Sundaram, K.S. Candan, Lexing Xie, and Aisling Kelliher. How does the data sampling strategy impact the discovery of information diffusion in social media? In ICWSM ’10: Proceedings of the 4th International Conference on Weblogs and Social Media, Washington D.C., May 2010. AAAI Press, AAAI Press. Munmun De Choudhury, Y-R Lin, Hari Sundaram, K.S. Candan, Lexing Xie, and Aisling Kelliher. How does the data sampling strategy impact the discovery of information diffusion in social media? In ICWSM ’10: Proceedings of the 4th International Conference on Weblogs and Social Media, Washington D.C., May 2010. AAAI Press, AAAI Press.
14.
go back to reference Nigel Ford. Modeling cognitive processes in information seeking: from popper to pask. J. Am. Soc. Inf. Sci. Technol., 55:769–782, July 2004. Nigel Ford. Modeling cognitive processes in information seeking: from popper to pask. J. Am. Soc. Inf. Sci. Technol., 55:769–782, July 2004.
15.
go back to reference O. Frank. Sampling and estimation in large social networks. Social Networks, 1(91):101, 1978. O. Frank. Sampling and estimation in large social networks. Social Networks, 1(91):101, 1978.
17.
go back to reference W. Kellogg. Information rates in sampling and quantization. Information Theory, IEEE Transactions on, 13(3):506 – 511, jul 1967. W. Kellogg. Information rates in sampling and quantization. Information Theory, IEEE Transactions on, 13(3):506 – 511, jul 1967.
18.
go back to reference Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. What is twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 591–600, New York, NY, USA, 2010. ACM. Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. What is twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 591–600, New York, NY, USA, 2010. ACM.
19.
go back to reference J. Leskovec and C. Faloutsos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, page 636. ACM, 2006. J. Leskovec and C. Faloutsos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, page 636. ACM, 2006.
20.
go back to reference Arun S. Maiya and Tanya Y. Berger-Wolf. Sampling community structure. In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 701–710, New York, NY, USA, 2010. ACM. Arun S. Maiya and Tanya Y. Berger-Wolf. Sampling community structure. In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 701–710, New York, NY, USA, 2010. ACM.
21.
go back to reference Qiaozhu Mei, Jian Guo, and Dragomir Radev. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’10, pages 1009–1018, New York, NY, USA, 2010. ACM. Qiaozhu Mei, Jian Guo, and Dragomir Radev. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’10, pages 1009–1018, New York, NY, USA, 2010. ACM.
22.
go back to reference Owen Phelan, Kevin McCarthy, and Barry Smyth. Using twitter to recommend real-time topical news. In Proceedings of the third ACM conference on Recommender systems, RecSys ’09, pages 385–388, New York, NY, USA, 2009. ACM. Owen Phelan, Kevin McCarthy, and Barry Smyth. Using twitter to recommend real-time topical news. In Proceedings of the third ACM conference on Recommender systems, RecSys ’09, pages 385–388, New York, NY, USA, 2009. ACM.
23.
go back to reference P. Rusmevichientong, D.M. Pennock, S. Lawrence, and C.L. Giles. Methods for sampling pages uniformly from the world wide web. In AAAI Fall Symposium on Using Uncertainty Within Computation, pages 121–128, 2001. P. Rusmevichientong, D.M. Pennock, S. Lawrence, and C.L. Giles. Methods for sampling pages uniformly from the world wide web. In AAAI Fall Symposium on Using Uncertainty Within Computation, pages 121–128, 2001.
24.
go back to reference Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 851–860, New York, NY, USA, 2010. ACM. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 851–860, New York, NY, USA, 2010. ACM.
25.
go back to reference Marc Smith, Vladimir Barash, Lise Getoor, and Hady W. Lauw. Leveraging social context for searching social media. In Proceeding of the 2008 ACM workshop on Search in social media, SSM ’08, pages 91–94, New York, NY, USA, 2008. ACM. Marc Smith, Vladimir Barash, Lise Getoor, and Hady W. Lauw. Leveraging social context for searching social media. In Proceeding of the 2008 ACM workshop on Search in social media, SSM ’08, pages 91–94, New York, NY, USA, 2008. ACM.
26.
go back to reference S. M. Smith. Remembering in and out of context. Journal of Experimental Psychology: Human Learning and Memory, 5(5):460–471, 1979.CrossRef S. M. Smith. Remembering in and out of context. Journal of Experimental Psychology: Human Learning and Memory, 5(5):460–471, 1979.CrossRef
27.
go back to reference G. Sperling. A model for visual memory tasks. Human Factors, 5:19–31, 1963. G. Sperling. A model for visual memory tasks. Human Factors, 5:19–31, 1963.
28.
go back to reference D. Stutzbach, R. Rejaie, N. Duffield, S. Sen, and W. Willinger. Sampling techniques for large, dynamic graphs. In INFOCOM 2006: Proceedings of the 25th IEEE International Conference on Computer Communications, pages 1–6. IEEE, April 2006. D. Stutzbach, R. Rejaie, N. Duffield, S. Sen, and W. Willinger. Sampling techniques for large, dynamic graphs. In INFOCOM 2006: Proceedings of the 25th IEEE International Conference on Computer Communications, pages 1–6. IEEE, April 2006.
29.
go back to reference Pertti Vakkari. Relevance and contributing information types of searched documents in task performance. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’00, pages 2–9, New York, NY, USA, 2000. ACM. Pertti Vakkari. Relevance and contributing information types of searched documents in task performance. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’00, pages 2–9, New York, NY, USA, 2000. ACM.
30.
go back to reference Steve Whittaker and Candace Sidner. Email overload: exploring personal information management of email. In CHI ’96: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 276–283, New York, NY, USA, 1996. ACM. Steve Whittaker and Candace Sidner. Email overload: exploring personal information management of email. In CHI ’96: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 276–283, New York, NY, USA, 1996. ACM.
31.
go back to reference Yunjie (Calvin) Xu and Zhiwei Chen. Relevance judgment: What do information users consider beyond topicality? J. Am. Soc. Inf. Sci. Technol., 57:961–973, May 2006. Yunjie (Calvin) Xu and Zhiwei Chen. Relevance judgment: What do information users consider beyond topicality? J. Am. Soc. Inf. Sci. Technol., 57:961–973, May 2006.
32.
go back to reference Judith Lynne Zaichkowsky. Measuring the involvement construct. Journal of Consumer Research: An Interdisciplinary Quarterly, 12(3):341–52, 1985. Judith Lynne Zaichkowsky. Measuring the involvement construct. Journal of Consumer Research: An Interdisciplinary Quarterly, 12(3):341–52, 1985.
Metadata
Title
Information Quality and Relevance in Large-Scale Social Information Systems
Author
Munmun De Choudhury
Copyright Year
2011
Publisher
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-1415-5_24

Premium Partner