Skip to main content
Top
Published in: International Journal on Digital Libraries 1/2020

26-10-2018

Tracking the history and evolution of entities: entity-centric temporal analysis of large social media archives

Authors: Pavlos Fafalios, Vasileios Iosifidis, Kostas Stefanidis, Eirini Ntoutsi

Published in: International Journal on Digital Libraries | Issue 1/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

How did the popularity of the Greek Prime Minister evolve in 2015? How did the predominant sentiment about him vary during that period? Were there any controversial sub-periods? What other entities were related to him during these periods? To answer these questions, one needs to analyze archived documents and data about the query entities, such as old news articles or social media archives. In particular, user-generated content posted in social networks, like Twitter and Facebook, can be seen as a comprehensive documentation of our society, and thus, meaningful analysis methods over such archived data are of immense value for sociologists, historians, and other interested parties who want to study the history and evolution of entities and events. To this end, in this paper we propose an entity-centric approach to analyze social media archives and we define measures that allow studying how entities were reflected in social media in different time periods and under different aspects, like popularity, attitude, controversiality, and connectedness with other entities. A case study using a large Twitter archive of 4 years illustrates the insights that can be gained by such an entity-centric and multi-aspect analysis.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Amigó, E., Carrillo de Albornoz, J., Chugur, I., Corujo, A., Gonzalo, J., Meij, E., de Rijke, M., Spina, D.: Overview of replab 2014: Author profiling and reputation dimensions for online reputation management. In: CLEF (2014) Amigó, E., Carrillo de Albornoz, J., Chugur, I., Corujo, A., Gonzalo, J., Meij, E., de Rijke, M., Spina, D.: Overview of replab 2014: Author profiling and reputation dimensions for online reputation management. In: CLEF (2014)
2.
go back to reference Ardon, S., Bagchi, A., Mahanti, A., Ruhela, A., Seth, A., Tripathy, RM., Triukose, S.: Spatio-temporal analysis of topic popularity in Twitter. arXiv preprint arXiv:1111.2904 (2011) Ardon, S., Bagchi, A., Mahanti, A., Ruhela, A., Seth, A., Tripathy, RM., Triukose, S.: Spatio-temporal analysis of topic popularity in Twitter. arXiv preprint arXiv:​1111.​2904 (2011)
3.
go back to reference Batrinca, B., Treleaven, P.C.: Social media analytics: a survey of techniques, tools and platforms. AI & SOCIETY 30(1), 89–116 (2015)CrossRef Batrinca, B., Treleaven, P.C.: Social media analytics: a survey of techniques, tools and platforms. AI & SOCIETY 30(1), 89–116 (2015)CrossRef
4.
go back to reference Blanco, R., Ottaviano, G., Meij, E.: Fast and space-efficient entity linking for queries. In: WSDM (2015) Blanco, R., Ottaviano, G., Meij, E.: Fast and space-efficient entity linking for queries. In: WSDM (2015)
5.
go back to reference Bruns, A., Stieglitz, S.: Towards more systematic Twitter analysis: metrics for tweeting activities. Int. J. Soc. Res. Methodol. 16(2), 91–108 (2013)CrossRef Bruns, A., Stieglitz, S.: Towards more systematic Twitter analysis: metrics for tweeting activities. Int. J. Soc. Res. Methodol. 16(2), 91–108 (2013)CrossRef
6.
go back to reference Bruns, A., Weller, K.: Twitter as a first draft of the present: and the challenges of preserving it for the future. In: WebSci (2016) Bruns, A., Weller, K.: Twitter as a first draft of the present: and the challenges of preserving it for the future. In: WebSci (2016)
7.
go back to reference Celik, I., Abel, F., Houben, G.J.: Learning semantic relationships between entities in Twitter. In: ICWE (2011) Celik, I., Abel, F., Houben, G.J.: Learning semantic relationships between entities in Twitter. In: ICWE (2011)
8.
go back to reference Chandrasekaran, B., Josephson, J.R., Benjamins, V.R.: What are ontologies, and why do we need them? IEEE Intell. Syst. Appl. 14(1), 20–26 (1999)CrossRef Chandrasekaran, B., Josephson, J.R., Benjamins, V.R.: What are ontologies, and why do we need them? IEEE Intell. Syst. Appl. 14(1), 20–26 (1999)CrossRef
9.
go back to reference Chang, Y., Wang, X., Mei, Q., Liu, Y.: Towards twitter context summarization with user influence models. In: WSDM (2013) Chang, Y., Wang, X., Mei, Q., Liu, Y.: Towards twitter context summarization with user influence models. In: WSDM (2013)
10.
go back to reference Chang, Y., Tang, J., Yin, D., Yamada, M., Liu, Y.: Timeline summarization from social media with life cycle models. In: IJCAI (2016) Chang, Y., Tang, J., Yin, D., Yamada, M., Liu, Y.: Timeline summarization from social media with life cycle models. In: IJCAI (2016)
11.
go back to reference Chen, P.P.S.: The entity-relationship model toward a unified view of data. ACM Trans. Database Syst. (TODS) 1(1), 9–36 (1976)MathSciNetCrossRef Chen, P.P.S.: The entity-relationship model toward a unified view of data. ACM Trans. Database Syst. (TODS) 1(1), 9–36 (1976)MathSciNetCrossRef
12.
go back to reference Fafalios, P., Holzmann, H., Kasturia, V., Nejdl, W.: Building and querying semantic layers for web archives. In: 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pp 1–10. IEEE (2017a) Fafalios, P., Holzmann, H., Kasturia, V., Nejdl, W.: Building and querying semantic layers for web archives. In: 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pp 1–10. IEEE (2017a)
13.
go back to reference Fafalios, P., Iosifidis, V., Stefanidis, K., Ntoutsi, E.: Multi-aspect entity-centric analysis of big social media archives. In: International Conference on Theory and Practice of Digital Libraries, pp 261–273. Springer (2017b) Fafalios, P., Iosifidis, V., Stefanidis, K., Ntoutsi, E.: Multi-aspect entity-centric analysis of big social media archives. In: International Conference on Theory and Practice of Digital Libraries, pp 261–273. Springer (2017b)
15.
go back to reference Fafalios, P., Iosifidis, V., Ntoutsi, E., Dietze, S.: Tweetskb: A public and large-scale rdf corpus of annotated tweets. In: European Semantic Web Conference, pp. 177–190. Springer (2018b) Fafalios, P., Iosifidis, V., Ntoutsi, E., Dietze, S.: Tweetskb: A public and large-scale rdf corpus of annotated tweets. In: European Semantic Web Conference, pp. 177–190. Springer (2018b)
16.
go back to reference Farzindar, A., Khreich, W.: A survey of techniques for event detection in twitter. Comput. Intell. 31(1), 132–164 (2015)MathSciNetCrossRef Farzindar, A., Khreich, W.: A survey of techniques for event detection in twitter. Comput. Intell. 31(1), 132–164 (2015)MathSciNetCrossRef
17.
go back to reference Garimella, K., Morales, G.D.F., Gionis, A., Mathioudakis, M.: Quantifying controversy on social media. ACM Trans. Soc. Comput. 1(1), 3 (2018)CrossRef Garimella, K., Morales, G.D.F., Gionis, A., Mathioudakis, M.: Quantifying controversy on social media. ACM Trans. Soc. Comput. 1(1), 3 (2018)CrossRef
18.
go back to reference Guille, A., Hacid, H., Favre, C., Zighed, D.A.: Information diffusion in online social networks: a survey. SIGMOD Rec. 42(2), 17–28 (2013)CrossRef Guille, A., Hacid, H., Favre, C., Zighed, D.A.: Information diffusion in online social networks: a survey. SIGMOD Rec. 42(2), 17–28 (2013)CrossRef
19.
go back to reference Heath, T., Bizer, C.: Linked data: evolving the web into a global data space. Synth. Lect. Semant. Web Theory Technol. 1(1), 1–136 (2011)CrossRef Heath, T., Bizer, C.: Linked data: evolving the web into a global data space. Synth. Lect. Semant. Web Theory Technol. 1(1), 1–136 (2011)CrossRef
20.
go back to reference Iosifidis, V., Ntoutsi, E.: Large scale sentiment learning with limited labels. In: KDD (2017) Iosifidis, V., Ntoutsi, E.: Large scale sentiment learning with limited labels. In: KDD (2017)
21.
go back to reference Kucuktunc, O., Cambazoglu, B.B., Weber, I., Ferhatosmanoglu, H.: A large-scale sentiment analysis for Yahoo! answers. In: WSDM (2012) Kucuktunc, O., Cambazoglu, B.B., Weber, I., Ferhatosmanoglu, H.: A large-scale sentiment analysis for Yahoo! answers. In: WSDM (2012)
22.
go back to reference Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., Van Kleef, P., Auer, S.: Dbpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)CrossRef Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., Van Kleef, P., Auer, S.: Dbpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)CrossRef
23.
go back to reference Li, J., Cardie, C.: Timeline generation: Tracking Individuals on Twitter. In: WWW (2014) Li, J., Cardie, C.: Timeline generation: Tracking Individuals on Twitter. In: WWW (2014)
24.
go back to reference Meng, X., Wei, F., Liu, X., Zhou, M., Li, S., Wang, H.: Entity-centric topic-oriented opinion summarization in Twitter. In: KDD (2012) Meng, X., Wei, F., Liu, X., Zhou, M., Li, S., Wang, H.: Entity-centric topic-oriented opinion summarization in Twitter. In: KDD (2012)
25.
go back to reference Mohapatra, N., Iosifidis, V., Ekbal, A., Dietze, S., Fafalios, P.: Time-aware and corpus-specific entity relatedness. In: Workshop on Deep Learning for Knowledge Graphs and Semantic Technologies (DL4KGS)—In conjunction with ESWC 2018, Heraklion, Greece (2018) Mohapatra, N., Iosifidis, V., Ekbal, A., Dietze, S., Fafalios, P.: Time-aware and corpus-specific entity relatedness. In: Workshop on Deep Learning for Knowledge Graphs and Semantic Technologies (DL4KGS)—In conjunction with ESWC 2018, Heraklion, Greece (2018)
26.
go back to reference Nakov, P., Ritter, A., Rosenthal, S., Sebastiani, F., Stoyanov, V.: Semeval-2016 task 4: Sentiment analysis in twitter. In: SemEval@ NAACL-HLT (2016) Nakov, P., Ritter, A., Rosenthal, S., Sebastiani, F., Stoyanov, V.: Semeval-2016 task 4: Sentiment analysis in twitter. In: SemEval@ NAACL-HLT (2016)
27.
go back to reference Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2007)CrossRef Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2007)CrossRef
28.
go back to reference Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: Identifying misinformation in microblogs. In: EMNLP (2011) Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: Identifying misinformation in microblogs. In: EMNLP (2011)
29.
go back to reference Ren, Z., Liang, S., Meij, E., de Rijke, M.: Personalized time-aware tweets summarization. In: SIGIR (2013) Ren, Z., Liang, S., Meij, E., de Rijke, M.: Personalized time-aware tweets summarization. In: SIGIR (2013)
30.
go back to reference Rizzo, G., Basave, A.E.C., Pereira, B., Varga, A.: Making sense of microposts (#microposts2015) named entity recognition and linking (NEEL) challenge. CEUR-WS.org (2015) Rizzo, G., Basave, A.E.C., Pereira, B., Varga, A.: Making sense of microposts (#microposts2015) named entity recognition and linking (NEEL) challenge. CEUR-WS.org (2015)
31.
go back to reference Rizzo, G., van Erp, M., Plu, J., Troncy, R.: Making sense of microposts (#microposts2016) named entity recognition and linking (NEEL) challenge. CEUR-WS.org (2016) Rizzo, G., van Erp, M., Plu, J., Troncy, R.: Making sense of microposts (#microposts2016) named entity recognition and linking (NEEL) challenge. CEUR-WS.org (2016)
32.
go back to reference Rosenthal, S., Farra, N., Nakov, P.: Semeval-2017 task 4: Sentiment analysis in twitter. In: SemEval (2017) Rosenthal, S., Farra, N., Nakov, P.: Semeval-2017 task 4: Sentiment analysis in twitter. In: SemEval (2017)
33.
go back to reference Roussakis, Y., Chrysakis, I., Stefanidis, K., Flouris, G., Stavrakas, Y.: A Flexible Framework for Understanding the Dynamics of Evolving RDF Datasets. In: ISWC (2015) Roussakis, Y., Chrysakis, I., Stefanidis, K., Flouris, G., Stavrakas, Y.: A Flexible Framework for Understanding the Dynamics of Evolving RDF Datasets. In: ISWC (2015)
34.
go back to reference Saleiro, P., Soares, C.: Learning from the news: Predicting entity popularity on twitter. In: International Symposium on Intelligent Data Analysis, pp. 171–182. Springer (2016) Saleiro, P., Soares, C.: Learning from the news: Predicting entity popularity on twitter. In: International Symposium on Intelligent Data Analysis, pp. 171–182. Springer (2016)
35.
go back to reference Sebastiani, F.: An axiomatically derived measure for the evaluation of classification algorithms. In: ICTIR (2015) Sebastiani, F.: An axiomatically derived measure for the evaluation of classification algorithms. In: ICTIR (2015)
36.
go back to reference Sedhai, S., Sun, A.: Hspam14: A collection of 14 million tweets for hashtag-oriented spam research. In: SIGIR (2015) Sedhai, S., Sun, A.: Hspam14: A collection of 14 million tweets for hashtag-oriented spam research. In: SIGIR (2015)
37.
go back to reference Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)CrossRef Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)CrossRef
38.
go back to reference Stefanidis, K., Koloniari, G.: Enabling Social Search in Time through Graphs. In: Web-KR@CIKM (2014) Stefanidis, K., Koloniari, G.: Enabling Social Search in Time through Graphs. In: Web-KR@CIKM (2014)
39.
go back to reference Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment strength detection for the social Web. J. Am. Soc. Inf. Sci. Technol. 63(1), 163–173 (2012)CrossRef Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment strength detection for the social Web. J. Am. Soc. Inf. Sci. Technol. 63(1), 163–173 (2012)CrossRef
40.
go back to reference Tran, T.A., Niederée, C., Kanhabua, N., Gadiraju, U., Anand, A.: Balancing novelty and salience: Adaptive learning to rank entities for timeline summarization of high-impact events. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1201–1210. ACM (2015) Tran, T.A., Niederée, C., Kanhabua, N., Gadiraju, U., Anand, A.: Balancing novelty and salience: Adaptive learning to rank entities for timeline summarization of high-impact events. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1201–1210. ACM (2015)
41.
go back to reference Weikum, G., Spaniol, M., Ntarmos, N., Triantafillou, P., Benczúr, A., Kirkpatrick, S., Rigaux, P., Williamson, M.: Longitudinal Analytics on Web Archive Data: It’s About Time! In: CIDR (2011) Weikum, G., Spaniol, M., Ntarmos, N., Triantafillou, P., Benczúr, A., Kirkpatrick, S., Rigaux, P., Williamson, M.: Longitudinal Analytics on Web Archive Data: It’s About Time! In: CIDR (2011)
42.
go back to reference Yao, J.g., Fan, F., Zhao, W.X., Wan, X., Chang, E., Xiao, J.: Tweet timeline generation with determinantal point processes. In: AAAI (2016) Yao, J.g., Fan, F., Zhao, W.X., Wan, X., Chang, E., Xiao, J.: Tweet timeline generation with determinantal point processes. In: AAAI (2016)
44.
go back to reference Zhang, L., Rettinger, A., Zhang, J.: A probabilistic model for time-aware entity recommendation. In: International Semantic Web Conference, pp. 598–614. Springer (2016) Zhang, L., Rettinger, A., Zhang, J.: A probabilistic model for time-aware entity recommendation. In: International Semantic Web Conference, pp. 598–614. Springer (2016)
45.
go back to reference Zhao, X.W., Guo, Y., Yan, R., He, Y., Li, X.: Timeline generation with social attention. In: SIGIR (2013) Zhao, X.W., Guo, Y., Yan, R., He, Y., Li, X.: Timeline generation with social attention. In: SIGIR (2013)
Metadata
Title
Tracking the history and evolution of entities: entity-centric temporal analysis of large social media archives
Authors
Pavlos Fafalios
Vasileios Iosifidis
Kostas Stefanidis
Eirini Ntoutsi
Publication date
26-10-2018
Publisher
Springer Berlin Heidelberg
Published in
International Journal on Digital Libraries / Issue 1/2020
Print ISSN: 1432-5012
Electronic ISSN: 1432-1300
DOI
https://doi.org/10.1007/s00799-018-0257-7

Other articles of this Issue 1/2020

International Journal on Digital Libraries 1/2020 Go to the issue

Premium Partner