Skip to main content

2017 | OriginalPaper | Buchkapitel

Continuous Summarization over Microblog Threads

verfasst von : Liangjun Song, Ping Zhang, Zhifeng Bao, Timos Sellis

Erschienen in: Database Systems for Advanced Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the dramatic growth of social media users, microblogs are created and shared at an unprecedented rate. The high velocity and large volumes of short text posts (microblogs) bring redundancies and noise, making it hard for users and analysts to elicit useful information. In this paper, we formalize the problem from a summarization angle – Continuous Summarization over Microblog Threads (CSMT), which considers three facets: information gain of the microblog dialogue, diversity, and temporal information. This summarization problem is different from the classic ones in two aspects: (i) It is considered over a large-scale, dynamic data with high updating frequency; (ii) the context between microblogs are taken into account. We first prove that the CSMT problem is NP-hard. Then we propose a greedy algorithm with (\(1-1/\mathrm{e}\)) performance guarantee. Finally we extend the greedy algorithm on the sliding window to continuously summarize microblogs for threads. Our experimental results on large-scale datasets show that our method is more superior than other two baselines in terms of summary diversity and information gain, with a close time cost to the best performed baseline.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bian, J., Yang, Y., Chua, T.-S.: Multimedia summarization for trending topics in microblogs. In: Proceedings of the CIKM, pp. 1807–1812 (2013) Bian, J., Yang, Y., Chua, T.-S.: Multimedia summarization for trending topics in microblogs. In: Proceedings of the CIKM, pp. 1807–1812 (2013)
2.
Zurück zum Zitat Bian, J., Yang, Y., Zhang, H., Chua, T.-S.: Multimedia summarization for social events in microblog stream. IEEE Trans. Multimedia 17(2), 216 (2015)CrossRef Bian, J., Yang, Y., Zhang, H., Chua, T.-S.: Multimedia summarization for social events in microblog stream. IEEE Trans. Multimedia 17(2), 216 (2015)CrossRef
3.
Zurück zum Zitat Chakrabarti, D., Punera, K.: Event summarization using tweets. In: ICWSM, vol. 11, pp. 66–73 (2011) Chakrabarti, D., Punera, K.: Event summarization using tweets. In: ICWSM, vol. 11, pp. 66–73 (2011)
4.
Zurück zum Zitat Chang, Y., Wang, X., Mei, Q., Liu, Y.: Towards twitter context summarization with user influence models. In: Proceedings of the WSDM, pp. 527–536. ACM (2013) Chang, Y., Wang, X., Mei, Q., Liu, Y.: Towards twitter context summarization with user influence models. In: Proceedings of the WSDM, pp. 527–536. ACM (2013)
5.
Zurück zum Zitat Chen, Y., Zhang, X., Li, Z., Ng, J.P.: Search engine reinforced semi-supervised classification and graph-based summarization of microblogs. Neurocomputing 152, 274–286 (2015)CrossRef Chen, Y., Zhang, X., Li, Z., Ng, J.P.: Search engine reinforced semi-supervised classification and graph-based summarization of microblogs. Neurocomputing 152, 274–286 (2015)CrossRef
6.
Zurück zum Zitat Chua, F., Asur, S.: Automatic summarization of events from social media. In: ICWSM (2013) Chua, F., Asur, S.: Automatic summarization of events from social media. In: ICWSM (2013)
7.
Zurück zum Zitat Drosou, M., Pitoura, E.: Dynamic diversification of continuous data. In: Proceedings of the EDBT, pp. 216–227 (2012) Drosou, M., Pitoura, E.: Dynamic diversification of continuous data. In: Proceedings of the EDBT, pp. 216–227 (2012)
8.
Zurück zum Zitat Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004) Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)
10.
Zurück zum Zitat Gao, W., Li, P., Darwish, K.: Joint topic modeling for event summarization across news and social media streams. In: Proceedings of the CIKM, pp. 1173–1182 (2012) Gao, W., Li, P., Darwish, K.: Joint topic modeling for event summarization across news and social media streams. In: Proceedings of the CIKM, pp. 1173–1182 (2012)
11.
Zurück zum Zitat Hasanain, M., Elsayed, T.: QU at TREC-2014: online clustering with temporal and topical expansion for tweet timeline generation. Technical report (2014) Hasanain, M., Elsayed, T.: QU at TREC-2014: online clustering with temporal and topical expansion for tweet timeline generation. Technical report (2014)
12.
Zurück zum Zitat Khan, M., Bollegala, D., Liu, G.: Multi-tweet summarization of real-time events. In: Proceedings of the SocialCom, pp. 128–133 (2013) Khan, M., Bollegala, D., Liu, G.: Multi-tweet summarization of real-time events. In: Proceedings of the SocialCom, pp. 128–133 (2013)
13.
Zurück zum Zitat Li, J., Cardie, C.: Timeline generation: tracking individuals on twitter. In: Proceedings of the WWW, pp. 643–652 (2014) Li, J., Cardie, C.: Timeline generation: tracking individuals on twitter. In: Proceedings of the WWW, pp. 643–652 (2014)
14.
Zurück zum Zitat Lin, J., Efron, M., Wang, Y., Sherman, G.: Overview of the TREC-2014 Microblog track. In: Proceedings of the TREC (2014) Lin, J., Efron, M., Wang, Y., Sherman, G.: Overview of the TREC-2014 Microblog track. In: Proceedings of the TREC (2014)
15.
Zurück zum Zitat Magdy, W., Gao, W., Elganainy, T., Wei, Z.: QCRI at TREC 2014: applying the kiss principle for the TTG task in the microblog track. Technical report (2014) Magdy, W., Gao, W., Elganainy, T., Wei, Z.: QCRI at TREC 2014: applying the kiss principle for the TTG task in the microblog track. Technical report (2014)
16.
Zurück zum Zitat Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functions. Math. Program. 14(1), 265–294 (1978)MathSciNetCrossRefMATH Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functions. Math. Program. 14(1), 265–294 (1978)MathSciNetCrossRefMATH
17.
Zurück zum Zitat Ren, Z., Liang, S., Meij, E., de Rijke, M.: Personalized time-aware tweets summarization. In: Proceedings of the SIGIR, pp. 513–522 (2013) Ren, Z., Liang, S., Meij, E., de Rijke, M.: Personalized time-aware tweets summarization. In: Proceedings of the SIGIR, pp. 513–522 (2013)
18.
Zurück zum Zitat Shou, L., Wang, Z., Chen, K., Chen, G.: Sumblr: continuous summarization of evolving tweet streams. In: Proceedings of the SIGIR, pp. 533–542 (2013) Shou, L., Wang, Z., Chen, K., Chen, G.: Sumblr: continuous summarization of evolving tweet streams. In: Proceedings of the SIGIR, pp. 533–542 (2013)
19.
Zurück zum Zitat Wang, C., Yu, X., Li, Y., Zhai, C., Han, J.: Content coverage maximization on word networks for hierarchical topic summarization. In: Proceedings of the CIKM, pp. 249–258 (2013) Wang, C., Yu, X., Li, Y., Zhai, C., Han, J.: Content coverage maximization on word networks for hierarchical topic summarization. In: Proceedings of the CIKM, pp. 249–258 (2013)
20.
Zurück zum Zitat Zhao, X.W., Guo, Y., Yan, R., He, Y., Li, X.: Timeline generation with social attention. In: Proceedings of the SIGIR, pp. 1061–1064 (2013) Zhao, X.W., Guo, Y., Yan, R., He, Y., Li, X.: Timeline generation with social attention. In: Proceedings of the SIGIR, pp. 1061–1064 (2013)
Metadaten
Titel
Continuous Summarization over Microblog Threads
verfasst von
Liangjun Song
Ping Zhang
Zhifeng Bao
Timos Sellis
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-55699-4_31