Skip to main content
Erschienen in: Cluster Computing 3/2016

01.09.2016

Building associated semantic representation model for the ultra-short microblog text jumping in big data

verfasst von: Shunxiang Zhang, Yin Wang, Shiyao Zhang, Guangli Zhu

Erschienen in: Cluster Computing | Ausgabe 3/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the massive microblog texts, the ultra-short microblog text is difficult to be independently understood because of its special characteristics such as data sparseness, content fragmentation and so on. To solve this problem, this paper presents an associated semantic representation model for the ultra-short microblog text (ASRM-UMT) to help users understand it better. First, multi-layer associated semantic views of the ultra-short microblog text are built. The ICTCLAS system is adopted to extract the feature keywords from microblog texts. The mining algorithm of associated semantic on a dynamic time window is proposed to mine the associated semantic relations among the feature keywords. The mining process has deeply considered three aspects including context, comments and transmissions of microblog texts. Then, multi-layer associated semantic views of the ultra-short microblog text are optimized. The comparison of the clustering coefficients among several multi-layer associated semantic views is presented to select the optimal associated semantic view. Experimental results show that the proposed model can represent the ultra-short microblog text accurately and effectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Li, J., Fan, Q., Zhang, K.: Keyword extraction based on tf/idf for Chinese news document. Wuhan Univ. J. Nat. Sci. 12(5), 917–921 (2007)CrossRef Li, J., Fan, Q., Zhang, K.: Keyword extraction based on tf/idf for Chinese news document. Wuhan Univ. J. Nat. Sci. 12(5), 917–921 (2007)CrossRef
2.
Zurück zum Zitat Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: Proceedings of the 2010 Workshops on Database and Expert Systems Applications, IEEE Computer Society, pp. 54–58 (2010) Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: Proceedings of the 2010 Workshops on Database and Expert Systems Applications, IEEE Computer Society, pp. 54–58 (2010)
3.
Zurück zum Zitat Jiao, H., Liu, Q., Jia, H.B.: Chinese keyword extraction based on N-gram and word co-occurrence. In: Computational Intelligence and Security Workshops, 2007 (CISW 2007), pp. 152–155 (2007) Jiao, H., Liu, Q., Jia, H.B.: Chinese keyword extraction based on N-gram and word co-occurrence. In: Computational Intelligence and Security Workshops, 2007 (CISW 2007), pp. 152–155 (2007)
4.
Zurück zum Zitat Zhao, W., Hou, X.: News topic recognition of Chinese microblog based on word co-occurrence graph. CAAI Trans. Intell. Syst. 7(5), 444–449 (2012) Zhao, W., Hou, X.: News topic recognition of Chinese microblog based on word co-occurrence graph. CAAI Trans. Intell. Syst. 7(5), 444–449 (2012)
5.
Zurück zum Zitat Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization[C]. In: Mmies 08 Workshop on Multi-source Multilingual Information Extraction and Summar, pp. 17–24 (2008) Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization[C]. In: Mmies 08 Workshop on Multi-source Multilingual Information Extraction and Summar, pp. 17–24 (2008)
6.
Zurück zum Zitat Chien, L.F.: PAT-tree-based keyword extraction for Chinese information retrieval. In: Machinery, ACM SIGIR Forum, Association for Computing, pp. 221–222 (1989) Chien, L.F.: PAT-tree-based keyword extraction for Chinese information retrieval. In: Machinery, ACM SIGIR Forum, Association for Computing, pp. 221–222 (1989)
7.
Zurück zum Zitat Zhang, K., Xu, H., Tang, J., et al.: Keyword extraction using support vector machine. Lecture Notes in Computer Science, vol. 4016, pp. 85–96 (2006) Zhang, K., Xu, H., Tang, J., et al.: Keyword extraction using support vector machine. Lecture Notes in Computer Science, vol. 4016, pp. 85–96 (2006)
8.
Zurück zum Zitat Wang, L., Zheng, T., Cheng, Q., et al.: Discovering news topics from microblogs based on semantic co-occurrence. Comput. Eng. Appl. 50(17), 150–154 (2014) Wang, L., Zheng, T., Cheng, Q., et al.: Discovering news topics from microblogs based on semantic co-occurrence. Comput. Eng. Appl. 50(17), 150–154 (2014)
9.
Zurück zum Zitat Zhang, S., Wang. Y., Liu, W., et al.: A model for estimating the out-degree of nodes in associated semantic network from semantic feature view. Concurr. Comp. Pract. E (2016). doi:10.1002/cpe.3819 Zhang, S., Wang. Y., Liu, W., et al.: A model for estimating the out-degree of nodes in associated semantic network from semantic feature view. Concurr. Comp. Pract. E (2016). doi:10.​1002/​cpe.​3819
10.
Zurück zum Zitat Razavi, A.H., Inkpen, D.: Text Representation using Multi-level Latent Dirichlet Allocation. In: Sokolova, M., van Beek, P., et al. (eds.) Advances in artificial intelligence, pp. 215–226. Springer, New York (2014)CrossRef Razavi, A.H., Inkpen, D.: Text Representation using Multi-level Latent Dirichlet Allocation. In: Sokolova, M., van Beek, P., et al. (eds.) Advances in artificial intelligence, pp. 215–226. Springer, New York (2014)CrossRef
11.
Zurück zum Zitat Tang, W., Chen, X., Xu, Z.: A Semantic Representation of Microblog Short Text Based on Topic Model. Springer, Berlin (2014) Tang, W., Chen, X., Xu, Z.: A Semantic Representation of Microblog Short Text Based on Topic Model. Springer, Berlin (2014)
12.
Zurück zum Zitat Hu, J., Xiong, C., Shu, J., et al.: A novel method of three dimensional text representation. In: IEEE, pp. 1–4 (2009) Hu, J., Xiong, C., Shu, J., et al.: A novel method of three dimensional text representation. In: IEEE, pp. 1–4 (2009)
13.
Zurück zum Zitat Delpisheh, E., An, A.: Topic Modeling Using Collapsed Typed Dependency Relations. In: Tseng, V.S., et al. (eds.) Advances in knowledge discovery and data mining, pp. 146–161. Springer, New York (2014)CrossRef Delpisheh, E., An, A.: Topic Modeling Using Collapsed Typed Dependency Relations. In: Tseng, V.S., et al. (eds.) Advances in knowledge discovery and data mining, pp. 146–161. Springer, New York (2014)CrossRef
14.
Zurück zum Zitat Xuan, W.F., Liu, B.Q., Sun, C.J., et al.: Finding main topics in blogosphere using document clustering based on topic model. In: Machine Learning and Cybernetics (ICMLC), IEEE, pp. 1902–1908, (2011) Xuan, W.F., Liu, B.Q., Sun, C.J., et al.: Finding main topics in blogosphere using document clustering based on topic model. In: Machine Learning and Cybernetics (ICMLC), IEEE, pp. 1902–1908, (2011)
15.
Zurück zum Zitat Luo, X., Zhang, J., Ye, F., et al.: Power series representation model of text knowledge based on human concept learning. IEEE Syst. Man Cybern. Syst. 44(1), 86–102 (2014)CrossRef Luo, X., Zhang, J., Ye, F., et al.: Power series representation model of text knowledge based on human concept learning. IEEE Syst. Man Cybern. Syst. 44(1), 86–102 (2014)CrossRef
16.
Zurück zum Zitat Wu, S., Zhang, Z., Qian, Q.: Research on text representation model based on language network. Inf. Sci. 12(31), 119–125 (2013) Wu, S., Zhang, Z., Qian, Q.: Research on text representation model based on language network. Inf. Sci. 12(31), 119–125 (2013)
17.
Zurück zum Zitat Liao, T., Liu, Z., Wang, X.: Research on event-based method for text representation. Comput. Sci. 39(12), 188–191 (2012) Liao, T., Liu, Z., Wang, X.: Research on event-based method for text representation. Comput. Sci. 39(12), 188–191 (2012)
18.
Zurück zum Zitat Wu, J., Liu, Q.: Research on graph structure based method for Chinese text representation. J. China Soc. Sci. Tech. 29(4), 618–624 (2010) Wu, J., Liu, Q.: Research on graph structure based method for Chinese text representation. J. China Soc. Sci. Tech. 29(4), 618–624 (2010)
19.
Zurück zum Zitat Li, G., Mao, J.: A review on text graph representation and its application in mining. J. China Soc. Sci. Tech. 32(12), 1257–1264 (2013)MathSciNet Li, G., Mao, J.: A review on text graph representation and its application in mining. J. China Soc. Sci. Tech. 32(12), 1257–1264 (2013)MathSciNet
21.
Zurück zum Zitat Qu, Q., Qiu, J.G., Sun, C.Y., et al.: Graph-based knowledge representation model and pattern retrieval. In: FSKD, IEEE, pp. 541–545 (2008) Qu, Q., Qiu, J.G., Sun, C.Y., et al.: Graph-based knowledge representation model and pattern retrieval. In: FSKD, IEEE, pp. 541–545 (2008)
22.
Zurück zum Zitat Xu, Z., Liu, Y., Mei, L., et al.: Semantic based representing and organizing surveillance big data using video structural description technology. J. Syst. Softw. 102, 217–225 (2014)CrossRef Xu, Z., Liu, Y., Mei, L., et al.: Semantic based representing and organizing surveillance big data using video structural description technology. J. Syst. Softw. 102, 217–225 (2014)CrossRef
23.
Zurück zum Zitat Xu, Z., Liu, Y., Mei, L., et al.: Generating temporal semantic context of concepts using web search engines. J. Netw. Comput. Appl. 43(1), 42–55 (2014)CrossRef Xu, Z., Liu, Y., Mei, L., et al.: Generating temporal semantic context of concepts using web search engines. J. Netw. Comput. Appl. 43(1), 42–55 (2014)CrossRef
24.
Zurück zum Zitat Luo, X., Xu, Z., Yu, J., et al.: Building association link network for semantic link on web resources. IEEE Trans. Autom. Sci. Eng. 8(3), 482–494 (2011)CrossRef Luo, X., Xu, Z., Yu, J., et al.: Building association link network for semantic link on web resources. IEEE Trans. Autom. Sci. Eng. 8(3), 482–494 (2011)CrossRef
25.
Zurück zum Zitat Bernard, T., Bui, A., Pilard, L., et al.: A distributed clustering algorithm for large-scale dynamic networks. Clust. Comput. 15(4), 335–350 (2012)CrossRef Bernard, T., Bui, A., Pilard, L., et al.: A distributed clustering algorithm for large-scale dynamic networks. Clust. Comput. 15(4), 335–350 (2012)CrossRef
26.
Zurück zum Zitat Yu, Z., Wang, H., Lin, X., et al.: Understanding short texts through semantic enrichment and hashing. IEEE Knowl. Data Eng. 28(2), 566–579 (2016)CrossRef Yu, Z., Wang, H., Lin, X., et al.: Understanding short texts through semantic enrichment and hashing. IEEE Knowl. Data Eng. 28(2), 566–579 (2016)CrossRef
27.
Zurück zum Zitat Tang, J., Wang, X., Gao, H., et al.: Enriching short text representation in microblog for clustering. Front. Comput. Sci. 6(1), 88–101 (2012)MathSciNetMATH Tang, J., Wang, X., Gao, H., et al.: Enriching short text representation in microblog for clustering. Front. Comput. Sci. 6(1), 88–101 (2012)MathSciNetMATH
28.
Zurück zum Zitat Chen, Y., Li, F., Fan, J.: Mining association rules in big data with NGEP. Clust. Comput. 18(2), 577–585 (2015)CrossRef Chen, Y., Li, F., Fan, J.: Mining association rules in big data with NGEP. Clust. Comput. 18(2), 577–585 (2015)CrossRef
29.
Zurück zum Zitat Tan, P.N.: An Introduction to Data Mining. Turing series of Computer Science. People’s Posts and Telecommunications press, Beijing (2011) Tan, P.N.: An Introduction to Data Mining. Turing series of Computer Science. People’s Posts and Telecommunications press, Beijing (2011)
30.
Zurück zum Zitat Wang, X.F.: Complex Network Theory and Its Application. Tsinghua University press, Beijing (2006) Wang, X.F.: Complex Network Theory and Its Application. Tsinghua University press, Beijing (2006)
Metadaten
Titel
Building associated semantic representation model for the ultra-short microblog text jumping in big data
verfasst von
Shunxiang Zhang
Yin Wang
Shiyao Zhang
Guangli Zhu
Publikationsdatum
01.09.2016
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 3/2016
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-016-0602-9

Weitere Artikel der Ausgabe 3/2016

Cluster Computing 3/2016 Zur Ausgabe