Skip to main content
Top
Published in: Cluster Computing 3/2016

01-09-2016

Building associated semantic representation model for the ultra-short microblog text jumping in big data

Authors: Shunxiang Zhang, Yin Wang, Shiyao Zhang, Guangli Zhu

Published in: Cluster Computing | Issue 3/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In the massive microblog texts, the ultra-short microblog text is difficult to be independently understood because of its special characteristics such as data sparseness, content fragmentation and so on. To solve this problem, this paper presents an associated semantic representation model for the ultra-short microblog text (ASRM-UMT) to help users understand it better. First, multi-layer associated semantic views of the ultra-short microblog text are built. The ICTCLAS system is adopted to extract the feature keywords from microblog texts. The mining algorithm of associated semantic on a dynamic time window is proposed to mine the associated semantic relations among the feature keywords. The mining process has deeply considered three aspects including context, comments and transmissions of microblog texts. Then, multi-layer associated semantic views of the ultra-short microblog text are optimized. The comparison of the clustering coefficients among several multi-layer associated semantic views is presented to select the optimal associated semantic view. Experimental results show that the proposed model can represent the ultra-short microblog text accurately and effectively.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Li, J., Fan, Q., Zhang, K.: Keyword extraction based on tf/idf for Chinese news document. Wuhan Univ. J. Nat. Sci. 12(5), 917–921 (2007)CrossRef Li, J., Fan, Q., Zhang, K.: Keyword extraction based on tf/idf for Chinese news document. Wuhan Univ. J. Nat. Sci. 12(5), 917–921 (2007)CrossRef
2.
go back to reference Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: Proceedings of the 2010 Workshops on Database and Expert Systems Applications, IEEE Computer Society, pp. 54–58 (2010) Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: Proceedings of the 2010 Workshops on Database and Expert Systems Applications, IEEE Computer Society, pp. 54–58 (2010)
3.
go back to reference Jiao, H., Liu, Q., Jia, H.B.: Chinese keyword extraction based on N-gram and word co-occurrence. In: Computational Intelligence and Security Workshops, 2007 (CISW 2007), pp. 152–155 (2007) Jiao, H., Liu, Q., Jia, H.B.: Chinese keyword extraction based on N-gram and word co-occurrence. In: Computational Intelligence and Security Workshops, 2007 (CISW 2007), pp. 152–155 (2007)
4.
go back to reference Zhao, W., Hou, X.: News topic recognition of Chinese microblog based on word co-occurrence graph. CAAI Trans. Intell. Syst. 7(5), 444–449 (2012) Zhao, W., Hou, X.: News topic recognition of Chinese microblog based on word co-occurrence graph. CAAI Trans. Intell. Syst. 7(5), 444–449 (2012)
5.
go back to reference Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization[C]. In: Mmies 08 Workshop on Multi-source Multilingual Information Extraction and Summar, pp. 17–24 (2008) Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization[C]. In: Mmies 08 Workshop on Multi-source Multilingual Information Extraction and Summar, pp. 17–24 (2008)
6.
go back to reference Chien, L.F.: PAT-tree-based keyword extraction for Chinese information retrieval. In: Machinery, ACM SIGIR Forum, Association for Computing, pp. 221–222 (1989) Chien, L.F.: PAT-tree-based keyword extraction for Chinese information retrieval. In: Machinery, ACM SIGIR Forum, Association for Computing, pp. 221–222 (1989)
7.
go back to reference Zhang, K., Xu, H., Tang, J., et al.: Keyword extraction using support vector machine. Lecture Notes in Computer Science, vol. 4016, pp. 85–96 (2006) Zhang, K., Xu, H., Tang, J., et al.: Keyword extraction using support vector machine. Lecture Notes in Computer Science, vol. 4016, pp. 85–96 (2006)
8.
go back to reference Wang, L., Zheng, T., Cheng, Q., et al.: Discovering news topics from microblogs based on semantic co-occurrence. Comput. Eng. Appl. 50(17), 150–154 (2014) Wang, L., Zheng, T., Cheng, Q., et al.: Discovering news topics from microblogs based on semantic co-occurrence. Comput. Eng. Appl. 50(17), 150–154 (2014)
9.
go back to reference Zhang, S., Wang. Y., Liu, W., et al.: A model for estimating the out-degree of nodes in associated semantic network from semantic feature view. Concurr. Comp. Pract. E (2016). doi:10.1002/cpe.3819 Zhang, S., Wang. Y., Liu, W., et al.: A model for estimating the out-degree of nodes in associated semantic network from semantic feature view. Concurr. Comp. Pract. E (2016). doi:10.​1002/​cpe.​3819
10.
go back to reference Razavi, A.H., Inkpen, D.: Text Representation using Multi-level Latent Dirichlet Allocation. In: Sokolova, M., van Beek, P., et al. (eds.) Advances in artificial intelligence, pp. 215–226. Springer, New York (2014)CrossRef Razavi, A.H., Inkpen, D.: Text Representation using Multi-level Latent Dirichlet Allocation. In: Sokolova, M., van Beek, P., et al. (eds.) Advances in artificial intelligence, pp. 215–226. Springer, New York (2014)CrossRef
11.
go back to reference Tang, W., Chen, X., Xu, Z.: A Semantic Representation of Microblog Short Text Based on Topic Model. Springer, Berlin (2014) Tang, W., Chen, X., Xu, Z.: A Semantic Representation of Microblog Short Text Based on Topic Model. Springer, Berlin (2014)
12.
go back to reference Hu, J., Xiong, C., Shu, J., et al.: A novel method of three dimensional text representation. In: IEEE, pp. 1–4 (2009) Hu, J., Xiong, C., Shu, J., et al.: A novel method of three dimensional text representation. In: IEEE, pp. 1–4 (2009)
13.
go back to reference Delpisheh, E., An, A.: Topic Modeling Using Collapsed Typed Dependency Relations. In: Tseng, V.S., et al. (eds.) Advances in knowledge discovery and data mining, pp. 146–161. Springer, New York (2014)CrossRef Delpisheh, E., An, A.: Topic Modeling Using Collapsed Typed Dependency Relations. In: Tseng, V.S., et al. (eds.) Advances in knowledge discovery and data mining, pp. 146–161. Springer, New York (2014)CrossRef
14.
go back to reference Xuan, W.F., Liu, B.Q., Sun, C.J., et al.: Finding main topics in blogosphere using document clustering based on topic model. In: Machine Learning and Cybernetics (ICMLC), IEEE, pp. 1902–1908, (2011) Xuan, W.F., Liu, B.Q., Sun, C.J., et al.: Finding main topics in blogosphere using document clustering based on topic model. In: Machine Learning and Cybernetics (ICMLC), IEEE, pp. 1902–1908, (2011)
15.
go back to reference Luo, X., Zhang, J., Ye, F., et al.: Power series representation model of text knowledge based on human concept learning. IEEE Syst. Man Cybern. Syst. 44(1), 86–102 (2014)CrossRef Luo, X., Zhang, J., Ye, F., et al.: Power series representation model of text knowledge based on human concept learning. IEEE Syst. Man Cybern. Syst. 44(1), 86–102 (2014)CrossRef
16.
go back to reference Wu, S., Zhang, Z., Qian, Q.: Research on text representation model based on language network. Inf. Sci. 12(31), 119–125 (2013) Wu, S., Zhang, Z., Qian, Q.: Research on text representation model based on language network. Inf. Sci. 12(31), 119–125 (2013)
17.
go back to reference Liao, T., Liu, Z., Wang, X.: Research on event-based method for text representation. Comput. Sci. 39(12), 188–191 (2012) Liao, T., Liu, Z., Wang, X.: Research on event-based method for text representation. Comput. Sci. 39(12), 188–191 (2012)
18.
go back to reference Wu, J., Liu, Q.: Research on graph structure based method for Chinese text representation. J. China Soc. Sci. Tech. 29(4), 618–624 (2010) Wu, J., Liu, Q.: Research on graph structure based method for Chinese text representation. J. China Soc. Sci. Tech. 29(4), 618–624 (2010)
19.
go back to reference Li, G., Mao, J.: A review on text graph representation and its application in mining. J. China Soc. Sci. Tech. 32(12), 1257–1264 (2013)MathSciNet Li, G., Mao, J.: A review on text graph representation and its application in mining. J. China Soc. Sci. Tech. 32(12), 1257–1264 (2013)MathSciNet
21.
go back to reference Qu, Q., Qiu, J.G., Sun, C.Y., et al.: Graph-based knowledge representation model and pattern retrieval. In: FSKD, IEEE, pp. 541–545 (2008) Qu, Q., Qiu, J.G., Sun, C.Y., et al.: Graph-based knowledge representation model and pattern retrieval. In: FSKD, IEEE, pp. 541–545 (2008)
22.
go back to reference Xu, Z., Liu, Y., Mei, L., et al.: Semantic based representing and organizing surveillance big data using video structural description technology. J. Syst. Softw. 102, 217–225 (2014)CrossRef Xu, Z., Liu, Y., Mei, L., et al.: Semantic based representing and organizing surveillance big data using video structural description technology. J. Syst. Softw. 102, 217–225 (2014)CrossRef
23.
go back to reference Xu, Z., Liu, Y., Mei, L., et al.: Generating temporal semantic context of concepts using web search engines. J. Netw. Comput. Appl. 43(1), 42–55 (2014)CrossRef Xu, Z., Liu, Y., Mei, L., et al.: Generating temporal semantic context of concepts using web search engines. J. Netw. Comput. Appl. 43(1), 42–55 (2014)CrossRef
24.
go back to reference Luo, X., Xu, Z., Yu, J., et al.: Building association link network for semantic link on web resources. IEEE Trans. Autom. Sci. Eng. 8(3), 482–494 (2011)CrossRef Luo, X., Xu, Z., Yu, J., et al.: Building association link network for semantic link on web resources. IEEE Trans. Autom. Sci. Eng. 8(3), 482–494 (2011)CrossRef
25.
go back to reference Bernard, T., Bui, A., Pilard, L., et al.: A distributed clustering algorithm for large-scale dynamic networks. Clust. Comput. 15(4), 335–350 (2012)CrossRef Bernard, T., Bui, A., Pilard, L., et al.: A distributed clustering algorithm for large-scale dynamic networks. Clust. Comput. 15(4), 335–350 (2012)CrossRef
26.
go back to reference Yu, Z., Wang, H., Lin, X., et al.: Understanding short texts through semantic enrichment and hashing. IEEE Knowl. Data Eng. 28(2), 566–579 (2016)CrossRef Yu, Z., Wang, H., Lin, X., et al.: Understanding short texts through semantic enrichment and hashing. IEEE Knowl. Data Eng. 28(2), 566–579 (2016)CrossRef
27.
go back to reference Tang, J., Wang, X., Gao, H., et al.: Enriching short text representation in microblog for clustering. Front. Comput. Sci. 6(1), 88–101 (2012)MathSciNetMATH Tang, J., Wang, X., Gao, H., et al.: Enriching short text representation in microblog for clustering. Front. Comput. Sci. 6(1), 88–101 (2012)MathSciNetMATH
28.
go back to reference Chen, Y., Li, F., Fan, J.: Mining association rules in big data with NGEP. Clust. Comput. 18(2), 577–585 (2015)CrossRef Chen, Y., Li, F., Fan, J.: Mining association rules in big data with NGEP. Clust. Comput. 18(2), 577–585 (2015)CrossRef
29.
go back to reference Tan, P.N.: An Introduction to Data Mining. Turing series of Computer Science. People’s Posts and Telecommunications press, Beijing (2011) Tan, P.N.: An Introduction to Data Mining. Turing series of Computer Science. People’s Posts and Telecommunications press, Beijing (2011)
30.
go back to reference Wang, X.F.: Complex Network Theory and Its Application. Tsinghua University press, Beijing (2006) Wang, X.F.: Complex Network Theory and Its Application. Tsinghua University press, Beijing (2006)
Metadata
Title
Building associated semantic representation model for the ultra-short microblog text jumping in big data
Authors
Shunxiang Zhang
Yin Wang
Shiyao Zhang
Guangli Zhu
Publication date
01-09-2016
Publisher
Springer US
Published in
Cluster Computing / Issue 3/2016
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-016-0602-9

Other articles of this Issue 3/2016

Cluster Computing 3/2016 Go to the issue

Premium Partner