Skip to main content
Top
Published in: Cluster Computing 4/2019

20-10-2017

Public opinion classification and text alignment based on Chinese and Tibetan corpus

Authors: Guixian Xu, Haishen Yao, Dongming Wu, Yuan Li, Deguang Ouyang, Gaofeng Chen

Published in: Cluster Computing | Special Issue 4/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

To address the need for researching the security of Chinese and Tibetan networks, the classification of public opinion of Chinese and Tibetan texts is proposed. First, web pages are collected. Second, preprocessing is conducted to extract the useful information from web pages. Third, a table of the Chinese and Tibetan public opinion key words is built. Finally, text similarity calculation is proposed to classify the text according to the table of public opinion words. A Chinese–Tibetan text-level alignment approach that is based on Chinese and Tibetan translation dictionary is proposed to match word frequency and position. Furthermore, sentence-level alignment algorithm is studied. The alignment performance is related to the Chinese and Tibetan translation dictionary. Text classification of public opinion and Chinese–Tibetan text alignment system is developed. After public opinion classification of Chinese text, the alignment software can discover the most similar Tibetan text and present it to the user. This research can effectively contribute to identifying Chinese and Tibetan public opinion text and is meaningful for information retrieval, text clustering, and Chinese and Tibetan machine translation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Dong, J.F.: Research of internet public opinion pre-warning on emergent event based on web mining. J. Mod. Inf. 34(2), 43–47 (2014) Dong, J.F.: Research of internet public opinion pre-warning on emergent event based on web mining. J. Mod. Inf. 34(2), 43–47 (2014)
2.
go back to reference Hao, Y.Z., Zheng, Q.H., Chen, Y.P., Yan, C.X.: Recognition of abnormal behavior based on data of public opinion on the web. J. Comput. Res. Dev. 53(3), 611–620 (2016) Hao, Y.Z., Zheng, Q.H., Chen, Y.P., Yan, C.X.: Recognition of abnormal behavior based on data of public opinion on the web. J. Comput. Res. Dev. 53(3), 611–620 (2016)
3.
go back to reference Mu, J.G., Liu, L.H., Lian, S.X.: A historical retrospection to the research on network public opinion in China. J. Ningbo Radio Tv Univ. 4, 8–11 (2008) Mu, J.G., Liu, L.H., Lian, S.X.: A historical retrospection to the research on network public opinion in China. J. Ningbo Radio Tv Univ. 4, 8–11 (2008)
4.
go back to reference Yu, X., Wu, J., Hong, J.L.: Research and realization of dictionary-based Chinese–Tibetan sentence alignment. J. Chin. Inf. Process. 25(4), 57–62 (2011) Yu, X., Wu, J., Hong, J.L.: Research and realization of dictionary-based Chinese–Tibetan sentence alignment. J. Chin. Inf. Process. 25(4), 57–62 (2011)
5.
go back to reference Li, Z.J.: Internal public opinions monitor system based on topic detection and clustering. Compu. Sci. 39(12), 237–240 (2012) Li, Z.J.: Internal public opinions monitor system based on topic detection and clustering. Compu. Sci. 39(12), 237–240 (2012)
6.
go back to reference Zhang, X.M., Li, Z.J., Chao, W.H.: Research of automatic topic detection based on incremental clustering. J. Softw. 23(6), 1578–1587 (2012)CrossRef Zhang, X.M., Li, Z.J., Chao, W.H.: Research of automatic topic detection based on incremental clustering. J. Softw. 23(6), 1578–1587 (2012)CrossRef
7.
go back to reference Li, Y.Q., Sun, L.: Hot-word detection for internet public sentiment. J. Chin. Inf. Process. 25(1), 40–48 (2011) Li, Y.Q., Sun, L.: Hot-word detection for internet public sentiment. J. Chin. Inf. Process. 25(1), 40–48 (2011)
8.
go back to reference Jia, Z.Y., He, Q., Zhang, H.J., Li, J.Y., et al.: A news event detection and tracking algorithm based on dynamic evolution model. J. Comput. Res. Dev. 41(7), 1273–1280 (2004) Jia, Z.Y., He, Q., Zhang, H.J., Li, J.Y., et al.: A news event detection and tracking algorithm based on dynamic evolution model. J. Comput. Res. Dev. 41(7), 1273–1280 (2004)
9.
go back to reference Zhao, H., Zhao, T.J., Zhang, S., et al.: Topic detection research based on content analysis. J. HarBin Inst. Technol. 38(10), 1740–1743 (2006) Zhao, H., Zhao, T.J., Zhang, S., et al.: Topic detection research based on content analysis. J. HarBin Inst. Technol. 38(10), 1740–1743 (2006)
10.
go back to reference Yu, M.Q., Luo, W.H., Xu, H.B., Bai, S.: Research on hierarchical topic detection in topic detection and tracking. J. Comput. Res. Dev. 43(3), 489–495 (2006)CrossRef Yu, M.Q., Luo, W.H., Xu, H.B., Bai, S.: Research on hierarchical topic detection in topic detection and tracking. J. Comput. Res. Dev. 43(3), 489–495 (2006)CrossRef
11.
go back to reference Luo, W.H., Yu, M.Q., Xu, H.B., et al.: The study of topic detection based on algorithm of division and multi-level clustering with multi-strategy optimization. J. Chin. Inf. Process. 20(1), 29–36 (2006) Luo, W.H., Yu, M.Q., Xu, H.B., et al.: The study of topic detection based on algorithm of division and multi-level clustering with multi-strategy optimization. J. Chin. Inf. Process. 20(1), 29–36 (2006)
12.
go back to reference Li, Y., Cao, X., Li, J.: A new cyber security risk evaluation method for oil and gas SCADA based on factor state space. Chaos Solitons Fractals 89, 203–209 (2015) Li, Y., Cao, X., Li, J.: A new cyber security risk evaluation method for oil and gas SCADA based on factor state space. Chaos Solitons Fractals 89, 203–209 (2015)
14.
go back to reference Hou, H.Q.: A brief discussion on the development trend of classification. Inf. Sci. 1, 58–63 (1981) Hou, H.Q.: A brief discussion on the development trend of classification. Inf. Sci. 1, 58–63 (1981)
15.
go back to reference Zhang, F.: Information Organization Science, pp. 411–412. Science Press, New York (2005) Zhang, F.: Information Organization Science, pp. 411–412. Science Press, New York (2005)
16.
go back to reference Jin, Z., Lin, H.F., Zhao, J.: Study on topic tracking and tendency classification based on HowNet. J. China Soc. Sci. Tech. Inf. 24(5), 555–561 (2005) Jin, Z., Lin, H.F., Zhao, J.: Study on topic tracking and tendency classification based on HowNet. J. China Soc. Sci. Tech. Inf. 24(5), 555–561 (2005)
17.
go back to reference Hou, S.: The research of text categorization for situation analysis of public opinion in internet. National University of Defense Technology (2009) Hou, S.: The research of text categorization for situation analysis of public opinion in internet. National University of Defense Technology (2009)
18.
go back to reference Liu, M.: New sentiment word detection in web texts and key sentiment sentence extraction. Zhengzhou University (2015) Liu, M.: New sentiment word detection in web texts and key sentiment sentence extraction. Zhengzhou University (2015)
19.
go back to reference Hua, Q.C.R.: Automatic alignment strategy of Tibetan–Chinese bilingual sentences. J. Qinghai Norm. Univ. (Nat. Sci.) 26(4), 39–43 (2010) Hua, Q.C.R.: Automatic alignment strategy of Tibetan–Chinese bilingual sentences. J. Qinghai Norm. Univ. (Nat. Sci.) 26(4), 39–43 (2010)
20.
go back to reference An, J.C.R., Wang, L.L.: Chinese–Tibetan bilingual sentence alignment algorithm. Microprocessor 32(3), 55–57 (2011) An, J.C.R., Wang, L.L.: Chinese–Tibetan bilingual sentence alignment algorithm. Microprocessor 32(3), 55–57 (2011)
21.
go back to reference Cai, Z.T., Suo, N.C.R.: Research on the alignment method of Chinese–Tibetan sentences based on the combination of anchor point information and sentence length. J. Minor. Teach. Coll. Qinghai Teach. Univ. 27(01), 91–93 (2016) Cai, Z.T., Suo, N.C.R.: Research on the alignment method of Chinese–Tibetan sentences based on the combination of anchor point information and sentence length. J. Minor. Teach. Coll. Qinghai Teach. Univ. 27(01), 91–93 (2016)
22.
go back to reference Gale, W.A., Church, K.W.: A program for aligning sentences in bilingual corpora. Meet. Assoc. Comput. Linguist. 19, 177–184 (1991) Gale, W.A., Church, K.W.: A program for aligning sentences in bilingual corpora. Meet. Assoc. Comput. Linguist. 19, 177–184 (1991)
23.
go back to reference Brown, P.F., Lai, J.C., Mercer, R.L.: Aligning sentences in parallel corpora. In: Meeting on Association for Computational Linguistics, pp. 169–176 (1991) Brown, P.F., Lai, J.C., Mercer, R.L.: Aligning sentences in parallel corpora. In: Meeting on Association for Computational Linguistics, pp. 169–176 (1991)
24.
go back to reference Wu, D.: Aligning a parallel English–Chinese corpus statistically with lexical criteria. Comput. Sci. 12, 80–87 (2012) Wu, D.: Aligning a parallel English–Chinese corpus statistically with lexical criteria. Comput. Sci. 12, 80–87 (2012)
25.
go back to reference Liu, X., Zhou, M., Zhu, S.H., Huang, C.N.: Aligning sentences in parallel corpora using self-extracted lexical information. Chin. J. Comput. 21, 151–158 (1998) Liu, X., Zhou, M., Zhu, S.H., Huang, C.N.: Aligning sentences in parallel corpora using self-extracted lexical information. Chin. J. Comput. 21, 151–158 (1998)
26.
go back to reference Yang, L., Geng, X., Liao, H.: A web sentiment analysis method on fuzzy clustering for mobile social media users. Eurasip J. Wirel. Commun. Netw. 2016(1), 1–13 (2016)CrossRef Yang, L., Geng, X., Liao, H.: A web sentiment analysis method on fuzzy clustering for mobile social media users. Eurasip J. Wirel. Commun. Netw. 2016(1), 1–13 (2016)CrossRef
27.
go back to reference Hao, W.N., Feng, B., Chen, G., et al.: Document vector space model construction based on domain ontology. Appl. Res. Comput. 30(3), 764–767 (2013) Hao, W.N., Feng, B., Chen, G., et al.: Document vector space model construction based on domain ontology. Appl. Res. Comput. 30(3), 764–767 (2013)
28.
go back to reference Xu, X.U., Zhang, W.Z., Zhang, H.L., Fang, B.X.: WAN-based distributed web crawling. J. Softw. 21(5), 1067–1082 (2010)CrossRef Xu, X.U., Zhang, W.Z., Zhang, H.L., Fang, B.X.: WAN-based distributed web crawling. J. Softw. 21(5), 1067–1082 (2010)CrossRef
29.
go back to reference Zhang, Y.F.: Reseach on the analysis of DOM4j technology. Mod. Comput. 17, 39–42 (2011) Zhang, Y.F.: Reseach on the analysis of DOM4j technology. Mod. Comput. 17, 39–42 (2011)
30.
go back to reference Zhu, J., Tianrui, L.I.: Research on Tibetan stop words selection and automatic processing method. J. Chin. Inf. Process. 29(2), 125–132 (2015) Zhu, J., Tianrui, L.I.: Research on Tibetan stop words selection and automatic processing method. J. Chin. Inf. Process. 29(2), 125–132 (2015)
31.
go back to reference Yang, L., Geng, X., Cao, X.: A novel knowledge representation model based on factor state space. Optik - Int. J. Light Elect. Opt. 127(12), 5141–5147 (2016)CrossRef Yang, L., Geng, X., Cao, X.: A novel knowledge representation model based on factor state space. Optik - Int. J. Light Elect. Opt. 127(12), 5141–5147 (2016)CrossRef
32.
go back to reference Cai, R.L.J.: Research and implementation on the Tibetan and Chinese automatic sentence alignment system. Tibet University (2013) Cai, R.L.J.: Research and implementation on the Tibetan and Chinese automatic sentence alignment system. Tibet University (2013)
33.
go back to reference Yang, S., Lou, X.Y.: Research on sentence similarity based on VSM with semantic of word. J. Chengdu Univ. Inf. Technol. 27(3), 239–242 (2012) Yang, S., Lou, X.Y.: Research on sentence similarity based on VSM with semantic of word. J. Chengdu Univ. Inf. Technol. 27(3), 239–242 (2012)
Metadata
Title
Public opinion classification and text alignment based on Chinese and Tibetan corpus
Authors
Guixian Xu
Haishen Yao
Dongming Wu
Yuan Li
Deguang Ouyang
Gaofeng Chen
Publication date
20-10-2017
Publisher
Springer US
Published in
Cluster Computing / Issue Special Issue 4/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-017-1267-8

Other articles of this Special Issue 4/2019

Cluster Computing 4/2019 Go to the issue

Premium Partner