Skip to main content

2018 | OriginalPaper | Buchkapitel

Extraction Method of Micro-Blog New Login Word Based on Improved Position-Word Probability

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the traditional discovery methods of micro-blog new login word, compound words are difficult to be extracted effectively. Aiming to solve this problem, this paper proposes an extraction method of micro-blog new login word based on improved Position-Word Probability (PWP) and N-increment algorithm. First, the micro-blog long text is composed of all micro-blog within a single topic in period of a given time and then pre-treated. Then, the extension direction of frequent strings is judged by improved the probability of word location in the query process of N-increment algorithm. Finally, the redundant strings are reduced by pruning frequent strings set. The experimental results show that the algorithm proposed in this paper can effectively extract the compound words in micro-blog new login word.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mei L.L.: A new words extraction method based on domain specificity and statistical language knowledge, Beijing Institute of Technology (2016) Mei L.L.: A new words extraction method based on domain specificity and statistical language knowledge, Beijing Institute of Technology (2016)
2.
Zurück zum Zitat Lei, Y.M., Liu, Y., Huo, H.: Network oriented language corpus word discovery based on micro-blog. Comput. Eng. Des. 3, 789–794 (2017) Lei, Y.M., Liu, Y., Huo, H.: Network oriented language corpus word discovery based on micro-blog. Comput. Eng. Des. 3, 789–794 (2017)
3.
Zurück zum Zitat Yao, R.P., Xu, G.Y., Song, J.: Micro-blog new word discovery method based on improved mutual information and branch entropy. J. Comput. Appl. 36(10), 2772–2776 (2016) Yao, R.P., Xu, G.Y., Song, J.: Micro-blog new word discovery method based on improved mutual information and branch entropy. J. Comput. Appl. 36(10), 2772–2776 (2016)
4.
Zurück zum Zitat Zhang, S., Liu, Q.R., Lei, W.: A Weibo-oriented method for unknown word extraction. In: 2012 Eighth International Conference on Semantics, Knowledge and Grids, pp. 209–212 (2012) Zhang, S., Liu, Q.R., Lei, W.: A Weibo-oriented method for unknown word extraction. In: 2012 Eighth International Conference on Semantics, Knowledge and Grids, pp. 209–212 (2012)
5.
Zurück zum Zitat Su, Q.L., Liu, B.Q.: Chinese new word extraction from Micro-blog data. In: 2013 International Conference on Machine Learning and Cybernetics, vol. 4, pp. 1874–1879 (2013) Su, Q.L., Liu, B.Q.: Chinese new word extraction from Micro-blog data. In: 2013 International Conference on Machine Learning and Cybernetics, vol. 4, pp. 1874–1879 (2013)
6.
Zurück zum Zitat Zhang, S.X., Wang, Y., Zhang, S.Y., Zhu, G.L.: Building associated semantic representation model for the ultra-short micro-blog text jumping in big data. Clust. Comput. J. Netw. Softw. Tools Appl. 19(3), 1399–1410 (2016) Zhang, S.X., Wang, Y., Zhang, S.Y., Zhu, G.L.: Building associated semantic representation model for the ultra-short micro-blog text jumping in big data. Clust. Comput. J. Netw. Softw. Tools Appl. 19(3), 1399–1410 (2016)
7.
Zurück zum Zitat Xu, Z., Luo, X.F., Zhang, S.X., Xiao, W., Lin, M., Hua, C.P.: Mining temporal explicit and implicit semantic relations between entities using web search engines. Future Gener. Comput. Syst. 37(7), 468–477 (2014)CrossRef Xu, Z., Luo, X.F., Zhang, S.X., Xiao, W., Lin, M., Hua, C.P.: Mining temporal explicit and implicit semantic relations between entities using web search engines. Future Gener. Comput. Syst. 37(7), 468–477 (2014)CrossRef
8.
Zurück zum Zitat Peng, J., Detchon, S., Choo, K.-K.R., Ashman, H.: Astroturfing detection in social media: a binary n-gram–based approach. Concurr. Comput. Pract. Exp. (in press) (2017) Peng, J., Detchon, S., Choo, K.-K.R., Ashman, H.: Astroturfing detection in social media: a binary n-gram–based approach. Concurr. Comput. Pract. Exp. (in press) (2017)
9.
Zurück zum Zitat Peng, J., Choo, K.-K.R., Ashman, H.: Bit-level N-gram based forensic authorship analysis on social media: identifying individuals from linguistic profiles. J. Netw. Comput. Appl. 70, 171–182 (2016)CrossRef Peng, J., Choo, K.-K.R., Ashman, H.: Bit-level N-gram based forensic authorship analysis on social media: identifying individuals from linguistic profiles. J. Netw. Comput. Appl. 70, 171–182 (2016)CrossRef
10.
Zurück zum Zitat Peng, J,, Raymond Choo, K.-K., Ashman, H.: Astroturfing detection in social media: using binary n-gram analysis for authorship attribution. In: Proceedings of 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom 2016), 23–26 August 2016, pp. 121–128, IEEE Computer Society Press (2016) Peng, J,, Raymond Choo, K.-K., Ashman, H.: Astroturfing detection in social media: using binary n-gram analysis for authorship attribution. In: Proceedings of 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom 2016), 23–26 August 2016, pp. 121–128, IEEE Computer Society Press (2016)
Metadaten
Titel
Extraction Method of Micro-Blog New Login Word Based on Improved Position-Word Probability
verfasst von
Hongze Zhu
Shunxiang Zhang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-67071-3_45