Skip to main content
Erschienen in: Neural Computing and Applications 1/2017

23.05.2016 | Original Article

Research of network data mining based on reliability source under big data environment

verfasst von: Jinhai Li, Youshi He, Yunlei Ma

Erschienen in: Neural Computing and Applications | Sonderheft 1/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the era of big data, facing vast amounts of network data, only identifying the reliable data source can the researchers extract the original data that can be used in scientific research. Building reliable network data mining model based on the improvement of PageRank algorithm with applying each improved algorithm. Then the model is divided into three modules: the first, use PageRank and TrustRank to eliminate cheating webpages; then, refine webpages which related to research topic highly by TC-PageRank which combined with the topic relevancy between webpages and weight of time difference; finally, determine the authoritative webpages of the original data source by the improved HITS which considered the influence of the similarity between webpage and research topic and the amplification of webpage links to the authoritative webpages. Meanwhile, the partitioning of matrix operation based on MapReduce reduces the time and space complexity of the algorithms. And the feasibility and accuracy of the method are verified by comparative analysis of the algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Malone J, McGarry K, Wermter S et al (2006) Data mining using rule extraction from Kohonen self-organising maps [J]. Neural Comput Appl 15(1):9–17CrossRef Malone J, McGarry K, Wermter S et al (2006) Data mining using rule extraction from Kohonen self-organising maps [J]. Neural Comput Appl 15(1):9–17CrossRef
2.
Zurück zum Zitat Mohanty AK, Senapati MR, Lenka SK (2013) An improved data mining technique for classification and detection of breast cancer from mammograms [J]. Neural Comput Appl 22(1):303–310CrossRef Mohanty AK, Senapati MR, Lenka SK (2013) An improved data mining technique for classification and detection of breast cancer from mammograms [J]. Neural Comput Appl 22(1):303–310CrossRef
3.
Zurück zum Zitat Bhardwaj AK, Singh M (2015) Data mining-based integrated network traffic visualization framework for threat detection [J]. Neural Comput Appl 26(1):117–130CrossRef Bhardwaj AK, Singh M (2015) Data mining-based integrated network traffic visualization framework for threat detection [J]. Neural Comput Appl 26(1):117–130CrossRef
4.
Zurück zum Zitat Small SG, Medsker L (2014) Review of information extraction technologies and applications [J]. Neural Comput Appl 25(3):533–548CrossRef Small SG, Medsker L (2014) Review of information extraction technologies and applications [J]. Neural Comput Appl 25(3):533–548CrossRef
5.
Zurück zum Zitat Cao XY, Zhang X, Liu L et al (2014) Research on internet public opinion heat based on the response level of emergencies [J]. Chin J Manag Sci 22(3):82–89 Cao XY, Zhang X, Liu L et al (2014) Research on internet public opinion heat based on the response level of emergencies [J]. Chin J Manag Sci 22(3):82–89
6.
Zurück zum Zitat Yin GP (2012) What online reviews are more useful by consumers’ thought? [J]. Manag World 12:115–124 Yin GP (2012) What online reviews are more useful by consumers’ thought? [J]. Manag World 12:115–124
7.
Zurück zum Zitat Ahuja MS, Bal DJS, Varnica B (2014) Web Crawler: extracting the web data [J]. Int J Comput Trends Technol 13(3):132–137CrossRef Ahuja MS, Bal DJS, Varnica B (2014) Web Crawler: extracting the web data [J]. Int J Comput Trends Technol 13(3):132–137CrossRef
8.
Zurück zum Zitat Xu S, Yoon HJ, Tourassi G (2014) A user-oriented web crawler for selectively acquiring online content in e-health research [J]. Bioinformatics 30(1):104–114CrossRef Xu S, Yoon HJ, Tourassi G (2014) A user-oriented web crawler for selectively acquiring online content in e-health research [J]. Bioinformatics 30(1):104–114CrossRef
9.
Zurück zum Zitat Si XM, Liu Y (2011) Influence of internet chat rooms on network public opinion [J]. J Internet Technol 12(3):393–398 Si XM, Liu Y (2011) Influence of internet chat rooms on network public opinion [J]. J Internet Technol 12(3):393–398
10.
Zurück zum Zitat Chen L, Qi L, Wang F (2012) Comparison of feature-level learning methods for mining online consumer reviews [J]. Expert Syst Appl 39(10):9588–9601CrossRef Chen L, Qi L, Wang F (2012) Comparison of feature-level learning methods for mining online consumer reviews [J]. Expert Syst Appl 39(10):9588–9601CrossRef
11.
Zurück zum Zitat Stvilia B, Gasser L, Twidale MB et al (2007) A framework for information quality assessment [J]. J Am Soc Inform Sci Technol 58(12):1720–1733CrossRef Stvilia B, Gasser L, Twidale MB et al (2007) A framework for information quality assessment [J]. J Am Soc Inform Sci Technol 58(12):1720–1733CrossRef
12.
Zurück zum Zitat Hilbert M, Lopez P (2011) The world’s technological capacity to store, communicate, and compute information [J]. Science 332(6025):60–65CrossRef Hilbert M, Lopez P (2011) The world’s technological capacity to store, communicate, and compute information [J]. Science 332(6025):60–65CrossRef
14.
Zurück zum Zitat Richardson M, Domingos P (2002) The intelligent surfer: probabilistic combination of link and content information in PageRank [J]. Adv Neural Inf Process Syst 14:673–680 Richardson M, Domingos P (2002) The intelligent surfer: probabilistic combination of link and content information in PageRank [J]. Adv Neural Inf Process Syst 14:673–680
15.
Zurück zum Zitat Haveliwala TH (2002) Topic-sensitive PageRank [C]. In: Proceedings of the 11th international world wide web conference, Hawaii, pp 517–526 Haveliwala TH (2002) Topic-sensitive PageRank [C]. In: Proceedings of the 11th international world wide web conference, Hawaii, pp 517–526
16.
Zurück zum Zitat Chang Q, Zhou MQ, Geng GH (2007) PageRank and HITS-based web search [J]. Comput Technol Dev 18(7):77–79 Chang Q, Zhou MQ, Geng GH (2007) PageRank and HITS-based web search [J]. Comput Technol Dev 18(7):77–79
17.
Zurück zum Zitat Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters [C]. In: Proceedings of the 6th conference on symposium on operating systems design and implementation, USENIX Association Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters [C]. In: Proceedings of the 6th conference on symposium on operating systems design and implementation, USENIX Association
Metadaten
Titel
Research of network data mining based on reliability source under big data environment
verfasst von
Jinhai Li
Youshi He
Yunlei Ma
Publikationsdatum
23.05.2016
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe Sonderheft 1/2017
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-016-2349-x

Weitere Artikel der Sonderheft 1/2017

Neural Computing and Applications 1/2017 Zur Ausgabe