Skip to main content
Top

2018 | OriginalPaper | Chapter

Web Documents Prioritization Using Iterative Improvement

Authors : Kamika Chaudhary, Neena Gupta, Santosh Kumar

Published in: Smart and Innovative Trends in Next Generation Computing Technologies

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The amount of information accumulating on World Wide Web is growing in size exponentially. This led to difficulty in accessing the relevant information as it becomes tough for a user to access his required information in minimum amount of time. As a result of single query placed by user in search engine a large number of search results appear in front of him and to dig out the most relevant web link becomes a cumbersome task for user which can lead to decrease in trust for search engine. This paper proposes an approach for web structure and web usage mining by using iterative improvement algorithm. Iterative improvement is a randomized algorithm which is used for solving combinatorial optimization problem. This technique helps in selecting top T web pages and prioritizing them in relevance order. Experimental evaluation has been done which shows significant improvement in performance. The parameters used are access frequency, time duration, no of visitors, hubs and authorities. They cover the area of both web structure and web usage mining.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Seyfi, A., Patel, A.: A focused crawler combinatory link and content model based on T-graph principles. Comput. Stand. Interfaces 43, 1–11 (2016)CrossRef Seyfi, A., Patel, A.: A focused crawler combinatory link and content model based on T-graph principles. Comput. Stand. Interfaces 43, 1–11 (2016)CrossRef
2.
go back to reference Derhami, V., Khodadadian, E., Ghasemzadeh, M., Bidoki, A.M.Z.: Applying reinforcement learning for web pages ranking algorithms. Appl. Soft Comput. 13(4), 1686–1692 (2013)CrossRef Derhami, V., Khodadadian, E., Ghasemzadeh, M., Bidoki, A.M.Z.: Applying reinforcement learning for web pages ranking algorithms. Appl. Soft Comput. 13(4), 1686–1692 (2013)CrossRef
3.
go back to reference Bidoki, A.M.Z., Ghodsnia, P., Yazdani, N., Oroumchian, F.: A3CRank: an adaptive ranking method based on connectivity, content and click-through data. Inf. Process. Manag. 46(2), 159–169 (2010)CrossRef Bidoki, A.M.Z., Ghodsnia, P., Yazdani, N., Oroumchian, F.: A3CRank: an adaptive ranking method based on connectivity, content and click-through data. Inf. Process. Manag. 46(2), 159–169 (2010)CrossRef
4.
go back to reference Zheng, Z., Chen, K., Sun, G., Zha, H.: A regression framework for learning ranking functions using relative relevance judgments. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 287–294, July 2007 Zheng, Z., Chen, K., Sun, G., Zha, H.: A regression framework for learning ranking functions using relative relevance judgments. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 287–294, July 2007
5.
go back to reference Bidoki, A.M.Z., Yazdani, N., Ghodsnia, P.: FICA: a novel intelligent crawling algorithm based on reinforcement learning. Web Intell. Agent Syst.: Int. J. 7(4), 363–373 (2009) Bidoki, A.M.Z., Yazdani, N., Ghodsnia, P.: FICA: a novel intelligent crawling algorithm based on reinforcement learning. Web Intell. Agent Syst.: Int. J. 7(4), 363–373 (2009)
6.
go back to reference Choi, D.Y.: Enhancing the power of web search engines by means of fuzzy query. Decis. Support Syst. 35(1), 31–44 (2003)CrossRef Choi, D.Y.: Enhancing the power of web search engines by means of fuzzy query. Decis. Support Syst. 35(1), 31–44 (2003)CrossRef
7.
go back to reference Wang, H., Li, Y., Guo, K.: Countering web spam of link-based ranking based on link analysis. Procedia Eng. 23, 310–315 (2011)CrossRef Wang, H., Li, Y., Guo, K.: Countering web spam of link-based ranking based on link analysis. Procedia Eng. 23, 310–315 (2011)CrossRef
8.
go back to reference Gupta, S.K., Singh, D., Doegar, A.: Web documents prioritization using genetic algorithm. In: IEEE International Conference on Computing for Sustainable Global Development (INDIACom), pp. 3042–3047 (2016) Gupta, S.K., Singh, D., Doegar, A.: Web documents prioritization using genetic algorithm. In: IEEE International Conference on Computing for Sustainable Global Development (INDIACom), pp. 3042–3047 (2016)
9.
go back to reference Chaudhary, K., Gupta, S.K.: Prioritizing web links based on web usage and content data. In: IEEE International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT), pp. 546–551 (2014) Chaudhary, K., Gupta, S.K.: Prioritizing web links based on web usage and content data. In: IEEE International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT), pp. 546–551 (2014)
11.
go back to reference Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 993–1002 (2011) Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 993–1002 (2011)
12.
go back to reference Koundal, D.: Prioritize the ordering of URL queue in focused crawler. J. AI Data Min. 2(1), 25–31 (2014) Koundal, D.: Prioritize the ordering of URL queue in focused crawler. J. AI Data Min. 2(1), 25–31 (2014)
13.
go back to reference Bendersky, M., Croft, W.B., Diao, Y.: Quality-biased ranking of web documents. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 95–104 (2011) Bendersky, M., Croft, W.B., Diao, Y.: Quality-biased ranking of web documents. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 95–104 (2011)
14.
go back to reference Cho, J., Roy, S., Adams, R.E.: Page quality: in search of an unbiased web ranking. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 551–562 (2005) Cho, J., Roy, S., Adams, R.E.: Page quality: in search of an unbiased web ranking. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 551–562 (2005)
15.
go back to reference Abdullah, S., Burke, E.K., McCollum, B.: Using a randomised iterative improvement algorithm with composite neighbourhood structures for the university course timetabling problem. In: Doerner, K.F., Gendreau, M., Greistorfer, P., Gutjahr, W., Hartl, R.F., Reimann, M. (eds.) Metaheuristics, pp. 153–169. Springer, Boston (2007). https://doi.org/10.1007/978-0-387-71921-4_8CrossRef Abdullah, S., Burke, E.K., McCollum, B.: Using a randomised iterative improvement algorithm with composite neighbourhood structures for the university course timetabling problem. In: Doerner, K.F., Gendreau, M., Greistorfer, P., Gutjahr, W., Hartl, R.F., Reimann, M. (eds.) Metaheuristics, pp. 153–169. Springer, Boston (2007). https://​doi.​org/​10.​1007/​978-0-387-71921-4_​8CrossRef
16.
go back to reference Xue, G.R., Zeng, H.J., Chen, Z., Yu, Y., Ma, W.Y., Xi, W., Fan, W.: Optimizing web search using web click-through data. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pp. 118–126 (2004) Xue, G.R., Zeng, H.J., Chen, Z., Yu, Y., Ma, W.Y., Xi, W., Fan, W.: Optimizing web search using web click-through data. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pp. 118–126 (2004)
17.
go back to reference Mobasher, B., Cooley, R., Srivastava, J.: Automatic personalization based on web usage mining. Commun. ACM 43(8), 142–151 (2000)CrossRef Mobasher, B., Cooley, R., Srivastava, J.: Automatic personalization based on web usage mining. Commun. ACM 43(8), 142–151 (2000)CrossRef
18.
19.
go back to reference Narasimhan, H., Satheesh, S.: A randomized iterative improvement algorithm for photomosaic generation. In: Nature & Biologically Inspired Computing World Congress, pp. 777–781 (2009) Narasimhan, H., Satheesh, S.: A randomized iterative improvement algorithm for photomosaic generation. In: Nature & Biologically Inspired Computing World Congress, pp. 777–781 (2009)
Metadata
Title
Web Documents Prioritization Using Iterative Improvement
Authors
Kamika Chaudhary
Neena Gupta
Santosh Kumar
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-8657-1_35

Premium Partner