A web robot is a program that downloads and stores web pages. Implementation issues of web robots have been studied widely and various web statistics are reported in the literature. First, this paper describes the overall architecture of our robot and the implementation decisions on several important issues. Second, we show empirical statistics on approximately 73 million Korean web pages. We also identify what factors of web pages could affect the page changes. The factors may be used for the selection of web pages to be updated incrementally.
Swipe to navigate through the chapters of this book
- Implementation of a Web Robot and Statistics on the Korean Web
Sung Jin Kim
Sang Ho Lee
- Springer Berlin Heidelberg
- Sequence number
Neuer Inhalt/© ITandMEDIA