- 1.Chang, C.H.; Lui, S.C.; and Wu, Y.C. Applying pattern mining to Web information extraction. In Proceedings of the Fifth Pacific Asia Conference on Knowledge Discovery and Data Mining, Apr. 2001, Hong Kong.]] Google ScholarDigital Library
- 2.Chien, L.F. PAT-tree-based keyword extraction for Chinese information retrieval. In Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 50-58. 1997.]] Google ScholarDigital Library
- 3.Doorenbos, R.B.; Etzioni, O.; and Weld, D. S. A scalable comparison-shopping agent for the World Wide Web. In Proceedings of the first international conference on Autonomous Agents. pp. 39-48, NewYork, NY, 1997, ACM Press.]] Google ScholarDigital Library
- 4.Embley, D.; Jiang, Y.; and Ng, Y. -K. 1999. Recordboundary discovery in Web documents. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data (SIGMOD'99)}. pp. 467-478, Philadelphia, Pennsylvania.]] Google ScholarDigital Library
- 5.Gonnet, G.H.; Baeza-yates, R.A.; and Snider, T. 1992. New Indices for Text: Pat trees and Pat Arrays. Information Retrieval: Data Structures and Algorithms, Prentice Hall.]] Google ScholarDigital Library
- 6.Gusfield, D. 1997. Algorithms on strings, tree, and sequence, Cambridge. 1997.]] Google ScholarDigital Library
- 7.Hsu, C.-N., and Dung, M.-T. 1998. Generating finite-state transducers for semi-structured data extraction from the Web. Information Systems. 23(8): 521-538.]] Google ScholarDigital Library
- 8.Knoblock, A. et al., Eds. 1998. In Proceedings of the 1998 Workshop on AI and Information Integration, Menlo Park, California. AAAI Press.]]Google Scholar
- 9.Kurtz, S., and Schleiermacher, C. 1999. REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics 15(5): 426-427.]]Google ScholarCross Ref
- 10.Kushmerick, N. 1999. Gleaning the Web. IEEE Intelligent Systems 14(2): 20-22.]] Google ScholarDigital Library
- 11.Kushmerick, N.; Weld, D.; and Doorenbos, R. 1997. Wrapper induction for information extraction. In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI).]]Google Scholar
- 12.Morrison, D. R. Journal of ACM, 15, pp. 514-534, 1968.]] Google ScholarDigital Library
- 13.Muslea, I.; Minton, S.; and Knoblock, C. 1999. A hierarchical approach to wrapper induction. In Proceedings of the 3rd International Conference on Autonomous Agents (Agents '99), Seattle, WA.]] Google ScholarDigital Library
- 14.Muslea, I. 1999. Extraction patterns for information extraction tasks: a survey. In Proceedings of AAAI '99: Workshop on Machine Learning for Information Extraction]]Google Scholar
- 15.Sedgewick, R. Algorithms in C, Addison Wesley, 1990.]] Google ScholarDigital Library
Index Terms
- IEPAD: information extraction based on pattern discovery
Recommendations
Automatic information extraction from semi-structured Web pages by pattern discovery
Web retrieval and miningThe World Wide Web is now undeniably the richest and most dense source of information; yet, its structure makes it difficult to make use of that information in a systematic way. This paper proposes a pattern discovery approach to the rapid generation of ...
Heuristic learning of rules for information extraction from web documents
InfoScale '07: Proceedings of the 2nd international conference on Scalable information systemsThe efficacy of an information extraction system is mostly determined by the quality of the extraction rules. Building these extraction rules is time-consuming and difficult to implement by hand. Hence, we propose a Heuristic Rule Learning (HRL) ...
OLERA: Semisupervised Web-Data Extraction with Visual Support
Extracting information from semistructured Web documents is an important task for many information agents. Over the past few years, researchers have developed an extensive family of generic information extraction techniques based on supervised ...
Comments