skip to main content
10.1145/564691.564702acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Efficient evaluation of queries in a mediator for WebSources

Published:03 June 2002Publication History

ABSTRACT

We consider an architecture of mediators and wrappers for Internet accessible WebSources of limited query capability. Each call to a source is a WebSource Implementation (WSI) and it is associated with both a capability and (a possibly dynamic) cost. The multiplicity of WSIs with varying costs and capabilities increases the complexity of a traditional optimizer that must assign WSIs for each remote relation in the query while generating an (optimal) plan. We present a two-phase Web Query Optimizer (WQO). In a pre-optimization phase, the WQO selects one or more WSIs for a pre-plan; a pre-plan represents a space of query evaluation plans (plans) based on this choice of WSIs. The WQO uses cost-based heuristics to evaluate the choice of WSI assignment in the pre-plan and to choose a good pre-plan. The WQO uses the pre-plan to drive the extended relational optimizer to obtain the best plan for a pre-plan. A prototype of the WQO has been developed. We compare the effectiveness of the WQO, i.e., its ability to efficiently search a large space of plans and obtain a low cost plan, in comparison to a traditional optimizer. We also validate the cost-based heuristics by experimental evaluation of queries in the noisy Internet environment.

References

  1. S. Adali, K. S. Candan, Y. Papakonstantinou, and V. S. Subrahmanian. Query caching and optimization in distributed mediator systems. Proceedings of the ACM SIGMOD Conference, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Amsaleg, M. Franklin, A. Tomasic, and T. Urhan. Dynamic query operator scheduling for wide-area remote access. Journal of Distributed and Parallel Databases, 6(3), 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Avnur and J. Hellerstein. Eddies: Continuously adaptive query processing. Proceedings of the ACM Sigmod Conference, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Bright, J.-R. Gruser, L. Raschid, and M. E. Vidal. A wrapper generation toolkit to specify and construct wrappers for webaccesible data sources (websources). Journal of Computer Systems Special Issue on Semantics in the WWW, 14(2):83-98, 1999.]]Google ScholarGoogle Scholar
  5. L. Bright and L. Raschid. Cost modeling of wrappers for web accesible data sources (websources). http://www.umiacs.umd.edu/labs/CLIP/DARPA/ww97.html. (under review), 1998.]]Google ScholarGoogle Scholar
  6. S. Chaudhuri and K. Shim. Query optimization in the presence of foreing fucntions. Proceedings of the VLDB Conference, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Chen, D. DeWitt, F. Tian, and Y. Wang. Niagaracq: A scalable continuous query system for internet databases. Proceedings of the ACM SIGMOD Conference, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. Du, R. Krishnamurthy, and M. C. Shan. Query optimization in a heterogeneous dbms. Proceedings of the VLDB Conference, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Eckman, A. Kosky, and L. Laroco. Extending traditional query-based integration approaches for functional characterization of post-genomic data. BioInformatics, 17(7):587-601, 2001.]]Google ScholarGoogle ScholarCross RefCross Ref
  10. B. Eckman, Z. Lacroix, and L. Raschid. Optimized seamless integration of biomolecular data. Proceedings of the IEEE International Symposium on Bio-Informatics and Biomedical Engineering (BIBE), 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Florescu, A. Levy, I. Manolescu, and D. Suciu. Query optimization in the presence of limited access patterns. Proceedings of the ACM SIGMOD Conference, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Garcia-Molina, Labio W., and Yerneni R. Capability-sensitive query processing on internet sources. Proceedings of the International Conference on Database Engineering, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Gardarin, B. Finance, and P. Fankhauser. IRO-DB: A Distributed System Federating Object and Relational Databases, In Object-Oriented Multidatabase Systems : A solution for Advanced Applications, Bukhres, O. and Elmagarmid, A. Prentice Hall, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Goldman and J. Widom. Wsq/dsq: A practical approach for combined querying of databases and the web. Proceedings of the ACM SIGMOD Conference, pages 285-296, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J.-R. Gruser, L. Raschid, V. Zadorozhny, and T. Zhan. Learning response time for websources using query feedback and application in query optimization. VLDB Journal, Special Issue on Databases and the Web, Mendelzon, A. and Atzeni, P., editors., 9(1):18-37, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Haas, D. Kossmann, E. Wimmers, and J. Yang. Optimizing queries across diverse data sources. Proceedings of the VLDB Conference, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Haas, P. Schwarz, P. Kodali, E. Kotlar, J. Rice, and W. Swope. Discoverylink: A system for integrating life sciences data. IBM Systems Journal, 40(2), 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Haas and J. Hellerstein. Ripple joins for online aggregation. Proceedings of the ACM Sigmod Conference, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Hellerstein et al. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 2000.]]Google ScholarGoogle Scholar
  20. Y. Ioanidis and Y. Kang. Randomized algorithms for optimizing large join queries. Proceedings of the ACM SIGMOD Conference, 1990.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Z. Ives et al. An adaptive query execution system for data integration. Proceedings of the ACM SIGMOD Conference, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Z. Ives, A. Levy, D. Weld, D. Florescu, and M. Friedman. Adaptive query processing for internet applications. IEEE Data Engineering Bulletin, 23(2):19-26, 2000.]]Google ScholarGoogle Scholar
  23. A. Y. Levy et al. Querying heterogeneous information sources using source descriptions. Proceedings of VLDB, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Li and E. Chang. Query planning with limited source capabilities. Proceedings of ICDE, 2000.]]Google ScholarGoogle ScholarCross RefCross Ref
  25. C. Li and E. Chang. On answering queries in the presence of limited access patterns. Proceedings of the International Conference on Database Theory, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. ACM Digital Library. http://www.acm.org/dl/Search.html.]]Google ScholarGoogle Scholar
  27. L. Haas M. Tork Roth, F. Ozcan. Cost models do matter: Providing cost information for diverse data sources in a federated system. Proceedings of the VLDB Conference, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Naacke, G. Gardarin, and A. Tomasic. Leveraging mediator cost models with heterogeneous data sources. Proceedings of the International Conference on Data Engineering, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Bureau of Labor Statistics. http://stats.bls.gov.]]Google ScholarGoogle Scholar
  30. Y. Papakonstantinou, A. Gupta, and L. Haas. Capabilities-based query rewriting in mediator systems. Proceedings of the International Conference on Parallel and Distributed Information Systems, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Ramakrishnan P. Seshadri, M. Livny. The case for enhanced abstract data types. Proceedings of the VLDB Conference, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, and T. Price. Access path selection in a relational database management system. 1979.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Tomasic et al. Scaling heterogeneous databases and the design of disco. Proceedings of the International Conference on Distributed Computing Systems, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Ullman. Principles of DataBase and Knowledge-Base Systems, volume I. Computer Science Press, 1988.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Ullman. Principles of DataBase and Knowledge-Base Systems, volume II. Computer Science Press, 1989.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. Urhan and M. Franklin. Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Engineering Bulletin, 23(2):27-33, 2000.]]Google ScholarGoogle Scholar
  37. T. Urhan and M. Franklin. Dynamic pipeline scheduling for improving interactive performance of online queries. Proceedings of the VLDB Conference, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. T. Urhan, M. Franklin, and L. Amsaleg. Cost-based query scrambling for initial delays. Proceedings of the ACM Sigmod Conference, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. V. Vassalos and Y. Papakonstantinou. Describing and using query capabilities of heterogeneos sources. In Proceedings of the VLDB Conference, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. V. Vassalos and Y. Papakonstantinou. Using knowledge of redundancy for query optimization in mediators. Proceedings of the AAAI Symposium on AI and Data Integration, 1998.]]Google ScholarGoogle Scholar
  41. M. E. Vidal. A Mediator for Scaling up to Multiple WebSources. PhD thesis, University Simon Bolivar, 2000.]]Google ScholarGoogle Scholar
  42. M. E. Vidal, L. Raschid, and V. Zadorozhny. Decision support model for pre-plans. In preparation, 2001.]]Google ScholarGoogle Scholar
  43. R. Yerneni, Chen Li, J. Ullman, and H. Garcia-Molina. Optimizing large join queries in mediation systems. Proceedings of the International Conference on Database Theory, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. V. Zadorozhny, L. Raschid, T. Zhan, and L Bright. Validating a cost model for wide area applications. Proceedings of the International Conference on Cooperating Information Systems, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficient evaluation of queries in a mediator for WebSources

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                SIGMOD '02: Proceedings of the 2002 ACM SIGMOD international conference on Management of data
                June 2002
                654 pages
                ISBN:1581134975
                DOI:10.1145/564691

                Copyright © 2002 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 3 June 2002

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • Article

                Acceptance Rates

                SIGMOD '02 Paper Acceptance Rate42of240submissions,18%Overall Acceptance Rate785of4,003submissions,20%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader