Abstract
Distributed data processing is becoming a reality. Businesses want to do it for many reasons, and they often must do it in order to stay competitive. While much of the infrastructure for distributed data processing is already there (e.g., modern network technology), a number of issues make distributed data processing still a complex undertaking: (1) distributed systems can become very large, involving thousands of heterogeneous sites including PCs and mainframe server machines; (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system; (3) legacy systems need to be integrated—such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the “textbook” architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intraquery paralleli sm, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses different kinds of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems, and shows how query processing works in these systems.
- ABITEBOUL, S. 1997. Querying semi-structured data. In Proceedings of the International Conference on Database Theory (ICDT) (Delphi, Greece, Jan.).]] Google Scholar
- ABITEBOUL, S., BUNEMAN,P.,AND SUCIU, D. 1999. Data on the Web, from Relations to Semistructured Data and XML. MORKAU, MKADDR.]] Google Scholar
- ACHARYA, S., ALONSO, R., FRANKLIN, M., AND ZDONIK, S. 1995. Broadcast disks: Data management for asymmetric communication environments. In Proceedings of the ACM SIGMOD Conference on Management of Data (San Jose, CA, May), 199-210.]] Google Scholar
- ACHARYA, S., FRANKLIN, M., AND ZDONIK, S. 1996. Prefetching from a broadcast disk. In Proceedings IEEE Conference on Data Engineering (New Orleans, LA, Feb. 1996), 276-285.]] Google Scholar
- ACHARYA, S., FRANKLIN, M., AND ZDONIK, S. 1997. Balancing push and pull for data broadcast. In Proceedings of the ACM SIGMOD Conference on Management of Data (Tucson, AZ, May), 183- 194.]] Google Scholar
- ACM Computing Surveys. 1990. Special issue on heterogeneous databases. ACM Computing Surveys, 22, 13.]]Google Scholar
- ADALI, S., CANDAN, K., PAPAKONSTANTINOU,Y.,AND SUBRAHMANIAN, V. S. 1996. Query caching and optimization in distributed mediator systems. In Proceedings of the ACM SIGMOD Conference on Management of Data (Montreal, Canada, June), 137-148.]] Google Scholar
- AHO, A., SETHI, R., AND ULLMAN, J. 1987. Compilers: Principles, Techniques and Tools. Addison-Wesley.]] Google Scholar
- AKSOY,D.AND FRANKLIN, M. 1998. Scheduling for large-scale on-demand data broadcasting. In Proceedings IEEE INFOCOM Conference (San Francisco, CA, March).]]Google Scholar
- APERS, P. 1988. Data allocation in distributed DBMS. ACM Transactions on Database Systems 13, 3 (Sept.), 263-304.]] Google Scholar
- BABB, E. 1979. Implementing a relational database by means of specialized hardware. ACM Transactions on Database Systems 4, 1 (March), 1-29.]] Google Scholar
- BELLO,R.G.,DIAS, K., DOWNING, A., JR., J. F., NOR- COTT, W.D.,SUN, H., WITKOWSKI, A., AND ZIAUDDIN, M. 1998. Materialized views in oracle. In Proceedings of the Conferencce on Very Large Data Bases (VLDB) (New York, Aug.), 659-664.]] Google Scholar
- BERNSTEIN, P., GOODMAN, N., WONG, E., REEVE,C.,AND ROTHNIE, J. 1981. Query processing in a system for distributed databases (SDD-1). ACM Transactions on Database Systems 6, 4 (Dec.), 602- 625.]] Google Scholar
- BESTAVROS,A.AND CUNHA, C. 1996. Server-initiated document dissemination for the WWW. IEEE Data Engeneering Bulletin 19, 3 (Sept.), 3- 11.]]Google Scholar
- BOGLE,P.AND LISKOV, B. 1994. Reducing cross domain call overhead using batched futures. In Proceedings of the ACM Conference on Object-Oriented Programming Systems and Lan-guages (OOPSLA) (Portland, OR, Oct.), 341- 354.]] Google Scholar
- BRAUMANDL, R., CLAUSSEN, J., KEMPER, A., AND KOSS- MANN, D. 2000. Functional join processing. The VLDB Journal. 8, 3-4 (Feb.), 156-177.]] Google Scholar
- BRAUMANDL, R., KEMPER, A., AND KOSSMANN, D. 1999. Database patchwork on the internet (project demo description). In Proceedings of the ACM SIGMOD Conference on Management of Data (Philadelphia, PA, June), 550-552.]] Google Scholar
- BREITBART, Y., GARCIA-MOLINA, H., AND SILBERSCHATZ, A. 1992. Overview of multidatabase transaction management. The VLDB Journal 1, 2 (July), 181-293.]] Google Scholar
- BUCK-EMDEN,R.AND GALIMOW, J. 1996. SAP R/3 System, A Client/Server Technology. Addison-Wesley, Reading, MA.]] Google Scholar
- BUNEMAN, P. 1997. Semistructured data. In Proceedings of the ACM SIGMOD/SIGACT Conference on Principle of Database Systems (PODS) (Tuc-son, AZ, May), 117-121.]] Google Scholar
- CAREY, M., HAAS, L., SCHWARTZ, P., ANYA, M., CODY, W., FAGIN, R., FLICKNER, M., LUNIEWSKI, A., NIBLACK, W., PETKOVIC, V., THOMAS, J., WILLIAMS, J., AND WIMMERS, E. 1995. Towards heterogeneous multimedia information systems. In Proceedings of the International Workshop on Research Issues in Data Engineering (March), 124- 131.]] Google Scholar
- CAREY, M., DEWITT, D., FRANKLIN, M., HALL,N., MCAULIFFE, M., NAUGHTON, J., SCHUH,D., SOLOMON, M., TAN, C., TSATALOS, O., WHITE,S., AND ZWILLING, M. 1994. Shoring up persistent applications. In Proceedings of the ACM SIGMOD Conference on Management of Data (Minneapolis, MI, May), 383-394.]] Google Scholar
- CAREY,M.AND KOSSMANN, D. 1997. On saying "enough already!" in SQL. In Proceedings of the ACM SIGMOD Conference on Management of Data (Tucson, AZ, May), 219-230.]] Google Scholar
- CAREY,M.AND KOSSMANN, D. 1998. Reducing the braking distance of an SQL query engine. In Proceeding of the Conference on Very Large Data Bases (VLDB) (New York, Aug.), 158- 169.]] Google Scholar
- CAREY,M.AND LU, H. 1986. Load balancing in a locally distributed database system. In Proceedings of the ACM SIGMOD Conference on Management of Data (Washington, DC, June), 108-119.]] Google Scholar
- CATTELL, R., BARRY, D., BARTELS, D., BERLER, M., EAST- MAN, J., GAMERMAN, S., JORDAN, D., SPRINGER, A., STRICKLAND, H., AND WADE, D. 1997. The Object Database Standard: ODMG 2.0. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann Publishers, San Mateo, CA.]] Google Scholar
- CERI,S.AND PELAGATTI, G. 1984. Distributed Databases-Principles and Systems. McGraw-Hill Inc., New York, San Francisco, Washington, D.C.]] Google Scholar
- CHAMBERLIN, D., ASTRAHAN, M., KING, W., LORIE, R., MEHL, J., PRICE, T., SCHKOLNIK, M., SELINGER,P., SLUTZ, D., WADE,B.,AND YOST, R. 1981. Support for repetitve transactions and ad hoc queries in System R. ACM Transactions on Database Systems 6, 1 (March), 70-94.]] Google Scholar
- CHAUDHURI,S.AND GRAVANO, L. 1996. Optimizing queries over mulitmedia repositories. In Proceedings of the ACM SIGMOD Conference on Management of Data (Montreal, Canada, June), 91-102.]] Google Scholar
- CHEN,C.AND ROUSSOPOULOS, N. 1994. Adaptive selectivity estimation using query feedback. In Proceedings of the ACM SIGMOD Conference on Management of Data (Minneapolis, MI, May), 161-172.]] Google Scholar
- COLE,R.AND GRAEFE, G. 1994. Optimization of dynamic query evaluation plans. In Proceedings of the ACM SIGMOD Conference on Management of Data (Minneapolis, MI, May), 150-160.]] Google Scholar
- COPELAND, G., ALEXANDER, W., BOUGHTER, E., AND KELLER, T. 1988. Data placement in bubba. In Proceedings of the ACM SIGMOD Conference on Management of Data (Chicago, IL, May), 99-108.]] Google Scholar
- D'ANDREA,A.AND JANUS, P. 1996. UniSQL's nextgeneration object-relational database management system. ACM SIGMOD Record 25,3 (Sept.), 70-76.]] Google Scholar
- DAR, S., FRANKLIN, M., JONSSON, B., SRIVASTAVA,D.,AND TAN, M. 1996. Semantic data caching and replacement. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Bombay, India, Sept.), 330-341.]] Google Scholar
- DAVIDSON, S., GARCIA-MOLINA, H., AND SKEEN, D. 1985. Consistency in partitioned networks. ACM Computing Surveys 17, 2 (Sept.), 341-370.]] Google Scholar
- DAYAL, U. 1983. Processing queries over generalization hierarchies in a multidatabase system. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Florence, Italy, Oct.), 342- 353.]] Google Scholar
- DESHPANDE, P., RAMASAMY, K., SHUKLA, A., AND NAUGHTON, J. 1998. Caching multidimensional queries using chunks. In Proceedings of the ACM SIGMOD Conference on Management of Data (Seattle, WA, June), 259-270.]] Google Scholar
- DESSLOCH, S., H ARDER, T., MATTOS, N., MITSCHANG, B., AND THOMAS, J. 1998. KRISYS: Modeling concepts, implementation techniques, and client/server issues. The VLDB Journal 7,2 (April), 79-95.]] Google Scholar
- DEWITT, D., FUTTERSACK, P., MAIER,D.,AND VELEZ,F. 1990. A study of three alternative workstation server architectures for object-oriented database systems. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Brisbane, Australia, Aug.), 107-121.]] Google Scholar
- DEWITT,D.AND GRAY, J. 1992. Parallel database systems: The future of high performance database systems. Communications of the ACM 35,6 (June), 85-98.]] Google Scholar
- DEWITT, D., LIEUWEN,D.,AND MEHTA, M. 1993. Parallel pointer-based join techniques for objectoriented databases. In Proceedings of the International IEEE Conference on Parallel and Distributed Information Systems (San Diego, CA, Jan.).]] Google Scholar
- DOGAC, A., HALICI, U., KILIC, E., OZHAN, G., OZCAN,F., NURAL, S., DENGI, C., MANCUHAN, S., ARPINAR,B., KOKASL,P.,AND EVRENDILEK, C. 1996. METU interoperable database system. In Proceedings of the ACM SIGMOD Conference on Management of Data (Montreal, Canada, June), 552.]] Google Scholar
- DOPPELHAMMER, J., H OPPLER, T., KEMPER, A., AND KOSSMANN, D. 1997. Database performance in the real world: TPC-D and SAP R/3. In Proceedings of the ACM SIGMOD Conference on Management of Data (Tucson, AZ, May), 123- 134.]] Google Scholar
- DU, W., KRISHNAMURTHY, R., AND SHAN, M.-C. 1992. Query optimization in heterogeneous DBMS. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Vancouver, Canada, Aug.), 277- 291.]] Google Scholar
- DU, W., SHAN, M.-C., AND DAYAL, U. 1995. Reducing multidatabase query response time by tree balancing. In Proceedings of the ACM SIGMOD Conference on Management of Data (San Jose, CA, May), 293-303.]] Google Scholar
- EICKLER, A., GERLHOF,C.,AND KOSSMANN, D. 1995. A performance evaluation of OID mapping techniques. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Zurich, Switzerland, Sept.), 18-29.]] Google Scholar
- EICKLER, A., KEMPER, A., AND KOSSMANN, D. 1997. Finding data in the neighborhood. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Athens, Greece, Aug.), 336- 345.]] Google Scholar
- EPSTEIN, R., STONEBRAKER, M., AND WONG, E. 1978. Query processing in a distributed relational database system. In Proceedings of the ACM SIGMOD Conference on Management of Data (Austin, TX, June), 169-180.]] Google Scholar
- EVRENDILEK, C., DOGAC, A., NURAL,S.,AND OZCAN,F. 1997. Multidatabase query optimization. Distributed and Parallel Databases 5, 1 (Jan.), 77- 114.]] Google Scholar
- FAGIN, R. 1996. Combining fuzzy information from multiple systems. In Proceedings of the ACM SIGMOD/SIGACT Conference on Principle of Database Systems (PODS) (Montreal, Canada, June), 216-226.]] Google Scholar
- FAGIN,R.AND WIMMERS, E. 1997. Incorporating user preferences in multimedia queries. In Proceedings of the International Conference on Database Theory (ICDT), Volume 1186 of Lecture Notes in Computer Science (LNCS) (Jan.), 247-261. Springer-Verlag.]] Google Scholar
- FERGUSON, D., NIKOLAOU, C., SAIRAMESH,J.,AND YEM- INI, Y. 1996. Economic models for allocating resources in computer systems. In S. CLEARWATER ED., Market based Control of Distributed Systems. World Scientific Press.]] Google Scholar
- FERGUSON, D., NIKOLAOU,C.,AND YEMINI, Y. 1993. An economy for managing replicated data in autonomous decentralized systems. In Proceedings International Symposium on Autonomous and Decentralized Systems (Kawasaki, Japan).]]Google Scholar
- FLORESCU, D., KOSSMANN,D.,AND MANOLESCU, I. 2000. Integrating keyword search into XMLquery processing, In Proceedings of the WWW Conference (WWW9) (Amsterdam, The Netherlands, May).]] Google Scholar
- FLORESCU, D., LEVY, A., AND MENDELZON, A. 1998. Database techniques for the worldwide web: A survey. ACM SIGMOD Record 27, 3 (Sept.), 59- 74.]] Google Scholar
- FRANKLIN, M., CAREY, M., AND LIVNY, M. 1993. Local disk caching for client-server database systems. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Dublin, Ireland, Aug.), 543- 554.]] Google Scholar
- FRANKLIN, M., CAREY, M., AND LIVNY, M. 1997. Transactional client-server cache consistency: Alternatives and performance. ACM Transactions on Database Systems 22, 3 (Sept.), 315-363.]] Google Scholar
- FRANKLIN, M., JONSSON,B.,AND KOSSMANN, D. 1996. Performance tradeoffs for client-server query processing. In Proceedings of the ACM SIGMOD Conference on Management of Data (Montreal, Canada, June), 149-160.]] Google Scholar
- FRANKLIN,M.AND ZDONIK, S. 1998. Data in your face: Push technology in perspective. In Proceedings of the ACM SIGMOD Conference on Management of Data (Seattle, Wash, June), 516-519.]] Google Scholar
- GANGULY, S., GOEL, A., AND SILBERSCHATZ, A. 1996. Efficient and accurate cost models for parallel query optimizaton. In Proceedings of the ACM SIGMOD/SIGACT Conference on Principles of Database Systems (PODS) (Montreal, Canada, June), 172-181.]] Google Scholar
- GANGULY, S., HASAN,W.,AND KRISHNAMURTHY, R. 1992. Query optimization for parallel execution. In Proceedings of the ACM SIGMOD Conference on Management of Data (San Diego, CA, June), 9- 18.]] Google Scholar
- GARDARIN, G., GRUSER, J.-R., AND TANG, Z.-H. 1996. Cost-based selection of path expression processing algorithms in object-oriented databases. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Bombay, India, Sept.), 390- 401.]] Google Scholar
- GRAEFE, G. 1990. Encapsulation of parallelism in the volcano query processing system. In Proceedings of the ACM SIGMOD Conference on Management of Data (Atlantic City, NJ, June), 102- 111.]] Google Scholar
- GRAEFE, G. 1993. Query evaluation techniques for large databases. ACM Computing Surveys 25,2 (June), 73-170.]] Google Scholar
- GRAEFE, G. 1995. The cascades framework for query optimization. IEEE Data Engeneering Bulletin 18, 3 (Sept.), 19-29.]]Google Scholar
- GRAEFE, G. 1996. Iterators, schedulers, and distributed-memory parallelism. Software Practice and Experience 26, 4 (April), 427- 452.]] Google Scholar
- GRAEFE,G.AND DEWITT, D. 1987. The EXODUS optimizer generator. In Proceedings of the ACMSIG- MOD Conference on Management of Data (San Francisco, CA, May), 160-172.]] Google Scholar
- GRAEFE,G.AND MCKENNA, W. 1993. The Volcano optimizer generator: Extensibility and efficient search. In Proceedings of the IEEE Conference on Data Engineering (Vienna, Austria, April), 209- 218.]] Google Scholar
- GRAEFE,G.AND WARD, K. 1989. Dynamic query evaluation plans. In Proceedings of the ACM SIG- MOD Conference on Management of Data (Portland, OR, May), 358-366.]] Google Scholar
- GRAVANO, L., CHANG, C.-C., GARCIA-MOLINA, H., AND PAEPCKE, A. 1997. STARTS: stanford proposal for internet meta-searching. In Proceedings of the ACM SIGMOD Conference on Management of Data (Tucson, AZ, May), 207- 218.]] Google Scholar
- GRAVANO,L.AND GARCIA-MOLINA, H. 1997. Merging ranks from heterogeneous internet sources. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Athens, Greece, Aug.), 196- 205.]] Google Scholar
- GRAY, J., BOSWORTH, A., LAYMAN, A., AND PIRAHESH,H. 1996. Data cube: a relational aggregation operator generalizing group-by, cross-tab, and subtotal. In Proceedings of the IEEE Conference on Data Engineering (New Orleans, LA, Feb.), 152- 159.]] Google Scholar
- GRAY,J.AND REUTER, A. 1993. Transaction Processing: Concepts and Techniques. Morgan Kaufmann Publishers, San Mateo, CA.]] Google Scholar
- GUPTA, A., HARINARAYAN,V.,AND RAJARAMAN,A. 1997. Virtual data technology. ACM SIGMOD Record 26, 4 (Dec.), 57-61.]] Google Scholar
- GWERTZMAN,J.AND SELTZER, M. 1994. The Case for Geographical Push-Caching. Technical Report HU TR-34-94, Harvard University, Cambridge, MA.]]Google Scholar
- HAAS, L., FREYTAG,J.C.,LOHMAN,G.,AND PIRAHESH, H. 1989. Extensible query processing in starburst. In Proceedings of the ACMSIGMOD Conference on Management of Data (Portland, OR, USA, May), 377-388.]] Google Scholar
- HAAS, L., KOSSMANN, D., WIMMERS, E., AND YANG,J. 1997. Optimizing queries across diverse data sources. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Athens, Greece, Aug.), 276-285.]] Google Scholar
- HAGMANN,R.AND FERRARI, D. 1986. Performance analysis of several back-end database architectures. ACM Transactions on Database Systems 11, 1 (March), 1-26.]] Google Scholar
- HAMILTON, G., CATTELL, R., AND FISHER, M. 1997. JDBC Database Access with Java: A Tutorial and Annotated Reference. Addison- Wesley, Reading, MA.]] Google Scholar
- HARDER, T., MITSCHANG, B., NINK,U.,AND RITTER,N. 1995. Workstation/server-architekturen fur datenbankbasierte ingenieuranwendungen. Informatik-Forschung und Entwicklung 10,2 (May), 55-72.]]Google Scholar
- HARINARAYAN, V., RAJARAMAN, A., AND ULLMAN, J. 1996. Implementing data cubes efficiently. In Proceedings of the ACM SIGMOD Conference on Management of Data (Montreal, Canada, June), 205- 216.]] Google Scholar
- HASAN,W.AND MOTWANI, R. 1995. Coloring away communication in parallel query optimization. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Zurich, Switzerland, Sept.), 239-250.]] Google Scholar
- HONG,W.AND STONEBRAKER, M. 1990. Parallel Query Processing in XPRS. Technical report UCB/ERL M90/47 (May), Department of Industrial Engineering and Operations Research and School of Business Administration, University of California, Berkeley, CA.]]Google Scholar
- IEEE Data Engineering Bulletin. 1998. Special issue on interoperability. IEEE Data Engineering Bulleting, 21, 3.]]Google Scholar
- IOANNIDIS,Y.AND KANG, Y. 1991. Left-deep vs. bushy trees: an analysis of strategy spaces and its implications for query optimization. In Proceedings of the ACM SIGMOD Conference on Management of Data (Denver, CO, May), 168- 177.]] Google Scholar
- IOANNIDIS, Y., NG, R., SHIM, K., AND SELLIS,T. 1992. Parametric query optimization. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Vancouver, Canada, Aug.), 103- 114.]] Google Scholar
- IVES, Z., FLORESCU, D., FRIEDMAN, M., LEVY, A., AND WELD, D. 1999. An adaptive query execution engine for data integration. In Proceedings of the ACM SIGMOD Conference on Management of Data (Philadelphia, PA, USA, June), 299- 310.]] Google Scholar
- JENQ, B., WOELK, D., KIM,W.,AND LEE, W. 1990. Query processing in distributed ORION. In Proceedings of the International Conference on Extending Database Technology (EDBT) (Venice, Italy, March), 169-187.]] Google Scholar
- KABRA,N.AND DEWITT, D. 1998. Efficient midquery re-optimization for sub-optimal query execution plans. In Proceedings of the ACMSIGMOD Conference on Management of Data (Seattle, WA, June), 106-117.]] Google Scholar
- KELLER,A.AND BASU, J. 1994. A predicate-based caching scheme for client-server database architectures. In Proceedings of the International IEEE Conference on Parallel and Distributed Information Systems (Austin, TX, Sept.), 229-238.]] Google Scholar
- KELLER, A., JENSEN, R., AND AGRAWAL, S. 1993. Persistence software: Bridging object-oriented programming and relational databases. In Proceedings of the ACM SIGMOD Conference on Management of Data (Washington, DC, May), 523- 528.]] Google Scholar
- KELLER, T., GRAEFE,G.,AND MAIER, D. 1991. Efficient assembly of complex objects. In Proceedings of the ACM SIGMOD Conference on Management of Data (Denver, CO, May), 148-158.]] Google Scholar
- KEMPER,A.AND KOSSMANN, D. 1994. Dual-buffering strategies in object bases. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Santiago, Chile, Sept.), 427-438.]] Google Scholar
- KEMPER, A., KOSSMANN,D.,AND MATTHES, F. 1998. SAP R/3: A Database Application System. Tutorial handouts for the ACM SIGMOD Conference, Seattle, WA. http://www.db.fmi.unipassau. de/publications/tutorials/.]] Google Scholar
- KIM, W., GARZA, J., BALLOU,N.,AND WOELK, D. 1990. Architecture of the ORION next-generation database system. IEEE Transactions on Knowledge and Data Engineering 2, 1 (March), 109- 124.]] Google Scholar
- KIMBALL,R.AND STREHLO, K. 1995. Why decision support fails and how to fix it. ACM SIGMOD Record 24, 3 (Sept.), 92-97.]] Google Scholar
- KOSSMANN, D., FRANKLIN, M., AND DRASCH, G. 2000. Cache Investment: Integrating query optimization and dynamic data placement. ACM Trans. Data Syst.]] Google Scholar
- KOSSMANN,D.AND STOCKER, K. 2000. Iterative dynamic programming: A new class of query optimization algorithms. ACM Transactions on Database Systems 25, 1 (March).]] Google Scholar
- LAHIRI, T., JOSHI, A., JASUJA, A., AND CHATTERJEE,S. 1998. 50,000 users on an Oracle8 Universal Server database. In Proceedings of the ACMSIG- MOD Conference on Management of Data (Seattle, WA, June), 528-530.]] Google Scholar
- LANZELOTTE, R., VALDURIEZ,P.,AND ZAIT, M. 1993. On the effectiveness of optimization search strategies for parallel execution spaces. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Dublin, Ireland, Aug.), 493- 504.]] Google Scholar
- LEVY, A. 1999. Answering Queries Using Views: A Survey. In preparation.]]Google Scholar
- LEVY, A., RAJARAMAN, A., AND ORDILLE, J. 1996. Querying heterogeneous information sources using source descriptions. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Bombay, India, Sept.), 251-262.]] Google Scholar
- LOHMAN, G. 1988. Grammar-like functional rules for representing query optimization alternatives. In Proceedings of the ACM SIGMOD Conference on Management of Data (Chicago, IL, May), 18-27.]] Google Scholar
- LOMET, D. 1996. Replicated indexes for distributed data. In Proceedings of the International IEEE Conference on Parallel and Distributed Information Systems (Miami Beach, FL, Dec.).]] Google Scholar
- LORIE,R.AND WADE, B. 1979. The Compilation of a High Level Data Language. Technical Report RJ 2598, IBM Research, San Jose, CA.]]Google Scholar
- LU,H.AND CAREY, M. 1985. Some experimental results on distributed join algorithms in a local network. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Stockholm, Sweden), 229-304.]]Google Scholar
- LUOTONEN,A.AND ALTIS, K. 1994. World-Wide Web Proxies. Technical report (April), CERN, Geneva, Switzerland.]]Google Scholar
- MACKERT,L.AND LOHMAN, G. 1986. R optimizer validation and performance evaluation for distributed queries. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Kyoto, Japan), 149-159.]] Google Scholar
- MAIER, D., GRAEFE, G., SHAPIRO, L., DANIELS, S., KELLER, T. , AND VANCE, B. 1994. Issues in distributed object assembly. In T. OZSU,U.DAYAL, AND P. VAL- DURIEZ EDS., Distributed Object Management (San Mateo, CA, May 1994), 165-181. Morgan Kaufmann Publishers. International Workshop on Distributed Object Management.]]Google Scholar
- MCHUGH,J.AND WIDOM, J. 1999. Query optimization for XML. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Edinburgh, GB, Sept.), 315-326.]] Google Scholar
- MELTON,J.AND SIMON, A. 1993. Understanding the New SQL: AComplete Guide. Morgan Kaufmann Publishers, San Mateo, CA.]] Google Scholar
- MISHRA,P.AND EICH, M. 1992. Join processing in relational databases. ACM Computing Surveys 24, 1 (March), 63-113.]] Google Scholar
- NAACKE, H., GARDARIN,G.,AND TOMASIC, A. 1998. Leveraging mediator cost models with heterogeneous data sources. In Proceedings IEEE Conference on Data Engineering (Orlando, FL).]] Google Scholar
- O'TOOLE,J.AND SHRIRA, L. 1994. Opportunisic Log: Efficient Reads in a Reliable Object Server. Technical Report MIT/LCS-TM-506 (March), Massachusetts Institute of Technology, Cambridge, MA 02139.]] Google Scholar
- OZCAN, F., NURAL, S., KOKSAL, P., EVRENDILEK,C.,AND DOGAC, A. 1996. Dynamic query optimization on a distributed object management platform. In Proceedings of the International Conference on Information and Knowledge Management (Rockville, MD, Nov.), 117-124.]] Google Scholar
- OZCAN, F., NURAL, S., KOKSAL, P., EVRENDILEK,C.,AND DOGAC, A. 1997. Dynamic query optimization in multidatabases. IEEE Data Engineering Bulletin 20, 3 (Sept.), 38-45.]]Google Scholar
- OZSU,T.AND VALDURIEZ, P. 1999. Principles of Distributed Database Systems (second ed.). Prentice Hall, Englewood Cliffs, NJ.]] Google Scholar
- PAPAKONSTANTINOU, Y., GARCIA-MOLINA, H., AND WIDOM, J. 1995a. Object exchange across heterogeneous information sources. In Proceedings of the IEEE Conference on Data Engineering (Taipeh, Taiwan, 1995), 251-260.]] Google Scholar
- PAPAKONSTANTINOU, Y., GUPTA, A., GARCIA-MOLINA, H., AND ULLMAN, J. 1995b. A query tranlation scheme for rapid implementation of wrappers. In Proceedings of the Conference on Deductive and Object-Oriented Databases (DOOD) (Dec.), 161-186.]] Google Scholar
- PAPAKONSTANTINOU, Y., GUPTA, A., AND HAAS, L. 1996. Capabilities-based query rewriting in mediator systems. In Proceedings of the International IEEE Conference on Parallel and Distributed Information Systems (Miami Beach, FL, Dec.).]] Google Scholar
- PIRAHESH, H., HELLERSTEIN,J.,AND HASAN, W. 1992. Extensible/rule based query rewrite optimization in starburst. In Proceedings of the ACMSIG- MOD Conference on Management of Data (San Diego, CA, June), 39-48.]] Google Scholar
- QUASS,D.AND WIDOM, J. 1997. On-line warehouse view maintenance. In Proceedings of the ACM SIGMOD Conference on Management of Data (Tucson, AZ, May), 393-404.]] Google Scholar
- RAMAKRISHNAN, R. 1997. Database Management Systems. McGraw-Hill, Inc., New York, San Francisco, Washington, DC.]] Google Scholar
- RELLY, L., SCHULDT, H., AND SCHEK, H.-J. 1998. Exporting database functionality-the concert way. IEEE Data Engeneering Bulletin 21,3 (Sept.), 40-48.]]Google Scholar
- ROTH,M.T.,OZCAN,F.,AND HAAS, L. 1999. Cost models DO matter: Providing cost information for diverse data sources in a federated system. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Edinburgh, GB, Sept.), 599- 610.]] Google Scholar
- ROTH,M.T.AND SCHWARZ, P. 1997. Don't scrap it, wrap it! A wrapper architecture for legacy data sources. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Athens, Greece, Aug.), 266-275.]] Google Scholar
- ROUSSOPOULOS, N. 1991. The incremental access method of view cache: Concepts, algorithms, and cost analysis. ACM Transactions on Database Systems 16, 3 (Sept.), 535-563.]] Google Scholar
- ROUSSOPOULOS, N., CHEN, C., KELLEY, S., DELIS, A., AND PAPAKONSTANTINOU, Y. 1995. The adms project: views r us. IEEE Data Engeneering Bulletin 18,2 (June), 19-28.]]Google Scholar
- SCHEUERMANN, P., SHIM,J.,AND VINGRALEK, R. 1996. WATCHMAN: a data warehouse intelligent cache manager. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Bombay, In-dia, Sept.), 51-62.]] Google Scholar
- SCHNEIDER,D.AND DEWITT, D. 1990. Tradeoffs in processing complex join queries via hashing in multiprocessor database machines. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Brisbane, Australia, Aug.), 469- 480.]] Google Scholar
- SELINGER, P., ASTRAHAN, M., CHAMBERLIN, D., LORIE, R., AND PRICE, T. 1979. Access path selection in a relational database management system. In Proceedings of the ACM SIGMOD Conference on Management of Data (Boston, MA, May), 23-34.]] Google Scholar
- SELLIS, T. 1988. Multiple-query optimization. ACM Transactions on Database Systems 13, 1 (March), 23-52.]] Google Scholar
- SHAN, M.-C., AHMED, R., DAVIS, J., DU,W.,AND KENT, W. 1994. Pegasus: A heterogeneous information management system. In W. KIM ED., Modern Database Systems, Chapter 32. Reading, MA. ACM Press (Addison-Wesley publishers).]] Google Scholar
- SHEKITA,E.AND CAREY, M. 1990. A performance evaluation of pointer-based joins. In Proceedings of the ACM SIGMOD Conference on Management of Data (Atlantic City, NJ, May), 300- 311.]] Google Scholar
- SHETH,A.AND LARSON, J. 1990. Federated database systems for managing distributed, heterogeneous, and autonmous databases. ACM Computing Surveys 22, 3 (Sept.), 183-236.]] Google Scholar
- SIDELL, J., AOKI, P., BARR, S., SAH, A., STAELIN,C., STONEBRAKER, M., AND YU, A. 1996. Data replication in Mariposa. In Proceedings IEEE Conference on Data Engineering (New Orleans, LA, Feb.), 485-494.]] Google Scholar
- SILBERSCHATZ, A., KORTH, H., AND SUDARSHAN, S. 1997. Database System Concepts (third ed.). McGraw-Hill, Inc., New York, San Francisco, Washington, DC.]] Google Scholar
- SRINIVANSAN,V.AND CAREY, M. 1992. Compensation-based on-line query processing. In Proceedings of the ACM SIGMOD Conference on Management of Data (San Diego, CA, June), 331-340.]] Google Scholar
- STEINBRUNN, M., MOERKOTTE,G.,AND KEMPER, A. 1997. Heuristic and randomized optimization for the join ordering problem. The VLDB Journal 6,3 (Aug.), 191-208.]] Google Scholar
- STOCKER, K., KOSSMANN, D., BRAUMANDL, R., AND KEMPER, A. 2001. Integrating semijoin reducers into state-of-the-art query processors. In Proceedings of the IEEE Conference on Data Engineering (Heidelberg, Germany, April.). IEEE Computer Society Press, Los Angeles, Calif.]] Google Scholar
- STONEBRAKER, M. 1985. The design and implementation of distributed INGRES. Reading, MA. Addison-Wesley.]]Google Scholar
- STONEBRAKER, M. 1986. The case for shared nothing. IEEE Data Engeneering Bulletin 9, 1 (March), 4-9.]]Google Scholar
- STONEBRAKER, M. 1994. Readings in Database Systems (second ed.). Morgan Kaufmann Publishers, San Mateo, CA.]] Google Scholar
- STONEBRAKER, M., AOKI, P., LITWIN, W., PFEFFER, A., SAH, A., SIDELL, J., STAELIN,C.,AND YU, A. 1996. Mariposa: a wide-area distributed database system. The VLDB Journal 5, 1 (Jan.), 48-63.]] Google Scholar
- TANENBAUM, A. 1989. Computer Networks. Prentice Hall, Englewood Cliffs, NJ.]] Google Scholar
- TANENBAUM, A. 1992. Modern Operating Systems. Prentice Hall, Englewood Cliffs, NJ.]] Google Scholar
- THOMAS, J., GERBES, T., H ARDER,T.,AND MITSCHANG, B. 1995. Implementing dynamic code assembly for client-based query processing. In Proceedings of the International Symposium for Advanced Applications, (DASFAA) (Singapore, April), 264- 272.]] Google Scholar
- TOMASIC, A., RASCHID, L., AND VALDURIEZ, P. 1998. Scaling acccess to distributed heterogeneous data sources with DISCO. IEEE Transactions on Knowledge and Data Engineering 10, 5 (Oct.), 808-823.]] Google Scholar
- ULLMAN, J. 1988. Principles of Data and Knowledge-Base Systems, Vol. I. Computer Science Press, Woodland Hills, CA.]] Google Scholar
- URHAN,T.AND FRANKLIN, M. 1999. Xjoin: Getting Fast Answers from Slow and Bursty Networks. Technical report CS-TR-3994 (Feb.), University of Maryland, College Park.]]Google Scholar
- URHAN, T., FRANKLIN, M., AND AMSALEG, L. 1998. Cost based query scrambling for initial delays. In Proceedings of the ACM SIGMOD Conference on Management of Data (Seattle, WA, June), 130- 141.]] Google Scholar
- VALDURIEZ,P.AND GARDARIN, G. 1984. Join and semijoin algorithms for a multiprocessor database machine. ACM Transactions on Database Systems 9, 1 (March), 133-161.]] Google Scholar
- WIDOM, J. 1995. Research problems in data warehousing. In Proceedings of the International Conference on Information and Knowledge Management (Baltimore, MD, Nov.), 25- 30.]] Google Scholar
- WIEDERHOLD, G. 1993. Intelligent integration of information. In Proceedings of the ACM SIGMOD Conference on Management of Data (Washington, DC, May), 434-437.]] Google Scholar
- WILLIAMS, R., DANIELS, D., HAAS, L., LAPIS, G., LIND-SAY, B., NG, P., OBERMARCK, R., SELINGER,P., WALKER, A., WILMS,P.,AND YOST, R. 1981. R : An Overview of the Architecture. IBM Research, San Jose, CA, RJ3325. Reprinted in: M. Stonebraker (ed.), Readings in Database Systems, Morgan Kaufmann Publishers, 1994, 515- 536.]] Google Scholar
- WILSHUT,A.AND APERS, P. 1991. Dataflow query execution in a parallel main memory. In Proceedings of the International IEEE Conference on Parallel and Distributed Information Systems (Miami, FL, Dec.), 68-77.]] Google Scholar
- WOLFSON, O., JAJODIA,S.,AND HUANG, Y. 1997. An adaptive data replication algorithm. ACM Transactions on Database Systems 22, 42 (June), 255-314.]] Google Scholar
- YANG, J., KARLAPALEM, K., AND LI, Q. 1997. Algorithms for materialized view design in data warehousing environment. In Proceedings of the Conference on Very Large Data Bases (VLDB) (Athens, Greece, Aug.), 136- 145.]] Google Scholar
- YU,C.AND CHANG, C. 1984. Distributed query processing. ACM Computing Surveys 16, 4 (Dec.), 399-433.]] Google Scholar
- YU,C.AND MENG, W. 1997. Principles of Database Query Processing for Advanced Applications. Morgan Kaufmann Publishers, San Mateo, CA.]] Google Scholar
- ZAHARIOUDAKIS,M.AND CAREY, M. 1997. Highly concurrent cache consistency for indices in clientserver database systems. In Proceedings of the ACM SIGMOD Conference on Management of Data (Tucson, AZ, May), 50-61.]] Google Scholar
- ZHU,Q.AND LARSON, P. 1994. A query sampling method of estimating local cost parameters in a multidatabase system. In Proceedings IEEE Conference on Data Engineering (Houston, TX, USA, Feb.), 144-153.]] Google Scholar
Recommendations
Query processing in a system for distributed databases (SDD-1)
This paper describes the techniques used to optimize relational queries in the SDD-1 distributed database system. Queries are submitted to SDD-1 in a high-level procedural language called Datalanguage. Optimization begins by translating each ...
An Optimal Algorithm for Processing Distributed Star Queries
The problem of optimal query processing in distributed database systems was shown to be NP-hard. However, for a special type of queries called star queries, we have developed a polynomial optimal algorithm. Semijoin tactics are applied for query ...
Improvement Algorithms for Semijoin Query Processing Programs in Distributed Database Systems
The problem of optimal query processing in distributed database systems was shown to be NP-hard. This means that heuristic algorithms are necessary to solve the query processing problem. In this paper, we describe algorithms to improve the solutions ...
Comments