Skip to main content
Erschienen in: Journal of Intelligent Information Systems 1/2010

01.02.2010

Processing top-N relational queries by learning

verfasst von: Liang Zhu, Weiyi Meng, Chunnian Liu, Wenzhu Yang, Dazhong Liu

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 1/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A top-N selection query against a relation is to find the N tuples that satisfy the query condition the best but not necessarily completely. In this paper, we propose a new method for evaluating top-N queries against a relation. This method employs a learning-based strategy. Initially, this method finds and saves the optimal search spaces for a small number of random top-N queries. The learned knowledge is then used to evaluate new queries. Extensive experiments are carried out to measure the performance of this strategy and the results indicate that it is highly competitive with existing techniques for both low-dimensional and high-dimensional data. Furthermore, the knowledge base can be updated based on new user queries to reflect new query patterns so that frequently submitted queries can be processed most efficiently. The maintenance and stability of the knowledge base are also addressed in the paper.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In all the figures in Section 5.5, we arrange the legends and the corresponding curves in the same order to reduce confusion; the order of legends is from left to right (if any), and then from top to bottom.
 
Literatur
Zurück zum Zitat Balke, W., Nejdl, W., Siberski, W., & Thaden, U. (2005). Progressive distributed top-k retrieval in peer-to-peer networks. In Proceedings of the 21st international conference on data engineering (ICDE’05) (pp. 174–185). Tokyo, Japan. Balke, W., Nejdl, W., Siberski, W., & Thaden, U. (2005). Progressive distributed top-k retrieval in peer-to-peer networks. In Proceedings of the 21st international conference on data engineering (ICDE’05) (pp. 174–185). Tokyo, Japan.
Zurück zum Zitat Bast, H., Majumdar, D., Schenkel, R., Theobald, M., & Weikum, G. (2006). IO-top-k: Index-access optimized top-k query processing. In Proceedings of 29th international conference on very large data bases (VLDB’06) (pp. 475–486). Seoul, Korea. Bast, H., Majumdar, D., Schenkel, R., Theobald, M., & Weikum, G. (2006). IO-top-k: Index-access optimized top-k query processing. In Proceedings of 29th international conference on very large data bases (VLDB’06) (pp. 475–486). Seoul, Korea.
Zurück zum Zitat Bowerman, B. L., & O’Connell, R. T. (1993). Forecasting and time series: An applied approach (3rd ed.). Pacific Grove: Brooks/Cole.MATH Bowerman, B. L., & O’Connell, R. T. (1993). Forecasting and time series: An applied approach (3rd ed.). Pacific Grove: Brooks/Cole.MATH
Zurück zum Zitat Bruno, N., Chaudhuri, S., & Gravano, L. (2001). STHoles: A multidimensional workload-aware histogram. In Proceedings ACM international conference on management of data (SIGMOD’01) (pp. 211–222). Santa Barbara, California, USA. Bruno, N., Chaudhuri, S., & Gravano, L. (2001). STHoles: A multidimensional workload-aware histogram. In Proceedings ACM international conference on management of data (SIGMOD’01) (pp. 211–222). Santa Barbara, California, USA.
Zurück zum Zitat Bruno, N., Chaudhuri, S., & Gravano, L. (2002). Top-k selection queries over relational databases: Mapping strategies and performance evaluation. ACM Transactions on Database Systems, 27(2), 153–187. doi:10.1145/568518.568519.CrossRef Bruno, N., Chaudhuri, S., & Gravano, L. (2002). Top-k selection queries over relational databases: Mapping strategies and performance evaluation. ACM Transactions on Database Systems, 27(2), 153–187. doi:10.​1145/​568518.​568519.CrossRef
Zurück zum Zitat Carey, M., & Kossmann, D. (1997). On saying “enough already!” in SQL. In Proceedings ACM international conference on management of data (SIGMOD’97) (pp. 219–230). Tucson, Arizona, USA. Carey, M., & Kossmann, D. (1997). On saying “enough already!” in SQL. In Proceedings ACM international conference on management of data (SIGMOD’97) (pp. 219–230). Tucson, Arizona, USA.
Zurück zum Zitat Carey, M., & Kossmann, D. (1998). Reducing the braking distance of an SQL query engine. In Proceedings of 24th international conference on very large data bases (VLDB’98) (pp. 158–169). New York City, New York, USA. Carey, M., & Kossmann, D. (1998). Reducing the braking distance of an SQL query engine. In Proceedings of 24th international conference on very large data bases (VLDB’98) (pp. 158–169). New York City, New York, USA.
Zurück zum Zitat Chang, Y.-C., Bergman, L. D., Castelli, V., Li, C.-S., Lo, M.-L., & Smith, J. R. (2000). The onion technique: Indexing for linear optimization queries. In Proceedings ACM international conference on management of data (SIGMOD’00) (pp. 391–402). Dallas, Texas, USA. Chang, Y.-C., Bergman, L. D., Castelli, V., Li, C.-S., Lo, M.-L., & Smith, J. R. (2000). The onion technique: Indexing for linear optimization queries. In Proceedings ACM international conference on management of data (SIGMOD’00) (pp. 391–402). Dallas, Texas, USA.
Zurück zum Zitat Chaudhuri, S., & Gravano, L. (1999). Evaluating top-k selection queries. In Proceedings of 25th international conference on very large data bases (VLDB’99) (pp. 397–410). Edinburgh, Scotland, UK. Chaudhuri, S., & Gravano, L. (1999). Evaluating top-k selection queries. In Proceedings of 25th international conference on very large data bases (VLDB’99) (pp. 397–410). Edinburgh, Scotland, UK.
Zurück zum Zitat Chaudhuri, S., Gravano, L., & Marian, A. (2004). Optimizing top-k selection queries over multimedia repositories. IEEE Transactions on Knowledge and Data Engineering, 16(8), 992–1009. doi:10.1109/TKDE.2004.30.CrossRef Chaudhuri, S., Gravano, L., & Marian, A. (2004). Optimizing top-k selection queries over multimedia repositories. IEEE Transactions on Knowledge and Data Engineering, 16(8), 992–1009. doi:10.​1109/​TKDE.​2004.​30.CrossRef
Zurück zum Zitat Chen, C., & Ling, Y. (2002). A sampling-based estimator for top-k selection query. In Proceedings of the 18th international conference on data engineering (ICDE’02) (pp. 617–627). San Jose, California. Chen, C., & Ling, Y. (2002). A sampling-based estimator for top-k selection query. In Proceedings of the 18th international conference on data engineering (ICDE’02) (pp. 617–627). San Jose, California.
Zurück zum Zitat Chen, Y., & Meng, W. (2003). Top-N query: Query language, distance function, and processing strategies. In International conference on web-age information management (pp. 458–470). Chengdu, China: Springer. Chen, Y., & Meng, W. (2003). Top-N query: Query language, distance function, and processing strategies. In International conference on web-age information management (pp. 458–470). Chengdu, China: Springer.
Zurück zum Zitat Das, G., Gunopulos, D., & Koudas, N. (2006). Answering top-k queries using views. In Proceedings of 29th international conference on very large data bases (VLDB’06) (pp. 451–462). Seoul, Korea. Das, G., Gunopulos, D., & Koudas, N. (2006). Answering top-k queries using views. In Proceedings of 29th international conference on very large data bases (VLDB’06) (pp. 451–462). Seoul, Korea.
Zurück zum Zitat Donjerkovic, D., & Ramakrishnan, R. (1999). Probabilistic optimization of top N queries. In Proceedings of 25th international conference on very large data bases (VLDB’99) (pp. 411–422). Edinburgh, Scotland, UK. Donjerkovic, D., & Ramakrishnan, R. (1999). Probabilistic optimization of top N queries. In Proceedings of 25th international conference on very large data bases (VLDB’99) (pp. 411–422). Edinburgh, Scotland, UK.
Zurück zum Zitat Fagin, R., Lotem, A., & Naor, M. (2001). Optimal aggregation algorithms for middleware. In Proceedings of the twentieth ACM symposium on principles of database systems (PODS’01) (pp. 102–113). Santa Barbara, California, USA. Fagin, R., Lotem, A., & Naor, M. (2001). Optimal aggregation algorithms for middleware. In Proceedings of the twentieth ACM symposium on principles of database systems (PODS’01) (pp. 102–113). Santa Barbara, California, USA.
Zurück zum Zitat Fleming, W. (1977). Functions of several variables, Addison-Wesley, 1965 (2nd ed.). New York: Springer. Fleming, W. (1977). Functions of several variables, Addison-Wesley, 1965 (2nd ed.). New York: Springer.
Zurück zum Zitat Habich, D., Lehner, W., & Hinneburg, A. (2005). Optimizing multiple top-K queries over joins. In Proceedings of the 17th international conference on scientific and statistical database management (pp. 195–204). Santa Barbara, CA, USA. Habich, D., Lehner, W., & Hinneburg, A. (2005). Optimizing multiple top-K queries over joins. In Proceedings of the 17th international conference on scientific and statistical database management (pp. 195–204). Santa Barbara, CA, USA.
Zurück zum Zitat Hristidis, V., Koudas, N., & Papakonstantinou, Y. (2001). PREFER: A system for the efficient execution of multi-parametric ranked queries. In Proceedings of the 2001 ACM international conference on management of data (SIGMOD’01) (pp. 259–270). Santa Barbara, California, USA. Hristidis, V., Koudas, N., & Papakonstantinou, Y. (2001). PREFER: A system for the efficient execution of multi-parametric ranked queries. In Proceedings of the 2001 ACM international conference on management of data (SIGMOD’01) (pp. 259–270). Santa Barbara, California, USA.
Zurück zum Zitat Ilyas, I., Aref, W., & Elmagarmid, A. (2002). Joining ranked inputs in practice. In Proceedings of 28th international conference on very large data bases (VLDB’02) (pp. 950–961). Hong Kong, China. Ilyas, I., Aref, W., & Elmagarmid, A. (2002). Joining ranked inputs in practice. In Proceedings of 28th international conference on very large data bases (VLDB’02) (pp. 950–961). Hong Kong, China.
Zurück zum Zitat Ilyas, I., Shah, R., Aref, W., Vitter, J., & Elmagarmid, A. (2004b). Rank-aware query optimization. In Proceedings ACM international conference on management of data (SIGMOD’04) (pp. 203–214). Paris, France. Ilyas, I., Shah, R., Aref, W., Vitter, J., & Elmagarmid, A. (2004b). Rank-aware query optimization. In Proceedings ACM international conference on management of data (SIGMOD’04) (pp. 203–214). Paris, France.
Zurück zum Zitat Lee, J., Kim, D., & Chung, C. (1999). Multi-dimensional selectivity estimation using compressed histogram information. In Proceedings ACM international conference on management of data (SIGMOD’99) (pp. 205–214). Philadelphia, Pennsylvania, USA. Lee, J., Kim, D., & Chung, C. (1999). Multi-dimensional selectivity estimation using compressed histogram information. In Proceedings ACM international conference on management of data (SIGMOD’99) (pp. 205–214). Philadelphia, Pennsylvania, USA.
Zurück zum Zitat Li, C., Chang, K., Ilyas, I., & Song, S. (2005). RankSQL, query algebra and optimization for relational top-k queries. In Proceedings ACM international conference on management of data (SIGMOD’05) (pp. 131–142). Baltimore, Maryland, USA. Li, C., Chang, K., Ilyas, I., & Song, S. (2005). RankSQL, query algebra and optimization for relational top-k queries. In Proceedings ACM international conference on management of data (SIGMOD’05) (pp. 131–142). Baltimore, Maryland, USA.
Zurück zum Zitat Michel, S., Triantafillou, P., & Weikum, G. (2005). KLEE: A framework for distributed top-k query algorithms. In Proceedings of the 31st international conference on very large data bases (VLDB’05) (pp. 637–648). Trondheim, Norway. Michel, S., Triantafillou, P., & Weikum, G. (2005). KLEE: A framework for distributed top-k query algorithms. In Proceedings of the 31st international conference on very large data bases (VLDB’05) (pp. 637–648). Trondheim, Norway.
Zurück zum Zitat Motro, A. (1988). VAGUE: A user interface to relational databases that permits vague queries. ACM Transactions on Office Information Systems, 6(3), 187–214. doi:10.1145/45945.48027.CrossRef Motro, A. (1988). VAGUE: A user interface to relational databases that permits vague queries. ACM Transactions on Office Information Systems, 6(3), 187–214. doi:10.​1145/​45945.​48027.CrossRef
Zurück zum Zitat Silberschatz, A., Korth, H. F., & Sudarshan, S. (2002). Database system concepts (4th ed.). New York: McGraw-Hill. Silberschatz, A., Korth, H. F., & Sudarshan, S. (2002). Database system concepts (4th ed.). New York: McGraw-Hill.
Zurück zum Zitat Soliman, M. A., Chang, K. C.-C., & Ilyas, I. F. (2007). Top-k query processing in uncertain databases. In Proceedings of the 2007 IEEE 23rd international conference on data engineering (ICDE’07) (pp. 896–905). Istanbul, Turkey. Soliman, M. A., Chang, K. C.-C., & Ilyas, I. F. (2007). Top-k query processing in uncertain databases. In Proceedings of the 2007 IEEE 23rd international conference on data engineering (ICDE’07) (pp. 896–905). Istanbul, Turkey.
Zurück zum Zitat Theobald, M., Weikum, G., & Schenkel, R. (2004). Top-k query evaluation with probabilistic guarantees. In Proceedings of the thirtieth international conference on very large data bases (VLDB’04) (pp. 648–659). Toronto, Canada. Theobald, M., Weikum, G., & Schenkel, R. (2004). Top-k query evaluation with probabilistic guarantees. In Proceedings of the thirtieth international conference on very large data bases (VLDB’04) (pp. 648–659). Toronto, Canada.
Zurück zum Zitat Vlachou, A., Doulkeridis, C., Nørvåg, K., & Vazirgiannis, M. (2008). On efficient top-k query processing in highly distributed environments. In Proceedings ACM international conference on management of data (SIGMOD’08) (pp. 753–764). Vancouver, BC, Canada. Vlachou, A., Doulkeridis, C., Nørvåg, K., & Vazirgiannis, M. (2008). On efficient top-k query processing in highly distributed environments. In Proceedings ACM international conference on management of data (SIGMOD’08) (pp. 753–764). Vancouver, BC, Canada.
Zurück zum Zitat Xin, D., Han, J., Cheng, H., & Li, X. (2006). Answering top-k queries with multi-dimensional selections: The ranking cube approach. In Proceedings of 29th international conference on very large data bases (VLDB’06) (pp. 463–474). Seoul, Korea. Xin, D., Han, J., Cheng, H., & Li, X. (2006). Answering top-k queries with multi-dimensional selections: The ranking cube approach. In Proceedings of 29th international conference on very large data bases (VLDB’06) (pp. 463–474). Seoul, Korea.
Zurück zum Zitat Xin, D., Han, J., & Chang, K. C.-C. (2007). Progressive and selective merge. computing top-k with ad-hoc ranking functions. In Proceedings ACM international conference on management of data (SIGMOD’07) (pp. 103–114). Beijing, China. Xin, D., Han, J., & Chang, K. C.-C. (2007). Progressive and selective merge. computing top-k with ad-hoc ranking functions. In Proceedings ACM international conference on management of data (SIGMOD’07) (pp. 103–114). Beijing, China.
Zurück zum Zitat Yi, K., Yu, H., Yang, J., Xia, G., & Chen, Y. (2003). Efficient maintenance of materialized top-k views. In Proceedings of the 19th international conference on data engineering (ICDE’03) (pp. 189–200). Bangalore, India. Yi, K., Yu, H., Yang, J., Xia, G., & Chen, Y. (2003). Efficient maintenance of materialized top-k views. In Proceedings of the 19th international conference on data engineering (ICDE’03) (pp. 189–200). Bangalore, India.
Zurück zum Zitat Yiu, M. L., & Mamoulis, N. (2007). Efficient processing of top-k dominating queries on multi-dimensional data. In Proceedings of 33rd international conference on very large data bases (VLDB’07) (pp. 483–494). Vienna, Austria. Yiu, M. L., & Mamoulis, N. (2007). Efficient processing of top-k dominating queries on multi-dimensional data. In Proceedings of 33rd international conference on very large data bases (VLDB’07) (pp. 483–494). Vienna, Austria.
Zurück zum Zitat Yu, C., Philip, G., & Meng, W. (2003). Distributed top-N query processing with possibly uncooperative local systems. In Proceedings of 29th international conference on very large data bases (VLDB’03) (pp. 117–128). Berlin, Germany. Yu, C., Philip, G., & Meng, W. (2003). Distributed top-N query processing with possibly uncooperative local systems. In Proceedings of 29th international conference on very large data bases (VLDB’03) (pp. 117–128). Berlin, Germany.
Zurück zum Zitat Yu, C., Sharma, P., Meng, W., & Qin, Y. (2001). Database selection for processing k nearest neighbors queries in distributed environments. In ACM/IEEE joint conference on digital libraries (JCDL’01) (pp. 215–222). Roanoke, Virginia, USA. Yu, C., Sharma, P., Meng, W., & Qin, Y. (2001). Database selection for processing k nearest neighbors queries in distributed environments. In ACM/IEEE joint conference on digital libraries (JCDL’01) (pp. 215–222). Roanoke, Virginia, USA.
Zurück zum Zitat Zhu, L., & Meng, W. (2004). Learning-based top-N selection query evaluation over relational databases. In Advances in web-age information management: 5th international conference (WAIM’04) (pp. 197–207). Dalian, China. Zhu, L., & Meng, W. (2004). Learning-based top-N selection query evaluation over relational databases. In Advances in web-age information management: 5th international conference (WAIM’04) (pp. 197–207). Dalian, China.
Metadaten
Titel
Processing top-N relational queries by learning
verfasst von
Liang Zhu
Weiyi Meng
Chunnian Liu
Wenzhu Yang
Dazhong Liu
Publikationsdatum
01.02.2010
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 1/2010
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-009-0078-7

Premium Partner