Skip to main content
Erschienen in: The Journal of Supercomputing 11/2017

08.05.2017

Exploring correlation for fast skyline computation

verfasst von: Boseon Yu, Wonik Choi, Ling Liu

Erschienen in: The Journal of Supercomputing | Ausgabe 11/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Scaling skyline queries over high-dimensional datasets remains to be challenging due to the fact that most existing algorithms assume dimensional independence when establishing the worst-case complexity by discarding correlation distribution. In this paper, we present HashSkyline, a systematic and correlation-aware approach for scaling skyline queries over high-dimensional datasets with three novel features: First, it offers a fast hash-based method to prune non-skyline points by utilizing data correlation characteristics and speed up the overall skyline evaluation for correlated datasets. Second, we develop \(HashSkyline_{GPU}\), which can dramatically reduce the response time for anti-correlated and independent datasets by capitalizing on the parallel processing power of GPUs. Third, the HashSkyline approach uses the pivot cell-based mechanism combined with the correlation threshold to determine the correlation distribution characteristics for a given dataset, enabling adaptive configuration of HashSkyline for skyline query evaluation by auto-switching of \(HashSkyline_{CPU}\) and \(HashSkyline_{GPU}\). We evaluate the validity of HashSkyline using both synthetic datasets and real datasets. Our experiments show that HashSkyline consumes significantly less pre-processing cost and achieves significantly higher overall query performance, compared to existing state-of-the-art algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Balke WT, Güntzer U, Zheng JX (2004) Efficient distributed skylining for web information systems. In: International Conference on Extending Database Technology, pp. 256–273. Springer Balke WT, Güntzer U, Zheng JX (2004) Efficient distributed skylining for web information systems. In: International Conference on Extending Database Technology, pp. 256–273. Springer
2.
Zurück zum Zitat Bartolini I, Ciaccia P, Patella M (2008) Efficient sort-based skyline evaluation. ACM Trans Database Syst 33(4):31CrossRef Bartolini I, Ciaccia P, Patella M (2008) Efficient sort-based skyline evaluation. ACM Trans Database Syst 33(4):31CrossRef
3.
Zurück zum Zitat Bentley JL, Stanat DF, Williams EH (1977) The complexity of finding fixed-radius near neighbors. Inform Process Lett 6(6):209–212CrossRefMATHMathSciNet Bentley JL, Stanat DF, Williams EH (1977) The complexity of finding fixed-radius near neighbors. Inform Process Lett 6(6):209–212CrossRefMATHMathSciNet
4.
Zurück zum Zitat Bøgh KS, Assent I, Magnani M (2013) Efficient GPU-based skyline computation. In: Proceedings of the Ninth International Workshop on Data Management on New Hardware, p 5. ACM Bøgh KS, Assent I, Magnani M (2013) Efficient GPU-based skyline computation. In: Proceedings of the Ninth International Workshop on Data Management on New Hardware, p 5. ACM
5.
Zurück zum Zitat Bøgh KS, Chester S, Assent I (2016) Skyalign: a portable, work-efficient skyline algorithm for multicore and gpu architectures. VLDB J 25(6):817–841CrossRef Bøgh KS, Chester S, Assent I (2016) Skyalign: a portable, work-efficient skyline algorithm for multicore and gpu architectures. VLDB J 25(6):817–841CrossRef
6.
Zurück zum Zitat Borzsony S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of the 17th International Conference on Data Engineering, pp 421–430. IEEE Borzsony S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of the 17th International Conference on Data Engineering, pp 421–430. IEEE
7.
Zurück zum Zitat Chan CY, Eng PK, Tan KL (2005) Stratified computation of skylines with partially-ordered domains. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of data, pp 203–214. ACM Chan CY, Eng PK, Tan KL (2005) Stratified computation of skylines with partially-ordered domains. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of data, pp 203–214. ACM
8.
Zurück zum Zitat Cui B, Lu H, Xu Q, Chen L, Dai Y, Zhou Y (2008) Parallel distributed processing of constrained skyline queries by filtering. In: 2008 IEEE 24th International Conference on Data Engineering, pp 546–555. IEEE Cui B, Lu H, Xu Q, Chen L, Dai Y, Zhou Y (2008) Parallel distributed processing of constrained skyline queries by filtering. In: 2008 IEEE 24th International Conference on Data Engineering, pp 546–555. IEEE
9.
Zurück zum Zitat Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef
10.
Zurück zum Zitat Dellis E, Seeger B (2007) Efficient computation of reverse skyline queries. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp 291–302. VLDB Endowment Dellis E, Seeger B (2007) Efficient computation of reverse skyline queries. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp 291–302. VLDB Endowment
11.
Zurück zum Zitat Dellis E, Vlachou A, Vladimirskiy I, Seeger B, Theodoridis Y (2006) Constrained subspace skyline computation. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp 415–424. ACM Dellis E, Vlachou A, Vladimirskiy I, Seeger B, Theodoridis Y (2006) Constrained subspace skyline computation. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp 415–424. ACM
12.
Zurück zum Zitat Deng K, Zhou X, Tao H (2007) Multi-source skyline query processing in road networks. In: 2007 IEEE 23rd International Conference on Data Engineering, pp 796–805. IEEE Deng K, Zhou X, Tao H (2007) Multi-source skyline query processing in road networks. In: 2007 IEEE 23rd International Conference on Data Engineering, pp 796–805. IEEE
13.
Zurück zum Zitat Godfrey P, Shipley R, Gryz J (2005) Maximal vector computation in large data sets. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp 229–240. VLDB Endowment Godfrey P, Shipley R, Gryz J (2005) Maximal vector computation in large data sets. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp 229–240. VLDB Endowment
14.
Zurück zum Zitat Hjaltason GR, Samet H (1999) Distance browsing in spatial databases. ACM Trans Database Syst 24(2):265–318CrossRef Hjaltason GR, Samet H (1999) Distance browsing in spatial databases. ACM Trans Database Syst 24(2):265–318CrossRef
15.
Zurück zum Zitat Hsueh YL, Zimmermann R, Ku WS (2008) Efficient updates for continuous skyline computations. In: International Conference on Database and Expert Systems Applications, pp 419–433. Springer Hsueh YL, Zimmermann R, Ku WS (2008) Efficient updates for continuous skyline computations. In: International Conference on Database and Expert Systems Applications, pp 419–433. Springer
16.
Zurück zum Zitat Huang Z, Lu H, Ooi BC, Tung AK (2006) Continuous skyline queries for moving objects. IEEE Trans Knowl Data Eng 18(12):1645–1658CrossRef Huang Z, Lu H, Ooi BC, Tung AK (2006) Continuous skyline queries for moving objects. IEEE Trans Knowl Data Eng 18(12):1645–1658CrossRef
17.
Zurück zum Zitat Jan C, Parke G, Jarek G, Dongming L (2003) Skyline with presorting. In: 19th International Conference on Data Engineering, Bangalore, India, p 717 Jan C, Parke G, Jarek G, Dongming L (2003) Skyline with presorting. In: 19th International Conference on Data Engineering, Bangalore, India, p 717
18.
Zurück zum Zitat Köhler H, Yang J, Zhou, X (2011) Efficient parallel skyline processing using hyperplane projections. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp 85–96. ACM Köhler H, Yang J, Zhou, X (2011) Efficient parallel skyline processing using hyperplane projections. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp 85–96. ACM
19.
Zurück zum Zitat Kossmann D, Ramsak F, Rost S (2002) Shooting stars in the sky: an online algorithm for skyline queries. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 275–286. VLDB Endowment Kossmann D, Ramsak F, Rost S (2002) Shooting stars in the sky: an online algorithm for skyline queries. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 275–286. VLDB Endowment
20.
Zurück zum Zitat Lee J, Hwang Sw (2010) Bskytree: scalable skyline computation using a balanced pivot selection. In: Proceedings of the 13th International Conference on Extending Database Technology, pp 195–206. ACM Lee J, Hwang Sw (2010) Bskytree: scalable skyline computation using a balanced pivot selection. In: Proceedings of the 13th International Conference on Extending Database Technology, pp 195–206. ACM
21.
Zurück zum Zitat Lee J, Hwang Sw (2010) Qskycube: efficient skycube computation using point-based space partitioning. Proc VLDB Endow 4(3):185–196CrossRef Lee J, Hwang Sw (2010) Qskycube: efficient skycube computation using point-based space partitioning. Proc VLDB Endow 4(3):185–196CrossRef
22.
Zurück zum Zitat Lee KC, Lee WC, Zheng B, Li H, Tian Y (2010) Z-sky: an efficient skyline query processing framework based on z-order. VLDB J 19(3):333–362CrossRef Lee KC, Lee WC, Zheng B, Li H, Tian Y (2010) Z-sky: an efficient skyline query processing framework based on z-order. VLDB J 19(3):333–362CrossRef
23.
Zurück zum Zitat Lee KC, Zheng B, Li H, Lee WC (2007) Approaching the skyline in z order. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp 279–290. VLDB Endowment Lee KC, Zheng B, Li H, Lee WC (2007) Approaching the skyline in z order. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp 279–290. VLDB Endowment
24.
Zurück zum Zitat Lian X, Chen L (2008) Monochromatic and bichromatic reverse skyline search over uncertain databases. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp 213–226. ACM Lian X, Chen L (2008) Monochromatic and bichromatic reverse skyline search over uncertain databases. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp 213–226. ACM
25.
Zurück zum Zitat Liu B, Chan CY (2010) Zinc: efficient indexing for skyline computation. Proc VLDB Endow 4(3):197–207CrossRef Liu B, Chan CY (2010) Zinc: efficient indexing for skyline computation. Proc VLDB Endow 4(3):197–207CrossRef
26.
27.
Zurück zum Zitat Papadias D, Tao Y, Fu G, Seeger B (2003) An optimal and progressive algorithm for skyline queries. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp 467–478. ACM Papadias D, Tao Y, Fu G, Seeger B (2003) An optimal and progressive algorithm for skyline queries. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp 467–478. ACM
28.
Zurück zum Zitat Park S, Kim T, Park J, Kim J, Im H (2009) Parallel skyline computation on multicore architectures. In: 2009 IEEE 25th International Conference on Data Engineering, pp 760–771. IEEE Park S, Kim T, Park J, Kim J, Im H (2009) Parallel skyline computation on multicore architectures. In: 2009 IEEE 25th International Conference on Data Engineering, pp 760–771. IEEE
29.
Zurück zum Zitat Pei J, Jin W, Ester M, Tao Y (2005) Catching the best views of skyline: a semantic approach based on decisive subspaces. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp 253–264. VLDB Endowment Pei J, Jin W, Ester M, Tao Y (2005) Catching the best views of skyline: a semantic approach based on decisive subspaces. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp 253–264. VLDB Endowment
30.
Zurück zum Zitat Roussopoulos N, Kelley S, Vincent F (1995) Nearest neighbor queries. In: ACM sigmod record, vol 24, pp 71–79. ACM Roussopoulos N, Kelley S, Vincent F (1995) Nearest neighbor queries. In: ACM sigmod record, vol 24, pp 71–79. ACM
31.
Zurück zum Zitat Sacharidis D, Papadopoulos S, Papadias D (2009) Topologically sorted skylines for partially ordered domains. In: IEEE 25th International Conference on Data Engineering, ICDE’09, pp 1072–1083. IEEE Sacharidis D, Papadopoulos S, Papadias D (2009) Topologically sorted skylines for partially ordered domains. In: IEEE 25th International Conference on Data Engineering, ICDE’09, pp 1072–1083. IEEE
32.
Zurück zum Zitat Shang H, Kitsuregawa M (2013) Skyline operator on anti-correlated distributions. Proc VLDB Endow 6(9):649–660CrossRef Shang H, Kitsuregawa M (2013) Skyline operator on anti-correlated distributions. Proc VLDB Endow 6(9):649–660CrossRef
33.
Zurück zum Zitat Sharifzadeh M, Shahabi C (2006) The spatial skyline queries. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp 751–762. VLDB Endowment Sharifzadeh M, Shahabi C (2006) The spatial skyline queries. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp 751–762. VLDB Endowment
34.
Zurück zum Zitat Tan KL, Eng PK, Ooi BC et al (2001) Efficient progressive skyline computation. VLDB 1:301–310 Tan KL, Eng PK, Ooi BC et al (2001) Efficient progressive skyline computation. VLDB 1:301–310
35.
Zurück zum Zitat Tao Y, Xiao X, Pei J (2006) Subsky: Efficient computation of skylines in subspaces. In: 22nd International Conference on Data Engineering (ICDE’06), pp 65–65. IEEE Tao Y, Xiao X, Pei J (2006) Subsky: Efficient computation of skylines in subspaces. In: 22nd International Conference on Data Engineering (ICDE’06), pp 65–65. IEEE
36.
Zurück zum Zitat Tiakas E, Papadopoulos AN, Manolopoulos Y (2011) Progressive processing of subspace dominating queries. VLDB J Int J Very Large Data Bases 20(6):921–948CrossRef Tiakas E, Papadopoulos AN, Manolopoulos Y (2011) Progressive processing of subspace dominating queries. VLDB J Int J Very Large Data Bases 20(6):921–948CrossRef
37.
Zurück zum Zitat Trimponias G, Bartolini I, Papadias D, Yang Y (2013) Skyline processing on distributed vertical decompositions. IEEE Trans Knowl Data Eng 25(4):850–862CrossRef Trimponias G, Bartolini I, Papadias D, Yang Y (2013) Skyline processing on distributed vertical decompositions. IEEE Trans Knowl Data Eng 25(4):850–862CrossRef
38.
Zurück zum Zitat Vlachou A, Doulkeridis C, Kotidis Y (2008) Angle-based space partitioning for efficient parallel skyline computation. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp 227–238. ACM Vlachou A, Doulkeridis C, Kotidis Y (2008) Angle-based space partitioning for efficient parallel skyline computation. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp 227–238. ACM
39.
Zurück zum Zitat Wu P, Zhang C, Feng Y, Zhao BY, Agrawal D, El Abbadi A (2006) Parallelizing skyline queries for scalable distribution. In: International Conference on Extending Database Technology, pp 112–130. Springer Wu P, Zhang C, Feng Y, Zhao BY, Agrawal D, El Abbadi A (2006) Parallelizing skyline queries for scalable distribution. In: International Conference on Extending Database Technology, pp 112–130. Springer
40.
Zurück zum Zitat Yuan Y, Lin X, Liu Q, Wang W, Yu JX (2005) Zhang, Q.: Efficient computation of the skyline cube. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp 241–252. VLDB Endowment Yuan Y, Lin X, Liu Q, Wang W, Yu JX (2005) Zhang, Q.: Efficient computation of the skyline cube. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp 241–252. VLDB Endowment
41.
Zurück zum Zitat Zhang Z, Cheng R, Papadias D, Tung AK (2009) Minimizing the communication cost for continuous skyline maintenance. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp 495–508. ACM Zhang Z, Cheng R, Papadias D, Tung AK (2009) Minimizing the communication cost for continuous skyline maintenance. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp 495–508. ACM
Metadaten
Titel
Exploring correlation for fast skyline computation
verfasst von
Boseon Yu
Wonik Choi
Ling Liu
Publikationsdatum
08.05.2017
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 11/2017
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-017-2064-0

Weitere Artikel der Ausgabe 11/2017

The Journal of Supercomputing 11/2017 Zur Ausgabe