Skip to main content
Erschienen in: Knowledge and Information Systems 2/2014

01.11.2014 | Regular Paper

Parallelizing skyline queries over uncertain data streams with sliding window partitioning and grid index

verfasst von: Xiaoyong Li, Yijie Wang, Xiaoling Li, Yuan Wang

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Skyline query processing over uncertain data streams has attracted considerable attention in database community recently, due to its importance in helping users make intelligent decisions over complex data in many real applications. Although lots of recent efforts have been conducted to the skyline computation over data streams in a centralized environment typically with one processor, they cannot be well adapted to the skyline queries over complex uncertain streaming data, due to the computational complexity of the query and the limited processing capability. Furthermore, none of the existing studies on parallel skyline computation can effectively address the skyline query problem over uncertain data streams, as they are all developed to address the problem of parallel skyline queries over static certain data sets. In this paper, we formally define the parallel query problem over uncertain data streams with the sliding window streaming model. Particularly, for the first time, we propose an effective framework, named distributed parallel framework to address the problem based on the sliding window partitioning. Furthermore, we propose an efficient approach (parallel streaming skyline) to further optimize the parallel skyline computation with an optimized streaming item mapping strategy and the grid index. Extensive experiments with real deployment over synthetic and real data are conducted to demonstrate the effectiveness and efficiency of the proposed techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Afrati FN, Koutris P, Suciu D, Ullman JD (2012) Parallel skyline queries. In: Proceedings of international conference on data theory (ICDT). ACM Afrati FN, Koutris P, Suciu D, Ullman JD (2012) Parallel skyline queries. In: Proceedings of international conference on data theory (ICDT). ACM
2.
Zurück zum Zitat Atallah M, Qi Y (2009) Computing all skyline probabilities for uncertain data. In: Proceedings of the ACM symposium on principles of database systems (PODS), pp 279–287 Atallah M, Qi Y (2009) Computing all skyline probabilities for uncertain data. In: Proceedings of the ACM symposium on principles of database systems (PODS), pp 279–287
3.
Zurück zum Zitat Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream. In: Proceedings of ACM Symposium on principles of database systems (PODS), pp 1–16 Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream. In: Proceedings of ACM Symposium on principles of database systems (PODS), pp 1–16
4.
Zurück zum Zitat Böhm C, Fiedler F, Oswald A, Plant C, Wackersreuther B (2009) Probabilistic skyline queries. In: Proceedings of ACM CIKM, pp 651–660 Böhm C, Fiedler F, Oswald A, Plant C, Wackersreuther B (2009) Probabilistic skyline queries. In: Proceedings of ACM CIKM, pp 651–660
5.
Zurück zum Zitat Börzsönyi S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of the 17th international conference on data engineering (ICDE), pp 421–430 Börzsönyi S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of the 17th international conference on data engineering (ICDE), pp 421–430
6.
Zurück zum Zitat Chan C, Jagadish H, Tan K, Tung A, Zhang Z (2006) Finding k-dominant skylines in high dimensional space. In: Proceedings of the international conference on management of data (SIGMOD), ACM, pp 503–514 Chan C, Jagadish H, Tan K, Tung A, Zhang Z (2006) Finding k-dominant skylines in high dimensional space. In: Proceedings of the international conference on management of data (SIGMOD), ACM, pp 503–514
7.
Zurück zum Zitat Chen L, Lian X (2008) Dynamic skyline queries in metric spaces. In: Proceedings of the 11th international conference on Extending database technology: advances in database technology (EDBT), ACM. pp 333–343 Chen L, Lian X (2008) Dynamic skyline queries in metric spaces. In: Proceedings of the 11th international conference on Extending database technology: advances in database technology (EDBT), ACM. pp 333–343
8.
Zurück zum Zitat Chen L, Özsu M, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of ACM SIGMOD, pp 491–502 Chen L, Özsu M, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of ACM SIGMOD, pp 491–502
9.
Zurück zum Zitat Cheng R, Kalashnikov D, Prabhakar S (2004) Querying imprecise data in moving object environments. IEEE Trans Knowl Data Eng (TKDE) 16(9):1112–1127CrossRef Cheng R, Kalashnikov D, Prabhakar S (2004) Querying imprecise data in moving object environments. IEEE Trans Knowl Data Eng (TKDE) 16(9):1112–1127CrossRef
10.
Zurück zum Zitat Dalvi N, Suciu D (2007) Efficient query evaluation on probabilistic databases. VLDB J 16(4):523–544CrossRef Dalvi N, Suciu D (2007) Efficient query evaluation on probabilistic databases. VLDB J 16(4):523–544CrossRef
11.
Zurück zum Zitat Dehne F, Fabri A, Rau-Chaplin A (1993) Scalable parallel geometric algorithms for coarse grained multicomputers. In: Proceedings of the ninth annual symposium on computational geometry, ACM. pp 298–307 Dehne F, Fabri A, Rau-Chaplin A (1993) Scalable parallel geometric algorithms for coarse grained multicomputers. In: Proceedings of the ninth annual symposium on computational geometry, ACM. pp 298–307
12.
Zurück zum Zitat Diao Y, Li B, Liu A, Peng L, Sutton C, Tran T, Zink M (2009) Capturing data uncertainty in high-volume stream processing. In: Conference on innovative data systems research (CIDR) Diao Y, Li B, Liu A, Peng L, Sutton C, Tran T, Zink M (2009) Capturing data uncertainty in high-volume stream processing. In: Conference on innovative data systems research (CIDR)
13.
Zurück zum Zitat Ding X, Jin H (2012) Efficient and progressive algorithms for distributed skyline queries over uncertain data. IEEE Trans Knowl Data Eng (TKDE) 24(8):1448–1462CrossRef Ding X, Jin H (2012) Efficient and progressive algorithms for distributed skyline queries over uncertain data. IEEE Trans Knowl Data Eng (TKDE) 24(8):1448–1462CrossRef
14.
Zurück zum Zitat Ding X, Lian X, Chen L, Jin H (2012) Continuous monitoring of skylines over uncertain data streams. Inf Sci 184(1):196–214 Ding X, Lian X, Chen L, Jin H (2012) Continuous monitoring of skylines over uncertain data streams. Inf Sci 184(1):196–214
15.
Zurück zum Zitat Gehrke J, Korn F, Srivastava D (2001) On computing correlated aggregates over continual data stream. ACM SIGMOD Rec 30(2):13–24CrossRef Gehrke J, Korn F, Srivastava D (2001) On computing correlated aggregates over continual data stream. ACM SIGMOD Rec 30(2):13–24CrossRef
16.
Zurück zum Zitat Hose K, Vlachou A (2012) A survey of skyline processing in highly distributed environments. VLDB J 21(3):359–384CrossRef Hose K, Vlachou A (2012) A survey of skyline processing in highly distributed environments. VLDB J 21(3):359–384CrossRef
17.
Zurück zum Zitat Huang Z, Sun S, Wang W (2010) Efficient mining of skyline objects in subspaces over data streams. Knowl Inf Syst (KAIS) 22(2):159–183CrossRef Huang Z, Sun S, Wang W (2010) Efficient mining of skyline objects in subspaces over data streams. Knowl Inf Syst (KAIS) 22(2):159–183CrossRef
18.
Zurück zum Zitat Jayram T, McGregor A, Muthukrishnan S, Vee E (2008) Estimating statistical aggregates on probabilistic data streams. ACM Trans Database Syst (TODS) 33(4):26CrossRef Jayram T, McGregor A, Muthukrishnan S, Vee E (2008) Estimating statistical aggregates on probabilistic data streams. ACM Trans Database Syst (TODS) 33(4):26CrossRef
19.
Zurück zum Zitat Jeffery S, Franklin M, Garofalakis M (2008) An adaptive RFID middleware for supporting metaphysical data independence. VLDB J 17(2):265–289CrossRef Jeffery S, Franklin M, Garofalakis M (2008) An adaptive RFID middleware for supporting metaphysical data independence. VLDB J 17(2):265–289CrossRef
20.
Zurück zum Zitat Jiang B, Pei J (2009) Online interval skyline queries on time series. In: Proceedings of the 25th international conference on data engineering (ICDE), IEEE. pp. 1036–1047 Jiang B, Pei J (2009) Online interval skyline queries on time series. In: Proceedings of the 25th international conference on data engineering (ICDE), IEEE. pp. 1036–1047
21.
Zurück zum Zitat Kanagal B, Deshpande A (2008) Online filtering, smoothing and probabilistic modeling of streaming data. In: Proceedings of the 24th international conference on data engineering (ICDE), IEEE Kanagal B, Deshpande A (2008) Online filtering, smoothing and probabilistic modeling of streaming data. In: Proceedings of the 24th international conference on data engineering (ICDE), IEEE
22.
Zurück zum Zitat Khalefa M, Mokbel M, Levandoski J (2010) Skyline query processing for uncertain data. In: Proceedings of the 19th ACM international conference on information and knowledge management (CIKM), pp 1293–1296 Khalefa M, Mokbel M, Levandoski J (2010) Skyline query processing for uncertain data. In: Proceedings of the 19th ACM international conference on information and knowledge management (CIKM), pp 1293–1296
23.
Zurück zum Zitat Köhler H, Yang J, Zhou X (2011) Efficient parallel skyline processing using hyperplane projections. In: Proceedings of the international conference on Management of data (SIGMOD), ACM. pp 85–96 Köhler H, Yang J, Zhou X (2011) Efficient parallel skyline processing using hyperplane projections. In: Proceedings of the international conference on Management of data (SIGMOD), ACM. pp 85–96
24.
Zurück zum Zitat Kontaki M, Papadopoulos A, Manolopoulos Y (2008) Continuous k-dominant skyline computation on multidimensional data streams. In: Proceedings of the ACM symposium on applied computing, ACM. pp 956–960 Kontaki M, Papadopoulos A, Manolopoulos Y (2008) Continuous k-dominant skyline computation on multidimensional data streams. In: Proceedings of the ACM symposium on applied computing, ACM. pp 956–960
25.
Zurück zum Zitat Kontaki M, Papadopoulos A, Manolopoulos Y (2011) Continuous top-k dominating queries. In: IEEE transactions on knowledge and data engineering (TKDE) Kontaki M, Papadopoulos A, Manolopoulos Y (2011) Continuous top-k dominating queries. In: IEEE transactions on knowledge and data engineering (TKDE)
26.
Zurück zum Zitat Kurose J, Lyons E, McLaughlin D, Pepyne D, Philips B, Westbrook D, Zink M (2006) An end-user-responsive sensor network architecture for hazardous weather detection, prediction and response. In: Proceedings of AINTEC, pp 1–15 Kurose J, Lyons E, McLaughlin D, Pepyne D, Philips B, Westbrook D, Zink M (2006) An end-user-responsive sensor network architecture for hazardous weather detection, prediction and response. In: Proceedings of AINTEC, pp 1–15
27.
Zurück zum Zitat Lee K, Zheng B, Li H, Lee W (2007) Approaching the skyline in z order. In: Proceedings of the 33rd international conference on very large data bases (VLDB), VLDB Endowment. pp 279–290 Lee K, Zheng B, Li H, Lee W (2007) Approaching the skyline in z order. In: Proceedings of the 33rd international conference on very large data bases (VLDB), VLDB Endowment. pp 279–290
28.
Zurück zum Zitat Lian X, Chen L (2008) Monochromatic and bichromatic reverse skyline search over uncertain data. In: Proceedings of ACM SIGMOD, pp 213–226 Lian X, Chen L (2008) Monochromatic and bichromatic reverse skyline search over uncertain data. In: Proceedings of ACM SIGMOD, pp 213–226
29.
Zurück zum Zitat Lin X, Lu H, Xu J, Yu J (2004) Continuously maintaining quantile summaries of the most recent n elements over a data stream. In: Proceedings of IEEE ICDE, pp 362–373 Lin X, Lu H, Xu J, Yu J (2004) Continuously maintaining quantile summaries of the most recent n elements over a data stream. In: Proceedings of IEEE ICDE, pp 362–373
30.
Zurück zum Zitat Lin X, Yuan Y, Wang W, Lu H (2005) Stabbing the sky: efficient skyline computation over sliding windows. In: Proceedings of the 21st international conference on data engineering (ICDE), IEEE. pp 502–513 Lin X, Yuan Y, Wang W, Lu H (2005) Stabbing the sky: efficient skyline computation over sliding windows. In: Proceedings of the 21st international conference on data engineering (ICDE), IEEE. pp 502–513
31.
Zurück zum Zitat Lin X, Zhang Y, Zhang W, Cheema M (2011) Stochastic skyline operator. In: Proceedings of the 27th international conference on data engineering (ICDE), IEEE. pp 721–732 Lin X, Zhang Y, Zhang W, Cheema M (2011) Stochastic skyline operator. In: Proceedings of the 27th international conference on data engineering (ICDE), IEEE. pp 721–732
32.
Zurück zum Zitat Lu H, Zhou Y, Haustad J (2013) Efficient and scalable continuous skyline monitoring in two-tier streaming settings. Inf Syst 38(1):68–81CrossRef Lu H, Zhou Y, Haustad J (2013) Efficient and scalable continuous skyline monitoring in two-tier streaming settings. Inf Syst 38(1):68–81CrossRef
33.
Zurück zum Zitat Lu X, Wang H, Wang J, Xu J, Li D (2011) Internet-based virtual computing environment: beyond the data center as a computer. Future Gen Comput Syst (FGCS) 29:309–322CrossRef Lu X, Wang H, Wang J, Xu J, Li D (2011) Internet-based virtual computing environment: beyond the data center as a computer. Future Gen Comput Syst (FGCS) 29:309–322CrossRef
34.
35.
Zurück zum Zitat Park S, Kim T, Park J, Kim J, Im H (2009) Parallel skyline computation on multicore architectures. In: Proceedings of the 25th international conference on data engineering (ICDE), IEEE. pp. 760–771 Park S, Kim T, Park J, Kim J, Im H (2009) Parallel skyline computation on multicore architectures. In: Proceedings of the 25th international conference on data engineering (ICDE), IEEE. pp. 760–771
36.
Zurück zum Zitat Pei J, Jiang B, Lin X, Yuan Y (2007) Probabilistic skylines on uncertain data. In: Proceedings of the 33rd international conference on very large data bases (VLDB), pp 15–26 Pei J, Jiang B, Lin X, Yuan Y (2007) Probabilistic skylines on uncertain data. In: Proceedings of the 33rd international conference on very large data bases (VLDB), pp 15–26
37.
Zurück zum Zitat Ré C, Dalvi N, Suciu D (2007) Efficient top-k query evaluation on probabilistic data. In: Proceedings of the 23rd international conference on data engineering (ICDE), pp 886–895 Ré C, Dalvi N, Suciu D (2007) Efficient top-k query evaluation on probabilistic data. In: Proceedings of the 23rd international conference on data engineering (ICDE), pp 886–895
38.
Zurück zum Zitat Ré C, Letchner J, Balazinksa M, Suciu D (2008) Event queries on correlated probabilistic streams. In: Proceedings of ACM SIGMOD, pp 715–728 Ré C, Letchner J, Balazinksa M, Suciu D (2008) Event queries on correlated probabilistic streams. In: Proceedings of ACM SIGMOD, pp 715–728
39.
Zurück zum Zitat Rocha-Junior J, Vlachou A, Doulkeridis C, Nørvåg K (2009) Agids: A grid-based strategy for distributed skyline query processing. In: Proceedings of data management in grid and peer-to-peer systems (Globe) Rocha-Junior J, Vlachou A, Doulkeridis C, Nørvåg K (2009) Agids: A grid-based strategy for distributed skyline query processing. In: Proceedings of data management in grid and peer-to-peer systems (Globe)
40.
Zurück zum Zitat Rocha-Junior J, Vlachou A, Doulkeridis C, Nørvåg K (2011) Efficient execution plans for distributed skyline query processing. In: Proceedings of the 14th international conference on extending database technology (EDBT), ACM. pp 271–282 Rocha-Junior J, Vlachou A, Doulkeridis C, Nørvåg K (2011) Efficient execution plans for distributed skyline query processing. In: Proceedings of the 14th international conference on extending database technology (EDBT), ACM. pp 271–282
41.
Zurück zum Zitat Sharifzadeh M, Shahabi C (2006) The spatial skyline queries. In: Proceedings of the 32nd international conference on very large data bases (VLDB), VLDB Endowment, pp 751–762 Sharifzadeh M, Shahabi C (2006) The spatial skyline queries. In: Proceedings of the 32nd international conference on very large data bases (VLDB), VLDB Endowment, pp 751–762
42.
Zurück zum Zitat Sun S, Huang Z, Zhong H, Dai D, Liu H, Li J (2010) Efficient monitoring of skyline queries over distributed data streams. Knowl Inf Syst (KAIS) 25(3):575–606CrossRef Sun S, Huang Z, Zhong H, Dai D, Liu H, Li J (2010) Efficient monitoring of skyline queries over distributed data streams. Knowl Inf Syst (KAIS) 25(3):575–606CrossRef
43.
Zurück zum Zitat Tao Y, Papadias D (2006) Maintaining sliding window skylines on data streams. IEEE Trans Knowl Data Eng (TKDE) 18(3):377–391 Tao Y, Papadias D (2006) Maintaining sliding window skylines on data streams. IEEE Trans Knowl Data Eng (TKDE) 18(3):377–391
44.
Zurück zum Zitat Vlachou A, Doulkeridis C, Kotidis Y (2008) Angle-based space partitioning for efficient parallel skyline computation. In: Proceedings of the international conference on management of data (SIGMOD), ACM, pp 227–238 Vlachou A, Doulkeridis C, Kotidis Y (2008) Angle-based space partitioning for efficient parallel skyline computation. In: Proceedings of the international conference on management of data (SIGMOD), ACM, pp 227–238
45.
Zurück zum Zitat Wang S, Ooi B, Tung A, Xu L (2007) Efficient skyline query processing on peer-to-peer networks. In: Proceedings of the 23rd international conference on data engineering (ICDE), IEEE, pp 1126–1135 Wang S, Ooi B, Tung A, Xu L (2007) Efficient skyline query processing on peer-to-peer networks. In: Proceedings of the 23rd international conference on data engineering (ICDE), IEEE, pp 1126–1135
46.
Zurück zum Zitat Wang Y, Li S (2006) Research and performance evaluation of data replication technology in distributed storage systems. Comput Math Appl 51(11):1625–1632CrossRef Wang Y, Li S (2006) Research and performance evaluation of data replication technology in distributed storage systems. Comput Math Appl 51(11):1625–1632CrossRef
47.
Zurück zum Zitat Wang Y, Li X, Li X, Wang Y (2013) A survey of queries over uncertain data. Knowl Inf Syst (KAIS) 37(3):485–530CrossRef Wang Y, Li X, Li X, Wang Y (2013) A survey of queries over uncertain data. Knowl Inf Syst (KAIS) 37(3):485–530CrossRef
48.
Zurück zum Zitat Wu P, Zhang C, Feng Y, Zhao B, Agrawal D, El Abbadi A (2006) Parallelizing skyline queries for scalable distribution. In: Proceedings of the international conference on extending database technology: advances in database technology (EDBT), pp 112–130 Wu P, Zhang C, Feng Y, Zhao B, Agrawal D, El Abbadi A (2006) Parallelizing skyline queries for scalable distribution. In: Proceedings of the international conference on extending database technology: advances in database technology (EDBT), pp 112–130
49.
Zurück zum Zitat Xin J, Wang G, Chen L, Zhang X, Wang Z (2007) Continuously maintaining sliding window skylines in a sensor network. In: Proceedings of the international conference on database systems for advanced applications (DASFAA), pp 509–521 Xin J, Wang G, Chen L, Zhang X, Wang Z (2007) Continuously maintaining sliding window skylines in a sensor network. In: Proceedings of the international conference on database systems for advanced applications (DASFAA), pp 509–521
50.
51.
Zurück zum Zitat Zhang W, Lin X, Zhang Y, Wang W, Yu J (2009) Probabilistic skyline operator over sliding windows. In: Proceedings of international conference on data engineering (ICDE), pp 1060–1071 Zhang W, Lin X, Zhang Y, Wang W, Yu J (2009) Probabilistic skyline operator over sliding windows. In: Proceedings of international conference on data engineering (ICDE), pp 1060–1071
52.
Zurück zum Zitat Zhang Y, Zhang W, Lin X, Jiang B, Pei J (2011) Ranking uncertain sky: the probabilistic top-k skyline operator. Inf Syst 36(5):898–915CrossRef Zhang Y, Zhang W, Lin X, Jiang B, Pei J (2011) Ranking uncertain sky: the probabilistic top-k skyline operator. Inf Syst 36(5):898–915CrossRef
53.
Zurück zum Zitat Zhang Z, Cheng R, Papadias D, Tung A (2009) Minimizing the communication cost for continuous skyline maintenance. In: Proceedings of the international conference on management of data (SIGMOD), ACM, pp 495–508 Zhang Z, Cheng R, Papadias D, Tung A (2009) Minimizing the communication cost for continuous skyline maintenance. In: Proceedings of the international conference on management of data (SIGMOD), ACM, pp 495–508
Metadaten
Titel
Parallelizing skyline queries over uncertain data streams with sliding window partitioning and grid index
verfasst von
Xiaoyong Li
Yijie Wang
Xiaoling Li
Yuan Wang
Publikationsdatum
01.11.2014
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2014
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-013-0725-8

Weitere Artikel der Ausgabe 2/2014

Knowledge and Information Systems 2/2014 Zur Ausgabe