Skip to main content
Top

2016 | OriginalPaper | Chapter

Planning Ahead: Stream-Driven Linked-Data Access Under Update-Budget Constraints

Authors : Shen Gao, Daniele Dell’Aglio, Soheila Dehghanzadeh, Abraham Bernstein, Emanuele Della Valle, Alessandra Mileo

Published in: The Semantic Web – ISWC 2016

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Data stream applications are becoming increasingly popular on the web. In these applications, one query pattern is especially prominent: a join between a continuous data stream and some background data (BGD). Oftentimes, the target BGD is large, maintained externally, changing slowly, and costly to query (both in terms of time and money). Hence, practical applications usually maintain a local (cached) view of the relevant BGD. Given that these caches are not updated as the original BGD, they should be refreshed under realistic budget constraints (in terms of latency, computation time, and possibly financial cost) to avoid stale data leading to wrong answers. This paper proposes to model the join between streams and the BGD as a bipartite graph. By exploiting the graph structure, we keep the quality of results good enough without refreshing the entire cache for each evaluation. We also introduce two extensions to this method: first, we consider a continuous join between recent portions of a data stream and some BGD to focus on updates that have the longest effect. Second, we consider the future impact of a query to the BGD by proposing to delay some updates to provide fresher answers in future. By extending an existing stream processor with the proposed policies, we empirically show that we can improve result freshness by 93 % over baseline algorithms such as Random Selection or Least Recently Updated.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
4
We assume that the aggregation is performed locally in the query processor and not in the remote BGD. It happens, e.g., when BGD is not SPARQL 1.1 compliant.
 
5
We acknowledge that not all types of budget can be saved for future (e.g., a fixed amount of bandwidth cannot be saved). Other types of budgets, such as a supplier charges per request, a limited data plan, or limited power can be saved.
 
Literature
1.
go back to reference Abadi, D.J.: Consistency tradeoffs in modern distributed database system design: cap is only part of the story. Computer 2, 37–42 (2012)CrossRef Abadi, D.J.: Consistency tradeoffs in modern distributed database system design: cap is only part of the story. Computer 2, 37–42 (2012)CrossRef
2.
go back to reference Aranda, C.B., Arenas, M., Corcho, Ó., Polleres, A.: Federating queries in SPARQL 1.1: syntax, semantics and evaluation. J. Web Semant. 18(1), 1–17 (2013)CrossRef Aranda, C.B., Arenas, M., Corcho, Ó., Polleres, A.: Federating queries in SPARQL 1.1: syntax, semantics and evaluation. J. Web Semant. 18(1), 1–17 (2013)CrossRef
3.
go back to reference Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: Querying RDF streams with C-SPARQL. SIGMOD Rec. 39(1), 20–26 (2010)CrossRefMATH Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: Querying RDF streams with C-SPARQL. SIGMOD Rec. 39(1), 20–26 (2010)CrossRefMATH
4.
go back to reference Calbimonte, J., Jeung, H., Corcho, Ó., Aberer, K.: Enabling query technologies for the semantic sensor web. Int. J. Semant. Web Inf. Syst. 8(1), 43–63 (2012)CrossRef Calbimonte, J., Jeung, H., Corcho, Ó., Aberer, K.: Enabling query technologies for the semantic sensor web. Int. J. Semant. Web Inf. Syst. 8(1), 43–63 (2012)CrossRef
5.
go back to reference Dehghanzadeh, S., Dell’Aglio, D., Gao, S., Della Valle, E., Mileo, A., Bernstein, A.: Approximate continuous query answering over streams and dynamic linked data sets. In: Cimiano, P., Frasincar, F., Houben, G.-J., Schwabe, D. (eds.) ICWE 2015. LNCS, vol. 9114, pp. 307–325. Springer, Heidelberg (2015)CrossRef Dehghanzadeh, S., Dell’Aglio, D., Gao, S., Della Valle, E., Mileo, A., Bernstein, A.: Approximate continuous query answering over streams and dynamic linked data sets. In: Cimiano, P., Frasincar, F., Houben, G.-J., Schwabe, D. (eds.) ICWE 2015. LNCS, vol. 9114, pp. 307–325. Springer, Heidelberg (2015)CrossRef
6.
go back to reference Dehghanzadeh, S., Parreira, J.X., Karnstedt, M., Umbrich, J., Hauswirth, M., Decker, S.: Optimizing SPARQL query processing on dynamic and static data based on query time/freshness requirements using materialization. In: Supnithi, T., Yamaguchi, T., Pan, J.Z., Wuwongse, V., Buranarach, M. (eds.) JIST 2014. LNCS, vol. 8943, pp. 257–270. Springer, Heidelberg (2015)CrossRef Dehghanzadeh, S., Parreira, J.X., Karnstedt, M., Umbrich, J., Hauswirth, M., Decker, S.: Optimizing SPARQL query processing on dynamic and static data based on query time/freshness requirements using materialization. In: Supnithi, T., Yamaguchi, T., Pan, J.Z., Wuwongse, V., Buranarach, M. (eds.) JIST 2014. LNCS, vol. 8943, pp. 257–270. Springer, Heidelberg (2015)CrossRef
7.
go back to reference Dell’Aglio, D., Della Valle, E., Calbimonte, J., Corcho, Ó.: RSP-QL semantics: a unifying query model to explain heterogeneity of RDF stream processing systems. Int. J. Semant. Web Inf. Syst. 10(4), 17–44 (2014)CrossRef Dell’Aglio, D., Della Valle, E., Calbimonte, J., Corcho, Ó.: RSP-QL semantics: a unifying query model to explain heterogeneity of RDF stream processing systems. Int. J. Semant. Web Inf. Syst. 10(4), 17–44 (2014)CrossRef
8.
go back to reference Gançarski, S., Naacke, H., Pacitti, E., Valduriez, P.: The leganet system: freshness-aware transaction routing in a database cluster. Inf. Syst. 32, 320–343 (2007)CrossRef Gançarski, S., Naacke, H., Pacitti, E., Valduriez, P.: The leganet system: freshness-aware transaction routing in a database cluster. Inf. Syst. 32, 320–343 (2007)CrossRef
9.
go back to reference Guo, H., Larson, P.-Å., Ramakrishnan, R.: Caching with good enough currency, consistency, and completeness. In: VLDB, pp. 457–468. VLDB Endowment (2005) Guo, H., Larson, P.-Å., Ramakrishnan, R.: Caching with good enough currency, consistency, and completeness. In: VLDB, pp. 457–468. VLDB Endowment (2005)
10.
go back to reference Hasan, S., O’Riain, S., Curry, E.: Towards unified and native enrichment in event processing systems. In: DEBS, pp. 171–182. ACM (2013) Hasan, S., O’Riain, S., Curry, E.: Towards unified and native enrichment in event processing systems. In: DEBS, pp. 171–182. ACM (2013)
11.
go back to reference Hinze, A., Sachs, K., Buchmann, A.: Event-based applications and enabling technologies. In: DEBS, p. 1. ACM (2009) Hinze, A., Sachs, K., Buchmann, A.: Event-based applications and enabling technologies. In: DEBS, p. 1. ACM (2009)
12.
go back to reference Ji, Y., Jerzak, Z., Nica, A., Hackenbroich, G., Fetzer, C.: Optimization of continuous queries in federated database and stream processing systems. In: BTW 2015. LNI, vol. 241, pp. 403–422. GI (2015) Ji, Y., Jerzak, Z., Nica, A., Hackenbroich, G., Fetzer, C.: Optimization of continuous queries in federated database and stream processing systems. In: BTW 2015. LNI, vol. 241, pp. 403–422. GI (2015)
13.
go back to reference Käfer, T., Umbrich, J., Hogan, A., Polleres, A.: Towards a dynamic linked data observatory. LDOW at WWW (2012) Käfer, T., Umbrich, J., Hogan, A., Polleres, A.: Towards a dynamic linked data observatory. LDOW at WWW (2012)
14.
go back to reference Labrinidis, A., Roussopoulos, N.: Exploring the tradeoff between performance and data freshness in database-driven web servers. VLDB J. 13(3), 240–255 (2004)CrossRef Labrinidis, A., Roussopoulos, N.: Exploring the tradeoff between performance and data freshness in database-driven web servers. VLDB J. 13(3), 240–255 (2004)CrossRef
15.
go back to reference Ladwig, G., Tran, T.: SIHJoin: querying remote and local linked data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 139–153. Springer, Heidelberg (2011)CrossRef Ladwig, G., Tran, T.: SIHJoin: querying remote and local linked data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 139–153. Springer, Heidelberg (2011)CrossRef
16.
go back to reference Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 370–388. Springer, Heidelberg (2011)CrossRef Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 370–388. Springer, Heidelberg (2011)CrossRef
17.
go back to reference Lee, R., Xu, Z.: Exploiting stream request locality to improve query throughput of a data integration system. IEEE Trans. Comput. 58(10), 1356–1368 (2009)MathSciNetCrossRef Lee, R., Xu, Z.: Exploiting stream request locality to improve query throughput of a data integration system. IEEE Trans. Comput. 58(10), 1356–1368 (2009)MathSciNetCrossRef
18.
go back to reference Margara, A., Urbani, J., van Harmelen, F., Bal, H.: Streaming the web: reasoning over dynamic data. J. Web Semant. 25, 24–44 (2014)CrossRef Margara, A., Urbani, J., van Harmelen, F., Bal, H.: Streaming the web: reasoning over dynamic data. J. Web Semant. 25, 24–44 (2014)CrossRef
19.
go back to reference Montoya, G., Vidal, M.-E., Corcho, O., Ruckhaus, E., Buil-Aranda, C.: Benchmarking federated SPARQL query engines: are existing testbeds enough? In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part II. LNCS, vol. 7650, pp. 313–324. Springer, Heidelberg (2012)CrossRef Montoya, G., Vidal, M.-E., Corcho, O., Ruckhaus, E., Buil-Aranda, C.: Benchmarking federated SPARQL query engines: are existing testbeds enough? In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part II. LNCS, vol. 7650, pp. 313–324. Springer, Heidelberg (2012)CrossRef
20.
go back to reference Rinne, M., Solanki, M., Nuutila, E.: RFID-based logistics monitoring with semantics-driven event processing. In: DEBS, pp. 238–245 (2016) Rinne, M., Solanki, M., Nuutila, E.: RFID-based logistics monitoring with semantics-driven event processing. In: DEBS, pp. 238–245 (2016)
21.
go back to reference Sharaf, M., Chrysanthis, P., Labrinidis, A.: Preemptive rate-based operator scheduling in a data stream management system. In: AICCSA, pp. 46–59 (2005) Sharaf, M., Chrysanthis, P., Labrinidis, A.: Preemptive rate-based operator scheduling in a data stream management system. In: AICCSA, pp. 46–59 (2005)
22.
go back to reference Teymourian, K., Paschke, A.: Plan-based semantic enrichment of event streams. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 21–35. Springer, Heidelberg (2014)CrossRef Teymourian, K., Paschke, A.: Plan-based semantic enrichment of event streams. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 21–35. Springer, Heidelberg (2014)CrossRef
Metadata
Title
Planning Ahead: Stream-Driven Linked-Data Access Under Update-Budget Constraints
Authors
Shen Gao
Daniele Dell’Aglio
Soheila Dehghanzadeh
Abraham Bernstein
Emanuele Della Valle
Alessandra Mileo
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-46523-4_16

Premium Partner