Skip to main content
Top
Published in: The VLDB Journal 6/2020

23-05-2020 | Regular Paper

BAD to the bone: Big Active Data at its core

Authors: Steven Jacobs, Xikui Wang, Michael J. Carey, Vassilis J. Tsotras, Md Yusuf Sarwar Uddin

Published in: The VLDB Journal | Issue 6/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Virtually, all of today’s Big Data systems are passive in nature, responding to queries posted by their users. Instead, we are working to shift Big Data platforms from passive to active. In our view, a Big Active Data (BAD) system should continuously and reliably capture Big Data while enabling timely and automatic delivery of relevant information to a large pool of interested users, as well as supporting retrospective analyses of historical information. While various scalable streaming query engines have been created, their active behavior is limited to a (relatively) small window of the incoming data. To this end, we have created a BAD platform that combines ideas and capabilities from both Big Data and Active Data (e.g., publish/subscribe, streaming engines). It supports complex subscriptions that consider not only newly arrived items but also their relationships to past, stored data. Further, it can provide actionable notifications by enriching the subscription results with other useful data. Our platform extends an existing open-source Big Data Management System, Apache AsterixDB, with an active toolkit. The toolkit contains features to rapidly ingest semistructured data, share execution pipelines among users, manage scaled user data subscriptions, and actively monitor the state of the data to produce individualized information for each user. This paper describes the features and design of our current BAD data platform and demonstrates its ability to scale without sacrificing query capabilities or result individualization.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
There is a distributed variant of Postgres—from Greenplum—that provided database triggers in an earlier version. However, triggers have been removed in the current version due to their unreliable behavior in a distributed setting [62].
 
Literature
1.
go back to reference Abadi, D.J., Ahmad, Y., Balazinska, M., Çetintemel, U., Cherniack, M., Hwang, J., Lindner, W., Maskey, A., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., Zdonik S.B (2005) The design of the borealis stream processing engine. In: CIDR 2005, Second Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4-7, 2005, Online Proceedings, pp. 277–289 (2005). www.cidrdb.org Abadi, D.J., Ahmad, Y., Balazinska, M., Çetintemel, U., Cherniack, M., Hwang, J., Lindner, W., Maskey, A., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., Zdonik S.B (2005) The design of the borealis stream processing engine. In: CIDR 2005, Second Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4-7, 2005, Online Proceedings, pp. 277–289 (2005). www.​cidrdb.​org
2.
go back to reference Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.B.: Aurora: a new model and architecture for data stream management. VLDB J. 12(2), 120–139 (2003)CrossRef Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.B.: Aurora: a new model and architecture for data stream management. VLDB J. 12(2), 120–139 (2003)CrossRef
3.
go back to reference Agrawal, P., Silberstein, A., Cooper, B.F., Srivastava, U., Ramakrishnan, R. Asynchronous view maintenance for VLSD databases. In: Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (ed.) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29–July 2, 2009, pp. 179–192. ACM (2009) Agrawal, P., Silberstein, A., Cooper, B.F., Srivastava, U., Ramakrishnan, R. Asynchronous view maintenance for VLSD databases. In:  Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (ed.) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29–July 2, 2009, pp. 179–192. ACM (2009)
4.
go back to reference Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., Naumann, F., Peters, M., Rheinländer, A., Sax, M.J., Schelter, S., Höger, M., Tzoumas, K., Warneke, D.: The stratosphere platform for big data analytics. VLDB J. 23(6), 939–964 (2014)CrossRef Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., Naumann, F., Peters, M., Rheinländer, A., Sax, M.J., Schelter, S., Höger, M., Tzoumas, K., Warneke, D.: The stratosphere platform for big data analytics. VLDB J. 23(6), 939–964 (2014)CrossRef
5.
go back to reference Alkowaileet, W.Y., Alsubaiee, S., Carey, M.J., Li, C., Ramampiaro, H., Sinthong, P., Wang X. End-to-end machine learning with Apache AsterixDB. In: Schelter, S., Seufert, S., Kumar, A. (ed.) Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, DEEM@SIGMOD 2018, Houston, TX, USA, June 15, 2018, pp. 6:1–6:10. ACM (2018) Alkowaileet, W.Y., Alsubaiee, S., Carey, M.J., Li, C., Ramampiaro, H., Sinthong, P., Wang X. End-to-end machine learning with Apache AsterixDB. In: Schelter, S., Seufert, S., Kumar, A. (ed.) Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, DEEM@SIGMOD 2018, Houston, TX, USA, June 15, 2018, pp. 6:1–6:10. ACM (2018)
6.
go back to reference Alsubaiee, S., Altowim, Y., Altwaijry, H., Behm, A., Borkar, V.R., Bu, Y., Carey, M.J., Cetindil, I., Cheelangi, M., Faraaz, K., Gabrielova, E., Grover, R., Heilbron, Z., Kim, Y., Li, C., Li, G., Ok, J.M., Onose, N., Pirzadeh, P., Tsotras, V.J., Vernica, R., Wen, J., Westmann, T.: AsterixDB: a scalable, open source BDMS. PVLDB 7(14), 1905–1916 (2014) Alsubaiee, S., Altowim, Y., Altwaijry, H., Behm, A., Borkar, V.R., Bu, Y., Carey, M.J., Cetindil, I., Cheelangi, M., Faraaz, K., Gabrielova, E., Grover, R., Heilbron, Z., Kim, Y., Li, C., Li, G., Ok, J.M., Onose, N., Pirzadeh, P., Tsotras, V.J., Vernica, R., Wen, J., Westmann, T.: AsterixDB: a scalable, open source BDMS. PVLDB 7(14), 1905–1916 (2014)
7.
go back to reference Alsubaiee, S., Behm, A., Borkar, V.R., Heilbron, Z., Kim, Y., Carey, M.J., Dreseler, M., Li, C.: Storage management in AsterixDB. PVLDB 7(10), 841–852 (2014) Alsubaiee, S., Behm, A., Borkar, V.R., Heilbron, Z., Kim, Y., Carey, M.J., Dreseler, M., Li, C.: Storage management in AsterixDB. PVLDB 7(10), 841–852 (2014)
14.
go back to reference Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Motwani, R., Nishizawa, I., Srivastava, U., Thomas, D., Varma, R., Widom, J.: STREAM: the stanford stream data manager. IEEE Data Eng. Bull. 26(1), 19–26 (2003) Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Motwani, R., Nishizawa, I., Srivastava, U., Thomas, D., Varma, R., Widom, J.: STREAM: the stanford stream data manager. IEEE Data Eng. Bull. 26(1), 19–26 (2003)
15.
go back to reference Armbrust, M., Das, T., Torres, J., Yavuz, B., Zhu, S., Xin, R., Ghodsi, A., Stoica, I., Zaharia, M. Structured Streaming: a declarative API for real-time applications in apache Spark. In: Das, G., Jermaine, C.M., Bernstein P.A. (ed.) Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018, pp. 601–613. ACM (2018) Armbrust, M., Das, T., Torres, J., Yavuz, B., Zhu, S., Xin, R., Ghodsi, A., Stoica, I., Zaharia, M. Structured Streaming: a declarative API for real-time applications in apache Spark. In: Das, G., Jermaine, C.M., Bernstein P.A. (ed.) Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018, pp. 601–613. ACM (2018)
16.
go back to reference Atikoglu, B. Xu, Y., Frachtenberg, E., Jiang, S., Paleczny, M. Workload analysis of a large-scale key-value store. In: Harrison, P.G., Arlitt, M.F., Casale, G. (ed.) ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’12, London, United Kingdom, June 11–15, 2012, pp. 53–64. ACM (2012) Atikoglu, B. Xu, Y., Frachtenberg, E., Jiang, S., Paleczny, M. Workload analysis of a large-scale key-value store. In: Harrison, P.G., Arlitt, M.F., Casale, G. (ed.) ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’12, London, United Kingdom, June 11–15, 2012, pp. 53–64. ACM (2012)
17.
go back to reference Babu, S., Widom, J.: Continuous queries over data streams. SIGMOD Rec. 30(3), 109–120 (2001)CrossRef Babu, S., Widom, J.: Continuous queries over data streams. SIGMOD Rec. 30(3), 109–120 (2001)CrossRef
19.
go back to reference Bainomugisha, E., Carreton, A.L., Cutsem, T.V., Mostinckx, S., Meuter, W.D.: A survey on reactive programming. ACM Comput. Surv. 45(4), :52:1–52:34 (2013)CrossRef Bainomugisha, E., Carreton, A.L., Cutsem, T.V., Mostinckx, S., Meuter, W.D.: A survey on reactive programming. ACM Comput. Surv. 45(4), :52:1–52:34 (2013)CrossRef
20.
go back to reference Bamba, B., Liu, L., Yu, P.S., Zhang, G., Doo, M. Scalable processing of spatial alarms. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) High Performance Computing-HiPC 2008, 15th International Conference, Bangalore, India, December 17-20, 2008. Proceedings, volume 5374 of Lecture Notes in Computer Science, pp. 232–244. Springer (2008) Bamba, B., Liu, L., Yu, P.S., Zhang, G., Doo, M. Scalable processing of spatial alarms. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) High Performance Computing-HiPC 2008, 15th International Conference, Bangalore, India, December 17-20, 2008. Proceedings, volume 5374 of Lecture Notes in Computer Science, pp. 232–244. Springer (2008)
21.
go back to reference Borkar, V.R., Bu, Y., Onose, Jr. E.P.C. N., Westmann, T., Pirzadeh, P., Carey, M.J., Tsotras, V.J. Algebricks: a data model-agnostic compiler backend for big data languages. In: Ghandeharizadeh, S., Barahmand, S., Balazinska, M., Freedman, M.J. (eds.) Proceedings of the Sixth ACM Symposium on Cloud Computing, SoCC 2015, Kohala Coast, Hawaii, USA, August 27–29, 2015, pp. 422–433. ACM (2015) Borkar, V.R., Bu, Y., Onose, Jr. E.P.C. N., Westmann, T., Pirzadeh, P., Carey, M.J., Tsotras, V.J. Algebricks: a data model-agnostic compiler backend for big data languages. In: Ghandeharizadeh, S., Barahmand, S., Balazinska, M., Freedman, M.J. (eds.) Proceedings of the Sixth ACM Symposium on Cloud Computing, SoCC 2015, Kohala Coast, Hawaii, USA, August 27–29, 2015, pp. 422–433. ACM (2015)
22.
go back to reference Borkar, V.R., Carey, M.J., Grover, R., Onose, N., Vernica, R. Hyracks: a flexible and extensible foundation for data-intensive computing. In: Abiteboul, S., Böhm, K., Koch, C., Tan, K. (eds.) Proceedings of the 27th International Conference on Data Engineering, ICDE 2011, April 11–16, 2011, Hannover, Germany, pp. 1151–1162. IEEE Computer Society (2011) Borkar, V.R., Carey, M.J., Grover, R., Onose, N., Vernica, R. Hyracks: a flexible and extensible foundation for data-intensive computing. In: Abiteboul, S., Böhm, K., Koch, C., Tan, K. (eds.) Proceedings of the 27th International Conference on Data Engineering, ICDE 2011, April 11–16, 2011, Hannover, Germany, pp. 1151–1162. IEEE Computer Society (2011)
23.
go back to reference Borkar, D., Mayuram, R., Sangudi, G., Carey, M.J. Have your data and query it too: from key-value caching to big data management. In: Özcan, F., Koutrika, G., Madden, S. (ed.) Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26–July 01, 2016, pp. 239–251. ACM (2016) Borkar, D., Mayuram, R., Sangudi, G., Carey, M.J. Have your data and query it too: from key-value caching to big data management. In: Özcan, F., Koutrika, G., Madden, S. (ed.) Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26–July 01, 2016, pp. 239–251. ACM (2016)
24.
go back to reference Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache Flink™: Stream and batch processing in a single engine. IEEE Data Eng. Bull. 38(4), 28–38 (2015) Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache Flink™: Stream and batch processing in a single engine. IEEE Data Eng. Bull. 38(4), 28–38 (2015)
25.
go back to reference Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.A. Telegraphcq: continuous dataflow processing for an uncertain world. In:CIDR 2003, First Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 5-8, 2003, Online Proceedings. www.cidrdb.org (2003) Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.A. Telegraphcq: continuous dataflow processing for an uncertain world. In:CIDR 2003, First Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 5-8, 2003, Online Proceedings. www.cidrdb.org (2003)
26.
go back to reference Chen, J., DeWitt, D.J., Tian, F., Wang, Y. Niagaracq: a scalable continuous query system for internet databases. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16–18, 2000, Dallas, Texas, USA, pp. 379–390. ACM (2000) Chen, J., DeWitt, D.J., Tian, F., Wang, Y. Niagaracq: a scalable continuous query system for internet databases. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16–18, 2000, Dallas, Texas, USA, pp. 379–390. ACM (2000)
27.
go back to reference Chintapalli, S., Dagit, D., Evans, B., Farivar, R., Graves, T., Holderbaugh, M., Liu, Z., Nusbaum, K., Patil, K., Peng, B., Poulosky, P. Benchmarking streaming computation engines: storm, Flink and Spark Streaming. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops 2016, Chicago, IL, USA, May 23–27, 2016, pp. 1789–1792. IEEE Computer Society (2016) Chintapalli, S., Dagit, D., Evans, B., Farivar, R., Graves, T., Holderbaugh, M., Liu, Z., Nusbaum, K., Patil, K., Peng, B., Poulosky, P. Benchmarking streaming computation engines: storm, Flink and Spark Streaming. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops 2016, Chicago, IL, USA, May 23–27, 2016, pp. 1789–1792. IEEE Computer Society (2016)
28.
go back to reference Chirkova, R., Yang, J.: Materialized views. Found. Trends Databases 4(4), 295–405 (2012)CrossRef Chirkova, R., Yang, J.: Materialized views. Found. Trends Databases 4(4), 295–405 (2012)CrossRef
30.
go back to reference Dalvi, B.B., Kshirsagar, M., Sudarshan, S.: Keyword search on external memory data graphs. PVLDB 1(1), 1189–1204 (2008) Dalvi, B.B., Kshirsagar, M., Sudarshan, S.: Keyword search on external memory data graphs. PVLDB 1(1), 1189–1204 (2008)
31.
go back to reference Dayal, U., Blaustein, B.T., Buchmann, A.P., Chakravarthy, U.S., Hsu, M., Ledin, R., McCarthy, D.R., Rosenthal, A., Sarin, S.K., Carey, M.J., Livny, M., Jauhari, R.: The HiPAC project: combining active databases and timing constraints. SIGMOD Rec. 17(1), 51–70 (1988)CrossRef Dayal, U., Blaustein, B.T., Buchmann, A.P., Chakravarthy, U.S., Hsu, M., Ledin, R., McCarthy, D.R., Rosenthal, A., Sarin, S.K., Carey, M.J., Livny, M., Jauhari, R.: The HiPAC project: combining active databases and timing constraints. SIGMOD Rec. 17(1), 51–70 (1988)CrossRef
32.
go back to reference Dean, J., Ghemawat, S. Mapreduce: Simplified data processing on large clusters. In: Brewer, E.A., Chen, P. (eds) 6th Symposium on Operating System Design and Implementation (OSDI 2004), San Francisco, California, USA, December 6–8, 2004, pp. 137–150. USENIX Association (2004) Dean, J., Ghemawat, S. Mapreduce: Simplified data processing on large clusters. In: Brewer, E.A., Chen, P. (eds) 6th Symposium on Operating System Design and Implementation (OSDI 2004), San Francisco, California, USA, December 6–8, 2004, pp. 137–150. USENIX Association (2004)
33.
go back to reference DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W. Dynamo: amazon’s highly available key-value store. In: Bressoud, T.C., Kaashoek, M.F. (eds.) Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007, SOSP 2007, Stevenson, Washington, USA, October 14–17, 2007, pp. 205–220. ACM (2007) DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W. Dynamo: amazon’s highly available key-value store. In: Bressoud, T.C., Kaashoek, M.F. (eds.) Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007, SOSP 2007, Stevenson, Washington, USA, October 14–17, 2007, pp. 205–220. ACM (2007)
34.
go back to reference Desta, M.S., Hyytiä, E., Keränen, A., Kärkkäinen, T., Ott, J. Evaluating (geo) content sharing with the ONE simulator. In: Nikoletseas, S.E., Rumín, Á C. (eds.) MobiWac’13, Proceedings of the 11th ACM International Symposium on Mobility Management and Wireless Access, Barcelona, Spain, November 3–8, 2013, pp. 37–40. ACM (2013) Desta, M.S., Hyytiä, E., Keränen, A., Kärkkäinen, T., Ott, J. Evaluating (geo) content sharing with the ONE simulator. In: Nikoletseas, S.E., Rumín, Á C. (eds.) MobiWac’13, Proceedings of the 11th ACM International Symposium on Mobility Management and Wireless Access, Barcelona, Spain, November 3–8, 2013, pp. 37–40. ACM (2013)
35.
go back to reference Dindar, N., Güç, B., Lau, P., Özal, A., Soner, M., Tatbul, N. Dejavu: declarative pattern matching over live and archived streams of events. In: Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29–July 2, 2009, pp. 1023–1026. ACM (2009) Dindar, N., Güç, B., Lau, P., Özal, A., Soner, M., Tatbul, N. Dejavu: declarative pattern matching over live and archived streams of events. In: Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29–July 2, 2009, pp. 1023–1026. ACM (2009)
36.
go back to reference Escriva, R., Wong, B., Sirer, E.G. Hyperdex: a distributed, searchable key-value store. In: Eggert, L., Ott, J., Padmanabhan, V.N., Varghese, G. (eds) ACM SIGCOMM 2012 Conference, SIGCOMM ’12, Helsinki, Finland-August 13–17, 2012, pp. 25–36. ACM (2012) Escriva, R., Wong, B., Sirer, E.G. Hyperdex: a distributed, searchable key-value store. In: Eggert, L., Ott, J., Padmanabhan, V.N., Varghese, G. (eds) ACM SIGCOMM 2012 Conference, SIGCOMM ’12, Helsinki, Finland-August 13–17, 2012, pp. 25–36. ACM (2012)
37.
go back to reference Eugster, P.T., Felber, P., Guerraoui, R., Kermarrec, A.: The many faces of publish/subscribe. ACM Comput. Surv. 35(2), 114–131 (2003)CrossRef Eugster, P.T., Felber, P., Guerraoui, R., Kermarrec, A.: The many faces of publish/subscribe. ACM Comput. Surv. 35(2), 114–131 (2003)CrossRef
38.
go back to reference Gedik, B., Andrade, H., Wu, K., Yu, P.S., Doo, M. SPADE: the systems declarative stream processing engine. In: Wang, J.T. (ed.) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10–12, 2008, pp. 1123–1134. ACM (2008) Gedik, B., Andrade, H., Wu, K., Yu, P.S., Doo, M. SPADE: the systems declarative stream processing engine. In: Wang, J.T. (ed.) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10–12, 2008, pp. 1123–1134. ACM (2008)
39.
go back to reference Golab, L., Johnson, T., Shkapenyuk, V. Scheduling updates in a real-time stream warehouse. In: Ioannidis, Y.E., Lee, D.L., Ng, R.T. (ed.) Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, March 29 2009–April 2 2009, Shanghai, China, pp. 1207–1210. IEEE Computer Society (2009) Golab, L., Johnson, T., Shkapenyuk, V. Scheduling updates in a real-time stream warehouse. In: Ioannidis, Y.E., Lee, D.L., Ng, R.T. (ed.) Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, March 29 2009–April 2 2009, Shanghai, China, pp. 1207–1210. IEEE Computer Society (2009)
40.
go back to reference Goldberg, D., Nichols, D.A., Oki, B.M., Terry, D.B.: Using collaborative filtering to weave an information tapestry. Commun. ACM 35(12), 61–70 (1992)CrossRef Goldberg, D., Nichols, D.A., Oki, B.M., Terry, D.B.: Using collaborative filtering to weave an information tapestry. Commun. ACM 35(12), 61–70 (1992)CrossRef
41.
go back to reference Grover, R., Carey, M.J. Data ingestion in AsterixDB. In: Alonso, G., Geerts, F., Popa, L., Barceló, P., Teubner, J., Ugarte, M., den Bussche, J.V., Paredaens, J. (eds.) Proceedings of the 18th International Conference on Extending Database Technology, EDBT 2015, Brussels, Belgium, March 23–27, 2015, pp. 605–616. OpenProceedings.org (2015) Grover, R., Carey, M.J. Data ingestion in AsterixDB. In: Alonso, G., Geerts, F., Popa, L., Barceló, P., Teubner, J., Ugarte, M., den Bussche, J.V., Paredaens, J. (eds.) Proceedings of the 18th International Conference on Extending Database Technology, EDBT 2015, Brussels, Belgium, March 23–27, 2015, pp. 605–616. OpenProceedings.org (2015)
42.
go back to reference Hanson, E.N., Carnes, C., Huang, L., Konyala, M., Noronha, L., Parthasarathy, S., Park, J.B., Vernon, A. Scalable trigger processing. In: Kitsuregawa, M., Papazoglou, M.P., Pu, C. (eds.) Proceedings of the 15th International Conference on Data Engineering, Sydney, Australia, March 23–26, 1999, pp. 266–275. IEEE Computer Society (1999) Hanson, E.N., Carnes, C., Huang, L., Konyala, M., Noronha, L., Parthasarathy, S., Park, J.B., Vernon, A. Scalable trigger processing. In: Kitsuregawa, M., Papazoglou, M.P., Pu, C. (eds.) Proceedings of the 15th International Conference on Data Engineering, Sydney, Australia, March 23–26, 1999, pp. 266–275. IEEE Computer Society (1999)
43.
go back to reference Hanson, E.N.: The design and implementation of the ariel active database rule system. IEEE Trans. Knowl. Data Eng. 8(1), 157–172 (1996)CrossRef Hanson, E.N.: The design and implementation of the ariel active database rule system. IEEE Trans. Knowl. Data Eng. 8(1), 157–172 (1996)CrossRef
44.
go back to reference Jafarpour, H., Hore, B., Mehrotra, S., Venkatasubramanian, N. Subscription subsumption evaluation for content-based publish/subscribe systems. In: Issarny, V., Schantz, R.E. (eds.) Middleware 2008, ACM/IFIP/USENIX 9th International Middleware Conference, Leuven, Belgium, December 1–5, 2008, Proceedings, volume 5346 of Lecture Notes in Computer Science, pp. 62–81. Springer (2008) Jafarpour, H., Hore, B., Mehrotra, S., Venkatasubramanian, N. Subscription subsumption evaluation for content-based publish/subscribe systems. In: Issarny, V., Schantz, R.E. (eds.) Middleware 2008, ACM/IFIP/USENIX 9th International Middleware Conference, Leuven, Belgium, December 1–5, 2008, Proceedings, volume 5346 of Lecture Notes in Computer Science, pp. 62–81. Springer (2008)
45.
go back to reference Jin, Y., Strom, R.E. Relational subscription middleware for internet-scale publish-subscribe. In: Jacobsen, H.(ed.) Proceedings of the 2nd International Workshop on Distributed Event-Based Systems, DEBS: Sunday, June 8th, 2003, p. 2003. ACM, San Diego, California, USA (in conjunction with SIGMOD/PODS) (2003) Jin, Y., Strom, R.E. Relational subscription middleware for internet-scale publish-subscribe. In: Jacobsen, H.(ed.) Proceedings of the 2nd International Workshop on Distributed Event-Based Systems, DEBS: Sunday, June 8th, 2003, p. 2003. ACM, San Diego, California, USA (in conjunction with SIGMOD/PODS) (2003)
46.
go back to reference Keränen, A., Ott, J., Kärkkäinen, T. The ONE simulator for DTN protocol evaluation. In: Dalle, O., Wainer, G.A., Perrone, L.F., Stea, G. (eds.) Proceedings of the 2nd International Conference on Simulation Tools and Techniques for Communications, Networks and Systems, SimuTools 2009, Rome, Italy, March 2–6, 2009, p. 55. ICST/ACM (2009) Keränen, A.,  Ott, J.,  Kärkkäinen, T. The ONE simulator for DTN protocol evaluation. In: Dalle, O., Wainer, G.A., Perrone, L.F., Stea, G. (eds.) Proceedings of the 2nd International Conference on Simulation Tools and Techniques for Communications, Networks and Systems, SimuTools 2009, Rome, Italy, March 2–6, 2009, p. 55. ICST/ACM (2009)
47.
go back to reference Kiran, M., Murphy, P., Monga, I., Dugan, J., Baveja, S.S. Lambda architecture for cost-effective batch and speed big data processing. In: 2015 IEEE International Conference on Big Data, Big Data 2015, Santa Clara, CA, USA, October 29–November 1, 2015, pp. 2785–2792. IEEE Computer Society (2015) Kiran, M., Murphy, P., Monga, I., Dugan, J., Baveja, S.S. Lambda architecture for cost-effective batch and speed big data processing. In: 2015 IEEE International Conference on Big Data, Big Data 2015, Santa Clara, CA, USA, October 29–November 1, 2015, pp. 2785–2792. IEEE Computer Society (2015)
48.
go back to reference Krämer, J., Seeger, B. PIPES: a public infrastructure for processing and exploring streams. In: Weikum, G., König, A.C., Deßloch, S. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13–18, 2004, pp. 925–926. ACM (2004) Krämer, J., Seeger, B. PIPES: a public infrastructure for processing and exploring streams. In: Weikum, G., König, A.C., Deßloch, S. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13–18, 2004, pp. 925–926. ACM (2004)
49.
go back to reference Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: a distributed messaging system for log processing. Proc. NetDB 11, 1–7 (2011) Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: a distributed messaging system for log processing. Proc. NetDB 11, 1–7 (2011)
50.
go back to reference Lee, K., Liu, L., Palanisamy, B., Yigitoglu, E.: Road network-aware spatial alarms. IEEE Trans. Mob. Comput. 15(1), 188–201 (2016)CrossRef Lee, K., Liu, L., Palanisamy, B., Yigitoglu, E.: Road network-aware spatial alarms. IEEE Trans. Mob. Comput. 15(1), 188–201 (2016)CrossRef
51.
go back to reference Li, M., Ye, F., Kim, M., Chen, H., Lei, H. A scalable and elastic publish/subscribe service. In: 25th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2011, Anchorage, Alaska, USA, 16–20 May, 2011-Conference Proceedings, pp. 1254–1265. IEEE (2011) Li, M., Ye, F., Kim, M., Chen, H., Lei, H. A scalable and elastic publish/subscribe service. In: 25th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2011, Anchorage, Alaska, USA, 16–20 May, 2011-Conference Proceedings, pp. 1254–1265. IEEE (2011)
52.
go back to reference Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning in the cloud. PVLDB 5(8), 716–727 (2012) Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning in the cloud. PVLDB 5(8), 716–727 (2012)
53.
go back to reference Luo, C., Carey, M.J.: Efficient data ingestion and query processing for LSM-based storage systems. PVLDB 12(5), 531–543 (2019) Luo, C., Carey, M.J.: Efficient data ingestion and query processing for LSM-based storage systems. PVLDB 12(5), 531–543 (2019)
54.
go back to reference Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G. Pregel: a system for large-scale graph processing. In: Elmagarmid, A.K., Agrawal, D. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6–10, 2010, pp. 135–146. ACM (2010) Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G. Pregel: a system for large-scale graph processing. In: Elmagarmid, A.K., Agrawal, D. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6–10, 2010, pp. 135–146. ACM (2010)
55.
go back to reference Markowetz, A., Yang, Y., Papadias, D. Keyword search on relational data streams. In: Chan, C.Y., Ooi, B.C., Zhou, A. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12–14, 2007, pp. 605–616. ACM (2007) Markowetz, A., Yang, Y., Papadias, D. Keyword search on relational data streams. In: Chan, C.Y., Ooi, B.C., Zhou, A. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12–14, 2007, pp. 605–616. ACM (2007)
56.
go back to reference Milo, T., Zur, T., Verbin, E. Boosting topic-based publish-subscribe systems with dynamic clustering. In: Chan, C.Y., Ooi, B.C., Zhou, A. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12–14, 2007, pp. 749–760. ACM (2007) Milo, T., Zur, T., Verbin, E. Boosting topic-based publish-subscribe systems with dynamic clustering. In: Chan, C.Y., Ooi, B.C., Zhou, A. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12–14, 2007, pp. 749–760. ACM (2007)
58.
go back to reference Moro, M.M., Bakalov, P., Tsotras, V.J. Early profile pruning on xml-aware publish/subscribe systems. In: Koch, C., Gehrke, J., Garofalakis, M.N., Srivastava, D., Aberer, K., Deshpande, A., Florescu, D., Chan, C.Y., Ganti, V., Kanne, C., Klas, W., Neuhold, E.J. (eds.) Proceedings of the 33rd International Conference on Very Large Data Bases, University of Vienna, Austria, September 23–27, 2007, pp. 866–877. ACM (2007) Moro, M.M., Bakalov, P., Tsotras, V.J. Early profile pruning on xml-aware publish/subscribe systems. In: Koch, C., Gehrke, J., Garofalakis, M.N., Srivastava, D., Aberer, K., Deshpande, A., Florescu, D., Chan, C.Y., Ganti, V., Kanne, C., Klas, W., Neuhold, E.J. (eds.) Proceedings of the 33rd International Conference on Very Large Data Bases, University of Vienna, Austria, September 23–27, 2007, pp. 866–877. ACM (2007)
59.
go back to reference Nikolic, M., Elseidy, M., Koch, C. LINVIEW: incremental view maintenance for complex analytical queries. In: Dyreson, C.E., Li, F., Özsu, M.T. (eds.) International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22–27, 2014, pp. 253–264. ACM (2014) Nikolic, M., Elseidy, M., Koch, C. LINVIEW: incremental view maintenance for complex analytical queries. In: Dyreson, C.E., Li, F., Özsu, M.T. (eds.) International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22–27, 2014, pp. 253–264. ACM (2014)
63.
go back to reference Qader, M.A., Hristidis, V. DualDB: an efficient LSM-based publish/subscribe storage system. In: Proceedings of the 29th International Conference on Scientific and Statistical Database Management, Chicago, IL, USA, June 27–29, 2017, pp. 24:1–24:6. ACM (2017) Qader, M.A., Hristidis, V. DualDB: an efficient LSM-based publish/subscribe storage system. In: Proceedings of the 29th International Conference on Scientific and Statistical Database Management, Chicago, IL, USA, June 27–29, 2017, pp. 24:1–24:6. ACM (2017)
64.
go back to reference Quass, D., Widom, J. On-line warehouse view maintenance. In: Peckham, J. (ed.) SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13–15, 1997, Tucson, Arizona, USA, pp. 393–404. ACM Press (1997) Quass, D., Widom, J. On-line warehouse view maintenance. In: Peckham, J. (ed.) SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13–15, 1997, Tucson, Arizona, USA, pp. 393–404. ACM Press (1997)
65.
go back to reference Saigaonkar, S., Rao, M., Mantha, S. Publish subscribe system based on ontology and XML filtering. In: 2011 3rd International Conference on Computer Research and Development, Vol. 1, pp. 154–158. IEEE (2011) Saigaonkar, S., Rao, M., Mantha, S. Publish subscribe system based on ontology and XML filtering. In: 2011 3rd International Conference on Computer Research and Development, Vol. 1, pp. 154–158. IEEE (2011)
66.
go back to reference Stonebraker, M., Rowe, L.A. The design of postgres. In: Zaniolo, C. (ed.) Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 28–30, 1986, pp. 340–355. ACM Press (1986) Stonebraker, M., Rowe, L.A. The design of postgres. In: Zaniolo, C. (ed.) Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 28–30, 1986, pp. 340–355. ACM Press (1986)
67.
go back to reference G. S. Thakur, B. L. Bhaduri, J. O. Piburn, K. M. Sims, R. N. Stewart, and M. L. Urban. Planetsense: a real-time streaming and spatio-temporal analytics platform for gathering geo-spatial intelligence from open source data. In J. Bao, C. Sengstock, M. E. Ali, Y. Huang, M. Gertz, M. Renz, and J. Sankaranarayanan, editors, Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems, Bellevue, WA, USA, November 3-6, 2015, pages 11:1–11:4. ACM, 2015 G. S. Thakur, B. L. Bhaduri, J. O. Piburn, K. M. Sims, R. N. Stewart, and M. L. Urban. Planetsense: a real-time streaming and spatio-temporal analytics platform for gathering geo-spatial intelligence from open source data. In J. Bao, C. Sengstock, M. E. Ali, Y. Huang, M. Gertz, M. Renz, and J. Sankaranarayanan, editors, Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems, Bellevue, WA, USA, November 3-6, 2015, pages 11:1–11:4. ACM, 2015
68.
go back to reference Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive - A warehousing solution over a map-reduce framework. PVLDB 2(2), 1626–1629 (2009) Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive - A warehousing solution over a map-reduce framework. PVLDB 2(2), 1626–1629 (2009)
69.
go back to reference Uddin, M.Y.S., Venkatasubramanian, N. Edge caching for enriched notifications delivery in big active data. In: 38th IEEE International Conference on Distributed Computing Systems, ICDCS 2018, Vienna, Austria, July 2–6, 2018, pp. 696–705. IEEE Computer Society (2018) Uddin, M.Y.S., Venkatasubramanian, N. Edge caching for enriched notifications delivery in big active data. In: 38th IEEE International Conference on Distributed Computing Systems, ICDCS 2018, Vienna, Austria, July 2–6, 2018, pp. 696–705. IEEE Computer Society (2018)
70.
go back to reference United States geological survey, Shakecast (2014). earthquake.usgs.gov/research/software/shakecast/ United States geological survey, Shakecast (2014). earthquake.usgs.gov/research/software/shakecast/
71.
go back to reference Wang, X., Carey, M.J.: An IDEA: an ingestion framework for data enrichment in AsterixDB. PVLDB 12(11), 1485–1498 (2019) Wang, X., Carey, M.J.: An IDEA: an ingestion framework for data enrichment in AsterixDB. PVLDB 12(11), 1485–1498 (2019)
72.
go back to reference Wang, X., Zhang, W., Zhang, Y., Lin, X., Huang, Z.: Top-k spatial-keyword publish/subscribe over sliding window. VLDB J. 26(3), 301–326 (2017)CrossRef Wang, X., Zhang, W., Zhang, Y., Lin, X., Huang, Z.: Top-k spatial-keyword publish/subscribe over sliding window. VLDB J. 26(3), 301–326 (2017)CrossRef
73.
go back to reference Widom, J., Cochrane, R., Lindsay, B.G. Implementing set-oriented production rules as an extension to starburst. In: Lohman, G.M., Sernadas, A., Camps, R. (eds.) 17th International Conference on Very Large Data Bases, September 3–6, 1991, Barcelona, Catalonia, Spain, Proceedings, pp. 275–285. Morgan Kaufmann (1991) Widom, J., Cochrane, R., Lindsay, B.G. Implementing set-oriented production rules as an extension to starburst. In: Lohman, G.M., Sernadas, A., Camps, R. (eds.) 17th International Conference on Very Large Data Bases, September 3–6, 1991, Barcelona, Catalonia, Spain, Proceedings, pp. 275–285. Morgan Kaufmann (1991)
74.
go back to reference Yan, D., Bu, Y., Tian, Y., Deshpande, A., Cheng, J. Big graph analytics systems. In: Özcan, F., Koutrika, G., Madden, S. (eds.) Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26–July 01, 2016, pp. 2241–2243. ACM (2016) Yan, D., Bu, Y., Tian, Y., Deshpande, A., Cheng, J. Big graph analytics systems. In:  Özcan, F., Koutrika, G., Madden, S. (eds.) Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26–July 01, 2016, pp. 2241–2243. ACM (2016)
75.
go back to reference Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Gribble, S.D., Katabi, D. (ed.) Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, San Jose, CA, USA, April 25–27, 2012, pp. 15–28. USENIX Association (2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Gribble, S.D., Katabi, D. (ed.) Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, San Jose, CA, USA, April 25–27, 2012, pp. 15–28. USENIX Association (2012)
76.
go back to reference Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., Ghodsi, A., Gonzalez, J., Shenker, S., Stoica, I.: Apache Spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)CrossRef Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., Ghodsi, A., Gonzalez, J., Shenker, S., Stoica, I.: Apache Spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)CrossRef
77.
go back to reference Zhao, Y., Kim, K., Venkatasubramanian, N. DYNATOPS: a dynamic topic-based publish/subscribe architecture. In: Chakravarthy, S., Urban, S.D., Pietzuch, P.R., Rundensteiner, E.A. (eds.) The 7th ACM International Conference on Distributed Event-Based Systems, DEBS ’13, Arlington, TX, USA-June 29–July 03, 2013, pp, 75–86. ACM (2013) Zhao, Y., Kim, K., Venkatasubramanian, N. DYNATOPS: a dynamic topic-based publish/subscribe architecture. In: Chakravarthy, S., Urban, S.D., Pietzuch, P.R., Rundensteiner, E.A. (eds.) The 7th ACM International Conference on Distributed Event-Based Systems, DEBS ’13, Arlington, TX, USA-June 29–July 03, 2013, pp, 75–86. ACM (2013)
Metadata
Title
BAD to the bone: Big Active Data at its core
Authors
Steven Jacobs
Xikui Wang
Michael J. Carey
Vassilis J. Tsotras
Md Yusuf Sarwar Uddin
Publication date
23-05-2020
Publisher
Springer Berlin Heidelberg
Published in
The VLDB Journal / Issue 6/2020
Print ISSN: 1066-8888
Electronic ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-020-00616-7

Other articles of this Issue 6/2020

The VLDB Journal 6/2020 Go to the issue

Premium Partner