Skip to main content
Erschienen in: Mobile Networks and Applications 2/2014

01.04.2014

Big Data: A Survey

verfasst von: Min Chen, Shiwen Mao, Yunhao Liu

Erschienen in: Mobile Networks and Applications | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we review the background and state-of-the-art of big data. We first introduce the general background of big data and review related technologies, such as could computing, Internet of Things, data centers, and Hadoop. We then focus on the four phases of the value chain of big data, i.e., data generation, data acquisition, data storage, and data analysis. For each phase, we introduce the general background, discuss the technical challenges, and review the latest advances. We finally examine the several representative applications of big data, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid. These discussions aim to provide a comprehensive overview and big-picture to readers of this exciting area. This survey is concluded with a discussion of open problems and future directions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Gantz J, Reinsel D (2011) Extracting value from chaos. IDC iView, pp 1–12 Gantz J, Reinsel D (2011) Extracting value from chaos. IDC iView, pp 1–12
3.
Zurück zum Zitat Cukier K (2010) Data, data everywhere: a special report on managing information. Economist Newspaper Cukier K (2010) Data, data everywhere: a special report on managing information. Economist Newspaper
5.
Zurück zum Zitat Lohr S (2012) The age of big data. New York Times, pp 11 Lohr S (2012) The age of big data. New York Times, pp 11
10.
Zurück zum Zitat Manyika J, McKinsey Global Institute, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers AH (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute Manyika J, McKinsey Global Institute, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers AH (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute
11.
Zurück zum Zitat Mayer-Schönberger V, Cukier K (2013) Big data: a revolution that will transform how we live, work, and think. Eamon Dolan/Houghton Mifflin Harcourt Mayer-Schönberger V, Cukier K (2013) Big data: a revolution that will transform how we live, work, and think. Eamon Dolan/Houghton Mifflin Harcourt
12.
Zurück zum Zitat Laney D (2001) 3-d data management: controlling data volume, velocity and variety. META Group Research Note, 6 February Laney D (2001) 3-d data management: controlling data volume, velocity and variety. META Group Research Note, 6 February
13.
Zurück zum Zitat Zikopoulos P, Eaton C, et al (2011) Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media Zikopoulos P, Eaton C, et al (2011) Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media
14.
Zurück zum Zitat Meijer E (2011) The world according to linq. Communications of the ACM 54(10):45–51CrossRef Meijer E (2011) The world according to linq. Communications of the ACM 54(10):45–51CrossRef
16.
Zurück zum Zitat O. R. Team (2011) Big data now: current perspectives from OReilly Radar. OReilly Media O. R. Team (2011) Big data now: current perspectives from OReilly Radar. OReilly Media
18.
Zurück zum Zitat Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L (2008) Detecting influenza epidemics using search engine query data. Nature 457(7232):1012–1014CrossRef Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L (2008) Detecting influenza epidemics using search engine query data. Nature 457(7232):1012–1014CrossRef
19.
Zurück zum Zitat DeWitt D, Gray J (1992) Parallel database systems: the future of high performance database systems. Commun ACM 35(6):85–98CrossRef DeWitt D, Gray J (1992) Parallel database systems: the future of high performance database systems. Commun ACM 35(6):85–98CrossRef
20.
Zurück zum Zitat Walter T (2009) Teradata past, present, and future. UCI ISG lecture series on scalable data management Walter T (2009) Teradata past, present, and future. UCI ISG lecture series on scalable data management
21.
Zurück zum Zitat Ghemawat S, Gobioff H, Leung S-T (2003) The google file system. In: ACM SIGOPS Operating Systems Review, vol 37. ACM, pp 29–43 Ghemawat S, Gobioff H, Leung S-T (2003) The google file system. In: ACM SIGOPS Operating Systems Review, vol 37. ACM, pp 29–43
22.
Zurück zum Zitat Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef
23.
Zurück zum Zitat Hey AJG, Tansley S, Tolle KM, et al (2009) The fourth paradigm: data-intensive scientific discovery Hey AJG, Tansley S, Tolle KM, et al (2009) The fourth paradigm: data-intensive scientific discovery
24.
Zurück zum Zitat Howard JH, Kazar ML, Menees SG, Nichols DA, Satyanarayanan M, Sidebotham RN, West MJ (1988) Scale and performance in a distributed file system. ACM Trans Comput Syst (TOCS) 6(1):51–81CrossRef Howard JH, Kazar ML, Menees SG, Nichols DA, Satyanarayanan M, Sidebotham RN, West MJ (1988) Scale and performance in a distributed file system. ACM Trans Comput Syst (TOCS) 6(1):51–81CrossRef
25.
Zurück zum Zitat Cattell R (2011) Scalable sql and nosql data stores. ACM SIGMOD Record 39(4):12–27CrossRef Cattell R (2011) Scalable sql and nosql data stores. ACM SIGMOD Record 39(4):12–27CrossRef
26.
Zurück zum Zitat Labrinidis A, Jagadish HV (2012) Challenges and opportunities with big data. Proc VLDB Endowment 5(12):2032–2033CrossRef Labrinidis A, Jagadish HV (2012) Challenges and opportunities with big data. Proc VLDB Endowment 5(12):2032–2033CrossRef
27.
Zurück zum Zitat Chaudhuri S, Dayal U, Narasayya V (2011) An overview of business intelligence technology. Commun ACM 54(8):88–98CrossRef Chaudhuri S, Dayal U, Narasayya V (2011) An overview of business intelligence technology. Commun ACM 54(8):88–98CrossRef
28.
Zurück zum Zitat Agrawal D, Bernstein P, Bertino E, Davidson S, Dayal U, Franklin M, Gehrke J, Haas L, Halevy A, Han J et al (2012) Challenges and opportunities with big data. A community white paper developed by leading researches across the United States Agrawal D, Bernstein P, Bertino E, Davidson S, Dayal U, Franklin M, Gehrke J, Haas L, Halevy A, Han J et al (2012) Challenges and opportunities with big data. A community white paper developed by leading researches across the United States
29.
Zurück zum Zitat Sun Y, Chen M, Liu B, Mao S (2013) Far: a fault-avoidant routing method for data center networks with regular topology. In: Proceedings of ACM/IEEE symposium on architectures for networking and communications systems (ANCS’13). ACM Sun Y, Chen M, Liu B, Mao S (2013) Far: a fault-avoidant routing method for data center networks with regular topology. In: Proceedings of ACM/IEEE symposium on architectures for networking and communications systems (ANCS’13). ACM
31.
Zurück zum Zitat Bahga A, Madisetti VK (2012) Analyzing massive machine maintenance data in a computing cloud. IEEE Transac Parallel Distrib Syst 23(10):1831–1843CrossRef Bahga A, Madisetti VK (2012) Analyzing massive machine maintenance data in a computing cloud. IEEE Transac Parallel Distrib Syst 23(10):1831–1843CrossRef
32.
Zurück zum Zitat Gunarathne T, Wu T-L, Choi JY, Bae S-H, Qiu J (2011) Cloud computing paradigms for pleasingly parallel biomedical applications. Concurr Comput Prac Experience 23(17):2338–2354CrossRef Gunarathne T, Wu T-L, Choi JY, Bae S-H, Qiu J (2011) Cloud computing paradigms for pleasingly parallel biomedical applications. Concurr Comput Prac Experience 23(17):2338–2354CrossRef
33.
Zurück zum Zitat Gantz J, Reinsel D (2010) The digital universe decade-are you ready. External publication of IDC (Analyse the Future) information and data, pp 1–16 Gantz J, Reinsel D (2010) The digital universe decade-are you ready. External publication of IDC (Analyse the Future) information and data, pp 1–16
34.
Zurück zum Zitat Bryant RE (2011) Data-intensive scalable computing for scientific applications. Comput Sci Eng 13(6):25–33CrossRef Bryant RE (2011) Data-intensive scalable computing for scientific applications. Comput Sci Eng 13(6):25–33CrossRef
35.
Zurück zum Zitat Wahab MHA, Mohd MNH, Hanafi HF, Mohsin MFM (2008) Data pre-processing on web server logs for generalized association rules mining algorithm. World Acad Sci Eng Technol 48:2008 Wahab MHA, Mohd MNH, Hanafi HF, Mohsin MFM (2008) Data pre-processing on web server logs for generalized association rules mining algorithm. World Acad Sci Eng Technol 48:2008
36.
Zurück zum Zitat Nanopoulos A, Manolopoulos Y, Zakrzewicz M, Morzy T (2002) Indexing web access-logs for pattern queries. In: Proceedings of the 4th international workshop on web information and data management. ACM, pp 63–68 Nanopoulos A, Manolopoulos Y, Zakrzewicz M, Morzy T (2002) Indexing web access-logs for pattern queries. In: Proceedings of the 4th international workshop on web information and data management. ACM, pp 63–68
37.
Zurück zum Zitat Joshi KP, Joshi A, Yesha Y (2003) On using a warehouse to analyze web logs. Distrib Parallel Databases 13(2):161–180MATHCrossRef Joshi KP, Joshi A, Yesha Y (2003) On using a warehouse to analyze web logs. Distrib Parallel Databases 13(2):161–180MATHCrossRef
38.
Zurück zum Zitat Chandramohan V, Christensen K (2002) A first look at wired sensor networks for video surveillance systems. In: Proceedings LCN 2002, 27th annual IEEE conference on local computer networks. IEEE, pp 728–729 Chandramohan V, Christensen K (2002) A first look at wired sensor networks for video surveillance systems. In: Proceedings LCN 2002, 27th annual IEEE conference on local computer networks. IEEE, pp 728–729
39.
Zurück zum Zitat Selavo L, Wood A, Cao Q, Sookoor T, Liu H, Srinivasan A, Wu Y, Kang W, Stankovic J, Young D et al (2007) Luster: wireless sensor network for environmental research. In: Proceedings of the 5th international conference on Embedded networked sensor systems. ACM, pp 103–116 Selavo L, Wood A, Cao Q, Sookoor T, Liu H, Srinivasan A, Wu Y, Kang W, Stankovic J, Young D et al (2007) Luster: wireless sensor network for environmental research. In: Proceedings of the 5th international conference on Embedded networked sensor systems. ACM, pp 103–116
40.
Zurück zum Zitat Barrenetxea G, Ingelrest F, Schaefer G, Vetterli M, Couach O, Parlange M (2008) Sensorscope: out-of-the-box environmental monitoring. In: Information processing in sensor networks, 2008, international conference on IPSN’08. IEEE, pp 332– 343 Barrenetxea G, Ingelrest F, Schaefer G, Vetterli M, Couach O, Parlange M (2008) Sensorscope: out-of-the-box environmental monitoring. In: Information processing in sensor networks, 2008, international conference on IPSN’08. IEEE, pp 332– 343
41.
Zurück zum Zitat Kim Y, Schmid T, Charbiwala ZM, Friedman J, Srivastava MB (2008) Nawms: nonintrusive autonomous water monitoring system. In: Proceedings of the 6th ACM conference on Embedded network sensor systems. ACM, pp 309–322 Kim Y, Schmid T, Charbiwala ZM, Friedman J, Srivastava MB (2008) Nawms: nonintrusive autonomous water monitoring system. In: Proceedings of the 6th ACM conference on Embedded network sensor systems. ACM, pp 309–322
42.
Zurück zum Zitat Kim S, Pakzad S, Culler D, Demmel J, Fenves G, Glaser S, Turon M (2007) Health monitoring of civil infrastructures using wireless sensor networks. In Information Processing in Sensor Networks 2007, 6th International Symposium on IPSN 2007. IEEE, pp 254–263 Kim S, Pakzad S, Culler D, Demmel J, Fenves G, Glaser S, Turon M (2007) Health monitoring of civil infrastructures using wireless sensor networks. In Information Processing in Sensor Networks 2007, 6th International Symposium on IPSN 2007. IEEE, pp 254–263
43.
Zurück zum Zitat Ceriotti M, Mottola L, Picco GP, Murphy AL, Guna S, Corra M, Pozzi M, Zonta D, Zanon P (2009) Monitoring heritage buildings with wireless sensor networks: the torre aquila deployment. In: Proceedings of the 2009 International Conference on Information Processing in Sensor Networks. IEEE Computer Society, pp 277–288 Ceriotti M, Mottola L, Picco GP, Murphy AL, Guna S, Corra M, Pozzi M, Zonta D, Zanon P (2009) Monitoring heritage buildings with wireless sensor networks: the torre aquila deployment. In: Proceedings of the 2009 International Conference on Information Processing in Sensor Networks. IEEE Computer Society, pp 277–288
44.
Zurück zum Zitat Tolle G, Polastre J, Szewczyk R, Culler D, Turner N, Tu K, Burgess S, Dawson T, Buonadonna P, Gay D et al (2005) A macroscope in the redwoods. In: Proceedings of the 3rd international conference on embedded networked sensor systems. ACM, pp 51–63 Tolle G, Polastre J, Szewczyk R, Culler D, Turner N, Tu K, Burgess S, Dawson T, Buonadonna P, Gay D et al (2005) A macroscope in the redwoods. In: Proceedings of the 3rd international conference on embedded networked sensor systems. ACM, pp 51–63
45.
Zurück zum Zitat Wang F, Liu J (2011) Networked wireless sensor data collection: issues, challenges, and approaches. IEEE Commun Surv Tutor 13(4):673–687CrossRef Wang F, Liu J (2011) Networked wireless sensor data collection: issues, challenges, and approaches. IEEE Commun Surv Tutor 13(4):673–687CrossRef
46.
Zurück zum Zitat Cho J, Garcia-Molina H (2002) Parallel crawlers. In: Proceedings of the 11th international conference on World Wide Web. ACM, pp 124–135 Cho J, Garcia-Molina H (2002) Parallel crawlers. In: Proceedings of the 11th international conference on World Wide Web. ACM, pp 124–135
47.
Zurück zum Zitat Choudhary S, Dincturk ME, Mirtaheri SM, Moosavi A, von Bochmann G, Jourdan G-V, Onut I-V (2012) Crawling rich internet applications: the state of the art. In: CASCON. pp 146–160 Choudhary S, Dincturk ME, Mirtaheri SM, Moosavi A, von Bochmann G, Jourdan G-V, Onut I-V (2012) Crawling rich internet applications: the state of the art. In: CASCON. pp 146–160
48.
Zurück zum Zitat Ghani N, Dixit S, Wang T-S (2000) On ip-over-wdm integration. IEEE Commun Mag 38(3):72–84CrossRef Ghani N, Dixit S, Wang T-S (2000) On ip-over-wdm integration. IEEE Commun Mag 38(3):72–84CrossRef
49.
Zurück zum Zitat Manchester J, Anderson J, Doshi B, Dravida S, Ip over sonet (1998). IEEE Commun Mag 36(5):136–142CrossRef Manchester J, Anderson J, Doshi B, Dravida S, Ip over sonet (1998). IEEE Commun Mag 36(5):136–142CrossRef
50.
Zurück zum Zitat Jinno M, Takara H, Kozicki B (2009) Dynamic optical mesh networks: drivers, challenges and solutions for the future. In: Optical communication, 2009, 35th European conference on ECOC’09. IEEE, pp 1–4 Jinno M, Takara H, Kozicki B (2009) Dynamic optical mesh networks: drivers, challenges and solutions for the future. In: Optical communication, 2009, 35th European conference on ECOC’09. IEEE, pp 1–4
51.
Zurück zum Zitat Barroso LA, Hölzle U (2009) The datacenter as a computer: an introduction to the design of warehouse-scale machines. Synt Lect Comput Archit 4(1):1–108 Barroso LA, Hölzle U (2009) The datacenter as a computer: an introduction to the design of warehouse-scale machines. Synt Lect Comput Archit 4(1):1–108
52.
Zurück zum Zitat Armstrong J (2009) Ofdm for optical communications. J Light Technol 27(3):189–204CrossRef Armstrong J (2009) Ofdm for optical communications. J Light Technol 27(3):189–204CrossRef
54.
Zurück zum Zitat Cisco data center interconnect design and deployment guide (2010) Cisco data center interconnect design and deployment guide (2010)
55.
Zurück zum Zitat Greenberg A, Hamilton JR, Jain N, Kandula S, Kim C, Lahiri P, Maltz DA, Patel P, Sengupta S (2009) Vl2: a scalable and flexible data center network. In ACM SIGCOMM computer communication review, vol 39. ACM, pp 51–62 Greenberg A, Hamilton JR, Jain N, Kandula S, Kim C, Lahiri P, Maltz DA, Patel P, Sengupta S (2009) Vl2: a scalable and flexible data center network. In ACM SIGCOMM computer communication review, vol 39. ACM, pp 51–62
56.
Zurück zum Zitat Guo C, Lu G, Li D, Wu H, Zhang X, Shi Y, Tian C, Zhang Y, Lu S (2009) Bcube: a high performance, server-centric network architecture for modular data centers. ACM SIGCOMM Comput Commun Rev 39(4):63–74CrossRef Guo C, Lu G, Li D, Wu H, Zhang X, Shi Y, Tian C, Zhang Y, Lu S (2009) Bcube: a high performance, server-centric network architecture for modular data centers. ACM SIGCOMM Comput Commun Rev 39(4):63–74CrossRef
57.
Zurück zum Zitat Farrington N, Porter G, Radhakrishnan S, Bazzaz HH, Subramanya V, Fainman Y, Papen G, Vahdat A (2011) Helios: a hybrid electrical/optical switch architecture for modular data centers. ACM SIGCOMM Comput Commun Rev 41(4):339–350 Farrington N, Porter G, Radhakrishnan S, Bazzaz HH, Subramanya V, Fainman Y, Papen G, Vahdat A (2011) Helios: a hybrid electrical/optical switch architecture for modular data centers. ACM SIGCOMM Comput Commun Rev 41(4):339–350
58.
Zurück zum Zitat Abu-Libdeh H, Costa P, Rowstron A, O’Shea G, Donnelly A (2010) Symbiotic routing in future data centers. ACM SIGCOMM Comput Commun Rev 40(4):51–62CrossRef Abu-Libdeh H, Costa P, Rowstron A, O’Shea G, Donnelly A (2010) Symbiotic routing in future data centers. ACM SIGCOMM Comput Commun Rev 40(4):51–62CrossRef
59.
Zurück zum Zitat Lam C, Liu H, Koley B, Zhao X, Kamalov V, Gill V, Fiber optic communication technologies: what’s needed for datacenter network operations (2010). IEEE Commun Mag 48(7):32–39CrossRef Lam C, Liu H, Koley B, Zhao X, Kamalov V, Gill V, Fiber optic communication technologies: what’s needed for datacenter network operations (2010). IEEE Commun Mag 48(7):32–39CrossRef
60.
Zurück zum Zitat Wang G, Andersen DG, Kaminsky M, Papagiannaki K, Ng TS, Kozuch M, Ryan M (2010) c-through: Part-time optics in data centers. In: ACM SIGCOMM Computer Communication Review, vol 40. ACM, pp 327–338 Wang G, Andersen DG, Kaminsky M, Papagiannaki K, Ng TS, Kozuch M, Ryan M (2010) c-through: Part-time optics in data centers. In: ACM SIGCOMM Computer Communication Review, vol 40. ACM, pp 327–338
61.
Zurück zum Zitat Ye X, Yin Y, Yoo SJB, Mejia P, Proietti R, Akella V (2010) Dos: a scalable optical switch for datacenters. In Proceedings of the 6th ACM/IEEE symposium on architectures for networking and communications systems. ACM, p 24 Ye X, Yin Y, Yoo SJB, Mejia P, Proietti R, Akella V (2010) Dos: a scalable optical switch for datacenters. In Proceedings of the 6th ACM/IEEE symposium on architectures for networking and communications systems. ACM, p 24
62.
Zurück zum Zitat Singla A, Singh A, Ramachandran K, Xu L, Zhang Y (2010) Proteus: a topology malleable data center network. In Proceedings of the 9th ACM SIGCOMM workshop on hot topics in networks. ACM, p 8 Singla A, Singh A, Ramachandran K, Xu L, Zhang Y (2010) Proteus: a topology malleable data center network. In Proceedings of the 9th ACM SIGCOMM workshop on hot topics in networks. ACM, p 8
63.
Zurück zum Zitat Liboiron-Ladouceur O, Cerutti I, Raponi PG, Andriolli N, Castoldi P (2011) Energy-efficient design of a scalable optical multiplane interconnection architecture. IEEE J Sel Top Quantum Electron 17(2):377–383CrossRef Liboiron-Ladouceur O, Cerutti I, Raponi PG, Andriolli N, Castoldi P (2011) Energy-efficient design of a scalable optical multiplane interconnection architecture. IEEE J Sel Top Quantum Electron 17(2):377–383CrossRef
64.
Zurück zum Zitat Kodi AK, Louri A (2011) Energy-efficient and bandwidth-reconfigurable photonic networks for high-performance computing (hpc) systems. IEEE J Sel Top Quantum Electron 17(2):384–395CrossRef Kodi AK, Louri A (2011) Energy-efficient and bandwidth-reconfigurable photonic networks for high-performance computing (hpc) systems. IEEE J Sel Top Quantum Electron 17(2):384–395CrossRef
65.
Zurück zum Zitat Zhou X, Zhang Z, Zhu Y, Li Y, Kumar S, Vahdat A, Zhao BY, Zheng H (2012) Mirror mirror on the ceiling: flexible wireless links for data centers. ACM SIGCOMM Comput Commun Rev 42(4):443–454CrossRef Zhou X, Zhang Z, Zhu Y, Li Y, Kumar S, Vahdat A, Zhao BY, Zheng H (2012) Mirror mirror on the ceiling: flexible wireless links for data centers. ACM SIGCOMM Comput Commun Rev 42(4):443–454CrossRef
66.
Zurück zum Zitat Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. ACM, pp 233–246 Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. ACM, pp 233–246
67.
Zurück zum Zitat Cafarella MJ, Halevy A, Khoussainova N (2009) Data integration for the relational web. Proc VLDB Endowment 2(1):1090–1101CrossRef Cafarella MJ, Halevy A, Khoussainova N (2009) Data integration for the relational web. Proc VLDB Endowment 2(1):1090–1101CrossRef
68.
Zurück zum Zitat Maletic JI, Marcus A (2000) Data cleansing: beyond integrity analysis. In: IQ. Citeseer, pp 200–209 Maletic JI, Marcus A (2000) Data cleansing: beyond integrity analysis. In: IQ. Citeseer, pp 200–209
69.
Zurück zum Zitat Kohavi R, Mason L, Parekh R, Zheng Z (2004) Lessons and challenges from mining retail e-commerce data. Mach Learn 57(1-2):83–113CrossRef Kohavi R, Mason L, Parekh R, Zheng Z (2004) Lessons and challenges from mining retail e-commerce data. Mach Learn 57(1-2):83–113CrossRef
70.
Zurück zum Zitat Chen H, Ku W-S, Wang H, Sun M-T (2010) Leveraging spatio-temporal redundancy for rfid data cleansing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 51–62 Chen H, Ku W-S, Wang H, Sun M-T (2010) Leveraging spatio-temporal redundancy for rfid data cleansing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 51–62
71.
Zurück zum Zitat Zhao Z, Ng W (2012) A model-based approach for rfid data stream cleansing. In Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 862–871 Zhao Z, Ng W (2012) A model-based approach for rfid data stream cleansing. In Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 862–871
72.
Zurück zum Zitat Khoussainova N, Balazinska M, Suciu D (2008) Probabilistic event extraction from rfid data. In: Data Engineering, 2008. IEEE 24th international conference on ICDE 2008. IEEE, pp 1480–1482 Khoussainova N, Balazinska M, Suciu D (2008) Probabilistic event extraction from rfid data. In: Data Engineering, 2008. IEEE 24th international conference on ICDE 2008. IEEE, pp 1480–1482
73.
Zurück zum Zitat Herbert KG, Wang JTL (2007) Biological data cleaning: a case study. Int J Inf Qual 1(1):60–82CrossRef Herbert KG, Wang JTL (2007) Biological data cleaning: a case study. Int J Inf Qual 1(1):60–82CrossRef
74.
Zurück zum Zitat Tsai T-H, Lin C-Y (2012) Exploring contextual redundancy in improving object-based video coding for video sensor networks surveillance. IEEE Transac Multmed 14(3):669–682CrossRef Tsai T-H, Lin C-Y (2012) Exploring contextual redundancy in improving object-based video coding for video sensor networks surveillance. IEEE Transac Multmed 14(3):669–682CrossRef
75.
Zurück zum Zitat Sarawagi S, Bhamidipaty A (2002) Interactive deduplication using active learning. In Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 269–278 Sarawagi S, Bhamidipaty A (2002) Interactive deduplication using active learning. In Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 269–278
76.
Zurück zum Zitat Kamath U, Compton J, Dogan RI, Jong KD, Shehu A (2012) An evolutionary algorithm approach for feature generation from sequence data and its application to dna splice site prediction. IEEE/ACM Transac Comput Biol Bioinforma (TCBB) 9(5):1387–1398CrossRef Kamath U, Compton J, Dogan RI, Jong KD, Shehu A (2012) An evolutionary algorithm approach for feature generation from sequence data and its application to dna splice site prediction. IEEE/ACM Transac Comput Biol Bioinforma (TCBB) 9(5):1387–1398CrossRef
77.
Zurück zum Zitat Leung K-S, Lee KH, Wang J-F, Ng EYT, Chan HLY, Tsui SKW, Mok TSK, Tse PC-H, Sung JJ-Y (2011) Data mining on dna sequences of hepatitis b virus. IEEE/ACM Transac Comput Biol Bioinforma 8(2):428–440CrossRef Leung K-S, Lee KH, Wang J-F, Ng EYT, Chan HLY, Tsui SKW, Mok TSK, Tse PC-H, Sung JJ-Y (2011) Data mining on dna sequences of hepatitis b virus. IEEE/ACM Transac Comput Biol Bioinforma 8(2):428–440CrossRef
78.
Zurück zum Zitat Huang Z, Shen H, Liu J, Zhou X (2011) Effective data co-reduction for multimedia similarity search. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, pp 1021–1032 Huang Z, Shen H, Liu J, Zhou X (2011) Effective data co-reduction for multimedia similarity search. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, pp 1021–1032
79.
Zurück zum Zitat Bleiholder J, Naumann F (2008) Data fusion. ACM Comput Surv (CSUR) 41(1):1CrossRef Bleiholder J, Naumann F (2008) Data fusion. ACM Comput Surv (CSUR) 41(1):1CrossRef
80.
Zurück zum Zitat Brewer EA (2000) Towards robust distributed systems. In: PODC. p 7 Brewer EA (2000) Towards robust distributed systems. In: PODC. p 7
81.
Zurück zum Zitat Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News 33(2):51–59CrossRef Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News 33(2):51–59CrossRef
82.
Zurück zum Zitat McKusick MK, Quinlan S (2009) Gfs: eqvolution on fast-forward. ACM Queue 7(7):10CrossRef McKusick MK, Quinlan S (2009) Gfs: eqvolution on fast-forward. ACM Queue 7(7):10CrossRef
83.
Zurück zum Zitat Chaiken R, Jenkins B, Larson P-Å, Ramsey B, Shakib D, Weaver S, Zhou J (2008) Scope: easy and efficient parallel processing of massive data sets. Proc VLDB Endowment 1(2):1265–1276CrossRef Chaiken R, Jenkins B, Larson P-Å, Ramsey B, Shakib D, Weaver S, Zhou J (2008) Scope: easy and efficient parallel processing of massive data sets. Proc VLDB Endowment 1(2):1265–1276CrossRef
84.
Zurück zum Zitat Beaver D, Kumar S, Li HC, Sobel J, Vajgel P et al (2010) Finding a needle in haystack: facebook’s photo storage. In OSDI, vol 10. pp 1–8 Beaver D, Kumar S, Li HC, Sobel J, Vajgel P et al (2010) Finding a needle in haystack: facebook’s photo storage. In OSDI, vol 10. pp 1–8
85.
Zurück zum Zitat DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: amazon’s highly available key-value store. In: SOSP, vol 7. pp 205–220 DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: amazon’s highly available key-value store. In: SOSP, vol 7. pp 205–220
86.
Zurück zum Zitat Karger D, Lehman E, Leighton T, Panigrahy R, Levine M, Lewin D (1997) Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proceedings of the twenty-ninth annual ACM symposium on theory of computing. ACM, pp 654–663 Karger D, Lehman E, Leighton T, Panigrahy R, Levine M, Lewin D (1997) Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proceedings of the twenty-ninth annual ACM symposium on theory of computing. ACM, pp 654–663
87.
Zurück zum Zitat Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst (TOCS) 26(2):4CrossRef Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst (TOCS) 26(2):4CrossRef
88.
Zurück zum Zitat Burrows M (2006) The chubby lock service for loosely-coupled distributed systems. In: Proceedings of the 7th symposium on Operating systems design and implementation. USENIX Association, pp 335–350 Burrows M (2006) The chubby lock service for loosely-coupled distributed systems. In: Proceedings of the 7th symposium on Operating systems design and implementation. USENIX Association, pp 335–350
89.
Zurück zum Zitat Lakshman A, Malik P (2009) Cassandra: structured storage system on a p2p network. In: Proceedings of the 28th ACM symposium on principles of distributed computing. ACM, pp 5–5 Lakshman A, Malik P (2009) Cassandra: structured storage system on a p2p network. In: Proceedings of the 28th ACM symposium on principles of distributed computing. ACM, pp 5–5
90.
Zurück zum Zitat George L (2011) HBase: the definitive guide. O’Reilly Media Inc George L (2011) HBase: the definitive guide. O’Reilly Media Inc
91.
Zurück zum Zitat Judd D (2008) hypertable-0.9. 0.4-alpha Judd D (2008) hypertable-0.9. 0.4-alpha
92.
Zurück zum Zitat Chodorow K (2013) MongoDB: the definitive guide. O’Reilly Media Inc Chodorow K (2013) MongoDB: the definitive guide. O’Reilly Media Inc
93.
Zurück zum Zitat Crockford D (2006) The application/json media type for javascript object notation (json) Crockford D (2006) The application/json media type for javascript object notation (json)
94.
Zurück zum Zitat Murty J (2009) Programming amazon web services: S3, EC2, SQS, FPS, and SimpleDB. O’Reilly Media Inc Murty J (2009) Programming amazon web services: S3, EC2, SQS, FPS, and SimpleDB. O’Reilly Media Inc
95.
Zurück zum Zitat Anderson JC, Lehnardt J, Slater N (2010) CouchDB: the definitive guide. O’Reilly Media Inc Anderson JC, Lehnardt J, Slater N (2010) CouchDB: the definitive guide. O’Reilly Media Inc
96.
Zurück zum Zitat Blanas S, Patel JM, Ercegovac V, Rao J, Shekita EJ, Tian Y (2010) A comparison of join algorithms for log processing in mapreduce. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 975–986 Blanas S, Patel JM, Ercegovac V, Rao J, Shekita EJ, Tian Y (2010) A comparison of join algorithms for log processing in mapreduce. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 975–986
97.
Zurück zum Zitat Yang H-C, Parker DS (2009) Traverse: simplified indexing on large map-reduce-merge clusters. In: Database systems for advanced applications. Springer, pp 308–322 Yang H-C, Parker DS (2009) Traverse: simplified indexing on large map-reduce-merge clusters. In: Database systems for advanced applications. Springer, pp 308–322
98.
Zurück zum Zitat Pike R, Dorward S, Griesemer R, Quinlan S (2005) Interpreting the data: parallel analysis with sawzall. Sci Program 13(4):277–298 Pike R, Dorward S, Griesemer R, Quinlan S (2005) Interpreting the data: parallel analysis with sawzall. Sci Program 13(4):277–298
99.
Zurück zum Zitat Gates AF, Natkovich O, Chopra S, Kamath P, Narayanamurthy SM, Olston C, Reed B, Srinivasan S, Srivastava U (2009) Building a high-level dataflow system on top of map-reduce: the pig experience. Proceedings VLDB Endowment 2(2):1414–1425CrossRef Gates AF, Natkovich O, Chopra S, Kamath P, Narayanamurthy SM, Olston C, Reed B, Srinivasan S, Srivastava U (2009) Building a high-level dataflow system on top of map-reduce: the pig experience. Proceedings VLDB Endowment 2(2):1414–1425CrossRef
100.
Zurück zum Zitat Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a map-reduce framework. Proc VLDB Endowment 2(2):1626–1629CrossRef Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a map-reduce framework. Proc VLDB Endowment 2(2):1626–1629CrossRef
101.
Zurück zum Zitat Isard M, Budiu M, Yu Y, Birrell A, Fetterly D (2007) Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS Oper Syst Rev 41(3):59–72CrossRef Isard M, Budiu M, Yu Y, Birrell A, Fetterly D (2007) Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS Oper Syst Rev 41(3):59–72CrossRef
102.
Zurück zum Zitat Yu Y, Isard M, Fetterly D, Budiu M, Erlingsson Ú, Gunda PK, Currey J (2008) Dryadlinq: a system for general-purpose distributed data-parallel computing using a high-level language. In: OSDI, vol 8. pp 1–14 Yu Y, Isard M, Fetterly D, Budiu M, Erlingsson Ú, Gunda PK, Currey J (2008) Dryadlinq: a system for general-purpose distributed data-parallel computing using a high-level language. In: OSDI, vol 8. pp 1–14
103.
Zurück zum Zitat Moretti C, Bulosan J, Thain D, Flynn PJ (2008) All-pairs: an abstraction for data-intensive cloud computing. In: Parallel and distributed processing, 2008. IEEE international symposium on IPDPS 2008. IEEE, pp 1–11 Moretti C, Bulosan J, Thain D, Flynn PJ (2008) All-pairs: an abstraction for data-intensive cloud computing. In: Parallel and distributed processing, 2008. IEEE international symposium on IPDPS 2008. IEEE, pp 1–11
104.
Zurück zum Zitat Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 135–146 Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 135–146
105.
Zurück zum Zitat Bu Y, Bill H, Balazinska M, Ernst MD (2010) Haloop: efficient iterative data processing on large clusters. Proc VLDB Endowment 3(1-2):285–296CrossRef Bu Y, Bill H, Balazinska M, Ernst MD (2010) Haloop: efficient iterative data processing on large clusters. Proc VLDB Endowment 3(1-2):285–296CrossRef
106.
Zurück zum Zitat Ekanayake J, Li H, Zhang B, Gunarathne T, Bae S-H, Qiu J, Fox G (2010) Twister: a runtime for iterative mapreduce. In Proceedings of the 19th ACM international symposium on high performance distributed computing. ACM, pp 810–818 Ekanayake J, Li H, Zhang B, Gunarathne T, Bae S-H, Qiu J, Fox G (2010) Twister: a runtime for iterative mapreduce. In Proceedings of the 19th ACM international symposium on high performance distributed computing. ACM, pp 810–818
107.
Zurück zum Zitat Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin M, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on networked systems design and implementation. USENIX Association, pp 2–2 Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin M, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on networked systems design and implementation. USENIX Association, pp 2–2
108.
Zurück zum Zitat Bhatotia P, Wieder A, Rodrigues R, Acar UA, Pasquin R (2011) Incoop: mapreduce for incremental computations. In: Proceedings of the 2nd ACM symposium on cloud computing. ACM, p 7 Bhatotia P, Wieder A, Rodrigues R, Acar UA, Pasquin R (2011) Incoop: mapreduce for incremental computations. In: Proceedings of the 2nd ACM symposium on cloud computing. ACM, p 7
109.
Zurück zum Zitat Murray DG, Schwarzkopf M, Smowton C, Smith S, Madhavapeddy A, Hand S (2011) Ciel: a universal execution engine for distributed data-flow computing. In: Proceedings of the 8th USENIX conference on Networked systems design and implementation. p 9 Murray DG, Schwarzkopf M, Smowton C, Smith S, Madhavapeddy A, Hand S (2011) Ciel: a universal execution engine for distributed data-flow computing. In: Proceedings of the 8th USENIX conference on Networked systems design and implementation. p 9
110.
Zurück zum Zitat Anderson TW (1958) An introduction to multivariate statistical analysis, vol 2. Wiley, New York Anderson TW (1958) An introduction to multivariate statistical analysis, vol 2. Wiley, New York
111.
Zurück zum Zitat Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37CrossRef Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37CrossRef
113.
Zurück zum Zitat Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B (2008) KNIME: the Konstanz information miner. Springer Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B (2008) KNIME: the Konstanz information miner. Springer
114.
Zurück zum Zitat Sallam RL, Richardson J, Hagerty J, Hostmann B (2011) Magic quadrant for business intelligence platforms. CT, Gartner Group, Stamford Sallam RL, Richardson J, Hagerty J, Hostmann B (2011) Magic quadrant for business intelligence platforms. CT, Gartner Group, Stamford
115.
Zurück zum Zitat Beyond the PC. Special Report on Personal Technology (2011) Beyond the PC. Special Report on Personal Technology (2011)
116.
Zurück zum Zitat Goff SA, Vaughn M, McKay S, Lyons E, Stapleton AE, Gessler D, Matasci N, Wang L, Hanlon M, Lenards A et al (2011) The iplant collaborative: cyberinfrastructure for plant biology. Front Plant Sci 34(2):1–16. doi:10.3389/fpls.2011.00034 Goff SA, Vaughn M, McKay S, Lyons E, Stapleton AE, Gessler D, Matasci N, Wang L, Hanlon M, Lenards A et al (2011) The iplant collaborative: cyberinfrastructure for plant biology. Front Plant Sci 34(2):1–16. doi:10.​3389/​fpls.​2011.​00034
117.
Zurück zum Zitat Baah GK, Gray A, Harrold MJ (2006) On-line anomaly detection of deployed software: a statistical machine learning approach. In: Proceedings of the 3rd international workshop on Software quality assurance. ACM, pp 70–77 Baah GK, Gray A, Harrold MJ (2006) On-line anomaly detection of deployed software: a statistical machine learning approach. In: Proceedings of the 3rd international workshop on Software quality assurance. ACM, pp 70–77
118.
Zurück zum Zitat Moeng M, Melhem R (2010) Applying statistical machine learning to multicore voltage & frequency scaling. In: Proceedings of the 7th ACM international conference on computing frontiers. ACM, pp 277–286 Moeng M, Melhem R (2010) Applying statistical machine learning to multicore voltage & frequency scaling. In: Proceedings of the 7th ACM international conference on computing frontiers. ACM, pp 277–286
119.
Zurück zum Zitat Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM Sigmod Record 34(2):18–26CrossRef Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM Sigmod Record 34(2):18–26CrossRef
120.
Zurück zum Zitat Verykios VS, Bertino E, Fovino IN, Provenza LP, Saygin Y, Theodoridis Y (2004) State-of-the-art in privacy preserving data mining. ACM Sigmod Record 33(1):50–57CrossRef Verykios VS, Bertino E, Fovino IN, Provenza LP, Saygin Y, Theodoridis Y (2004) State-of-the-art in privacy preserving data mining. ACM Sigmod Record 33(1):50–57CrossRef
121.
Zurück zum Zitat van der Aalst W (2012) Process mining: overview and opportunities. ACM Transac Manag Inform Syst (TMIS) 3(2):7 van der Aalst W (2012) Process mining: overview and opportunities. ACM Transac Manag Inform Syst (TMIS) 3(2):7
122.
Zurück zum Zitat Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press
123.
Zurück zum Zitat Pal SK, Talwar V, Mitra P (2002) Web mining in soft computing framework, relevance, state of the art and future directions. IEEE Transac Neural Netw 13(5):1163–1177CrossRef Pal SK, Talwar V, Mitra P (2002) Web mining in soft computing framework, relevance, state of the art and future directions. IEEE Transac Neural Netw 13(5):1163–1177CrossRef
124.
Zurück zum Zitat Chakrabarti S (2000) Data mining for hypertext: a tutorial survey. ACM SIGKDD Explor Newsl 1(2):1–11CrossRef Chakrabarti S (2000) Data mining for hypertext: a tutorial survey. ACM SIGKDD Explor Newsl 1(2):1–11CrossRef
125.
Zurück zum Zitat Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1):107–117CrossRef Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1):107–117CrossRef
126.
Zurück zum Zitat Konopnicki D, Shmueli O (1995) W3qs: a query system for the world-wide web. In: VLDB, vol 95. pp 54–65 Konopnicki D, Shmueli O (1995) W3qs: a query system for the world-wide web. In: VLDB, vol 95. pp 54–65
127.
Zurück zum Zitat Chakrabarti S, Van den Berg M, Dom B (1999) Focused crawling: a new approach to topic-specific web resource discovery. Comput Netw 31(11):1623–1640CrossRef Chakrabarti S, Van den Berg M, Dom B (1999) Focused crawling: a new approach to topic-specific web resource discovery. Comput Netw 31(11):1623–1640CrossRef
128.
Zurück zum Zitat Ding D, Metze F, Rawat S, Schulam PF, Burger S, Younessian E, Bao L, Christel MG, Hauptmann A (2012) Beyond audio and video retrieval: towards multimedia summarization. In: Proceedings of the 2nd ACM international conference on multimedia retrieval. ACM, pp 2 Ding D, Metze F, Rawat S, Schulam PF, Burger S, Younessian E, Bao L, Christel MG, Hauptmann A (2012) Beyond audio and video retrieval: towards multimedia summarization. In: Proceedings of the 2nd ACM international conference on multimedia retrieval. ACM, pp 2
129.
Zurück zum Zitat Wang M, Ni B, Hua X-S, Chua T-S (2012) Assistive tagging: a survey of multimedia tagging with human-computer joint exploration. ACM Comput Surv (CSUR) 44(4):25CrossRef Wang M, Ni B, Hua X-S, Chua T-S (2012) Assistive tagging: a survey of multimedia tagging with human-computer joint exploration. ACM Comput Surv (CSUR) 44(4):25CrossRef
130.
Zurück zum Zitat Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM Trans Multimed Comput Commun Appl (TOMCCAP) 2(1):1–19CrossRef Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM Trans Multimed Comput Commun Appl (TOMCCAP) 2(1):1–19CrossRef
131.
Zurück zum Zitat Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 41(6):797–819CrossRef Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 41(6):797–819CrossRef
132.
Zurück zum Zitat Park Y-J, Chang K-N (2009) Individual and group behavior-based customer profile model for personalized product recommendation. Expert Syst Appl 36(2):1932–1939MathSciNetCrossRef Park Y-J, Chang K-N (2009) Individual and group behavior-based customer profile model for personalized product recommendation. Expert Syst Appl 36(2):1932–1939MathSciNetCrossRef
133.
Zurück zum Zitat Barragáns-Martínez AB, Costa-Montenegro E, Burguillo JC, Rey-López M, Mikic-Fonte FA, Peleteiro A (2010) A hybrid content-based and item-based collaborative filtering approach to recommend tv programs enhanced with singular value decomposition. Inf Sci 180(22):4290–4311CrossRef Barragáns-Martínez AB, Costa-Montenegro E, Burguillo JC, Rey-López M, Mikic-Fonte FA, Peleteiro A (2010) A hybrid content-based and item-based collaborative filtering approach to recommend tv programs enhanced with singular value decomposition. Inf Sci 180(22):4290–4311CrossRef
134.
Zurück zum Zitat Naphade M, Smith JR, Tesic J, Chang S-F, Hsu W, Kennedy L, Hauptmann A, Curtis J (2006) Large-scale concept ontology for multimedia. IEEE Multimedia 13(3):86–91CrossRef Naphade M, Smith JR, Tesic J, Chang S-F, Hsu W, Kennedy L, Hauptmann A, Curtis J (2006) Large-scale concept ontology for multimedia. IEEE Multimedia 13(3):86–91CrossRef
135.
Zurück zum Zitat Ma Z, Yang Y, Cai Y, Sebe N, Hauptmann AG (2012) Knowledge adaptation for ad hoc multimedia event detection with few exemplars. In: Proceedings of the 20th ACM international conference on multimedia. ACM, pp 469–478 Ma Z, Yang Y, Cai Y, Sebe N, Hauptmann AG (2012) Knowledge adaptation for ad hoc multimedia event detection with few exemplars. In: Proceedings of the 20th ACM international conference on multimedia. ACM, pp 469–478
136.
Zurück zum Zitat Hirsch JE (2005) An index to quantify an individual’s scientific research output. Proc Natl Acad Sci USA 102(46):16569CrossRef Hirsch JE (2005) An index to quantify an individual’s scientific research output. Proc Natl Acad Sci USA 102(46):16569CrossRef
137.
Zurück zum Zitat Watts DJ (2004) Six degrees: the science of a connected age. WW Norton & Company Watts DJ (2004) Six degrees: the science of a connected age. WW Norton & Company
138.
Zurück zum Zitat Aggarwal CC (2011) An introduction to social network data analytics. Springer Aggarwal CC (2011) An introduction to social network data analytics. Springer
139.
Zurück zum Zitat Scellato S, Noulas A, Mascolo C (2011) Exploiting place features in link prediction on location-based social networks. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1046–1054 Scellato S, Noulas A, Mascolo C (2011) Exploiting place features in link prediction on location-based social networks. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1046–1054
140.
Zurück zum Zitat Ninagawa A, Eguchi K (2010) Link prediction using probabilistic group models of network structure. In: Proceedings of the 2010 ACM symposium on applied Computing. ACM, pp 1115–1116 Ninagawa A, Eguchi K (2010) Link prediction using probabilistic group models of network structure. In: Proceedings of the 2010 ACM symposium on applied Computing. ACM, pp 1115–1116
141.
Zurück zum Zitat Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. ACM Transac Knowl Discov Data (TKDD) 5(2):10 Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. ACM Transac Knowl Discov Data (TKDD) 5(2):10
142.
Zurück zum Zitat Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World wide web. ACM, pp 631–640 Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World wide web. ACM, pp 631–640
143.
Zurück zum Zitat Du N, Wu B, Pei X, Wang B, Xu L (2007) Community detection in large-scale social networks. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis. ACM, pp 16–25 Du N, Wu B, Pei X, Wang B, Xu L (2007) Community detection in large-scale social networks. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis. ACM, pp 16–25
144.
Zurück zum Zitat Garg S, Gupta T, Carlsson N, Mahanti A (2009) Evolution of an online social aggregation network: an empirical study. In: Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference. ACM, pp 315–321 Garg S, Gupta T, Carlsson N, Mahanti A (2009) Evolution of an online social aggregation network: an empirical study. In: Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference. ACM, pp 315–321
145.
Zurück zum Zitat Allamanis M, Scellato S, Mascolo C (2012) Evolution of a location-based online social network: analysis and models. In: Proceedings of the 2012 ACM conference on Internet measurement conference. ACM, pp 145–158 Allamanis M, Scellato S, Mascolo C (2012) Evolution of a location-based online social network: analysis and models. In: Proceedings of the 2012 ACM conference on Internet measurement conference. ACM, pp 145–158
146.
Zurück zum Zitat Gong NZ, Xu W, Huang L, Mittal P, Stefanov E, Sekar V, Song D (2012) Evolution of social-attribute networks: measurements, modeling, and implications using google+. In: Proceedings of the 2012 ACM conference on Internet measurement conference. ACM, pp 131–144 Gong NZ, Xu W, Huang L, Mittal P, Stefanov E, Sekar V, Song D (2012) Evolution of social-attribute networks: measurements, modeling, and implications using google+. In: Proceedings of the 2012 ACM conference on Internet measurement conference. ACM, pp 131–144
147.
Zurück zum Zitat Zheleva E, Sharara H, Getoor L (2009) Co-evolution of social and affiliation networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1007–1016 Zheleva E, Sharara H, Getoor L (2009) Co-evolution of social and affiliation networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1007–1016
148.
Zurück zum Zitat Tang J, Sun J, Wang C, Yang Z (2009) Social influence analysis in large-scale networks. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 807–816 Tang J, Sun J, Wang C, Yang Z (2009) Social influence analysis in large-scale networks. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 807–816
149.
Zurück zum Zitat Li Y, Chen W, Wang Y, Zhang Z-L (2013) Influence diffusion dynamics and influence maximization in social networks with friend and foe relationships. In Proceedings of the sixth ACM international conference on Web search and data mining. ACM, pp 657–666 Li Y, Chen W, Wang Y, Zhang Z-L (2013) Influence diffusion dynamics and influence maximization in social networks with friend and foe relationships. In Proceedings of the sixth ACM international conference on Web search and data mining. ACM, pp 657–666
150.
Zurück zum Zitat Dai W, Chen Y, Xue G-R, Yang Q, Yu Y (2008) Translated learning: transfer learning across different feature spaces: In: Advances in neural information processing systems. pp 353–360 Dai W, Chen Y, Xue G-R, Yang Q, Yu Y (2008) Translated learning: transfer learning across different feature spaces: In: Advances in neural information processing systems. pp 353–360
152.
Zurück zum Zitat Rhee Y, Lee J (2009) On modeling a model of mobile community: designing user interfaces to support group interaction. Interactions 16(6):46–51CrossRef Rhee Y, Lee J (2009) On modeling a model of mobile community: designing user interfaces to support group interaction. Interactions 16(6):46–51CrossRef
153.
Zurück zum Zitat Han J, Lee J-G, Gonzalez H, Li X (2008) Mining massive rfid, trajectory, and traffic data sets. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, p 2 Han J, Lee J-G, Gonzalez H, Li X (2008) Mining massive rfid, trajectory, and traffic data sets. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, p 2
154.
Zurück zum Zitat Garg MK, Kim D-J, Turaga DS, Prabhakaran B (2010) Multimodal analysis of body sensor network data streams for real-time healthcare. In: Proceedings of the international conference on multimedia information retrieval. ACM, pp 469–478 Garg MK, Kim D-J, Turaga DS, Prabhakaran B (2010) Multimodal analysis of body sensor network data streams for real-time healthcare. In: Proceedings of the international conference on multimedia information retrieval. ACM, pp 469–478
155.
Zurück zum Zitat Park Y, Ghosh J (2012) A probabilistic imputation framework for predictive analysis using variably aggregated, multi-source healthcare data. In: Proceedings of the 2nd ACM SIGHIT international health informatics symposium. ACM, pp 445–454 Park Y, Ghosh J (2012) A probabilistic imputation framework for predictive analysis using variably aggregated, multi-source healthcare data. In: Proceedings of the 2nd ACM SIGHIT international health informatics symposium. ACM, pp 445–454
156.
Zurück zum Zitat Tasevski P (2011) Password attacks and generation strategies. Tartu University: Faculty of Mathematics and Computer Sciences Tasevski P (2011) Password attacks and generation strategies. Tartu University: Faculty of Mathematics and Computer Sciences
Metadaten
Titel
Big Data: A Survey
verfasst von
Min Chen
Shiwen Mao
Yunhao Liu
Publikationsdatum
01.04.2014
Verlag
Springer US
Erschienen in
Mobile Networks and Applications / Ausgabe 2/2014
Print ISSN: 1383-469X
Elektronische ISSN: 1572-8153
DOI
https://doi.org/10.1007/s11036-013-0489-0

Weitere Artikel der Ausgabe 2/2014

Mobile Networks and Applications 2/2014 Zur Ausgabe

Neuer Inhalt