Skip to main content
Top
Published in: Knowledge and Information Systems 2/2017

23-12-2016 | Regular Paper

FIU-Miner (a fast, integrated, and user-friendly system for data mining) and its applications

Authors: Tao Li, Chunqiu Zeng, Wubai Zhou, Wei Xue, Yue Huang, Zheng Liu, Qifeng Zhou, Bin Xia, Qing Wang, Wentao Wang, Xiaolong Zhu

Published in: Knowledge and Information Systems | Issue 2/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The advent of Big Data era drives data analysts from different domains to use data mining techniques for data analysis. However, performing data analysis in a specific domain is not trivial; it often requires complex task configuration, onerous integration of algorithms, and efficient execution in distributed environments. Few efforts have been paid on developing effective tools to facilitate data analysts in conducting complex data analysis tasks. In this paper, we design and implement FIU-Miner, a Fast, Integrated, and User-friendly system to ease data analysis. FIU-Miner allows users to rapidly configure a complex data analysis task without writing a single line of code. It also helps users conveniently import and integrate different analysis programs. Further, it significantly balances resource utilization and task execution in heterogeneous environments. Case studies of real-world applications demonstrate the efficacy and effectiveness of our proposed system.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Anselin L (1995) Local indicators of spatial association—LISA. Geogr Anal 27(2):93–115CrossRef Anselin L (1995) Local indicators of spatial association—LISA. Geogr Anal 27(2):93–115CrossRef
2.
go back to reference Belz R, Mertens P (1996) Combining knowledge-based systems and simulation to solve rescheduling problems. Decis Support Syst 17(2):141–157CrossRef Belz R, Mertens P (1996) Combining knowledge-based systems and simulation to solve rescheduling problems. Decis Support Syst 17(2):141–157CrossRef
3.
go back to reference Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton
4.
go back to reference Chang C-C, Lin Chih-Jen (2011) Libsvm: a library for support vector machines. TIST 2(3):27CrossRef Chang C-C, Lin Chih-Jen (2011) Libsvm: a library for support vector machines. TIST 2(3):27CrossRef
5.
go back to reference Chen Injazz J (2001) Planning for ERP systems: analysis and future trend. Bus Process Manag J 7(5):374–386CrossRef Chen Injazz J (2001) Planning for ERP systems: analysis and future trend. Bus Process Manag J 7(5):374–386CrossRef
6.
go back to reference Chen W-C, Tseng S-S, Wang Ching-Yao (2005) A novel manufacturing defect detection method using association rule mining techniques. Exp Syst Appl 29(4):807–815CrossRef Chen W-C, Tseng S-S, Wang Ching-Yao (2005) A novel manufacturing defect detection method using association rule mining techniques. Exp Syst Appl 29(4):807–815CrossRef
7.
go back to reference Davis Chad A, Gerick Fabian, Hintermair Volker, Friedel Caroline C, Fundel Katrin, Küffner Robert, Zimmer Ralf (2006) Reliable gene signatures for microarray classification: assessment of stability and performance. Bioinformatics 22(19):2356–2363CrossRef Davis Chad A, Gerick Fabian, Hintermair Volker, Friedel Caroline C, Fundel Katrin, Küffner Robert, Zimmer Ralf (2006) Reliable gene signatures for microarray classification: assessment of stability and performance. Bioinformatics 22(19):2356–2363CrossRef
8.
go back to reference Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232 Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232
9.
go back to reference Groger C, Niedermann F, Schwarz H, Mitschang B (2012) Supporting manufacturing design by analytics, continuous collaborative process improvement enabled by the advanced manufacturing analytics platform. In: CSCWD, pp 793–799. IEEE Groger C, Niedermann F, Schwarz H, Mitschang B (2012) Supporting manufacturing design by analytics, continuous collaborative process improvement enabled by the advanced manufacturing analytics platform. In: CSCWD, pp 793–799. IEEE
10.
go back to reference Gröger C, Niedermann F, Mitschang B (2012) Data mining-driven manufacturing process optimization. Proc World Congr Eng 3:4–6 Gröger C, Niedermann F, Mitschang B (2012) Data mining-driven manufacturing process optimization. Proc World Congr Eng 3:4–6
11.
go back to reference Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11(1):10–18 Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11(1):10–18
12.
go back to reference Jiang Y, Perng C-S, Sailer A, Silva-Lepe I, Zhou Yang, Li Tao (2016) CSM: a cloud service marketplace for complex service acquisition. ACM TIST 8(1):8 Jiang Y, Perng C-S, Sailer A, Silva-Lepe I, Zhou Yang, Li Tao (2016) CSM: a cloud service marketplace for complex service acquisition. ACM TIST 8(1):8
13.
go back to reference Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95–116CrossRef Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95–116CrossRef
14.
go back to reference Li H, Calder CA, Cressie N (2007) Beyond Moran’s I: testing for spatial dependence based on the spatial autoregressive model. Geogr Anal 39(4):357–375CrossRef Li H, Calder CA, Cressie N (2007) Beyond Moran’s I: testing for spatial dependence based on the spatial autoregressive model. Geogr Anal 39(4):357–375CrossRef
15.
go back to reference Lei L, Wei P, Saurabh K, Tong S, Tao L (2015) Recommending users and communities in social media. ACM Trans Knowl Discov Data 10(2):17:1–17:27 Lei L, Wei P, Saurabh K, Tong S, Tao L (2015) Recommending users and communities in social media. ACM Trans Knowl Discov Data 10(2):17:1–17:27
16.
go back to reference Li L, Shen C, Wang L, Zheng L, Jiang Y, Tang L, Li H, Zhang L, Zeng C, Li T, Tang J, Liu D (2014) Iminer: mining inventory data for intelligent management. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, CIKM ’14, pp 2057–2059, New York, ACM Li L, Shen C, Wang L, Zheng L, Jiang Y, Tang L, Li H, Zhang L, Zeng C, Li T, Tang J, Liu D (2014) Iminer: mining inventory data for intelligent management. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, CIKM ’14, pp 2057–2059, New York, ACM
17.
go back to reference Liu H, Motoda H (2008) Computational methods of feature selection. Chapman & Hall, LondonMATH Liu H, Motoda H (2008) Computational methods of feature selection. Chapman & Hall, LondonMATH
18.
go back to reference Loscalzo S, Yu L, Ding C (2009) Consensus group stable feature selection. In: SIGKDD, pp 567–576. ACM Loscalzo S, Yu L, Ding C (2009) Consensus group stable feature selection. In: SIGKDD, pp 567–576. ACM
19.
go back to reference Lu Y, Zhang M, Li T, Guang Y, Rishe N (2013) Online spatial data analysis and visualization system. In: Proceedings of the ACM SIGKDD workshop on interactive data exploration and analytics, pp 71–78. ACM Lu Y, Zhang M, Li T, Guang Y, Rishe N (2013) Online spatial data analysis and visualization system. In: Proceedings of the ACM SIGKDD workshop on interactive data exploration and analytics, pp 71–78. ACM
22.
go back to reference Oh S, Han J, Cho H (2001) Intelligent process control system for quality improvement by data mining in the process industry. In: Dan B (ed) Data mining for design and manufacturing, pp 289–309. Springer, Berlin Oh S, Han J, Cho H (2001) Intelligent process control system for quality improvement by data mining in the process industry. In: Dan B (ed) Data mining for design and manufacturing, pp 289–309. Springer, Berlin
23.
go back to reference Owen S, Anil R, Dunning T, Friedman E (2011) Mahout in action. Manning, New York Owen S, Anil R, Dunning T, Friedman E (2011) Mahout in action. Manning, New York
24.
go back to reference Pang-Ning T, Steinbach M, Kumar V et al (2006) Introduction to data mining. Pearson Education, USA Pang-Ning T, Steinbach M, Kumar V et al (2006) Introduction to data mining. Pearson Education, USA
25.
go back to reference Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE PAMI 27(8):1226–1238 Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE PAMI 27(8):1226–1238
26.
go back to reference Pindyck RS, Rubinfeld DL (1998) Econometric models and economic forecasts. Irwin and McGraw-Hill, New York Pindyck RS, Rubinfeld DL (1998) Econometric models and economic forecasts. Irwin and McGraw-Hill, New York
27.
go back to reference Prekopcsak Z, Makrai G, Henk T, Gaspar-Papanek C (2011) Radoop: analyzing big data with rapidminer and hadoop. In: RCOMM Prekopcsak Z, Makrai G, Henk T, Gaspar-Papanek C (2011) Radoop: analyzing big data with rapidminer and hadoop. In: RCOMM
28.
go back to reference Rasmussen CE (2006) Gaussian processes for machine learning. MIT Press, Cambridge Rasmussen CE (2006) Gaussian processes for machine learning. MIT Press, Cambridge
29.
go back to reference Shen L, Francis EHT, Liangsheng Q, Yudi S (2000) Fault diagnosis using rough sets theory. Comput Ind 43(1):61–72CrossRef Shen L, Francis EHT, Liangsheng Q, Yudi S (2000) Fault diagnosis using rough sets theory. Comput Ind 43(1):61–72CrossRef
30.
go back to reference Skormin VA, Gorodetski VI, Popyack LJ (2002) Data mining technology for failure prognostic of avionics. TAES 38(2):388–403 Skormin VA, Gorodetski VI, Popyack LJ (2002) Data mining technology for failure prognostic of avionics. TAES 38(2):388–403
31.
go back to reference Tan P-N, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson Education, USA Tan P-N, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson Education, USA
32.
go back to reference Tao L, Chunqiu Z, Wubai Z, Qifeng Z, Li Z (2015) Data mining in the era of big data: from the application perspective. Big Data Res 1(4):1–24 Tao L, Chunqiu Z, Wubai Z, Qifeng Z, Li Z (2015) Data mining in the era of big data: from the application perspective. Big Data Res 1(4):1–24
34.
go back to reference Unger DA, van den Dool H, O’Lenic E, Collins D (2009) Ensemble regression. Month Weather Rev 137(7):2365–2379CrossRef Unger DA, van den Dool H, O’Lenic E, Collins D (2009) Ensemble regression. Month Weather Rev 137(7):2365–2379CrossRef
35.
go back to reference Woznica A, Nguyen P, Kalousis A (2012) Model mining for robust feature selection. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining ACM, New York Woznica A, Nguyen P, Kalousis A (2012) Model mining for robust feature selection. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining ACM, New York
36.
go back to reference Yu L, Zheng J, Wu B, Wang B, Shen C, Qian L, Zhang R (2012) Bc-pdm: data mining, social network analysis and text mining system based on cloud computing. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1496–1499). ACM, New York Yu L, Zheng J, Wu B, Wang B, Shen C, Qian L, Zhang R (2012) Bc-pdm: data mining, social network analysis and text mining system based on cloud computing. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1496–1499). ACM, New York
37.
go back to reference Yu L, Ding C, Loscalzo S (2008) Stable feature selection via dense feature groups. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 803–811. ACM, New York Yu L, Ding C, Loscalzo S (2008) Stable feature selection via dense feature groups. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 803–811. ACM, New York
38.
go back to reference Zeng C, Jiang Y, Zheng L, Li J, Li L, Li H, Shen C, Zhou W, Li T, Duan B, Lei M, Wang P (2013) FIU-Miner: international conference on knowledge discovery and data mining, pp 1506–1509 Zeng C, Jiang Y, Zheng L, Li J, Li L, Li H, Shen C, Zhou W, Li T, Duan B, Lei M, Wang P (2013) FIU-Miner: international conference on knowledge discovery and data mining, pp 1506–1509
39.
go back to reference Zeng C, Li H, Wang H, Guang Y, Liu C, Li T, Zhang M, Chen S-C, Rishe N (2014) Optimizing online spatial data analysis with sequential query patterns. In: Joshi J, Bertino E, Thuraisingham BM, Liu L (eds) IRI, pp 253–260. IEEE Zeng C, Li H, Wang H, Guang Y, Liu C, Li T, Zhang M, Chen S-C, Rishe N (2014) Optimizing online spatial data analysis with sequential query patterns. In: Joshi J, Bertino E, Thuraisingham BM, Liu L (eds) IRI, pp 253–260. IEEE
40.
go back to reference Zhang M, Wang H, Lu Y, Li T, Guang Y, Liu C, Edrosa E, Li H, Rishe N (2015) Terrafly geocloud: an online spatial data analysis and visualization system. ACM Trans Intell Syst Technol 6(3):34:1–34:24 Zhang M, Wang H, Lu Y, Li T, Guang Y, Liu C, Edrosa E, Li H, Rishe N (2015) Terrafly geocloud: an online spatial data analysis and visualization system. ACM Trans Intell Syst Technol 6(3):34:1–34:24
41.
go back to reference Zheng L, Shen C, Tang L, Zeng C, Li T, Luis S, Chen S-C (2013) Data mining meets the needs of disaster information management. IEEE Trans Hum-Mach Syst 43(5):451–464CrossRef Zheng L, Shen C, Tang L, Zeng C, Li T, Luis S, Chen S-C (2013) Data mining meets the needs of disaster information management. IEEE Trans Hum-Mach Syst 43(5):451–464CrossRef
42.
go back to reference Zheng L, Zeng C, Li L, Jiang Y, Xue W, Li J, Shen C, Zhou W, Li H, Tang L, Li T, Duan B, Lei M, Wang P (2014) Applying data mining techniques to address critical process optimization needs in advanced manufacturing. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14, pp 1739–1748, New York, ACM Zheng L, Zeng C, Li L, Jiang Y, Xue W, Li J, Shen C, Zhou W, Li H, Tang L, Li T, Duan B, Lei M, Wang P (2014) Applying data mining techniques to address critical process optimization needs in advanced manufacturing. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14, pp 1739–1748, New York, ACM
43.
go back to reference Zipkin PH (2000) Foundations of inventory management, vol 2 Zipkin PH (2000) Foundations of inventory management, vol 2
Metadata
Title
FIU-Miner (a fast, integrated, and user-friendly system for data mining) and its applications
Authors
Tao Li
Chunqiu Zeng
Wubai Zhou
Wei Xue
Yue Huang
Zheng Liu
Qifeng Zhou
Bin Xia
Qing Wang
Wentao Wang
Xiaolong Zhu
Publication date
23-12-2016
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 2/2017
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-016-1014-0

Other articles of this Issue 2/2017

Knowledge and Information Systems 2/2017 Go to the issue

Premium Partner