Skip to main content
Top
Published in: Soft Computing 5/2013

01-05-2013 | Methodologies and Application

Large data sets classification using convex–concave hull and support vector machine

Authors: Asdrúbal López Chau, Xiaoou Li, Wen Yu

Published in: Soft Computing | Issue 5/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Normal support vector machine (SVM) is not suitable for classification of large data sets because of high training complexity. Convex hull can simplify the SVM training. However, the classification accuracy becomes lower when there exist inseparable points. This paper introduces a novel method for SVM classification, called convex–concave hull SVM (CCH-SVM). After grid processing, the convex hull is used to find extreme points. Then, we use Jarvis march method to determine the concave (non-convex) hull for the inseparable points. Finally, the vertices of the convex–concave hull are applied for SVM training. The proposed CCH-SVM classifier has distinctive advantages on dealing with large data sets. We apply the proposed method on several benchmark problems. Experimental results demonstrate that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers. Compared with the other convex hull SVM methods, the classification accuracy is higher.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Bennett KP, Bredensteiner EJ (2000a) Geometry in learning. In: Gorini C (eds), Geometry at work. Mathematical Association of America, pp 132–145 Bennett KP, Bredensteiner EJ (2000a) Geometry in learning. In: Gorini C (eds), Geometry at work. Mathematical Association of America, pp 132–145
go back to reference Bennett K.P., Bredensteiner E.J. (2000b) Duality and geometry in SVM classifiers. 17th International Conference on Machine Learning, San Francisco Bennett K.P., Bredensteiner E.J. (2000b) Duality and geometry in SVM classifiers. 17th International Conference on Machine Learning, San Francisco
go back to reference Berg M, Cheong O, Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications. Springer, Berlin Berg M, Cheong O, Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications. Springer, Berlin
go back to reference Cervantes J, Li X, Yu W, Li K (2008) Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71:611–619CrossRef Cervantes J, Li X, Yu W, Li K (2008) Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71:611–619CrossRef
go back to reference Collobert R, Bengio S (2001) SVMTorch: support vector machines for large regression problems. J Mach Learn Res 1:143–160MathSciNet Collobert R, Bengio S (2001) SVMTorch: support vector machines for large regression problems. J Mach Learn Res 1:143–160MathSciNet
go back to reference Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. 23rd international conference on machine learning, Pittsburgh, pp 201–208 Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. 23rd international conference on machine learning, Pittsburgh, pp 201–208
go back to reference Crisp DJ, Burges CJC (2000) A geometric interpretation of υ-SVM classifiers. NIPS 12:244–250 Crisp DJ, Burges CJC (2000) A geometric interpretation of υ-SVM classifiers. NIPS 12:244–250
go back to reference Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
go back to reference Eddy W (1977) A new convex hull algorithm for planar sets. ACM Trans Math Softw 3(4):398–403MATHCrossRef Eddy W (1977) A new convex hull algorithm for planar sets. ACM Trans Math Softw 3(4):398–403MATHCrossRef
go back to reference Franc V, Hlavac V (2003) An iterative algorithm learning the maximal margin classifier. Pattern Recogn Lett 36:1985–1996MATHCrossRef Franc V, Hlavac V (2003) An iterative algorithm learning the maximal margin classifier. Pattern Recogn Lett 36:1985–1996MATHCrossRef
go back to reference Gilbert EG (1966) An iterative procedure for computing the minimum of a quadratic form on a convex set. SIAM J Control Optim 4(1):61–79MATHCrossRef Gilbert EG (1966) An iterative procedure for computing the minimum of a quadratic form on a convex set. SIAM J Control Optim 4(1):61–79MATHCrossRef
go back to reference Graham RL (1972) An efficient algorithm for dutennining the convex hull of a finite pianar set. Inf Process Lett 1:132–133MATHCrossRef Graham RL (1972) An efficient algorithm for dutennining the convex hull of a finite pianar set. Inf Process Lett 1:132–133MATHCrossRef
go back to reference Guo G, Zhang J-S (2007) Reducing examples to accelerate support vector regression. Pattern Recognit Lett 28:2173–2183CrossRef Guo G, Zhang J-S (2007) Reducing examples to accelerate support vector regression. Pattern Recognit Lett 28:2173–2183CrossRef
go back to reference Hsu C-W, Chang C-C, Lin C-J (2010) A practical guide to support vector classification. Bioinform Biol Insights 1(1):1–16MathSciNet Hsu C-W, Chang C-C, Lin C-J (2010) A practical guide to support vector classification. Bioinform Biol Insights 1(1):1–16MathSciNet
go back to reference Jarvis RA (1973) On the identification of the convex hull of a finite set of points in the plane. Inf Process Lett 2:18–21MATHCrossRef Jarvis RA (1973) On the identification of the convex hull of a finite set of points in the plane. Inf Process Lett 2:18–21MATHCrossRef
go back to reference Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, Berlin Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, Berlin
go back to reference Keerthi SS, Gilbert EG (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46:351–360MATHCrossRef Keerthi SS, Gilbert EG (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46:351–360MATHCrossRef
go back to reference Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans Neural Netw 11(12):124–137 Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans Neural Netw 11(12):124–137
go back to reference Li Y (2011) Selecting training points for one-class support vector machines. Pattern Recognit Lett 32:1517–1522CrossRef Li Y (2011) Selecting training points for one-class support vector machines. Pattern Recognit Lett 32:1517–1522CrossRef
go back to reference Mavroforakis ME, Theodoridis S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17(3):671–682CrossRef Mavroforakis ME, Theodoridis S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17(3):671–682CrossRef
go back to reference Mitchell BF, Dem’yanov VF, Malozemov VN (1971) Finding the point of a polyhedron closest to the origin. Vestinik Leningrad Gos Univ 13:38–45 Mitchell BF, Dem’yanov VF, Malozemov VN (1971) Finding the point of a polyhedron closest to the origin. Vestinik Leningrad Gos Univ 13:38–45
go back to reference Moreira A, Santos MY (2007) Concave hull: a K-nearest neighbours approach for the computation of the region occupied by a set of points. GRAPP (GM/R), pp 61–68 Moreira A, Santos MY (2007) Concave hull: a K-nearest neighbours approach for the computation of the region occupied by a set of points. GRAPP (GM/R), pp 61–68
go back to reference Pizzuti C, Talia D (2003) P-auto class: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15(3):629–641CrossRef Pizzuti C, Talia D (2003) P-auto class: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15(3):629–641CrossRef
go back to reference Platt J. (1998) Fast training of support vector machine using sequential minimal optimization, advances in kernel methods: support vector machine. MIT Press, Cambridge Platt J. (1998) Fast training of support vector machine using sequential minimal optimization, advances in kernel methods: support vector machine. MIT Press, Cambridge
go back to reference Schlesinger MI, Kalmykov VG, Suchorukov AA, Sravnitelnyj (1981) Comparative analysis of algorithms synthesising linear decision rule for analysis of complex hypotheses. Automatika 1:3–9 Schlesinger MI, Kalmykov VG, Suchorukov AA, Sravnitelnyj (1981) Comparative analysis of algorithms synthesising linear decision rule for analysis of complex hypotheses. Automatika 1:3–9
go back to reference Tsang IW, Kwok JT, Cheung PM (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6:363–392MathSciNetMATH Tsang IW, Kwok JT, Cheung PM (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6:363–392MathSciNetMATH
go back to reference Vapnik V (1995) The nature of statistical learning theory. Springer, New York Vapnik V (1995) The nature of statistical learning theory. Springer, New York
go back to reference Xia C, Hsu W, Lee ML, Ooi BC (2006) BORDER: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18(3):289–303CrossRef Xia C, Hsu W, Lee ML, Ooi BC (2006) BORDER: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18(3):289–303CrossRef
go back to reference Yu W, Li X (2008) On-line fuzzy modeling via clustering and support vector machines. Inf Sci 178:4264–4279MATHCrossRef Yu W, Li X (2008) On-line fuzzy modeling via clustering and support vector machines. Inf Sci 178:4264–4279MATHCrossRef
go back to reference Yu H, Yang J, Han J (2003) Classifying large data sets using SVMs with hierarchical clusters. Proceedings of the 9th ACM SIGKDD 2003 Washington, DC Yu H, Yang J, Han J (2003) Classifying large data sets using SVMs with hierarchical clusters. Proceedings of the 9th ACM SIGKDD 2003 Washington, DC
go back to reference Yuille AL, Rangarajan A (2003) The concave-convex procedure. Neural Comput Appl 15(4):915–936MATHCrossRef Yuille AL, Rangarajan A (2003) The concave-convex procedure. Neural Comput Appl 15(4):915–936MATHCrossRef
Metadata
Title
Large data sets classification using convex–concave hull and support vector machine
Authors
Asdrúbal López Chau
Xiaoou Li
Wen Yu
Publication date
01-05-2013
Publisher
Springer-Verlag
Published in
Soft Computing / Issue 5/2013
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-012-0954-x

Other articles of this Issue 5/2013

Soft Computing 5/2013 Go to the issue

Premium Partner