Skip to main content

2018 | OriginalPaper | Buchkapitel

Granular Computing Techniques for Bioinformatics Pattern Recognition Problems in Non-metric Spaces

verfasst von : Alessio Martino, Alessandro Giuliani, Antonello Rizzi

Erschienen in: Computational Intelligence for Pattern Recognition

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Computational intelligence and pattern recognition techniques are gaining more and more attention as the main computing tools in bioinformatics applications. This is due to the fact that biology by definition, deals with complex systems and that computational intelligence can be considered as an effective approach when facing the general problem of complex systems modelling. Moreover, most data available on shared databases are represented by sequences and graphs, thus demanding the definition of meaningful dissimilarity measures between patterns, which are often non-metric in nature. Especially in such cases, evolutive and fully automatic machine learning systems are mandatory for dealing with parametric dissimilarity measures and/or for performing suitable feature selection. Besides other approaches, such as kernel methods and embedding in dissimilarity spaces, granular computing is a very promising framework not only for designing effective data-driven modelling systems able to determine automatically the correct representation (abstraction) level, but also for giving to field-experts (biologists) the possibility to investigate information granules (frequent substructures) that have been discovered by the machine learning system as the most relevant for the problem at hand. We expect that many important discoveries in biology and medicine in the next future will be determined by an increasingly stronger integration between the ongoing research efforts of natural sciences and modern inductive modelling tools based on computational intelligence, pattern recognition and granular computing techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
For example, let us consider a classification/clustering algorithm driven by the Euclidean distance. A common problem with the Euclidean distance is that features spanning a wider range of values have more influence in the resulting distance measure, therefore normalising all attributes in the same range (usually [0, 1] or \([-1,+1]\)) ensures fair contribution from all attributes, regardless of their original range.
 
2
In Statistics, outliers are “anomalous data” that for a given dissimilarity measure lie far away from most observations.
 
3
Non sunt multiplicanda entia sine necessitate (Entities are not to be multiplied without necessity), commonly known as “The Ockham’s Razor” Criterion (William of Ockham, circa 1287–1347). This criterion states that among a set of predicting models sharing the same performances, the simplest one (i.e. the one with the simplest decision surfaces) should be preferred. It is for sure one of the fundamental axioms for thoughtful and practical data-driven modelling.
 
4
Also known as hyperparameters in the Machine Learning terminology.
 
5
That is why evolutionary optimisation metaheuristics fall within the derivative-free methods.
 
6
A common choice for a genetic algorithm fitness function takes into account both the model performance and its structural complexity. Specifically, whilst the former should be maximised, the latter should be minimised in order to avoid overfitting (cf. the Ockham’s Razor Criterion).
 
7
That is why in most of the Chapter, unless explicitly specified, the generic term (dis)similarity will be used.
 
8
Indeed, the anatomical structure changes in the order of months/years depending on the age of subjects.
 
9
A finite set of points equipped with a notion of distance in a finite multidimensional space.
 
10
According to which the distance between two strings of equal length is given by the number of mismatches.
 
11
Also known as the Gram Matrix, after Danish mathematician Jørgen Pedersen Gram.
 
12
If the similarity measure at hand is not symmetric, patterns’ distance vectors as taken by rows or columns will be different. In order to overcome this problem, one can ‘force’ a similarity measure to be symmetric by considering \(\mathbf {S}:=(\mathbf {S}+\mathbf {S}^T)/2\) (e.g. [14]).
 
13
Also known as the Krebs cycle.
 
14
Protein molecules driving the folding of other protein systems.
 
15
Indeed, the absolute entity of metabolic rate can vary for a lot of reasons going from anatomical differences among patients to their actual nutrition state.
 
Literatur
1.
Zurück zum Zitat S. Alelyani, J. Tang, H. Liu, Feature selection for clustering: a review. Data Clust. Algorithms Appl. 29, 110–121 (2013) S. Alelyani, J. Tang, H. Liu, Feature selection for clustering: a review. Data Clust. Algorithms Appl. 29, 110–121 (2013)
2.
Zurück zum Zitat C. Anderson, The end of theory: the data deluge makes the scientific method obsolete. Wired mag. 16(7), 16–07 (2008) C. Anderson, The end of theory: the data deluge makes the scientific method obsolete. Wired mag. 16(7), 16–07 (2008)
3.
Zurück zum Zitat M. Ankerst, M.M. Breunig, H.P. Kriegel, J. Sander, Optics: ordering points to identify the clustering structure. ACM Sigmod Rec. 28, 49–60 (1999)CrossRef M. Ankerst, M.M. Breunig, H.P. Kriegel, J. Sander, Optics: ordering points to identify the clustering structure. ACM Sigmod Rec. 28, 49–60 (1999)CrossRef
4.
Zurück zum Zitat A. Bargiela, W. Pedrycz, Granular Computing: An Introduction (Kluwer Academic Publishers, Boston, 2003)CrossRef A. Bargiela, W. Pedrycz, Granular Computing: An Introduction (Kluwer Academic Publishers, Boston, 2003)CrossRef
5.
Zurück zum Zitat V. Beckers, L.M. Dersch, K. Lotz, G. Melzer, O.E. Bläsing, R. Fuchs, T. Ehrhardt, C. Wittmann, In silico metabolic network analysis of arabidopsis leaves. BMC Syst. Biol. 10(1), 102 (2016)CrossRef V. Beckers, L.M. Dersch, K. Lotz, G. Melzer, O.E. Bläsing, R. Fuchs, T. Ehrhardt, C. Wittmann, In silico metabolic network analysis of arabidopsis leaves. BMC Syst. Biol. 10(1), 102 (2016)CrossRef
6.
Zurück zum Zitat J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)MathSciNetMATH J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)MathSciNetMATH
7.
Zurück zum Zitat H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne, The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000)CrossRef H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne, The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000)CrossRef
8.
Zurück zum Zitat F.M. Bianchi, L. Livi, A. Rizzi, A. Sadeghian, A granular computing approach to the design of optimized graph classification systems. Soft Comput. 18(2), 393–412 (2014)CrossRef F.M. Bianchi, L. Livi, A. Rizzi, A. Sadeghian, A granular computing approach to the design of optimized graph classification systems. Soft Comput. 18(2), 393–412 (2014)CrossRef
9.
Zurück zum Zitat F.M. Bianchi, S. Scardapane, A. Rizzi, A. Uncini, A. Sadeghian, Granular computing techniques for classification and semantic characterization of structured data. Cogn. Comput. 8(3), 442–461 (2016)CrossRef F.M. Bianchi, S. Scardapane, A. Rizzi, A. Uncini, A. Sadeghian, Granular computing techniques for classification and semantic characterization of structured data. Cogn. Comput. 8(3), 442–461 (2016)CrossRef
10.
Zurück zum Zitat P.S. Bradley, O.L. Mangasarian, W.N. Street, Clustering via concave minimization, in Advances in Neural Information Processing Systems (1997), pp. 368–374 P.S. Bradley, O.L. Mangasarian, W.N. Street, Clustering via concave minimization, in Advances in Neural Information Processing Systems (1997), pp. 368–374
11.
Zurück zum Zitat E. Bullmore, O. Sporns, Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10(3), 186–198 (2009)CrossRef E. Bullmore, O. Sporns, Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10(3), 186–198 (2009)CrossRef
13.
Zurück zum Zitat C. Cellucci, Rethinking Logic: Logic in Relation to Mathematics, Evolution, and Method (Springer Science & Business Media, 2013) C. Cellucci, Rethinking Logic: Logic in Relation to Mathematics, Evolution, and Method (Springer Science & Business Media, 2013)
14.
Zurück zum Zitat Y. Chen, E.K. Garcia, M.R. Gupta, A. Rahimi, L. Cazzanti, Similarity-based classification: concepts and algorithms. J. Mach. Learn. Res. 10, 747–776 (2009) Y. Chen, E.K. Garcia, M.R. Gupta, A. Rahimi, L. Cazzanti, Similarity-based classification: concepts and algorithms. J. Mach. Learn. Res. 10, 747–776 (2009)
15.
Zurück zum Zitat Y. Chen, M.R. Gupta, B. Recht, Learning kernels from indefinite similarities, in Proceedings of the 26th Annual International Conference on Machine Learning (ACM, 2009), pp. 145–152 Y. Chen, M.R. Gupta, B. Recht, Learning kernels from indefinite similarities, in Proceedings of the 26th Annual International Conference on Machine Learning (ACM, 2009), pp. 145–152
16.
Zurück zum Zitat A. Colorni, M. Dorigo, V. Maniezzo, Distributed optimization by ant colonies, in Toward a Practice of Autonomous Systems: Proceedings of the First European Conference on Artificial Life (Mit Press, 1992), p. 134 A. Colorni, M. Dorigo, V. Maniezzo, Distributed optimization by ant colonies, in Toward a Practice of Autonomous Systems: Proceedings of the First European Conference on Artificial Life (Mit Press, 1992), p. 134
17.
Zurück zum Zitat D. Counsell, A review of bioinformatics education in the uk. Brief. Bioinform. 4(1), 7–21 (2003)CrossRef D. Counsell, A review of bioinformatics education in the uk. Brief. Bioinform. 4(1), 7–21 (2003)CrossRef
18.
Zurück zum Zitat J. Damoiseaux, S. Rombouts, F. Barkhof, P. Scheltens, C. Stam, S.M. Smith, C. Beckmann, Consistent resting-state networks across healthy subjects. Proc. Natl. Acad. Sci. 103(37), 13848–13853 (2006)CrossRef J. Damoiseaux, S. Rombouts, F. Barkhof, P. Scheltens, C. Stam, S.M. Smith, C. Beckmann, Consistent resting-state networks across healthy subjects. Proc. Natl. Acad. Sci. 103(37), 13848–13853 (2006)CrossRef
19.
Zurück zum Zitat D.L. Davies, D.W. Bouldin, A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)CrossRef D.L. Davies, D.W. Bouldin, A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)CrossRef
20.
Zurück zum Zitat G. Del Vescovo, L. Livi, F.M. Frattale Mascioli, A. Rizzi, On the problem of modeling structured data with the minsod representative. Int. J. Comput. Theory Eng. 6(1), 9 (2014) G. Del Vescovo, L. Livi, F.M. Frattale Mascioli, A. Rizzi, On the problem of modeling structured data with the minsod representative. Int. J. Comput. Theory Eng. 6(1), 9 (2014)
21.
Zurück zum Zitat A. Di Noia, P. Montanari, A. Rizzi, Occupational diseases risk prediction by cluster analysis and genetic optimization, in Proceedings of the International Joint Conference on Computational Intelligence (SCITEPRESS-Science and Technology Publications, Lda, 2014), pp. 68–75 A. Di Noia, P. Montanari, A. Rizzi, Occupational diseases risk prediction by cluster analysis and genetic optimization, in Proceedings of the International Joint Conference on Computational Intelligence (SCITEPRESS-Science and Technology Publications, Lda, 2014), pp. 68–75
22.
Zurück zum Zitat A. Di Noia, P. Montanari, A. Rizzi, Occupational diseases risk prediction by genetic optimization: Towards a non-exclusive classification approach, in Computational Intelligence (Springer, Berlin, 2016), pp. 63–77 A. Di Noia, P. Montanari, A. Rizzi, Occupational diseases risk prediction by genetic optimization: Towards a non-exclusive classification approach, in Computational Intelligence (Springer, Berlin, 2016), pp. 63–77
23.
Zurück zum Zitat L. Di Paola, M. De Ruvo, P. Paci, D. Santoni, A. Giuliani, Protein contact networks: an emerging paradigm in chemistry. Chem. Rev. 113(3), 1598–1613 (2012)CrossRef L. Di Paola, M. De Ruvo, P. Paci, D. Santoni, A. Giuliani, Protein contact networks: an emerging paradigm in chemistry. Chem. Rev. 113(3), 1598–1613 (2012)CrossRef
24.
Zurück zum Zitat M. Ester, H.P. Kriegel, J. Sander, X. Xu et al., A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96, 226–231 (1996) M. Ester, H.P. Kriegel, J. Sander, X. Xu et al., A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96, 226–231 (1996)
25.
Zurück zum Zitat M.D. Fox, M.E. Raichle, Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging. Nat. Rev. Neurosci. 8(9), 700–711 (2007)CrossRef M.D. Fox, M.E. Raichle, Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging. Nat. Rev. Neurosci. 8(9), 700–711 (2007)CrossRef
26.
Zurück zum Zitat K.J. Friston, C.D. Frith, R.S. Frackowiak, R. Turner, Characterizing dynamic brain responses with fmri: a multivariate approach. Neuroimage 2(2), 166–172 (1995)CrossRef K.J. Friston, C.D. Frith, R.S. Frackowiak, R. Turner, Characterizing dynamic brain responses with fmri: a multivariate approach. Neuroimage 2(2), 166–172 (1995)CrossRef
27.
Zurück zum Zitat J. Gao, B. Barzel, A.L. Barabási, Universal resilience patterns in complex networks. Nature 530(7590), 307–312 (2016)CrossRef J. Gao, B. Barzel, A.L. Barabási, Universal resilience patterns in complex networks. Nature 530(7590), 307–312 (2016)CrossRef
28.
Zurück zum Zitat A. Giuliani, S. Filippi, M. Bertolaso, Why network approach can promote a new way of thinking in biology. Front. Genet. 5 (2014) A. Giuliani, S. Filippi, M. Bertolaso, Why network approach can promote a new way of thinking in biology. Front. Genet. 5 (2014)
29.
Zurück zum Zitat A. Giuliani, A. Krishnan, J.P. Zbilut, M. Tomita, Proteins as networks: usefulness of graph theory in protein science. Curr. Protein Peptide Sci. 9(1), 28–38 (2008)CrossRef A. Giuliani, A. Krishnan, J.P. Zbilut, M. Tomita, Proteins as networks: usefulness of graph theory in protein science. Curr. Protein Peptide Sci. 9(1), 28–38 (2008)CrossRef
30.
Zurück zum Zitat D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning (Addison-Wesley, USA, 1989)MATH D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning (Addison-Wesley, USA, 1989)MATH
31.
Zurück zum Zitat M.D. Greicius, B. Krasnow, A.L. Reiss, V. Menon, Functional connectivity in the resting brain: a network analysis of the default mode hypothesis. Proc. Natl. Acad. Sci. 100(1), 253–258 (2003)CrossRef M.D. Greicius, B. Krasnow, A.L. Reiss, V. Menon, Functional connectivity in the resting brain: a network analysis of the default mode hypothesis. Proc. Natl. Acad. Sci. 100(1), 253–258 (2003)CrossRef
32.
Zurück zum Zitat S. Guha, R. Rastogi, K. Shim, Cure: an efficient clustering algorithm for large databases. ACM Sigmod Rec. 27, 73–84 (1998)CrossRef S. Guha, R. Rastogi, K. Shim, Cure: an efficient clustering algorithm for large databases. ACM Sigmod Rec. 27, 73–84 (1998)CrossRef
33.
34.
Zurück zum Zitat B. He, K. Wang, Y. Liu, B. Xue, V.N. Uversky, A.K. Dunker, Predicting intrinsic disorder in proteins: an overview. Cell Res. 19(8), 929–949 (2009)CrossRef B. He, K. Wang, Y. Liu, B. Xue, V.N. Uversky, A.K. Dunker, Predicting intrinsic disorder in proteins: an overview. Cell Res. 19(8), 929–949 (2009)CrossRef
35.
Zurück zum Zitat D.R. Hofstadter, I Am a Strange Loop, Basic Books (2007) D.R. Hofstadter, I Am a Strange Loop, Basic Books (2007)
36.
Zurück zum Zitat J. Horgan, From complexity to perplexity. Sci. Am. 272(6), 104–109 (1995)CrossRef J. Horgan, From complexity to perplexity. Sci. Am. 272(6), 104–109 (1995)CrossRef
37.
Zurück zum Zitat A.K. Jain, M.N. Murty, P.J. Flynn, Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)CrossRef A.K. Jain, M.N. Murty, P.J. Flynn, Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)CrossRef
38.
Zurück zum Zitat G. Jurman, R. Visintainer, C. Furlanello, An introduction to spectral distances in networks. Front. Artif. Intell. Appl. 226, 227–234 (2011) G. Jurman, R. Visintainer, C. Furlanello, An introduction to spectral distances in networks. Front. Artif. Intell. Appl. 226, 227–234 (2011)
39.
Zurück zum Zitat L. Kaufman, P. Rousseeuw, Clustering by means of medoids. Stat. Data Anal. Based L1-Norm Relat. Methods, 405–416 (1987) L. Kaufman, P. Rousseeuw, Clustering by means of medoids. Stat. Data Anal. Based L1-Norm Relat. Methods, 405–416 (1987)
40.
Zurück zum Zitat J. Kennedy, R. Eberhart, Particle swarm optimization, in Proceedings of the IEEE International Conference on Neural Networks, vol. 4 (IEEE, 1995), pp. 1942–1948 J. Kennedy, R. Eberhart, Particle swarm optimization, in Proceedings of the IEEE International Conference on Neural Networks, vol. 4 (IEEE, 1995), pp. 1942–1948
41.
Zurück zum Zitat S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi, Optimization by simulated annealing. Science 220(4598), 671–680 (1983)MathSciNetCrossRef S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi, Optimization by simulated annealing. Science 220(4598), 671–680 (1983)MathSciNetCrossRef
42.
Zurück zum Zitat V.I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals. Soviet physics doklady. 10, 707–710 (1966)MathSciNetMATH V.I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals. Soviet physics doklady. 10, 707–710 (1966)MathSciNetMATH
43.
Zurück zum Zitat A.W.C. Liew, H. Yan, M. Yang, Pattern recognition techniques for the emerging field of bioinformatics: A review. Pattern Recognition 38(11), 2055–2073 (2005)CrossRef A.W.C. Liew, H. Yan, M. Yang, Pattern recognition techniques for the emerging field of bioinformatics: A review. Pattern Recognition 38(11), 2055–2073 (2005)CrossRef
44.
Zurück zum Zitat L. Livi, A. Giuliani, A. Rizzi, Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem. Inf. Sci. 326, 134–145 (2016)CrossRef L. Livi, A. Giuliani, A. Rizzi, Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem. Inf. Sci. 326, 134–145 (2016)CrossRef
45.
Zurück zum Zitat L. Livi, A. Giuliani, A. Sadeghian, Characterization of graphs for protein structure modeling and recognition of solubility. Curr. Bioinform. 11(1), 106–114 (2016)CrossRef L. Livi, A. Giuliani, A. Sadeghian, Characterization of graphs for protein structure modeling and recognition of solubility. Curr. Bioinform. 11(1), 106–114 (2016)CrossRef
46.
Zurück zum Zitat L. Livi, E. Maiorino, A. Giuliani, A. Rizzi, A. Sadeghian, A generative model for protein contact networks. J. Biomol. Struct. Dyn. 34(7), 1441–1454 (2016)CrossRef L. Livi, E. Maiorino, A. Giuliani, A. Rizzi, A. Sadeghian, A generative model for protein contact networks. J. Biomol. Struct. Dyn. 34(7), 1441–1454 (2016)CrossRef
48.
Zurück zum Zitat L. Livi, A. Rizzi, A. Sadeghian, Optimized dissimilarity space embedding for labeled graphs. Inf. Sci. 266, 47–64 (2014)MathSciNetCrossRef L. Livi, A. Rizzi, A. Sadeghian, Optimized dissimilarity space embedding for labeled graphs. Inf. Sci. 266, 47–64 (2014)MathSciNetCrossRef
49.
Zurück zum Zitat L. Livi, A. Rizzi, A. Sadeghian, Granular modeling and computing approaches for intelligent analysis of non-geometric data. Appl. Soft Comput. 27, 567–574 (2015)CrossRef L. Livi, A. Rizzi, A. Sadeghian, Granular modeling and computing approaches for intelligent analysis of non-geometric data. Appl. Soft Comput. 27, 567–574 (2015)CrossRef
50.
Zurück zum Zitat L. Livi, A. Sadeghian, Granular computing, computational intelligence, and the analysis of non-geometric input spaces. Granul. Comput. 1(1), 13–20 (2016)CrossRef L. Livi, A. Sadeghian, Granular computing, computational intelligence, and the analysis of non-geometric input spaces. Granul. Comput. 1(1), 13–20 (2016)CrossRef
52.
Zurück zum Zitat L. MacQueen, Some methods for classification and analysis of multivariate observations, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (Oakland, USA, 1967), pp. 281–297 L. MacQueen, Some methods for classification and analysis of multivariate observations, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (Oakland, USA, 1967), pp. 281–297
53.
Zurück zum Zitat H.A. Maghawry, M.C. Mostafa, M.H. Abdul-Aziz, T.E. Gharib, A modified cutoff scanning matrix protein representation for enhancing protein function prediction, in 9th International Conference on Informatics and Systems (INFOS) (IEEE, 2014), pp. DEKM–40 H.A. Maghawry, M.C. Mostafa, M.H. Abdul-Aziz, T.E. Gharib, A modified cutoff scanning matrix protein representation for enhancing protein function prediction, in 9th International Conference on Informatics and Systems (INFOS) (IEEE, 2014), pp. DEKM–40
54.
Zurück zum Zitat E. Maiorino, A. Rizzi, A. Sadeghian, A. Giuliani, Spectral reconstruction of protein contact networks. Phys. A: Stat. Mech. Appl. 471, 804–817 (2017)CrossRef E. Maiorino, A. Rizzi, A. Sadeghian, A. Giuliani, Spectral reconstruction of protein contact networks. Phys. A: Stat. Mech. Appl. 471, 804–817 (2017)CrossRef
55.
Zurück zum Zitat A. Martino, E. Maiorino, A. Giuliani, M. Giampieri, A. Rizzi, Supervised approaches for function prediction of proteins contact networks from topological structure information, in Scandinavian Conference on Image Analysis (Springer, Berlin, 2017), pp. 285–296CrossRef A. Martino, E. Maiorino, A. Giuliani, M. Giampieri, A. Rizzi, Supervised approaches for function prediction of proteins contact networks from topological structure information, in Scandinavian Conference on Image Analysis (Springer, Berlin, 2017), pp. 285–296CrossRef
56.
Zurück zum Zitat A. Martino, A. Rizzi, F.M. Frattale Mascioli, Efficient approaches for solving the large-scale k-medoids problem, in Proceedings of the 9th International Joint Conference on Computational Intelligence. IJCCI, vol. 1 (INSTICC, 2017), pp. 338–347 A. Martino, A. Rizzi, F.M. Frattale Mascioli, Efficient approaches for solving the large-scale k-medoids problem, in Proceedings of the 9th International Joint Conference on Computational Intelligence. IJCCI, vol. 1 (INSTICC, 2017), pp. 338–347
57.
Zurück zum Zitat J. Mercer, Functions of positive and negative type, and their connection with the theory of integral equations, in Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 209 (1909), pp. 415–446CrossRef J. Mercer, Functions of positive and negative type, and their connection with the theory of integral equations, in Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 209 (1909), pp. 415–446CrossRef
58.
Zurück zum Zitat D.C. Mikulecky, Network thermodynamics and complexity: a transition to relational systems theory. Comput. Chem. 25(4), 369–391 (2001)CrossRef D.C. Mikulecky, Network thermodynamics and complexity: a transition to relational systems theory. Comput. Chem. 25(4), 369–391 (2001)CrossRef
59.
Zurück zum Zitat T.M. Mitchell, Machine Learning (McGraw-Hill Boston, MA, 1997)MATH T.M. Mitchell, Machine Learning (McGraw-Hill Boston, MA, 1997)MATH
60.
Zurück zum Zitat M. Neuhaus, H. Bunke, Bridging the Gap Between Graph Edit Distance and Kernel Machines, vol. 68 (World Scientific, 2007) M. Neuhaus, H. Bunke, Bridging the Gap Between Graph Edit Distance and Kernel Machines, vol. 68 (World Scientific, 2007)
61.
Zurück zum Zitat M. Pagani, A. Giuliani, J. Öberg, A. Chincarini, S. Morbelli, A. Brugnolo, D. Arnaldi, A. Picco, M. Bauckneht, A. Buschiazzo et al., Predicting the transition from normal aging to alzheimer’s disease: a statistical mechanistic evaluation of fdg-pet data. NeuroImage 141, 282–290 (2016)CrossRef M. Pagani, A. Giuliani, J. Öberg, A. Chincarini, S. Morbelli, A. Brugnolo, D. Arnaldi, A. Picco, M. Bauckneht, A. Buschiazzo et al., Predicting the transition from normal aging to alzheimer’s disease: a statistical mechanistic evaluation of fdg-pet data. NeuroImage 141, 282–290 (2016)CrossRef
62.
Zurück zum Zitat M. Pagani, A. Giuliani, J. Öberg, F. De Carli, S. Morbelli, N. Girtler, F. Bongioanni, D. Arnaldi, J. Accardo, M. Bauckneht et al., Progressive disgregation of brain networking from normal aging to alzheimer’s disease. independent component analysis on fdg-pet data. J. Nucl. Med. jnumed–116 (2017) M. Pagani, A. Giuliani, J. Öberg, F. De Carli, S. Morbelli, N. Girtler, F. Bongioanni, D. Arnaldi, J. Accardo, M. Bauckneht et al., Progressive disgregation of brain networking from normal aging to alzheimer’s disease. independent component analysis on fdg-pet data. J. Nucl. Med. jnumed–116 (2017)
63.
Zurück zum Zitat E. Parzen, On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)MathSciNetCrossRef E. Parzen, On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)MathSciNetCrossRef
64.
Zurück zum Zitat M. Pascual, S.A. Levin, From individuals to population densities: searching for the intermediate scale of nontrivial determinism. Ecology 80(7), 2225–2236 (1999)CrossRef M. Pascual, S.A. Levin, From individuals to population densities: searching for the intermediate scale of nontrivial determinism. Ecology 80(7), 2225–2236 (1999)CrossRef
65.
Zurück zum Zitat K. Pearson, Mathematical contributions to the theory of evolution. iii. regression, heredity, and panmixia, in Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 187 (1896), pp. 253–318CrossRef K. Pearson, Mathematical contributions to the theory of evolution. iii. regression, heredity, and panmixia, in Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 187 (1896), pp. 253–318CrossRef
66.
Zurück zum Zitat E. Pękalska, R.P. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Applications (World Scientific, 2005) E. Pękalska, R.P. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Applications (World Scientific, 2005)
67.
Zurück zum Zitat K. Peng, P. Radivojac, S. Vucetic, A.K. Dunker, Z. Obradovic, Length-dependent prediction of protein intrinsic disorder. BMC Bioinform. 7(1), 208 (2006)CrossRef K. Peng, P. Radivojac, S. Vucetic, A.K. Dunker, Z. Obradovic, Length-dependent prediction of protein intrinsic disorder. BMC Bioinform. 7(1), 208 (2006)CrossRef
68.
Zurück zum Zitat J.B. Pereira, M. Mijalkov, E. Kakaei, P. Mecocci, B. Vellas, M. Tsolaki, I. Kłoszewska, H. Soininen, C. Spenger, S. Lovestone et al., Disrupted network topology in patients with stable and progressive mild cognitive impairment and alzheimer’s disease. Cereb. Cortex 26(8), 3476–3493 (2016)CrossRef J.B. Pereira, M. Mijalkov, E. Kakaei, P. Mecocci, B. Vellas, M. Tsolaki, I. Kłoszewska, H. Soininen, C. Spenger, S. Lovestone et al., Disrupted network topology in patients with stable and progressive mild cognitive impairment and alzheimer’s disease. Cereb. Cortex 26(8), 3476–3493 (2016)CrossRef
69.
Zurück zum Zitat F. Possemato, A. Rizzi, Automatic text categorization by a granular computing approach: facing unbalanced data sets, in The International Joint Conference on Neural Networks (IJCNN) (IEEE, 2013), pp. 1–8 F. Possemato, A. Rizzi, Automatic text categorization by a granular computing approach: facing unbalanced data sets, in The International Joint Conference on Neural Networks (IJCNN) (IEEE, 2013), pp. 1–8
70.
Zurück zum Zitat J.S. Richardson, The anatomy and taxonomy of protein structure. Adv. Protein Chem. 34, 167–339 (1981)CrossRef J.S. Richardson, The anatomy and taxonomy of protein structure. Adv. Protein Chem. 34, 167–339 (1981)CrossRef
71.
Zurück zum Zitat D. de Ridder, J. de Ridder, M.J. Reinders, Pattern recognition in bioinformatics. Brief. Bioinform. 14(5), 633–647 (2013)CrossRef D. de Ridder, J. de Ridder, M.J. Reinders, Pattern recognition in bioinformatics. Brief. Bioinform. 14(5), 633–647 (2013)CrossRef
72.
Zurück zum Zitat A. Rizzi, F. Possemato, L. Livi, A. Sebastiani, A. Giuliani, F.M. Frattale Mascioli, A dissimilarity-based classifier for generalized sequences by a granular computing approach, in The International Joint Conference on Neural Networks (IJCNN) (IEEE, 2013), pp. 1–8 A. Rizzi, F. Possemato, L. Livi, A. Sebastiani, A. Giuliani, F.M. Frattale Mascioli, A dissimilarity-based classifier for generalized sequences by a granular computing approach, in The International Joint Conference on Neural Networks (IJCNN) (IEEE, 2013), pp. 1–8
73.
Zurück zum Zitat P. Romero, Z. Obradovic, X. Li, E.C. Garner, C.J. Brown, A.K. Dunker, Sequence complexity of disordered protein. Proteins Struct. Funct. Bioinform. 42(1), 38–48 (2001)CrossRef P. Romero, Z. Obradovic, X. Li, E.C. Garner, C.J. Brown, A.K. Dunker, Sequence complexity of disordered protein. Proteins Struct. Funct. Bioinform. 42(1), 38–48 (2001)CrossRef
74.
Zurück zum Zitat P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRef P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRef
75.
Zurück zum Zitat M. Rubinov, O. Sporns, Complex network measures of brain connectivity: uses and interpretations. Neuroimage 52(3), 1059–1069 (2010)CrossRef M. Rubinov, O. Sporns, Complex network measures of brain connectivity: uses and interpretations. Neuroimage 52(3), 1059–1069 (2010)CrossRef
76.
Zurück zum Zitat H. Sakoe, S. Chiba, Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)CrossRef H. Sakoe, S. Chiba, Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)CrossRef
77.
Zurück zum Zitat B. Schölkopf, A.J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (MIT press, 2002) B. Schölkopf, A.J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (MIT press, 2002)
78.
Zurück zum Zitat J. Shawe-Taylor, N. Cristianini, Kernel Methods for Pattern Analysis (Cambridge university press, Cambridge, 2004) J. Shawe-Taylor, N. Cristianini, Kernel Methods for Pattern Analysis (Cambridge university press, Cambridge, 2004)
79.
Zurück zum Zitat D.H. Silverman, G.W. Small, C.Y. Chang, C.S. Lu, M.A.K. de Aburto, W. Chen, J. Czernin, S.I. Rapoport, P. Pietrini, G.E. Alexander et al., Positron emission tomography in evaluation of dementia: regional brain metabolism and long-term outcome. Jama 286(17), 2120–2127 (2001)CrossRef D.H. Silverman, G.W. Small, C.Y. Chang, C.S. Lu, M.A.K. de Aburto, W. Chen, J. Czernin, S.I. Rapoport, P. Pietrini, G.E. Alexander et al., Positron emission tomography in evaluation of dementia: regional brain metabolism and long-term outcome. Jama 286(17), 2120–2127 (2001)CrossRef
80.
Zurück zum Zitat G.P. Singh, M. Ganapathi, D. Dash, Role of intrinsic disorder in transient interactions of hub proteins. Proteins Struct. Funct. Bioinform. 66(4), 761–765 (2007)CrossRef G.P. Singh, M. Ganapathi, D. Dash, Role of intrinsic disorder in transient interactions of hub proteins. Proteins Struct. Funct. Bioinform. 66(4), 761–765 (2007)CrossRef
81.
Zurück zum Zitat J. Smucny, K.P. Wylie, J.R. Tregellas, Functional magnetic resonance imaging of intrinsic brain networks for translational drug discovery. Trends Pharmacol. Sci. 35(8), 397–403 (2014)CrossRef J. Smucny, K.P. Wylie, J.R. Tregellas, Functional magnetic resonance imaging of intrinsic brain networks for translational drug discovery. Trends Pharmacol. Sci. 35(8), 397–403 (2014)CrossRef
82.
Zurück zum Zitat C. Soguero-Ruiz, K. Hindberg, J.L. Rojo-Álvarez, S.O. Skrøvseth, F. Godtliebsen, K. Mortensen, A. Revhaug, R.O. Lindsetmo, K.M. Augestad, R. Jenssen, Support vector feature selection for early detection of anastomosis leakage from bag-of-words in electronic health records. IEEE J. Biomed. Health Inf. 20(5), 1404–1415 (2016)CrossRef C. Soguero-Ruiz, K. Hindberg, J.L. Rojo-Álvarez, S.O. Skrøvseth, F. Godtliebsen, K. Mortensen, A. Revhaug, R.O. Lindsetmo, K.M. Augestad, R. Jenssen, Support vector feature selection for early detection of anastomosis leakage from bag-of-words in electronic health records. IEEE J. Biomed. Health Inf. 20(5), 1404–1415 (2016)CrossRef
83.
Zurück zum Zitat P.G. Spetsieris, J.H. Ko, C.C. Tang, A. Nazem, W. Sako, S. Peng, Y. Ma, V. Dhawan, D. Eidelberg, Metabolic resting-state brain networks in health and disease. Proc. Natl. Acad. Sci. 112(8), 2563–2568 (2015)CrossRef P.G. Spetsieris, J.H. Ko, C.C. Tang, A. Nazem, W. Sako, S. Peng, Y. Ma, V. Dhawan, D. Eidelberg, Metabolic resting-state brain networks in health and disease. Proc. Natl. Acad. Sci. 112(8), 2563–2568 (2015)CrossRef
84.
Zurück zum Zitat J.M. Stanton, Galton, pearson, and the peas: A brief history of linear regression for statistics instructors. J. Stat. Education 9(3), 1–16 (2001)CrossRef J.M. Stanton, Galton, pearson, and the peas: A brief history of linear regression for statistics instructors. J. Stat. Education 9(3), 1–16 (2001)CrossRef
85.
Zurück zum Zitat S. Theodoridis, K. Koutroumbas, Pattern Recognition, 4th edn. (Academic Press, 2008) S. Theodoridis, K. Koutroumbas, Pattern Recognition, 4th edn. (Academic Press, 2008)
86.
Zurück zum Zitat M.K. Transtrum, B.B. Machta, K.S. Brown, B.C. Daniels, C.R. Myers, J.P. Sethna, Perspective: sloppiness and emergent theories in physics, biology, and beyond. J. Chem. Phys. 143(1), 07B201_1 (2015) M.K. Transtrum, B.B. Machta, K.S. Brown, B.C. Daniels, C.R. Myers, J.P. Sethna, Perspective: sloppiness and emergent theories in physics, biology, and beyond. J. Chem. Phys. 143(1), 07B201_1 (2015)
87.
Zurück zum Zitat V.N. Uversky, Natively unfolded proteins: a point where biology waits for physics. Protein Sci. 11(4), 739–756 (2002)CrossRef V.N. Uversky, Natively unfolded proteins: a point where biology waits for physics. Protein Sci. 11(4), 739–756 (2002)CrossRef
88.
Zurück zum Zitat B.C. Van Wijk, C.J. Stam, A. Daffertshofer, Comparing brain networks of different size and connectivity density using graph theory. PloS one 5(10), e13701 (2010)CrossRef B.C. Van Wijk, C.J. Stam, A. Daffertshofer, Comparing brain networks of different size and connectivity density using graph theory. PloS one 5(10), e13701 (2010)CrossRef
89.
Zurück zum Zitat J.P. Vert, K. Tsuda, B. Schölkopf, Kernel Methods in Computational Biology, A primer on kernel methods (2004), pp. 35–70 J.P. Vert, K. Tsuda, B. Schölkopf, Kernel Methods in Computational Biology, A primer on kernel methods (2004), pp. 35–70
90.
Zurück zum Zitat Y.C. Wang, Y. Wang, Z.X. Yang, N.Y. Deng, Support vector machine prediction of enzyme function with conjoint triad feature and hierarchical context. BMC Syst. Biol. 5(1), S6 (2011)CrossRef Y.C. Wang, Y. Wang, Z.X. Yang, N.Y. Deng, Support vector machine prediction of enzyme function with conjoint triad feature and hierarchical context. BMC Syst. Biol. 5(1), S6 (2011)CrossRef
91.
Zurück zum Zitat L. Wasserman, Topological data analysis. Ann. Rev. Stat. Appl. 5(1) (2018)CrossRef L. Wasserman, Topological data analysis. Ann. Rev. Stat. Appl. 5(1) (2018)CrossRef
92.
Zurück zum Zitat W. Weaver, Science and complexity. Am. Sci. 36(4), 536 (1948) W. Weaver, Science and complexity. Am. Sci. 36(4), 536 (1948)
93.
Zurück zum Zitat A. Wright, A.B. McCoy, S. Henkin, A. Kale, D.F. Sittig, Use of a support vector machine for categorizing free-text notes: assessment of accuracy across two institutions. J. Am. Med. Inf. Assoc. 20(5), 887–890 (2013)CrossRef A. Wright, A.B. McCoy, S. Henkin, A. Kale, D.F. Sittig, Use of a support vector machine for categorizing free-text notes: assessment of accuracy across two institutions. J. Am. Med. Inf. Assoc. 20(5), 887–890 (2013)CrossRef
94.
Zurück zum Zitat Y. Yang, L. Han, Y. Yuan, J. Li, N. Hei, H. Liang, Gene co-expression network analysis reveals common system-level properties of prognostic genes across cancer types. Nat. Commun. 5, 3231 (2014) Y. Yang, L. Han, Y. Yuan, J. Li, N. Hei, H. Liang, Gene co-expression network analysis reveals common system-level properties of prognostic genes across cancer types. Nat. Commun. 5, 3231 (2014)
95.
Zurück zum Zitat F. Yates, K. Mather, Ronald aylmer fisher, 1890–1962. Biogr. Mem. Fellows R. Soc. 9, 91–129 (1963)CrossRef F. Yates, K. Mather, Ronald aylmer fisher, 1890–1962. Biogr. Mem. Fellows R. Soc. 9, 91–129 (1963)CrossRef
96.
Zurück zum Zitat L.A. Zadeh, Soft computing and fuzzy logic. IEEE Softw. 11(6), 48–56 (1994)CrossRef L.A. Zadeh, Soft computing and fuzzy logic. IEEE Softw. 11(6), 48–56 (1994)CrossRef
97.
Zurück zum Zitat T. Zhang, R. Ramakrishnan, M. Livny, Birch: an efficient data clustering method for very large databases. ACM Sigmod Rec. 25, 103–114 (1996)CrossRef T. Zhang, R. Ramakrishnan, M. Livny, Birch: an efficient data clustering method for very large databases. ACM Sigmod Rec. 25, 103–114 (1996)CrossRef
Metadaten
Titel
Granular Computing Techniques for Bioinformatics Pattern Recognition Problems in Non-metric Spaces
verfasst von
Alessio Martino
Alessandro Giuliani
Antonello Rizzi
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-89629-8_3

Premium Partner