Comparative Study of Machine Learning Techniques for Genome Scale Discrimination of Recombinant HIV-1 Strains
The whole genomes of HIV-1 strains were analyzed for discriminating genomes of circulated recombinant forms from other non-recombinant genomes using naïve bays, logistic regression, support vector machine, k-nearest neighbor and classification tree using codon frequencies
as sequence attributes. The performance of all five techniques were compared on different indices like, classification accuracy, sensitivity, specificity, Matthews's correlation coefficient and brier score. Moreover the techniques were compared using receiver-operating curves and on calibration
graphs for their calibration ability. All techniques were validated using tenfold cross validation and evaluated on training data sets, comprising 4215 genomes, including 3004 non-recombinant strains, and 1211 circulating recombinant strains. Highest classification accuracy of 94.47%
were achieved using K-nearest neighbor on tenfold cross validation. Moreover, classification accuracy of 84.49%, 88.28%, 92.22%, 86.31% were achieved using Naïve Bayes, Logistic Regression, Support Vector Machine and Classification Trees respectively,
on tenfold cross validation. Furthermore, on receiver operating curve k-Nearest Neighbor performed best by having area under the curve near to one (0.9754). Our results indicates that supervised machine learning techniques can effectively applied for the efficient discrimination of
recombinant strains of HIV-1 from nonrecombinant strains at genome scale using frequency of codons.
Keywords: CLASSIFICATION; HIV-1; MACHINE LEARNING; NON RECOMBINANT; RECOMBINANT
Document Type: Research Article
Publication date: 01 April 2016
- Journal of Medical Imaging and Health Informatics (JMIHI) is a medium to disseminate novel experimental and theoretical research results in the field of biomedicine, biology, clinical, rehabilitation engineering, medical image processing, bio-computing, D2H2, and other health related areas.
- Editorial Board
- Information for Authors
- Subscribe to this Title
- Ingenta Connect is not responsible for the content or availability of external websites
- Access Key
- Free content
- Partial Free content
- New content
- Open access content
- Partial Open access content
- Subscribed content
- Partial Subscribed content
- Free trial content