Even though Support Vector Machines (SVMs) are capable of identifying patterns in high dimensional kernel spaces, their performance is determined by two main factors: SVM cost parameter and kernel parameters. This paper identifies a mechanism to extract meta features from string datasets, and derives a
string kernel SVM optimization method. In the method, a
is trained over computed string meta-features for each dataset from a string dataset pool, learning algorithm parameters, and accuracy information to predict the optimal parameter combination for a given string classification task. In the experiments, the
SVM were optimized using the proposed algorithm over four string datasets: spam, Reuters-21578, Network Application Detection and e-News Categorization. The experiment results revealed that the proposed algorithm was able to produce parameter combinations which yield good string classification accuracies for
SVM on all string datasets.