In a globalized world, the need to speak foreign languages, particularly English, is imperative. One challenge for learning foreign languages at the scale of millions is that, although teaching content is widely available, speaking skills are harder to develop than vocabulary, because feedback from a teacher is needed to correct pronunciation, intonation, etc. There are currently no automated tools to evaluate the fluency or pronunciation level of language students, so this evaluation, which is required even for placing the student into the right level, requires an interview with a language teacher. We have proposed a supervised machine-learning method for automatically evaluating both the fluency and the pronunciation of a language student, as well as detecting specific pronunciation mistakes, taking English as the target language. In order to train a classifier for the classes “low”, “intermediate” and “high”, we first built datasets of audio samples of English-learning students talking. Each audio was divided into small segments, and for each segment a set of features were calculated. We trained several classifiers, which made predictions about the level of a given non-native English speaker. We performed a series of tests with the trained classifiers, comparing the predicted class of audio segments not included in the training dataset, for accuracy, precision, and other measures. Results were promising, as for both fluency and pronunciation we obtained accuracy values of 94% and 99.9% in predictions, the second one being the highest accuracy ever reported on the literature for such predictions.