Software systems are often built without developing any explicit model and therefore research has been focusing on automatic inference of models by applying machine learning to execution logs. However, the logs generated by a real software system may be very large and the inference algorithm can exceed the capacity of a single computer.
This paper focuses on inference of
and explores to use of MapReduce to deal with large logs. The approach consists of two distributed algorithms that perform
. For each job, a distributed algorithm using MapReduce is developed. With the parallel data processing capacity of MapReduce, the problem of inferring behavioral models from large logs can be efficiently solved. The technique is implemented on top of Hadoop. Experiments on Amazon clusters show efficiency and scalability of our approach.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten