Skip to main content
main-content

Über dieses Buch

​​​​Describing novel mathematical concepts for recommendation engines, Realtime Data Mining: Self-Learning Techniques for Recommendation Engines features a sound mathematical framework unifying approaches based on control and learning theories, tensor factorization, and hierarchical methods. Furthermore, it presents promising results of numerous experiments on real-world data.​ The area of realtime data mining is currently developing at an exceptionally dynamic pace, and realtime data mining systems are the counterpart of today's “classic” data mining systems. Whereas the latter learn from historical data and then use it to deduce necessary actions, realtime analytics systems learn and act continuously and autonomously. In the vanguard of these new analytics systems are recommendation engines. They are principally found on the Internet, where all information is available in realtime and an immediate feedback is guaranteed.

This monograph appeals to computer scientists and specialists in machine learning, especially from the area of recommender systems, because it conveys a new way of realtime thinking by considering recommendation tasks as control-theoretic problems. Realtime Data Mining: Self-Learning Techniques for Recommendation Engines will also interest application-oriented mathematicians because it consistently combines some of the most promising mathematical areas, namely control theory, multilevel approximation, and tensor factorization.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Brave New Realtime World: Introduction

Abstract
The chapter offers a general introduction to methods of realtime analytics and sets out their advantages and disadvantages as compared with conventional analytics methods, which learn only from historical data. In particular, we stress the difficulties in the development of theoretically sound realtime analytics methods. We emphasize that such online learning does not conflict with conventional offline learning but, on the opposite, both complement each other. Finally, we give some methodical remarks.
Alexander Paprotny, Michael Thess

Chapter 2. Strange Recommendations? On the Weaknesses of Current Recommendation Engines

Abstract
Currently, most approaches to recommendation engines focus on traditional techniques such as collaborative filtering, basket analysis, and content-based recommendations. Recommendations are considered from a prediction point of view only, i.e., the recommendation task is reduced to the prediction of content that the user is going to select with highest probability anyway. In contrast, in this chapter we propose to view recommendations as control-theoretic problem by investigating the interaction of analysis and action. The corresponding mathematical framework is developed in the next chapters of the book.
Alexander Paprotny, Michael Thess

Chapter 3. Changing Not Just Analyzing: Control Theory and Reinforcement Learning

Abstract
We give a short introduction to reinforcement learning. This includes basic concepts like Markov decision processes, policies, state-value and action-value functions, and the Bellman equation. We discuss solution methods like policy and value iteration methods, online methods like temporal-difference learning, and state fundamental convergence results.
It turns out that RL addresses the problems from Chap. 2. This shows that, in principle, RL is a suitable instrument for solving all of these problems.
Alexander Paprotny, Michael Thess

Chapter 4. Recommendations as a Game: Reinforcement Learning for Recommendation Engines

Abstract
We describe the application of reinforcement learning to recommendation engines. At this, we introduce RE-specific empirical assumptions to reduce the complexity of RL in order to make it applicable to real-live recommendation problems. Especially, we provide a new approach for estimating transition probabilities of multiple recommendations based on that of single recommendations. The estimation of transition probabilities for single recommendations is left as an open problem that is covered in Chap. 5. Finally, we introduce a simple framework for testing online recommendations.
Alexander Paprotny, Michael Thess

Chapter 5. How Engines Learn to Generate Recommendations: Adaptive Learning Algorithms

Abstract
This chapter is mainly devoted to the question of estimating transition probabilities taking into account the effect of recommendations. It turned out that this is an extremely complex problem. The central result is a simple empirical assumption that allows reducing the complexity of the estimation in a way that is computationally suitable to most practical problems. The discussion of this approach gives a deeper insight into essential principles of realtime recommendation engines. Based on this assumption, we propose methods to estimate the transition probabilities and provide some first experimental results. Although the results look promising, more advanced techniques are highly desirable. Such techniques like hierarchical and factorization methods are presented in the following chapters.
Alexander Paprotny, Michael Thess

Chapter 6. Up the Down Staircase: Hierarchical Reinforcement Learning

Abstract
We address the question of how hierarchical, or multigrid, methods may figure in dynamic programming and reinforcement learning for recommendation engines.
After providing a general introduction, we approach the framework of hierarchical methods from both the historical analytical and algebraic viewpoints; we proceed to devising and justifying approaches to apply hierarchical methods to both the model-based as well as the model-free case. In regard to the latter, we set out from the multigrid reinforcement learning algorithms introduced by Ziv in [Ziv04] and extend these methods to finite-horizon problems.
Alexander Paprotny, Michael Thess

Chapter 7. Breaking Dimensions: Adaptive Scoring with Sparse Grids

Abstract
We introduce the concept of a sparse grid and show how this powerful approach to function space discretization may be employed to tackle high-dimensional machine learning problems of regression and classification. In particular, we address the issue of incremental computation of sparse grid regression coefficients so as to meet the requirements of realtime data mining. Conclusively, we present experimental results on real-world data sets.
Alexander Paprotny, Michael Thess

Chapter 8. Decomposition in Transition: Adaptive Matrix Factorization

Abstract
We introduce SVD/PCA-based matrix factorization frameworks and present applications to prediction-based recommendation. Furthermore, we devise incremental algorithms that enable to compute the considered factorizations adaptively in a realtime setting. Besides SVD and PCA-based frameworks, we discuss more sophisticated approaches like non-negative matrix factorization and Lanczos-based methods and assess their effectiveness by means of experiments on real-world data. Moreover, we address a compressive sensing-based approach to Netflix-like matrix completion problems and conclude the chapter by proposing a remedy to complexity issues in computing large elements of the low-rank matrices, which, as we shall see, is a recurring problem related to factorization-based prediction methods.
Alexander Paprotny, Michael Thess

Chapter 9. Decomposition in Transition II: Adaptive Tensor Factorization

Abstract
We consider generalizations of the previously described SVD-based factorization methods to a tensor framework and discuss applications to recommendation. In particular, we generalize the previously introduced incremental SVD algorithm to higher dimensions. Furthermore, we briefly address other tensor factorization frameworks like CANDECOMP/PARAFAC as well as hierarchical SVD and Tensor-Train-Decomposition.
Alexander Paprotny, Michael Thess

Chapter 10. The Big Picture: Toward a Synthesis of RL and Adaptive Tensor Factorization

Abstract
We explore the subject of uniting the control-theoretic with the factorization-based approach to recommendation, arguing that tensor factorization may be employed to vanquish combinatorial complexity impediments related to more sophisticated MDP models that take a history of previous states rather than one single state into account. Specifically, we introduce a tensor representation of transition probabilities of Markov-k-processes and devise a Tucker-based approximation architecture that relies crucially on the notion of an aggregation basis described in Chap. 6. As our method requires a partitioning of the set of state transition histories, we are left with the challenge of how to determine a suitable partitioning, for which we propose a genetic algorithm.
Alexander Paprotny, Michael Thess

Chapter 11. What Cannot Be Measured Cannot Be Controlled: Gauging Success with A/B Tests

Abstract
The robust measurement of the efficiency of recommendation algorithms is an extremely important factor in the development of recommendation engines. We provide some useful methodical remarks on this topic in this chapter, even though it is not directly connected to the problem of adaptive learning. We further propose a straightforward algorithm to calculate confidence intervals for REs. At the end, we discuss Simpson’s paradox which illustrates the importance of constant environment conditions for testing.
Alexander Paprotny, Michael Thess

Chapter 12. Building a Recommendation Engine: The XELOPES Library

Abstract
In this chapter we provide some ideas of implementing the adaptive algorithms described in this book based on the prudsys XELOPES library for BI. We start with the abstract CWM standard and then consider its application to data mining. Next we move to realtime data mining where the central idea is the introduction of agents. The agent framework is further specified for reinforcement learning, and based on RL we next propose a framework for adaptive recommendation engines. At the end, we briefly discuss the application of XELOPES for real recommendation engines.
Alexander Paprotny, Michael Thess

Chapter 13. Last Words: Conclusion

Abstract
We first discuss the requirements of a modern data mining system and show that the approach presented in this book fulfills most of them. However, the full realization of this approach is often thwarted by principal problems in the development of the required mathematical instruments. Especially, most of the computational methods developed by mathematicians over the last centuries are designed for engineering problems. We stress the differences to the requirements for data analysis problems and encourage the development of appropriate frameworks. Especially, control theory should play an important role here.
Alexander Paprotny, Michael Thess

ERRATUM

Without Abstract
Alexander Paprotny, Michael Thess

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise