ABSTRACT
The microeconomic framework for data mining [7] assumes that an enterprise chooses a decision maximizing the overall utility over all customers where the contribution of a customer is a function of the data available on that customer. In Catalog Segmentation, the enterprise wants to design k product catalogs of size r that maximize the overall number of catalog products purchased. However, there are many applications where a customer, once attracted to an enterprise, would purchase more products beyond the ones contained in the catalog. Therefore, in this paper, we investigate an alternative problem formulation, that we call Customer-Oriented Catalog Segmentation, where the overall utility is measured by the number of customers that have at least a specified minimum interest t in the catalogs. We formally introduce the Customer-Oriented Catalog Segmentation problem and discuss its complexity. Then we investigate two different paradigms to design efficient, approximate algorithms for the Customer-Oriented Catalog Segmentation problem, greedy (deterministic) and randomized algorithms. Since greedy algorithms may be trapped in a local optimum and randomized algorithms crucially depend on a reasonable initial solution, we explore a combination of these two paradigms. Our experimental evaluation on synthetic and real data demonstrates that the new algorithms yield catalogs of significantly higher utility compared to classical Catalog Segmentation algorithms.
- R.Agrawal. IBM synthetic data generator. 1994.Google Scholar
- V.Asodi and S.Safra. On the complexity of the catalog segmentation problem. Unpublished manuscriptGoogle Scholar
- T.Brijs, B.Goethals, G.Swinnen, K.Vanhoof and G.Wets. A Data Mining Framework for Optimal Product Selection in Retail Supermarket Data: The Generalized PROFSET Model. In Proc. of SIGKDD 2000. Google ScholarDigital Library
- U.Feige. A threshold of ln $n$ for approximating set cover, J. ACM 45(4) pages 634 -- 652 1998. Google ScholarDigital Library
- M.R.Garey and D.S.Johnson. Computers and Intractability, a guide to the Theory of NP-completeness. W.H. Freeman and company, 1979. Google ScholarDigital Library
- J.Kleinberg, C.Papadimitriou, and P.Raghavan. Segmentation problems. In Proc. of 13th Symposium on Theory of Computation, 1998. Google ScholarDigital Library
- J.Kleinberg, C.Papadimitriou, and P.Raghavan. A Microeconomic View of Data Mining. In Journal of Data Mining and Knowledge Discovery, 1998. Google ScholarDigital Library
- T.Y.Lin, Y.Y.Yao and E.Louie. Mining values added association rules. In Proc. of PAKDD 2002. Google ScholarDigital Library
- H.Mannila. Theoretical Framework for Data Mining. In SIGKDD Explorations, Jan. 2000. Google ScholarDigital Library
- M.Steinbach, G.Karypis and V.Kumar. Efficient Algorithms for Creating Product Catalogs. In Proc. of SDM 2001.Google Scholar
- R.C.W.Wong, A.W.C.Fu and K.Wang. MPIS: Maximal-profit item selection with cross-selling considerations. In Proc. of ICDM 2003. Google ScholarDigital Library
- K.Wang and M.Y.Su. Item selection by "hub-authority" profit ranking. In Proc. of SIGKDD 2002. Google ScholarDigital Library
- D.Xu, Y.Ye, and J.Zhang. Approximate the 2-Catalog Segmentation Problem Using Semidefinite Programming Relaxations. In Optimization Methods and Software.Google Scholar
- K.Wang, S.Q.Zhou and J.W.Han. Profit Mining: From Patterns to Actions. In Proc. of EBDT 2002. Google ScholarDigital Library
Index Terms
- A microeconomic data mining problem: customer-oriented catalog segmentation
Recommendations
Customer data mining for lifestyle segmentation
Highlights We propose a method for segmentation in retailing, based on customers' lifestyle. We identify typical shopping baskets by clustering the transactional records. We infer the lifestyle corresponding to each typical shopping basket. Customers ...
Catalog segmentation with double constraints in business
Catalog segmentation is an important issue in data mining in business from the microeconomic point of view. In catalog segmentation, an enterprise tries to develop k catalogs with r products that are sent to corresponding customers in order to maximize ...
Designing customer-oriented catalogs in e-CRM using an effective self-adaptive genetic algorithm
Analysis of customer interactions for electronic customer relationship management (e-CRM) can be performed by way of using data mining (DM), optimization methods, or combined approaches. The microeconomic framework for data mining addresses maximizing ...
Comments