skip to main content
10.1145/1014052.1014119acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

A microeconomic data mining problem: customer-oriented catalog segmentation

Authors Info & Claims
Published:22 August 2004Publication History

ABSTRACT

The microeconomic framework for data mining [7] assumes that an enterprise chooses a decision maximizing the overall utility over all customers where the contribution of a customer is a function of the data available on that customer. In Catalog Segmentation, the enterprise wants to design k product catalogs of size r that maximize the overall number of catalog products purchased. However, there are many applications where a customer, once attracted to an enterprise, would purchase more products beyond the ones contained in the catalog. Therefore, in this paper, we investigate an alternative problem formulation, that we call Customer-Oriented Catalog Segmentation, where the overall utility is measured by the number of customers that have at least a specified minimum interest t in the catalogs. We formally introduce the Customer-Oriented Catalog Segmentation problem and discuss its complexity. Then we investigate two different paradigms to design efficient, approximate algorithms for the Customer-Oriented Catalog Segmentation problem, greedy (deterministic) and randomized algorithms. Since greedy algorithms may be trapped in a local optimum and randomized algorithms crucially depend on a reasonable initial solution, we explore a combination of these two paradigms. Our experimental evaluation on synthetic and real data demonstrates that the new algorithms yield catalogs of significantly higher utility compared to classical Catalog Segmentation algorithms.

References

  1. R.Agrawal. IBM synthetic data generator. 1994.Google ScholarGoogle Scholar
  2. V.Asodi and S.Safra. On the complexity of the catalog segmentation problem. Unpublished manuscriptGoogle ScholarGoogle Scholar
  3. T.Brijs, B.Goethals, G.Swinnen, K.Vanhoof and G.Wets. A Data Mining Framework for Optimal Product Selection in Retail Supermarket Data: The Generalized PROFSET Model. In Proc. of SIGKDD 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. U.Feige. A threshold of ln $n$ for approximating set cover, J. ACM 45(4) pages 634 -- 652 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M.R.Garey and D.S.Johnson. Computers and Intractability, a guide to the Theory of NP-completeness. W.H. Freeman and company, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J.Kleinberg, C.Papadimitriou, and P.Raghavan. Segmentation problems. In Proc. of 13th Symposium on Theory of Computation, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J.Kleinberg, C.Papadimitriou, and P.Raghavan. A Microeconomic View of Data Mining. In Journal of Data Mining and Knowledge Discovery, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T.Y.Lin, Y.Y.Yao and E.Louie. Mining values added association rules. In Proc. of PAKDD 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H.Mannila. Theoretical Framework for Data Mining. In SIGKDD Explorations, Jan. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M.Steinbach, G.Karypis and V.Kumar. Efficient Algorithms for Creating Product Catalogs. In Proc. of SDM 2001.Google ScholarGoogle Scholar
  11. R.C.W.Wong, A.W.C.Fu and K.Wang. MPIS: Maximal-profit item selection with cross-selling considerations. In Proc. of ICDM 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K.Wang and M.Y.Su. Item selection by "hub-authority" profit ranking. In Proc. of SIGKDD 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D.Xu, Y.Ye, and J.Zhang. Approximate the 2-Catalog Segmentation Problem Using Semidefinite Programming Relaxations. In Optimization Methods and Software.Google ScholarGoogle Scholar
  14. K.Wang, S.Q.Zhou and J.W.Han. Profit Mining: From Patterns to Actions. In Proc. of EBDT 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A microeconomic data mining problem: customer-oriented catalog segmentation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2004
      874 pages
      ISBN:1581138881
      DOI:10.1145/1014052

      Copyright © 2004 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 August 2004

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader