research-article

Dynamic active probing of helpdesk databases

Authors:
Shenghuo Zhu

NEC Labs America

NEC Labs America
View Profile

,
Tao Li

Florida International University

Florida International University
View Profile

,
Zhiyuan Chen

University of Maryland Baltimore County

University of Maryland Baltimore County
View Profile

,
Dingding Wang

Florida International University

Florida International University
View Profile

,
Yihong Gong

NEC Labs America

NEC Labs America
View Profile

Proceedings of the VLDB Endowment Volume 1 Issue 1pp 748–760https://doi.org/10.14778/1453856.1453937

Published:01 August 2008Publication History

Proceedings of the VLDB Endowment

Abstract

Helpdesk databases are used to store past interactions between customers and companies to improve customer service quality. One common scenario of using helpdesk database is to find whether recommendations exist given a new problem from a customer. However, customers often provide incomplete or even inaccurate information. Manually preparing a list of clarification questions does not work for large databases. This paper investigates the problem of automatic generation of a minimal number of questions to reach an appropriate recommendation. This paper proposes a novel dynamic active probing method. Compared to other alternatives such as decision tree and case-based reasoning, this method has two distinctive features. First, it actively probe the customer to get useful information to reach the recommendation, and the information provided by customer will be immediately used by the method to dynamically generate the next questions to probe. This feature ensures that all available information from the customer is used. Second, this method is based on a probabilistic model, and uses a data augmentation method which avoids overfitting when estimating the probabilities in the model. This feature ensures that the method is robust to databases that are incomplete or contain errors. Experimental results verify the effectiveness of our approach.

References

C. C. Aggarwal and P. S. Yu. The igrid index: reversing the dimensionality curse for similarity indexing in high dimensional space. In SIGKDD, pages 119--129, 2000. Google ScholarDigital Library
R. Agrawal, R. Rantzau, and E. Terzi. Context-sensitive ranking. In SIGMOD, pages 383--394, 2006. Google ScholarDigital Library
S. Agrawal, S. Chaudhuri, G. Das, and A. Gionis. Automated ranking of database query results. In CIDR, 2003.Google Scholar
D. W. Aha, D. Mcsherry, and Q. Yang. Advances in conversational case-based reasoning. The Knowledge Engineering Review, 20(3):247--254, 2006. Google ScholarDigital Library
J. Allen. Natural Language Understanding. Addison Wesley, 1994.Google Scholar
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is nearest neighbor meaningful? In ICDT, 1999. Google ScholarDigital Library
S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, pages 421--430, 2001. Google ScholarDigital Library
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth and Brooks/Cole, Monterey, CA, 1984.Google Scholar
D. Bridge, M. H. Goker, L. Mcginty, and B. Smyth. Case-based recommender systems. The Knowledge Engineering Review. 20(3):315--320, 2006. Google ScholarDigital Library
M. Brodie, I. Rish, and S. Ma. Intelligent probing: a cost-effective approach to fault diagnosis in computer networks. IBM System Journal, 41(3), 2002. Google ScholarDigital Library
C.-H. Chang, M. Kayed, M. R. Girgis, and K. F. Shaalan. A survey of web information extraction systems. IEEE Transactions on Knowledge and Data Engineering, 18(10):1411--1428, 2006. Google ScholarDigital Library
S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic ranking of database query results. In VLDB, pages 888--899, 2004. Google ScholarDigital Library
S. Chaudhuri and L. Gravano. Evaluating top-k selection queries. In M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, editors, VLDB, pages 397--410. Morgan Kaufmann, 1999. Google ScholarDigital Library
C. Cieri, D. Graff, and M. Liberman. The TDT-2 text and speech corpus, 1999.Google Scholar
D. A. Cohn, Z. Ghahramani, and M. I. Jordan. Active learning with statistical models. In Advances in Neural Information Processing Systems, pages 705--712, 1995.Google ScholarCross Ref
P. Cunningham and S. B. A comparison of Model-based and incremental case-based approaches to electronic fault diagnosis. Technical Report TCD-CS-94-21, 1994. Google ScholarDigital Library
T. Darrell, P. Indyk, and G. Shakhnarovich. Nearest Neighbor Methods in Learning and Vision: Theory and Practice. MIT Press, 2006. Google ScholarDigital Library
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Proceedings of the Royal Statistical Society, pages 1--38, 1976.Google Scholar
L. Devroye and G. Lugosi. Combinatorial Methods in Density Estimation. Springer, 2001.Google ScholarCross Ref
M. Doyle and P. Cunningham. A dynamic approach to reducing dialog in on-line decision guides. In EWCBR, pages 49--60, 2000. Google ScholarDigital Library
J. English, M. Hearst, R. Sinha, K. Swearingen, and K.-P. Yee. Hierarchical faceted metadata in site search interfaces. In CHI '02, pages 628--639, 2002. Google ScholarDigital Library
R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, 2001. Google ScholarDigital Library
M. Franz, T. Ward, J. S. McCarley, and W.-J. Zhu. Unsupervised and supervised clustering for topic tracking. In SIGIR, pages 310--317, 2001. Google ScholarDigital Library
Z. Ghahramani and M. I. Jordan. Supervised learning from incomplete data via an EM approach. In Advances in Neural Information Processing Systems, pages 120--127, 1994.Google ScholarDigital Library
R. Grishman. Information extraction: Techniques and challenges. In SCIE '97: International Summer School on Information Extraction, pages 10--27, 1997. Springer-Verlag. Google ScholarDigital Library
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, Prediction. Springer, 2001.Google Scholar
H. V. Jagadish, B. C. Ooi, K.-L. Tan, C. Yu, and R. Z. 0003. idistance: An adaptive b⁺-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst., 30(2):364--397, 2005. Google ScholarDigital Library
M. Jayapandian and H. V. Jagadish. Automating the design and construction of query forms. In ICDE, page 125, 2006. Google ScholarDigital Library
A. A. Kamis and E. A. Stohr. Parametric search engines: what makes them effective when shopping online for differentiated products? Inf. Manage., 43(7):904--918, 2006. Google ScholarDigital Library
R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1--2):273--324, 1997. Google ScholarDigital Library
D. Kossmann, F. Ramsak, and S. Rost. Shooting stars in the sky: An online algorithm for skyline queries. In VLDB, pages 275--286, 2002. Google ScholarDigital Library
G. Kumaran and J. Allan. Simple questions to improve pseudo-relevance feedback results. In SIGIR, pages 661--662, 2006. Google ScholarDigital Library
P. Langley. Selection of relevant features in machine learning. In AAAI Fall Symposium on Relevance, pages 140--144, 1994.Google ScholarCross Ref
N. Mirzadeh, F. Ricci, and M. Bansal. Feature selection methods for conversational recommender systems. In Proceedings of the 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service, pages 772--777, 2005. Google ScholarDigital Library
National Retail Federation. Importance of customer service reinforced in nrf foundation/american express study. http://www.nrf.com/content/press/release2004/custserv1104.htm.Google Scholar
T. Nguyen, M. Czerwinski, and D. Lee. COMPAQ QuickSource: Providing the consumer with the power of artificial intelligence. In IAAI '93, pages 142--151, 1993. Google ScholarDigital Library
J. R. Quinlan. Introduction of decision trees. Machine Learning, (1):81--106, 1986. Google ScholarDigital Library
D. B. Rubin. Multiple Imputation for Nonresponse in surveys. Wiley, New York, 1987.Google Scholar
S. Schmitt. simvar: A similarity-influenced question selection criterion for e-sales dialogs. Artificial Intelligence Review, 18:195--221, 2002. Google ScholarDigital Library
H. S. Seung, M. Opper, and H. Sompolinsky. Query by committee. In Computational Learning Theory, pages 287--294, 1992. Google ScholarDigital Library
K. Sparck Jones. Automatic Keyword Classification for Information Retrieval. Butterworth, London, 1971.Google Scholar
M. A. Tanner and W. H. Wong. The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82:528--550, 1987.Google ScholarCross Ref
P. Wu, Y. Sismanis, and B. Reinwald. Towards keyword-driven analytical processing. In SIGMOD, pages 617--628, 2007. Google ScholarDigital Library
J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR, pages 4--11, 1996. Google ScholarDigital Library
K.-P. Yee, K. Swearingen, K. Li, and M. Hearst. Faceted metadata for image search and browsing. In CHI '03, pages 401--408, 2003. Google ScholarDigital Library

Index Terms

Dynamic active probing of helpdesk databases
1. Information systems
2. Theory of computation
  1. Models of computation
    1. Probabilistic computation

Recommendations

Probe generation for active probing
Summary
Active probing is a widely adopted approach for developing effective solutions for network monitoring and diagnosing. However, the use of probing techniques incurs costs in terms of additional network traffic. Furthermore, probing stations are ...

Active probing is a widely adopted approach for developing effective solutions for network monitoring and diagnosing. The set of probes used for fault detection and/or diagnosis (called the target probe set) is selected by a probe selection algorithm ...
Read More
Activating Case-Based Reasoning with Active Databases
EWCBR '00: Proceedings of the 5th European Workshop on Advances in Case-Based Reasoning

Many of today's CBR systems are passive in nature: they require human users to activate them manually and to provide information about the incoming problem explicitly. In this paper, we present an integrated system that combines CBR system with an ...
Read More
Real-Time and Active Databases: A Survey
ARTDB '97: Proceedings of the Second International Workshop on Active, Real-Time, and Temporal Database Systems

Active real-time databases have emerged as a research area in which concepts of active databases and real-time databases are combined into a real-time database with reactive behavior. However, this marriage is not free from complications. The main ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 1, Issue 1
August 2008
1216 pages
ISSN:2150-8097
Editors:
Peter Buneman,
Beng Chin Ooi,
Kenneth Ross,
Gerald Weber
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 August 2008
Published in pvldb Volume 1, Issue 1
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 224
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dynamic active probing of helpdesk databases

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Probe generation for active probing

Activating Case-Based Reasoning with Active Databases

Real-Time and Active Databases: A Survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Dynamic active probing of helpdesk databases

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Probe generation for active probing

Activating Case-Based Reasoning with Active Databases

Real-Time and Active Databases: A Survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media