Skip to main content
Top
Published in: World Wide Web 1/2019

24-01-2018

Answering unique topic queries with dynamic threshold

Authors: Huixin Ma, Zhihui Yang, Yinan Jing, Zhenying He, X. Sean Wang

Published in: World Wide Web | Issue 1/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Queries with threshold are common when dealing with unstructured data such as text corpus. It often requires several exploring attempts for users to achieve final results. In this work, we propose an automatic sampling method for threshold determination without any interaction with users, in which two optimizing algorithms are introduced to reach the lower-bound time complexity in each sampling trial. We evaluate our methods using several experiments and demonstrate the effectiveness of it, which can be an enormously powerful tool for ordinary users.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bentley, J.: Programming pearls: algorithm design techniques. Commun. ACM 27(9), 865–873 (1984)CrossRef Bentley, J.: Programming pearls: algorithm design techniques. Commun. ACM 27(9), 865–873 (1984)CrossRef
2.
go back to reference Bentley, J.: Programming pearls: perspective on performance. Commun. ACM 27(9), 1087–1092 (1984)CrossRef Bentley, J.: Programming pearls: perspective on performance. Commun. ACM 27(9), 1087–1092 (1984)CrossRef
3.
go back to reference Blei, D. M., Ng, A. Y., Jordan, M. I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D. M., Ng, A. Y., Jordan, M. I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
4.
go back to reference Cetintemel, U., Cherniack, M., DeBrabant, J., Diao, Y., Dimitriadou, K., Kalinin, A., Papaemmanouil, O., Zdonik, S. B.: Query steering for interactive data exploration. In: CIDR (2013) Cetintemel, U., Cherniack, M., DeBrabant, J., Diao, Y., Dimitriadou, K., Kalinin, A., Papaemmanouil, O., Zdonik, S. B.: Query steering for interactive data exploration. In: CIDR (2013)
5.
go back to reference Cheng, R., Kalashnikov, D. V., Prabhakar, S.: Querying imprecise data in moving object environments. IEEE TKDE 16(9), 1112–1127 (2004) Cheng, R., Kalashnikov, D. V., Prabhakar, S.: Querying imprecise data in moving object environments. IEEE TKDE 16(9), 1112–1127 (2004)
6.
go back to reference Diao, Y., Dimitriadou, K., Li, Z., Liu, W., Papaemmanouil, O., Peng, K., Peng, L.: Aide: an automatic user navigation system for interactive data exploration. PVLDB 8(12), 1964–1967 (2015) Diao, Y., Dimitriadou, K., Li, Z., Liu, W., Papaemmanouil, O., Peng, K., Peng, L.: Aide: an automatic user navigation system for interactive data exploration. PVLDB 8(12), 1964–1967 (2015)
7.
go back to reference Dimitriadou, K., Papaemmanouil, O., Diao, Y.: Explore-by-example: an automatic query steering framework for interactive data exploration. In: SIGMOD (2014) Dimitriadou, K., Papaemmanouil, O., Diao, Y.: Explore-by-example: an automatic query steering framework for interactive data exploration. In: SIGMOD (2014)
8.
go back to reference Dimitriadou, K., Papaemmanouil, O., Diao, Y.: Aide: an active learning-based approach for interactive data exploration. IEEE Trans. Knowl. Data Eng. 28(11), 2842–2856 (2016)CrossRef Dimitriadou, K., Papaemmanouil, O., Diao, Y.: Aide: an active learning-based approach for interactive data exploration. IEEE Trans. Knowl. Data Eng. 28(11), 2842–2856 (2016)CrossRef
9.
go back to reference Drosou, M., Pitoura, E.: Ymaldb: exploring relational databases via result-driven recommendations. VLDB J. 22(6), 849–874 (2013)CrossRef Drosou, M., Pitoura, E.: Ymaldb: exploring relational databases via result-driven recommendations. VLDB J. 22(6), 849–874 (2013)CrossRef
10.
go back to reference Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Mining optimized association rules for numeric attributes. In: Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp 182–191. ACM (1996) Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Mining optimized association rules for numeric attributes. In: Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp 182–191. ACM (1996)
11.
go back to reference Fung, G. P. C., Yu, J. X., Yu, P. S., Lu, H.: Parameter free bursty events detection in text streams. In: VLDB (2005) Fung, G. P. C., Yu, J. X., Yu, P. S., Lu, H.: Parameter free bursty events detection in text streams. In: VLDB (2005)
12.
go back to reference Griffiths, T. L., Steyvers, M.: A probabilistic approach to semantic representation. In: Proceedings of the 24th Annual Conference of the Cognitive Science Society, pp 381–386. Citeseer (2002) Griffiths, T. L., Steyvers, M.: A probabilistic approach to semantic representation. In: Proceedings of the 24th Annual Conference of the Cognitive Science Society, pp 381–386. Citeseer (2002)
13.
go back to reference Griffiths, T. L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(suppl 1), 5228–5235 (2004)CrossRef Griffiths, T. L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(suppl 1), 5228–5235 (2004)CrossRef
14.
go back to reference Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 50–57. ACM (1999) Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 50–57. ACM (1999)
15.
go back to reference Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001)CrossRefMATH Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001)CrossRefMATH
16.
go back to reference Jiang, L., Nandi, A.: Snaptoquery: providing interactive feedback during exploratory query specification. PVLDB 8(11), 1250–1261 (2015) Jiang, L., Nandi, A.: Snaptoquery: providing interactive feedback during exploratory query specification. PVLDB 8(11), 1250–1261 (2015)
17.
go back to reference Joglekar, M., Garcia-Molina, H., Parameswaran, A.: Interactive data exploration with smart drill-down. arXiv:1412.0364 (2014) Joglekar, M., Garcia-Molina, H., Parameswaran, A.: Interactive data exploration with smart drill-down. arXiv:1412.​0364 (2014)
18.
go back to reference Kahng, M., Navathe, S. B., Stasko, J. T., Chau, D. H.: Interactive browsing and navigation in relational databases. arXiv:1603.02371 (2016) Kahng, M., Navathe, S. B., Stasko, J. T., Chau, D. H.: Interactive browsing and navigation in relational databases. arXiv:1603.​02371 (2016)
19.
go back to reference Kamat, N., Jayachandran, P., Tunga, K., Nandi, A.: Distributed and interactive cube exploration. In: ICDE (2014) Kamat, N., Jayachandran, P., Tunga, K., Nandi, A.: Distributed and interactive cube exploration. In: ICDE (2014)
20.
21.
go back to reference Lappas, T., Arai, B., Platakis, M., Kotsakos, D., Gunopulos, D.: On burstiness-aware search for document sequences. In: SIGKDD (2009) Lappas, T., Arai, B., Platakis, M., Kotsakos, D., Gunopulos, D.: On burstiness-aware search for document sequences. In: SIGKDD (2009)
22.
go back to reference Qarabaqi, B., Riedewald, M.: User-driven refinement of imprecise queries. In: ICDE (2014) Qarabaqi, B., Riedewald, M.: User-driven refinement of imprecise queries. In: ICDE (2014)
23.
go back to reference Sellam, T., Kersten, M.: Fast, explainable view detection to characterize exploration queries. In: Proceedings of the 28th International Conference on Scientific and Statistical Database Management, p 20. ACM (2016) Sellam, T., Kersten, M.: Fast, explainable view detection to characterize exploration queries. In: Proceedings of the 28th International Conference on Scientific and Statistical Database Management, p 20. ACM (2016)
24.
go back to reference Sellam, T., Kersten, M., et al.: Meet charles, big data query advisor. In: CIDR (2013) Sellam, T., Kersten, M., et al.: Meet charles, big data query advisor. In: CIDR (2013)
25.
go back to reference Sellam, T., Müller, E., Kersten, M.: Semi-automated exploration of data warehouses. In: CIKM (2015) Sellam, T., Müller, E., Kersten, M.: Semi-automated exploration of data warehouses. In: CIKM (2015)
26.
go back to reference Smith, D. R.: Applications of a strategy for designing divide-and-conquer algorithms. Sci. Comput. Program. 8(3), 213–229 (1987)CrossRefMATH Smith, D. R.: Applications of a strategy for designing divide-and-conquer algorithms. Sci. Comput. Program. 8(3), 213–229 (1987)CrossRefMATH
27.
go back to reference Soliman, M. A., Ilyas, I. F., Chang, K. C. -C.: Top-k query processing in uncertain databases. In: ICDE (2007) Soliman, M. A., Ilyas, I. F., Chang, K. C. -C.: Top-k query processing in uncertain databases. In: ICDE (2007)
28.
go back to reference Tukey, J.: Exploratory data analysis. Addison-Wesley, Reading, Mass., (1977) Tukey, J.: Exploratory data analysis. Addison-Wesley, Reading, Mass., (1977)
29.
go back to reference Vartak, M., Rahman, S., Madden, S., Parameswaran, A., Polyzotis, N.: SeeDB: efficient data-driven visualization recommendations to support visual analytics. Proceedings of the VLDB Endowment 8(13), 2182–2193 (2015)CrossRef Vartak, M., Rahman, S., Madden, S., Parameswaran, A., Polyzotis, N.: SeeDB: efficient data-driven visualization recommendations to support visual analytics. Proceedings of the VLDB Endowment 8(13), 2182–2193 (2015)CrossRef
Metadata
Title
Answering unique topic queries with dynamic threshold
Authors
Huixin Ma
Zhihui Yang
Yinan Jing
Zhenying He
X. Sean Wang
Publication date
24-01-2018
Publisher
Springer US
Published in
World Wide Web / Issue 1/2019
Print ISSN: 1386-145X
Electronic ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-018-0528-7

Other articles of this Issue 1/2019

World Wide Web 1/2019 Go to the issue

Premium Partner