Skip to main content

2017 | OriginalPaper | Buchkapitel

Top-k String Auto-Completion with Synonyms

verfasst von : Pengfei Xu, Jiaheng Lu

Erschienen in: Database Systems for Advanced Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Auto-completion is one of the most prominent features of modern information systems. The existing solutions of auto-completion provide the suggestions based on the beginning of the currently input character sequence (i.e. prefix). However, in many real applications, one entity often has synonyms or abbreviations. For example, “DBMS” is an abbreviation of “Database Management Systems”. In this paper, we study a novel type of auto-completion by using synonyms and abbreviations. We propose three trie-based algorithms to solve the top-k auto-completion with synonyms; each one with different space and time complexity trade-offs. Experiments on large-scale datasets show that it is possible to support effective and efficient synonym-based retrieval of completions of a million strings with thousands of synonyms rules at about a microsecond per-completion, while taking small space overhead (i.e. 160–200 bytes per string). The implementation of algorithms is publicly available at http://​udbms.​cs.​helsinki.​fi/​?​projects/​autocompletion/​download.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In this table, we use the denotation \({\texttt {ab}{{\underline{\mathtt{{c}}}}}}\) to represent a node with label “c” with parent node labeled “b”, in path root – a – b – c.
 
Literatur
1.
Zurück zum Zitat Burg, J.J., Ainsworth, J.D., Casto, B., Lang, S.: Experiments with the “oregon trail knapsack problem”. Electron. Notes Discrete Math. 1, 26–35 (1999)MathSciNetCrossRefMATH Burg, J.J., Ainsworth, J.D., Casto, B., Lang, S.: Experiments with the “oregon trail knapsack problem”. Electron. Notes Discrete Math. 1, 26–35 (1999)MathSciNetCrossRefMATH
2.
Zurück zum Zitat Cai, F., Rijke, M.: A survey of query auto completion in information retrieval. Found. Trends Inf. Retrieval 10(4), 273–363 (2016)CrossRef Cai, F., Rijke, M.: A survey of query auto completion in information retrieval. Found. Trends Inf. Retrieval 10(4), 273–363 (2016)CrossRef
3.
Zurück zum Zitat Chaudhuri, S., Kaushik, R.: Extending autocompletion to tolerate errors. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, pp. 707–718. ACM, New York (2009) Chaudhuri, S., Kaushik, R.: Extending autocompletion to tolerate errors. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, pp. 707–718. ACM, New York (2009)
5.
Zurück zum Zitat Hyvönen, E., Mäkelä, E.: Semantic autocompletion. In: Mizoguchi, R., Shi, Z., Giunchiglia, F. (eds.) ASWC 2006. LNCS, vol. 4185, pp. 739–751. Springer, Heidelberg (2006). doi:10.1007/11836025_72 CrossRef Hyvönen, E., Mäkelä, E.: Semantic autocompletion. In: Mizoguchi, R., Shi, Z., Giunchiglia, F. (eds.) ASWC 2006. LNCS, vol. 4185, pp. 739–751. Springer, Heidelberg (2006). doi:10.​1007/​11836025_​72 CrossRef
6.
Zurück zum Zitat Ji, S., Li, G., Li, C., Feng, J.: Efficient interactive fuzzy keyword search. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, pp. 371–380. ACM, New York (2009) Ji, S., Li, G., Li, C., Feng, J.: Efficient interactive fuzzy keyword search. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, pp. 371–380. ACM, New York (2009)
7.
Zurück zum Zitat Kolesar, P.J.: A branch and bound algorithm for the knapsack problem. Manage. Sci. 13(9), 723–735 (1967)CrossRef Kolesar, P.J.: A branch and bound algorithm for the knapsack problem. Manage. Sci. 13(9), 723–735 (1967)CrossRef
8.
Zurück zum Zitat LeFevre, J., Sankaranarayanan, J., Hacigümüs, H., Tatemura, J., Polyzotis, N., Carey, M.J.: MISO: souping up big data query processing with a multistore system. In: SIGMOD Conference, pp. 1591–1602. ACM (2014) LeFevre, J., Sankaranarayanan, J., Hacigümüs, H., Tatemura, J., Polyzotis, N., Carey, M.J.: MISO: souping up big data query processing with a multistore system. In: SIGMOD Conference, pp. 1591–1602. ACM (2014)
9.
Zurück zum Zitat Li, G., Ji, S., Li, C., Feng, J.: Efficient type-ahead search on relational data: a TASTIER approach. In: ACM SIGMOD, pp. 695–706 (2009) Li, G., Ji, S., Li, C., Feng, J.: Efficient type-ahead search on relational data: a TASTIER approach. In: ACM SIGMOD, pp. 695–706 (2009)
10.
Zurück zum Zitat Lu, J., Lin, C., Wang, W., Li, C., Xiao, X.: Boosting the quality of approximate string matching by synonyms. ACM Trans. Database Syst. 40(3), 15 (2015)MathSciNetCrossRef Lu, J., Lin, C., Wang, W., Li, C., Xiao, X.: Boosting the quality of approximate string matching by synonyms. ACM Trans. Database Syst. 40(3), 15 (2015)MathSciNetCrossRef
11.
Zurück zum Zitat Schnaitter, K., Polyzotis, N., Getoor, L.: Index interactions in physical design tuning: modeling, analysis, and applications. PVLDB 2(1), 1234–1245 (2009) Schnaitter, K., Polyzotis, N., Getoor, L.: Index interactions in physical design tuning: modeling, analysis, and applications. PVLDB 2(1), 1234–1245 (2009)
12.
Zurück zum Zitat Singh, R., Gulwani, S.: Learning semantic string transformations from examples. PVLDB 5(8), 740–751 (2012) Singh, R., Gulwani, S.: Learning semantic string transformations from examples. PVLDB 5(8), 740–751 (2012)
13.
Zurück zum Zitat Tsuruoka, Y., McNaught, J., Tsujii, J., Ananiadou, S.: Learning string similarity measures for gene/protein name dictionary look-up using logistic regression. Bioinformatics 23(20), 2768–2774 (2007)CrossRef Tsuruoka, Y., McNaught, J., Tsujii, J., Ananiadou, S.: Learning string similarity measures for gene/protein name dictionary look-up using logistic regression. Bioinformatics 23(20), 2768–2774 (2007)CrossRef
14.
Zurück zum Zitat Xiao, C., Qin, J., Wang, W., Ishikawa, Y., Tsuda, K., Sadakane, K.: Efficient error-tolerant query autocompletion. Proc. VLDB Endow. 6(6), 373–384 (2013)CrossRef Xiao, C., Qin, J., Wang, W., Ishikawa, Y., Tsuda, K., Sadakane, K.: Efficient error-tolerant query autocompletion. Proc. VLDB Endow. 6(6), 373–384 (2013)CrossRef
Metadaten
Titel
Top-k String Auto-Completion with Synonyms
verfasst von
Pengfei Xu
Jiaheng Lu
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-55699-4_13