ABSTRACT
The volume of natural language text data has been rapidly increasing over the past two decades, due to factors such as the growth of the Web, the low cost associated to publishing and the progress on the digitization of printed texts. This growth combined with the proliferation of natural language systems for search and retrieving information provides tremendous opportunities for studying some of the areas where database systems and natural language processing systems overlap. This tutorial explores two more relevant areas of overlap to the database community: (1) managing natural language text data in a relational database, and (2) developing natural language interfaces to databases. The tutorial presents state-of-the-art methods, related systems, research opportunities and challenges covering both areas.
- Eugene Agichtein and Luis Gravano. Querying text databases for efficient information extraction. In Proc. of the ICDE Conference, pages 113--124, Bangalore, India, March 2003.Google ScholarCross Ref
- Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, and Venkatesh Ganti. Scalable ad-hoc entity extraction from text collections. PVLDB, 1(1):945--957, 2008. Google ScholarDigital Library
- Yael Amsterdamer, Anna Kukliansky, and Tova Milo. A natural language interface for querying general and individual knowledge. PVLDB, 8(12):1430--1441, August 2015. Google ScholarDigital Library
- H Bais, M Machkour, and L Koutti. Querying database using a universal natural language interface based on machine learning. In IT4OD, 2016.Google ScholarCross Ref
- Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. Semantic parsing on freebase from question-answer pairs. In Proc. of the EMNLP Conference, volume 2, page 6, 2013.Google Scholar
- Elisa Bertino, Beng Chin Ooi, Ron Sacks-Davis, Kian-Lee Tan, Justin Zobel, Boris Shidlovsky, and Daniele Andronico. Indexing techniques for advanced database systems, volume 8. Springer Science & Business Media, 2012. Google ScholarDigital Library
- Michael J Cafarella and Oren Etzioni. A search engine for natural language applications. In Proc. of the WWW conference, pages 442--452. ACM, 2005. Google ScholarDigital Library
- Michael J. Cafarella, Christopher Re, Dan Suciu, and Oren Etzioni. Structured querying of web text data: A technical challenge. In Proc. of the CIDR Conference, pages 225--234, Asilomar, CA, January 2007.Google Scholar
- Guoray Cai, Hongmei Wang, Alan M. MacEachren, and Sven Fuhrmann. Natural conversational interfaces to geospatial databases. Transactions in GIS, 9(2):199--221, 2005.Google ScholarCross Ref
- Qingqing Cai and Alexander Yates. Large-scale semantic parsing via schema matching and lexicon extension. In ACL, pages 423--433. Citeseer, 2013.Google Scholar
- Angel X Chang and Christopher D Manning. Tokensregex: Defining cascaded regular expressions over tokens. Technical Report CSTR-2014-02, Department of Computer Science, Stanford University.Google Scholar
- Surajit Chaudhuri, Umeshwar Dayal, and Tak W Yan. Join queries with external text sources: Execution and optimization techniques. In ACM SIGMOD Record, pages 410--422, San Jose, California, May 1995. Google ScholarDigital Library
- Yang Chen and Daisy Zhe Wang. Knowledge expansion over probabilistic knowledge bases. In Proc. of the SIGMOD conference, pages 649--660. ACM, 2014. Google ScholarDigital Library
- Eric Chu, Akanksha Baid, Ting Chen, AnHai Doan, and Jeffrey Naughton. A relational approach to incrementally extracting and querying structure in unstructured data. In Proc. of the VLDB Conference, 2007. Google ScholarDigital Library
- P. Chubak and D. Rafiei. Index Structures for Efficiently Searching Natural Language Text. In Proc. of the CIKM Conference, 2010. Google ScholarDigital Library
- Pirooz Chubak and Davood Rafiei. Efficient indexing and querying over syntactically annotated trees. PVLDB, 5(11):1316--1327, 2012. Google ScholarDigital Library
- E.F. Codd. Seven steps to rendezvous with the casual user. In IFIP Working Conference Data Base Management, pages 179--200, 1974.Google Scholar
- Francesco Draicchio1 and Aldo Gangemi. Fred: From natural language text to rdf and owl in one click. In Extended Semantic Web Conference, pages 263--267, 2013.Google Scholar
- Eduardo M. Eisman, María Navarro, and Juan Luis Castro. A multi-agent conversational system with heterogeneous data sources access. Expert Syst. Appl., 53:172--191, 2016. Google ScholarDigital Library
- Dan Moldovan et al. LCC tools for question answering. In TREC, 2002.Google Scholar
- Rodolfo A. Pazos R. et al. Natural language interfaces to databases: An analysis of the state of the art. Recent Advances on Hybrid Intelligent Systems, 451:463--480, 2013.Google ScholarCross Ref
- Yunyao Li et al. Enabling domain-awareness for a generic natural language interface. In AAAI, pages 833--838, 2007. Google ScholarDigital Library
- David A. Ferrucci. Introduction to "this is watson". IBM Journal of Research and Development, 56(3):1, 2012. Google ScholarDigital Library
- Gaston H. Gonnet and Frank Wm. Tompa. Mind your grammar: a new approach to modelling text. In Proc. of the VLDB Conference, pages 339--346, Brighton, England, September 1987. Google ScholarDigital Library
- Carolin Haas and Stefan Riezler. Responsebased learning for machine translation of opendomain database queries. In Proc. of NAACL HLT, pages 1339--1344, 2015.Google Scholar
- Alpa Jain, AnHai Doan, and Luis Gravano. Optimizing SQL queries over text databases. In Proc. of the ICDE Conference, pages 636--645, Cancun, Mexico, April 2008. Google ScholarDigital Library
- Rohini Kokare and Kirti Wanjale. A natural language query builder interface for structured databases using dependency parsing. International Journal of Mathematical Sciences and Computing, 1(4):11--20, November 2015.Google ScholarCross Ref
- Jayant Krishnamurthy and Tom M Mitchell. Weakly supervised training of semantic parsers. In Proc. of the EMNLP Conference, pages 754--765. Association for Computational Linguistics, 2012. Google ScholarDigital Library
- Nicolas Kuchmann-Beauger and Marie-Aude Aufaure. A natural language interface for data warehouse question answering. In Natural Language Processing and Information Systems, volume 6716, pages 201--208. 2011. Google ScholarDigital Library
- Fei Li and H. V. Jagadish. Constructing an interactive natural language interface for relational databases. PVLDB, 8(1):73--84, 2014. Google ScholarDigital Library
- Fei Li and H. V. Jagadish. Understanding natural language queries over relational databases. SIGMOD Record, 45(1):6--13, June 2016. Google ScholarDigital Library
- Yunyao Li, Huahai Yang, and H. V. Jagadish. Constructing a generic natural language interface for an XML database. In Proc. of the EDBT Conference, pages 737--754, 2006. Google ScholarDigital Library
- Yunyao Li, Huahai Yang, and H. V. Jagadish. Nalix: A generic natural language search environment for XML data. ACM Trans. Database Systems, 32(4), 2007. Google ScholarDigital Library
- Dekang Lin and Patrick Pantel. Dirt - discovery of inference rules from text. In Proc. of the KDD Conference, pages 323--328, 2001. Google ScholarDigital Library
- Ana maria Popescu et al. Modern natural language interfaces to databases: Composing statistical parsing with semantic tractability. In Proc. of the COLING Conference, 2004. Google ScholarDigital Library
- Dan Moldovan and Vasile Rus. Logic form transformation of wordnet and its applicability to question answering. In Proc. of the ACL Conference, pages 402--409, 2001. Google ScholarDigital Library
- Davide Mottin, Matteo Lissandrini, Yannis Velegrakis, and Themis Palpanas. Exemplar queries: Give me an example of what you need. Proceedings of the VLDB Endowment, 7(5):365--376, 2014. Google ScholarDigital Library
- Ndapandula Nakashole, Martin Theobald, and Gerhard Weikum. Scalable knowledge harvesting with high precision and high recall. In Proc. of the WSDM Conference, pages 227--236. ACM, 2011. Google ScholarDigital Library
- Davood Rafiei and Haobin Li. Data extraction from the web using wild card queries. In Proc. of the CIKM Conference, pages 1939--1942, 2009. Google ScholarDigital Library
- Deepak Ravichandran and Eduard Hovy. Learning surface text patterns for a question answering system. In Proc. of the ACL Conference, 2002. Google ScholarDigital Library
- Siva Reddy, Oscar Täckström, Michael Collins, Tom Kwiatkowski, Dipanjan Das, Mark Steedman, and Mirella Lapata. Transforming dependency structures to logical forms for semantic parsing. Transactions of the Association for Computational Linguistics, 4:127--140, 2016.Google ScholarCross Ref
- Diptikalyan Saha, Avrilia Floratou, Karthik Sankaranarayanan, Umar Farooq Minhas, Ashish R. Mittal, and Fatma Özcan. Athena: An ontology-driven system for natural language querying over relational data stores. PVLDB, 9(12):1209--1220, August 2016. Google ScholarDigital Library
- Airi Salminen and Frank Tompa. PAT expressions: an algebra for text search. Acta Linguistica Hungarica, 41(1):277--306, 1994.Google Scholar
- K Shabaz, Jim D O'Shea, Keeley A Crockett, and A Latham. Aneesah: A conversational natural language interface to databases. In World Congress on Engineering, pages 227--232, 2015.Google Scholar
- Jaeho Shin, Sen Wu, Feiran Wang, Christopher De Sa, Ce Zhang, and Christopher Ré. Incremental knowledge base construction using deepdive. Proceedings of the VLDB Endowment, 8(11):1310--1321, 2015. Google ScholarDigital Library
- Niculae Stratica, Leila Kosseim, and Bipin C. Desai. Using semantic templates for a natural language interface to the cindi virtual library. Data and Knowledge Engineering, 55(1):4--19, October 2005. Google ScholarDigital Library
- Lappoon R Tang and Raymond J Mooney. Using multiple clause constructors in inductive logic programming for semantic parsing. In European Conference on Machine Learning, pages 466--477, 2001. Google ScholarDigital Library
- Marco A Valenzuela-Escarcega, Gustave Hahn-Powell, and Mihai Surdeanu. Odin's runes: A rule language for information extraction. In Proc. of the Language Resources and Evaluation Conference (LREC), 2016.Google Scholar
- Wei Xu. Data-driven approaches for paraphrasing across language variations. PhD thesis, New York University, 2014.Google Scholar
- Mohamed Yahya, Klaus Berberich, Shady Elbassuoni, Maya Ramanath, Volker Tresp, and Gerhard Weikum. Natural language questions for the web of data. In Proc. of the EMNLP Conference, pages 379--390. Association for Computational Linguistics, 2012. Google ScholarDigital Library
Index Terms
- Natural Language Data Management and Interfaces: Recent Development and Open Challenges
Recommendations
Ontology-aware dynamically adaptable free-form natural language agent interface for querying databases
AbstractStudying the literature, one can see a large number of systems that provide natural language interfaces to databases. Despite their importance, these interfaces address only one part of the problem: transforming natural language ...
Highlights- Natural language, ontology based interface to database.
- Declarative approach ...
Comments