skip to main content
10.1145/3038912.3052562acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Almond: The Architecture of an Open, Crowdsourced, Privacy-Preserving, Programmable Virtual Assistant

Published:03 April 2017Publication History

ABSTRACT

This paper presents the architecture of Almond, an open, crowdsourced, privacy-preserving and programmable virtual assistant for online services and the Internet of Things (IoT). Included in Almond is Thingpedia, a crowdsourced public knowledge base of natural language interfaces and open APIs. Our proposal addresses four challenges in virtual assistant technology: generality, interoperability, privacy, and usability. Generality is addressed by crowdsourcing Thingpedia, while interoperability is provided by ThingTalk, a high-level domain-specific language that connects multiple devices or services via open APIs. For privacy, user credentials and user data are managed by our open-source ThingSystem, which can be run on personal phones or home servers. Finally, we address usability by providing a natural language interface, whose capability can be extended via training with the help of a menu-driven interface.

We have created a fully working prototype, and crowdsourced a set of 187 functions across 45 different kinds of devices. Almond is the first virtual assistant that lets users specify trigger-action tasks in natural language. Despite the lack of real usage data, our experiment suggests that Almond can understand about 40% of the complex tasks when uttered by a user familiar with its capability.

References

  1. J. Andreas, A. Vlachos, and S. Clark. Semantic parsing as machine translation. In ACL (2), pages 47--52, 2013.Google ScholarGoogle Scholar
  2. Apple HomeKit. http://www.apple.com/ios/home.Google ScholarGoogle Scholar
  3. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. Dbpedia: A nucleus for a web of open data. In The semantic web, pages 722--735. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on freebase from question-answer pairs. In EMNLP, volume 2, page 6, 2013.Google ScholarGoogle Scholar
  5. J. Berant and P. Liang. Semantic parsing via paraphrasing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL-14), pages 1415--1425, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  6. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247--1250. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. L. Chen and R. J. Mooney. Learning to interpret natural language navigation instructions from observations. In AAAI, volume 2, pages 1--2, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Dixon, R. Mahajan, S. Agarwal, A. Brush, B. Lee, S. Saroiu, and P. Bahl. An operating system for the home. In Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12), pages 337--352, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. Dong and M. Lapata. Language to logical form with neural attention. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-16), pages 33--34, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  10. Google Weave. https://developers.google.com/weave.Google ScholarGoogle Scholar
  11. M. Gordon and C. Breazeal. Designing a virtual assistant for in-car child entertainment. In Proceedings of the 14th International Conference on Interaction Design and Children, pages 359--362. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. H. Harvey, E. Currie, P. Daryanani, and J. C. Augusto. Enhancing student support with a virtual assistant. In International Conference on E-Learning, E-Education, and Online Training, pages 101--109. Springer, 2015.Google ScholarGoogle Scholar
  13. J. Huang and M. Cakmak. Supporting mental model accuracy in trigger-action programming. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 215--225. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. If This Then That. http://ifttt.com.Google ScholarGoogle Scholar
  15. R. J. Kate, Y. W. Wong, and R. J. Mooney. Learning to transform natural to formal languages. In Proceedings of the National Conference on Artificial Intelligence, volume 20, page 1062, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Kenny, T. Parsons, J. Gratch, and A. Rizzo. Virtual humans for assisted health care. In Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments, page 6. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. W. J. Kent, C. W. Sugnet, T. S. Furey, K. M. Roskin, T. H. Pringle, A. M. Zahler, and D. Haussler. The human genome browser at UCSC. Genome research, 12(6):996--1006, 2002.Google ScholarGoogle Scholar
  18. C. Liu, X. Chen, E. C. Shin, M. Chen, and D. Song. Latent attention for if-then program synthesis. In Advances in Neural Information Processing Systems, pages 4574--4582, 2016.Google ScholarGoogle Scholar
  19. S. Mayer, N. Inhelder, R. Verborgh, R. Van de Walle, and F. Mattern. Configuration of smart environments made simple: Combining visual modeling with semantic metadata and reasoning. In Internet of Things (IOT), 2014 International Conference on the, pages 61--66. IEEE, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  20. Nest. https://developer.nest.com.Google ScholarGoogle Scholar
  21. P. Pasupat and P. Liang. Compositional semantic parsing on semi-structured tables. In Proceedings of the 53nd Annual Meeting of the Association for Computational Linguistics (ACL-15), pages 1470--1480, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  22. C. Quirk, R. Mooney, and M. Galley. Language to code: Learning semantic parsers for if-this-then-that recipes. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL-15), pages 878--888, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  23. Samsung SmartThings. http://www.smartthings.com.Google ScholarGoogle Scholar
  24. Sportradar. http://sportradar.us.Google ScholarGoogle Scholar
  25. K. N. Truong, E. M. Huang, and G. D. Abowd. Camp: A magnetic poetry interface for end-user programming of capture applications for the home. In International Conference on Ubiquitous Computing, pages 143--160. Springer, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  26. B. Ur, E. McManus, M. Pak Yong Ho, and M. L. Littman. Practical trigger-action programming in the smart home. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 803--812. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Walch, M. Rietzler, J. Greim, F. Schaub, B. Wiedersheim, and M. Weber. homeblox: making home automation usable. In Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication, pages 295--298. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Wang, J. Berant, and P. Liang. Building a semantic parser overnight. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL-15), pages 1332--1342, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  29. E. Weisstein et al. Wolfram mathworld, 2007.Google ScholarGoogle Scholar
  30. C. Xiao, M. Dymetman, and C. Gardent. Sequence-based structured prediction for semantic parsing. Proceedings Association For Computational Linguistics, Berlin, pages 1341--1350, 2016.Google ScholarGoogle Scholar
  31. L. S. Zettlemoyer and M. Collins. Learning to map sentences to logical form: structured classification with probabilistic categorial grammars. In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence, pages 658--666. AUAI Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Almond: The Architecture of an Open, Crowdsourced, Privacy-Preserving, Programmable Virtual Assistant

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader