research-article

Almond: The Architecture of an Open, Crowdsourced, Privacy-Preserving, Programmable Virtual Assistant

Authors:
Giovanni Campagna

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Rakesh Ramesh

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Silei Xu

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Michael Fischer

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Monica S. Lam

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

WWW '17: Proceedings of the 26th International Conference on World Wide WebApril 2017Pages 341–350https://doi.org/10.1145/3038912.3052562

Published:03 April 2017Publication History

WWW '17: Proceedings of the 26th International Conference on World Wide Web

Pages 341–350

ABSTRACT

This paper presents the architecture of Almond, an open, crowdsourced, privacy-preserving and programmable virtual assistant for online services and the Internet of Things (IoT). Included in Almond is Thingpedia, a crowdsourced public knowledge base of natural language interfaces and open APIs. Our proposal addresses four challenges in virtual assistant technology: generality, interoperability, privacy, and usability. Generality is addressed by crowdsourcing Thingpedia, while interoperability is provided by ThingTalk, a high-level domain-specific language that connects multiple devices or services via open APIs. For privacy, user credentials and user data are managed by our open-source ThingSystem, which can be run on personal phones or home servers. Finally, we address usability by providing a natural language interface, whose capability can be extended via training with the help of a menu-driven interface.

We have created a fully working prototype, and crowdsourced a set of 187 functions across 45 different kinds of devices. Almond is the first virtual assistant that lets users specify trigger-action tasks in natural language. Despite the lack of real usage data, our experiment suggests that Almond can understand about 40% of the complex tasks when uttered by a user familiar with its capability.

References

J. Andreas, A. Vlachos, and S. Clark. Semantic parsing as machine translation. In ACL (2), pages 47--52, 2013.Google Scholar
Apple HomeKit. http://www.apple.com/ios/home.Google Scholar
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. Dbpedia: A nucleus for a web of open data. In The semantic web, pages 722--735. Springer, 2007. Google ScholarDigital Library
J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on freebase from question-answer pairs. In EMNLP, volume 2, page 6, 2013.Google Scholar
J. Berant and P. Liang. Semantic parsing via paraphrasing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL-14), pages 1415--1425, 2014.Google ScholarCross Ref
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247--1250. ACM, 2008. Google ScholarDigital Library
D. L. Chen and R. J. Mooney. Learning to interpret natural language navigation instructions from observations. In AAAI, volume 2, pages 1--2, 2011. Google ScholarDigital Library
C. Dixon, R. Mahajan, S. Agarwal, A. Brush, B. Lee, S. Saroiu, and P. Bahl. An operating system for the home. In Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12), pages 337--352, 2012. Google ScholarDigital Library
L. Dong and M. Lapata. Language to logical form with neural attention. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-16), pages 33--34, 2016.Google ScholarCross Ref
Google Weave. https://developers.google.com/weave.Google Scholar
M. Gordon and C. Breazeal. Designing a virtual assistant for in-car child entertainment. In Proceedings of the 14th International Conference on Interaction Design and Children, pages 359--362. ACM, 2015. Google ScholarDigital Library
P. H. Harvey, E. Currie, P. Daryanani, and J. C. Augusto. Enhancing student support with a virtual assistant. In International Conference on E-Learning, E-Education, and Online Training, pages 101--109. Springer, 2015.Google Scholar
J. Huang and M. Cakmak. Supporting mental model accuracy in trigger-action programming. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 215--225. ACM, 2015. Google ScholarDigital Library
If This Then That. http://ifttt.com.Google Scholar
R. J. Kate, Y. W. Wong, and R. J. Mooney. Learning to transform natural to formal languages. In Proceedings of the National Conference on Artificial Intelligence, volume 20, page 1062, 2005. Google ScholarDigital Library
P. Kenny, T. Parsons, J. Gratch, and A. Rizzo. Virtual humans for assisted health care. In Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments, page 6. ACM, 2008. Google ScholarDigital Library
W. J. Kent, C. W. Sugnet, T. S. Furey, K. M. Roskin, T. H. Pringle, A. M. Zahler, and D. Haussler. The human genome browser at UCSC. Genome research, 12(6):996--1006, 2002.Google Scholar
C. Liu, X. Chen, E. C. Shin, M. Chen, and D. Song. Latent attention for if-then program synthesis. In Advances in Neural Information Processing Systems, pages 4574--4582, 2016.Google Scholar
S. Mayer, N. Inhelder, R. Verborgh, R. Van de Walle, and F. Mattern. Configuration of smart environments made simple: Combining visual modeling with semantic metadata and reasoning. In Internet of Things (IOT), 2014 International Conference on the, pages 61--66. IEEE, 2014.Google ScholarCross Ref
Nest. https://developer.nest.com.Google Scholar
P. Pasupat and P. Liang. Compositional semantic parsing on semi-structured tables. In Proceedings of the 53nd Annual Meeting of the Association for Computational Linguistics (ACL-15), pages 1470--1480, 2015.Google ScholarCross Ref
C. Quirk, R. Mooney, and M. Galley. Language to code: Learning semantic parsers for if-this-then-that recipes. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL-15), pages 878--888, 2015.Google ScholarCross Ref
Samsung SmartThings. http://www.smartthings.com.Google Scholar
Sportradar. http://sportradar.us.Google Scholar
K. N. Truong, E. M. Huang, and G. D. Abowd. Camp: A magnetic poetry interface for end-user programming of capture applications for the home. In International Conference on Ubiquitous Computing, pages 143--160. Springer, 2004.Google ScholarCross Ref
B. Ur, E. McManus, M. Pak Yong Ho, and M. L. Littman. Practical trigger-action programming in the smart home. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 803--812. ACM, 2014. Google ScholarDigital Library
M. Walch, M. Rietzler, J. Greim, F. Schaub, B. Wiedersheim, and M. Weber. homeblox: making home automation usable. In Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication, pages 295--298. ACM, 2013. Google ScholarDigital Library
Y. Wang, J. Berant, and P. Liang. Building a semantic parser overnight. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL-15), pages 1332--1342, 2015.Google ScholarCross Ref
E. Weisstein et al. Wolfram mathworld, 2007.Google Scholar
C. Xiao, M. Dymetman, and C. Gardent. Sequence-based structured prediction for semantic parsing. Proceedings Association For Computational Linguistics, Berlin, pages 1341--1350, 2016.Google Scholar
L. S. Zettlemoyer and M. Collins. Learning to map sentences to logical form: structured classification with probabilistic categorial grammars. In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence, pages 658--666. AUAI Press, 2005. Google ScholarDigital Library

Index Terms

Almond: The Architecture of an Open, Crowdsourced, Privacy-Preserving, Programmable Virtual Assistant

Recommendations

When Virtual Reality Meets Internet of Things in the Gym: Enabling Immersive Interactive Machine Exercises

With the advent of immersive virtual reality (VR) head-mounted displays (HMD), we envision that immersive VR will revolutionize the personal fitness experience in our daily lives. Toward this vision, we present JARVIS, a virtual exercise assistant that ...
Read More
Exploring the Potential of Speech-based Virtual Assistants in Mixed Reality Applications for People with Cognitive Disabilities
AVI '20: Proceedings of the International Conference on Advanced Visual Interfaces

Mixed Reality (MR) has been receiving increasing interest in the rehabilitation of people with Cognitive Disabilities. The power of MR in the context of therapies is the possibility to maintain a physical and psychological relationship with the ...
Read More
Exploring requirements and opportunities of conversational user interfaces for the cognitively impaired
MobileHCI '18: Proceedings of the 20th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct

Interacting with traditional user interfaces can be challenging for people with cognitive impairments. Speech-based conversational interfaces and virtual assistants such as Amazon's Alexa and Apple's Siri might provide great potential for this user ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '17: Proceedings of the 26th International Conference on World Wide Web
April 2017
1678 pages
ISBN:9781450349130
General Chairs:
Rick Barrett
W3Events
,
Rick Cummings
Murdoch University
,
Program Chairs:
Eugene Agichtein
Emory University
,
Evgeniy Gabrilovich
Google Research
Copyright © 2017 Copyright is held by the International World Wide Web Conference Committee (IW3C2).
Sponsors
In-Cooperation
Publisher
International World Wide Web Conferences Steering Committee
Republic and Canton of Geneva, Switzerland
Publication History
- Published: 3 April 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
conversational agents
crowdsourcing
internet of things
natural language programming
virtual assistant
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '17 Paper Acceptance Rate164of966submissions,17%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 44
  Total Citations
  View Citations
- 979
  Total Downloads
- Downloads (Last 12 months)80
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Almond: The Architecture of an Open, Crowdsourced, Privacy-Preserving, Programmable Virtual Assistant

WWW '17: Proceedings of the 26th International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

When Virtual Reality Meets Internet of Things in the Gym: Enabling Immersive Interactive Machine Exercises

Exploring the Potential of Speech-based Virtual Assistants in Mixed Reality Applications for People with Cognitive Disabilities

Exploring requirements and opportunities of conversational user interfaces for the cognitively impaired

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Almond: The Architecture of an Open, Crowdsourced, Privacy-Preserving, Programmable Virtual Assistant

WWW '17: Proceedings of the 26th International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

When Virtual Reality Meets Internet of Things in the Gym: Enabling Immersive Interactive Machine Exercises

Exploring the Potential of Speech-based Virtual Assistants in Mixed Reality Applications for People with Cognitive Disabilities

Exploring requirements and opportunities of conversational user interfaces for the cognitively impaired

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media