Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sasha: Creative Goal-Oriented Reasoning in Smart Homes with Large Language Models (2305.09802v3)

Published 16 May 2023 in cs.HC and cs.AI

Abstract: Smart home assistants function best when user commands are direct and well-specified (e.g., "turn on the kitchen light"), or when a hard-coded routine specifies the response. In more natural communication, however, human speech is unconstrained, often describing goals (e.g., "make it cozy in here" or "help me save energy") rather than indicating specific target devices and actions to take on those devices. Current systems fail to understand these under-specified commands since they cannot reason about devices and settings as they relate to human situations. We introduce LLMs to this problem space, exploring their use for controlling devices and creating automation routines in response to under-specified user commands in smart homes. We empirically study the baseline quality and failure modes of LLM-created action plans with a survey of age-diverse users. We find that LLMs can reason creatively to achieve challenging goals, but they experience patterns of failure that diminish their usefulness. We address these gaps with Sasha, a smarter smart home assistant. Sasha responds to loosely-constrained commands like "make it cozy" or "help me sleep better" by executing plans to achieve user goals, e.g., setting a mood with available devices, or devising automation routines. We implement and evaluate Sasha in a hands-on user study, showing the capabilities and limitations of LLM-driven smart homes when faced with unconstrained user-generated scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. 2022. Google Nest API Documentation. https://developers.google.com/nest/device-access
  2. 2022. Insteon REST API Documentation. https://www.insteon.com/developer
  3. 2022. Philips Hue API Documentation. https://developers.meethue.com/
  4. 2023. IFTTT. Retrieved May 1, 2023 from https://ifttt.com/
  5. 2023. TP-Link Kasa Smart Plugs. https://www.kasasmart.com/us/products/smart-plugs
  6. A survey on ambient intelligence in healthcare. Proc. IEEE 101, 12 (2013), 2470–2494.
  7. A review of smart home applications based on Internet of Things. Journal of Network and Computer Applications 97 (2017), 48–65.
  8. Understanding the long-term use of smart speaker assistants. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 3 (2018), 1–24.
  9. Robert Botsch. 2011. Chapter 12: Significance and measures of association. Scopes and Methods of Political Science (2011).
  10. Smarter greener cities through a social-ecological-technological systems approach. Current Opinion in Environmental Sustainability 55 (2022), 101168.
  11. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  12. Noam Chomsky. 2002. Syntactic Structures. De Gruyter Mouton, Berlin, New York. https://doi.org/doi:10.1515/9783110218329
  13. PaLM: Scaling Language Modeling with Pathways. arXiv:2204.02311 [cs.CL]
  14. Devices and data and agents, oh my: How smart home abstractions prime end-user mental models. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 3 (2017), 1–26.
  15. Diane J Cook. 2012. Learning setting-generalized activity models for smart spaces. IEEE intelligent systems 27, 1 (2012), 32.
  16. William S Cooper. 1971. A definition of relevance for information retrieval. Information storage and retrieval 7, 1 (1971), 19–37.
  17. ”What can I help you with?” infrequent users’ experiences of intelligent personal assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services. 1–12.
  18. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  19. Anind K Dey. 2001. Understanding and using context. Personal and ubiquitous computing 5 (2001), 4–7.
  20. A survey of ambient intelligence. ACM Computing Surveys (CSUR) 54, 4 (2021), 1–27.
  21. Augur: Mining human behaviors from fiction to power interactive systems. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 237–247.
  22. The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027 (2020).
  23. Radhika Garg and Subhasree Sengupta. 2020. He is just like me: a study of the long-term use of smart speakers by parents and children. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1–24.
  24. Ilche Georgievski and Marco Aiello. 2016. Automated planning for ubiquitous computing. ACM Computing Surveys (CSUR) 49, 4 (2016), 1–46.
  25. Smart home design and operation preferences of Americans and Koreans. Ergonomics 53, 5 (2010), 636–660.
  26. Design and implementation of smart buildings: a review of current research trend. Energies 15, 12 (2022), 4278.
  27. Sunyoung Kim and Abhishek Choudhury. 2021. Exploring older adults’ perception and use of smart speaker-based voice assistants: A longitudinal study. Computers in Human Behavior 124 (2021), 106914.
  28. ”Get ready for a party”: Exploring smarter smart spaces with help from large language models. arXiv preprint arXiv:2303.14143 (2023).
  29. Taku Kudo and John Richardson. 2018. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808.06226 (2018).
  30. Code as policies: Language model programs for embodied control. arXiv preprint arXiv:2209.07753 (2022).
  31. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. Comput. Surveys 55, 9 (2023), 1–35.
  32. Ewa Luger and Abigail Sellen. 2016. ”Like Having a Really Bad PA” The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI conference on human factors in computing systems. 5286–5297.
  33. R. Lutolf. 1992. Smart Home concept and the integration of energy meters into a home based system. In Seventh International Conference on Metering Apparatus and Tariffs for Electricity Supply 1992. 277–278.
  34. Isabela Motta and Manuela Quaresma. 2022. Exploring the opinions of experts in conversational design: A Study on users’ mental models of voice assistants. In International Conference on Human-Computer Interaction. Springer, 494–514.
  35. Jesse Mu and Advait Sarkar. 2019. Do we need natural language? Exploring restricted language interfaces for complex domains. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–6.
  36. Natural language goal understanding for smart home environments. In Proceedings of the 10th International Conference on the Internet of Things. 1–8.
  37. VISH: Does Your Smart Home Dialogue System Also Need Training Data?. In Web Engineering: 20th International Conference, ICWE 2020, Helsinki, Finland, June 9–12, 2020, Proceedings 20. Springer, 171–187.
  38. OpenAI. 2023a. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
  39. OpenAI. 2023b. tiktoken: a fast BPE tokeniser for use with OpenAI’s models. https://github.com/openai/tiktoken.
  40. Designing a goal-oriented smart-home environment. Information Systems Frontiers 20 (2018), 125–142.
  41. Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442 (2023).
  42. Use of intelligent voice assistants by older adults with low technology use. ACM Transactions on Computer-Human Interaction (TOCHI) 27, 4 (2020), 1–27.
  43. Reasoning with Language Model Prompting: A Survey. arXiv preprint arXiv:2212.09597 (2022).
  44. Improving language understanding by generative pre-training. (2018).
  45. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  46. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
  47. User requirements for the design of smart homes: dimensions and goals. Journal of Ambient Intelligence and Humanized Computing (2022), 1–20.
  48. Large language models can be easily distracted by irrelevant context. arXiv preprint arXiv:2302.00093 (2023).
  49. Who will be smart home users? An analysis of adoption and diffusion of smart homes. Technological Forecasting and Social Change 134 (2018), 246–253.
  50. Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting. arXiv preprint arXiv:2303.14100 (2023).
  51. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022).
  52. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
  53. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
  54. Studying Exploration & Long-Term Use of Voice Assistants by Older Adults. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–11.
  55. Attention is all you need. Advances in neural information processing systems 30 (2017).
  56. Jonathan J Webster and Chunyu Kit. 1992. Tokenization as the initial phase in NLP. In COLING 1992 volume 4: The 14th international conference on computational linguistics.
  57. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652 (2021).
  58. Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903 (2022).
  59. Mark Weiser. 1999. The computer for the 21st century. ACM SIGMOBILE mobile computing and communications review 3, 3 (1999), 3–11.
  60. Smart homes and their users: a systematic analysis and key challenges. Personal and Ubiquitous Computing 19 (2015), 463–476.
  61. TidyBot: Personalized Robot Assistance with Large Language Models. arXiv:2305.05658 [cs.RO]
  62. Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts. In Proceedings of the 2022 CHI conference on human factors in computing systems. 1–22.
  63. Dataset: Analysis of IFTTT Recipes to Study How Humans Use Internet-of-Things (IoT) Devices. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems. 537–541.
  64. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593 (2019).
Citations (11)

Summary

We haven't generated a summary for this paper yet.