Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Thoughtful Things: Building Human-Centric Smart Devices with Small Language Models (2405.03821v1)

Published 6 May 2024 in cs.HC, cs.AI, and cs.SE

Abstract: Everyday devices like light bulbs and kitchen appliances are now embedded with so many features and automated behaviors that they have become complicated to actually use. While such "smart" capabilities can better support users' goals, the task of learning the "ins and outs" of different devices is daunting. Voice assistants aim to solve this problem by providing a natural language interface to devices, yet such assistants cannot understand loosely-constrained commands, they lack the ability to reason about and explain devices' behaviors to users, and they rely on connectivity to intrusive cloud infrastructure. Toward addressing these issues, we propose thoughtful things: devices that leverage lightweight, on-device LLMs to take actions and explain their behaviors in response to unconstrained user commands. We propose an end-to-end framework that leverages formal modeling, automated training data synthesis, and generative LLMs to create devices that are both capable and thoughtful in the presence of unconstrained user goals and inquiries. Our framework requires no labeled data and can be deployed on-device, with no cloud dependency. We implement two thoughtful things (a lamp and a thermostat) and deploy them on real hardware, evaluating their practical performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774 (2023).
  2. GE Appliances. 2023a. Owner’s Manual, Refrigerator Models 16, 17, 18, 19, 22 (6th ed.). General Electric.
  3. GE Appliances. 2023b. Owner’s Manual, Refrigerator Models PVD, PXD, and GVE (7th ed.). General Electric.
  4. Boone Ashworth. 2022. The Risk of Relying on a Smart Home Company to Keep The Lights On. Wired (April 2022). https://www.wired.com/story/insteon-shutdown/
  5. Do Privacy and Security Matter to Everyone? Quantifying and Clustering {{\{{User-Centric}}\}} Considerations About Smart Home Device Adoption. In Sixteenth Symposium on Usable Privacy and Security (SOUPS 2020). 417–435.
  6. Tom Bell. 2023. ERG mode explained: what it is, how to use it and when you should turn it off. https://www.bikeradar.com/advice/fitness-and-training/erg-mode
  7. Andrei Betlen. 2024. llama-cpp-python. https://github.com/abetlen/llama-cpp-python.
  8. SynthScribe: Deep Multimodal Tools for Synthesizer Sound Retrieval and Exploration. In Proceedings of the 29th International Conference on Intelligent User Interfaces. 51–65.
  9. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. 535–541.
  10. Ambient Reflection: Towards Self-explaining Devices.. In LMIS@ EICS. 16–20.
  11. The smart object description language: modeling interaction capabilities for self-reflection. In 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). IEEE, 503–508.
  12. Jeffrey Van Camp. 2019. My Jibo Is Dying and It’s Breaking My Heart. Wired (March 2019). https://www.wired.com/story/jibo-is-dying-eulogy/
  13. Augusto Ciuffoletti. 2018. Low-cost IoT: A holistic approach. Journal of Sensor and Actuator Networks 7, 2 (2018), 19.
  14. FORTNIoT: Intelligible Predictions to Improve User Understanding of Smart Home Behavior. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4, 4, Article 124 (dec 2020), 24 pages. https://doi.org/10.1145/3432225
  15. Is smart home a necessity or a fantasy for the mainstream user? A study on users’ expectations of smart household appliances. International Journal of Design 12, 1 (2018), 7–20.
  16. ” What can i help you with?” infrequent users’ experiences of intelligent personal assistants. In Proceedings of the 19th international conference on human-computer interaction with mobile devices and services. 1–12.
  17. Anind K Dey. 2009. Explanations in Context-Aware Systems.. In ExaCt. 84–93.
  18. Leatrice Eiseman. 2017. The complete color harmony, pantone edition: expert color information for professional results. Rockport Publishers.
  19. Ubiquitous fridge with natural language interaction. In 2019 IEEE International Conference on RFID Technology and Applications (RFID-TA). IEEE, 404–409.
  20. Georgi Gerganov. 2024. llama.cpp. https://github.com/ggerganov/llama.cpp.
  21. Google. 2024. Beginner’s guide to the Nest thermostat. https://support.google.com/googlenest/answer/9248184
  22. Smart home appliances: Chat with your fridge. arXiv preprint arXiv:1912.09589 (2019).
  23. Textbooks are all you need. arXiv preprint arXiv:2306.11644 (2023).
  24. Andrew F Hayes and Klaus Krippendorff. 2007. Answering the call for a standard reliability measure for coding data. Communication methods and measures 1, 1 (2007), 77–89.
  25. When smart devices are stupid: negative experiences using home smart devices. In 2019 IEEE Security and Privacy Workshops (SPW). IEEE, 150–155.
  26. Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. arXiv preprint arXiv:2305.02301 (2023).
  27. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).
  28. Ioana Iancu and Bogdan Iancu. 2020. I love it, but it is too complicated. Aging adults’ perspective on mobile technology acceptance. ESSACHESS–Journal for Communication Studies 13, 2 (26) (2020), 13–39.
  29. Tom Kenter and Maarten De Rijke. 2015. Short text similarity with word embeddings. In Proceedings of the 24th ACM international on conference on information and knowledge management. 1411–1420.
  30. ”Get ready for a party”: Exploring smarter smart spaces with help from large language models. arXiv preprint arXiv:2303.14143 (2023).
  31. Sasha: creative goal-oriented reasoning in smart homes with large language models. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 1 (2024), 1–38.
  32. Large language models are zero-shot reasoners. Advances in neural information processing systems 35 (2022), 22199–22213.
  33. Towards self-explaining ambient applications. In Proceedings of the 14th PErvasive Technologies Related to Assistive Environments Conference. 383–390.
  34. Too much, too little, or just right? Ways explanations impact end users’ mental models. In 2013 IEEE Symposium on visual languages and human centric computing. IEEE, 3–10.
  35. Sook-Youn Kwon and Jae-Hyun Lim. 2017. Multi-objective context-adaptive natural lighting system. Energy and Buildings 144 (2017), 61–73.
  36. J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159–174.
  37. Why we use and abandon smart devices. In Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. 635–646.
  38. Intelligent oven in smart home environment. In 2009 international conference on research challenges in computer science. IEEE, 247–250.
  39. Motivations, barriers and risks of smart home adoption: From systematic literature review to conceptual framework. Energy Research & Social Science 80 (2021), 102211.
  40. Henry Lieberman and José Espinosa. 2006. A goal-oriented interface to consumer electronics using planning and commonsense reasoning. In Proceedings of the 11th international conference on Intelligent User Interfaces. 226–233.
  41. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.
  42. Ewa Luger and Abigail Sellen. 2016. ” Like Having a Really Bad PA” The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI conference on human factors in computing systems. 5286–5297.
  43. Multi-level Query Analysis for NLT-based Synthesizer Interface. In 2020 Nicograph International (NicoInt). IEEE, 62–65.
  44. Pranay Parab. 2024. Google Assistant Is Losing a Bunch of Features. https://lifehacker.com/tech/google-assistant-is-losing-a-bunch-of-features Section: Internet.
  45. Enabling Device Control Planning Capabilities of Small Language Model. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 12066–12070.
  46. Use of intelligent voice assistants by older adults with low technology use. ACM Transactions on Computer-Human Interaction (TOCHI) 27, 4 (2020), 1–27.
  47. ”Accessibility Came by Accident” Use of Voice-Controlled Intelligent Personal Assistants by People with Disabilities. In Proceedings of the 2018 CHI Conference on human factors in computing systems. 1–13.
  48. Why think step by step? Reasoning emerges from the locality of experience. Advances in Neural Information Processing Systems 36 (2024).
  49. Aung Pyae and Tapani N. Joelsson. 2018. Investigating the usability and user experiences of voice user interface: a case of Google home smart speaker. In Proceedings of the 20th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct (Barcelona, Spain) (MobileHCI ’18). Association for Computing Machinery, New York, NY, USA, 127–131. https://doi.org/10.1145/3236112.3236130
  50. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning. PMLR, 28492–28518.
  51. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. http://arxiv.org/abs/1908.10084
  52. SmartEx: A Framework for Generating User-Centric Explanations in Smart Environments. arXiv preprint arXiv:2402.13024 (2024).
  53. Jordie Shier. 2021. The synthesizer programming problem: improving the usability of sound synthesizers. Master’s thesis. University of Victoria.
  54. IoT Privacy and security: Challenges and solutions. Applied Sciences 10, 12 (2020), 4102.
  55. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295 (2024).
  56. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
  57. Statistical methods for building robust spoken dialogue systems in an automobile. Proceedings of the 4th applied human factors and ergonomics (2012).
  58. Studying exploration & long-term use of voice assistants by older adults. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–11.
  59. Attention is all you need. Advances in neural information processing systems 30 (2017).
  60. Self-instruct: Aligning language models with self-generated instructions. arXiv preprint arXiv:2212.10560 (2022).
  61. A survey of joint intent detection and slot filling models in natural language understanding. Comput. Surveys 55, 8 (2022), 1–38.
  62. A systematic evaluation of large language models of code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming. 1–10.
  63. Rayoung Yang and Mark W Newman. 2013. Learning from a learning thermostat: lessons for intelligent systems for the home. In Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing. 93–102.
  64. A reliable natural language interface to household appliances. In Proceedings of the 8th international conference on Intelligent user interfaces. 189–196.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets