TeamCAD -- A Multimodal Interface for Remote Computer Aided Design (2312.12309v1)
Abstract: Remote collaboration is a common reality of spatial design processes, but tools for computer aided design were made for single users. Via TeamCAD, we introduce a user experience where online remote collaboration experience is more like working on a table. Using speech and gesture recognition based on state of the art machine learning through webcam and microphone input, TeamCAD plugs into existing software through API's, keybindings, and mouse input. We share results from user studies conducted on graduate students from <removed for double blind review>. Our user tests were run on Blender animation software, making simultaneous use of both modalities for given tasks. We mitigated challenges in terms of robustness and latency in readily available voice recognition models. Our prototype has proven to be an intuitive interface, providing a suitable denominator for collaborators with or without previous experience in three-dimensional modeling applications.
- Tangible interfaces for remote collaboration and communication. In Proceedings of the 1998 ACM conference on Computer supported cooperative work. 169–178.
- Oxford English Dictionary. 1989. Oxford english dictionary. Simpson, Ja & Weiner, Esc (1989), 3.
- Carrie Sturts Dossick and Gina Neff. 2011. Messy talk and clean technology: communication, problem-solving and collaboration using Building Information Modelling. Engineering Project Organization Journal 1, 2 (2011), 83–93. https://doi.org/10.1080/21573727.2011.569929 arXiv:https://doi.org/10.1080/21573727.2011.569929
- A review on methods and systems for remote collaboration. Applied Sciences 11, 21 (2021), 10035.
- Gestures over video streams to support remote collaboration on physical tasks. Human-Computer Interaction 19, 3 (2004), 273–309.
- Vlad Petre Glăveanu. 2014. Distributed Creativity: What Is It? Springer International Publishing, Cham, 1–13. https://doi.org/10.1007/978-3-319-05434-6_1
- Michael Polanyi. 1967. The Tacit Dimension: Michael Polanyi. Routledge & Kegan Paul.
- Francis Quek. 2004. The catchment feature model: A device for multimodal fusion and a bridge between signal and sense. EURASIP Journal on Advances in Signal Processing 2004, 11 (2004), 1–18.
- Real-time human pose recognition in parts from single depth images. In CVPR 2011. Ieee, 1297–1304.