Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI (2403.13947v2)

Published 20 Mar 2024 in cs.HC and cs.AI

Abstract: Today's video-conferencing tools support a rich range of professional and social activities, but their generic meeting environments cannot be dynamically adapted to align with distributed collaborators' needs. To enable end-user customization, we developed BlendScape, a rendering and composition system for video-conferencing participants to tailor environments to their meeting context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users' physical or digital backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an exploratory study with 15 end-users, we investigated whether and how they would find value in using generative AI to customize video-conferencing environments. Participants envisioned using a system like BlendScape to facilitate collaborative activities in the future, but required further controls to mitigate distracting or unrealistic visual elements. We implemented scenarios to demonstrate BlendScape's expressiveness for supporting environment design strategies from prior work and propose composition techniques to improve the quality of environments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Interactive digital photomontage. ACM Transactions on Graphics 23, 3 (Aug. 2004), 294–302. https://doi.org/10.1145/1015706.1015718
  2. Shai Avidan and Ariel Shamir. 2007. Seam Carving for Content-Aware Image Resizing. In ACM SIGGRAPH 2007 Papers (SIGGRAPH ’07). Association for Computing Machinery, New York, NY, USA, 10–es. https://doi.org/10.1145/1275808.1276390
  3. Remote Learners, Home Makers: How Digital Fabrication Was Taught Online During a Pandemic. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021, Yoshifumi Kitamura, Aaron Quigley, Katherine Isbister, Takeo Igarashi, Pernille Bjørn, and Steven Mark Drucker (Eds.). ACM, 350:1–350:14. https://doi.org/10.1145/3411764.3445450
  4. Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. ACM, San Francisco , CA , USA. https://doi.org/10.48550/arXiv.2304.09337 arXiv:2304.09337 [cs]
  5. Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3, 2 (2006), 77–101.
  6. Language Models Are Few-Shot Learners. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 1877–1901.
  7. Bill Buxton. 2009. Mediaspace – Meaningspace – Meetingspace. Springer London, London, 217–231. https://doi.org/10.1007/978-1-84882-483-6_13
  8. Breakdowns and Breakthroughs: Observing Musicians’ Responses to the COVID-19 Pandemic. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021, Yoshifumi Kitamura, Aaron Quigley, Katherine Isbister, Takeo Igarashi, Pernille Bjørn, and Steven Mark Drucker (Eds.). ACM, 571:1–571:13. https://doi.org/10.1145/3411764.3445192
  9. MeetScript: Designing Transcript-based Interactions to Support Active Participation in Group Video Meetings. Proceedings of the ACM on Human-Computer Interaction abs/2309.12115 (2023). https://doi.org/10.48550/ARXIV.2309.12115 arXiv:2309.12115
  10. Jaz Hee-jeong Choi and Cade Diehm. 2021. Aesthetic flattening. Interactions 28, 4 (2021), 21–23. https://doi.org/10.1145/3468080
  11. John Joon Young Chung and Eytan Adar. 2023a. Artinter: AI-powered Boundary Objects for Commissioning Visual Arts. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (DIS ’23). Association for Computing Machinery, New York, NY, USA, 1997–2018. https://doi.org/10.1145/3563657.3595961
  12. John Joon Young Chung and Eytan Adar. 2023b. PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like Interactions. https://doi.org/10.1145/3586183.3606777 arXiv:2308.05184 [cs]
  13. TaleBrush: Sketching Stories with Generative Pretrained Language Models. In CHI Conference on Human Factors in Computing Systems. ACM, New Orleans LA USA, 1–19. https://doi.org/10.1145/3491102.3501819
  14. Bob Coyne and Richard Sproat. 2001. WordsEye: An Automatic Text-to-Scene Conversion System. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’01). Association for Computing Machinery, New York, NY, USA, 487–496. https://doi.org/10.1145/383259.383316
  15. WorldSmith: Iterative and Expressive Prompting for World Building with a Generative AI. arXiv:2308.13355 [cs]
  16. GANSlider: How Users Control Generative Models for Images Using Multiple Sliders with and without Feedforward Information. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3491102.3502141
  17. ”Yours is better!”: participant response bias in HCI. In CHI Conference on Human Factors in Computing Systems, CHI ’12, Austin, TX, USA - May 05 - 10, 2012, Joseph A. Konstan, Ed H. Chi, and Kristina Höök (Eds.). ACM, 1321–1330. https://doi.org/10.1145/2207676.2208589
  18. Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), 10303–10311. https://doi.org/10.1109/ICCV.2019.01040
  19. Zoom Exhaustion & Fatigue Scale. Computers in Human Behavior Reports 4 (2021), 100119. https://doi.org/10.1016/j.chbr.2021.100119
  20. Video Play: Playful Interactions in Video Conferencing for Long-Distance Families with Young Children. In Proceedings of the 9th International Conference on Interaction Design and Children. ACM, Barcelona Spain, 49–58. https://doi.org/10.1145/1810543.1810550
  21. Grandparents and Grandchildren Meeting Online: The Role of Material Things in Remote Settings. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021, Yoshifumi Kitamura, Aaron Quigley, Katherine Isbister, Takeo Igarashi, Pernille Bjørn, and Steven Mark Drucker (Eds.). ACM, 478:1–478:14. https://doi.org/10.1145/3411764.3445191
  22. Mesh R-CNN. arXiv:1906.02739 [cs]
  23. Mirrorverse: Live Tailoring of Video Conferencing Interfaces. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. ACM, San Francisco , CA , USA. https://doi.org/10.1145/3586183.3606767
  24. MirrorBlender: Supporting Hybrid Meetings with a Malleable Video-Conferencing System. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021, Yoshifumi Kitamura, Aaron Quigley, Katherine Isbister, Takeo Igarashi, Pernille Bjørn, and Steven Mark Drucker (Eds.). ACM, 451:1–451:13. https://doi.org/10.1145/3411764.3445698
  25. Partially Blended Realities: Aligning Dissimilar Spaces for Distributed Mixed Reality Meetings. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023, Hamburg, Germany, April 23-28, 2023, Albrecht Schmidt, Kaisa Väänänen, Tesh Goyal, Per Ola Kristensson, Anicia Peters, Stefanie Mueller, Julie R. Williamson, and Max L. Wilson (Eds.). ACM, 456:1–456:16. https://doi.org/10.1145/3544548.3581515
  26. XSpace: An Augmented Reality Toolkit for Enabling Spatially-Aware Distributed Collaboration. Proc. ACM Hum. Comput. Interact. 6, ISS (2022), 277–302. https://doi.org/10.1145/3567721
  27. OpenMic: Utilizing Proxemic Metaphors for Conversational Floor Transitions in Multiparty Video Meetings. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023, Hamburg, Germany, April 23-28, 2023, Albrecht Schmidt, Kaisa Väänänen, Tesh Goyal, Per Ola Kristensson, Anicia Peters, Stefanie Mueller, Julie R. Williamson, and Max L. Wilson (Eds.). ACM, 793:1–793:17. https://doi.org/10.1145/3544548.3581013
  28. ThingShare: Ad-Hoc Digital Copies of Physical Objects for Sharing Things in Video Meetings. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023, Hamburg, Germany, April 23-28, 2023, Albrecht Schmidt, Kaisa Väänänen, Tesh Goyal, Per Ola Kristensson, Anicia Peters, Stefanie Mueller, Julie R. Williamson, and Max L. Wilson (Eds.). ACM, 365:1–365:22. https://doi.org/10.1145/3544548.3581148
  29. WaaZam!: supporting creative play at a distance in customized video environments. In CHI Conference on Human Factors in Computing Systems, CHI’14, Toronto, ON, Canada - April 26 - May 01, 2014, Matt Jones, Philippe A. Palanque, Albrecht Schmidt, and Tovi Grossman (Eds.). ACM, 1197–1206. https://doi.org/10.1145/2556288.2557382
  30. Spatialized Audio and Hybrid Video Conferencing: Where Should Voices be Positioned for People in the Room and Remote Headset Users?. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023, Hamburg, Germany, April 23-28, 2023, Albrecht Schmidt, Kaisa Väänänen, Tesh Goyal, Per Ola Kristensson, Anicia Peters, Stefanie Mueller, Julie R. Williamson, and Max L. Wilson (Eds.). ACM, 794:1–794:14. https://doi.org/10.1145/3544548.3581085
  31. Heewoo Jun and Alex Nichol. 2023. Shap-E: Generating Conditional 3D Implicit Functions. https://doi.org/10.48550/arXiv.2305.02463 arXiv:2305.02463 [cs]
  32. IllumiShare: sharing any surface. In CHI Conference on Human Factors in Computing Systems, CHI ’12, Austin, TX, USA - May 05 - 10, 2012, Joseph A. Konstan, Ed H. Chi, and Kristina Höök (Eds.). ACM, 1919–1928. https://doi.org/10.1145/2207676.2208333
  33. HOLODIFFUSION: Training a 3D Diffusion Model Using 2D Images. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, BC, Canada, 18423–18433. https://doi.org/10.1109/CVPR52729.2023.01767
  34. Simple and Effective Synthesis of Indoor 3D Scenes. https://doi.org/10.48550/arXiv.2204.02960 arXiv:2204.02960 [cs]
  35. Loki: Facilitating Remote Instruction of Physical Tasks Using Bi-Directional Mixed-Reality Telepresence. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, UIST 2019, New Orleans, LA, USA, October 20-23, 2019, François Guimbretière, Michael S. Bernstein, and Katharina Reinecke (Eds.). ACM, 161–174. https://doi.org/10.1145/3332165.3347872
  36. Toward Video-Conferencing Tools for Hands-On Activities in Online Teaching. Proc. ACM Hum. Comput. Interact. 6, GROUP (2022), 10:1–10:22. https://doi.org/10.1145/3492829
  37. Evaluation Strategies for HCI Toolkit Research. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI 2018, Montreal, QC, Canada, April 21-26, 2018. ACM, 36. https://doi.org/10.1145/3173574.3173610
  38. Distracting Moments in Videoconferencing: A Look Back at the Pandemic Period. In CHI ’22: CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 29 April 2022 - 5 May 2022, Simone D. J. Barbosa, Cliff Lampe, Caroline Appert, David A. Shamma, Steven Mark Drucker, Julie R. Williamson, and Koji Yatani (Eds.). ACM, 141:1–141:21. https://doi.org/10.1145/3491102.3517545
  39. GLIGEN: Open-Set Grounded Text-to-Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22511–22521.
  40. Opal: Multimodal Image Generation for News Illustration. In The 35th Annual ACM Symposium on User Interface Software and Technology, UIST 2022, Bend, OR, USA, 29 October 2022 - 2 November 2022. ACM, 73:1–73:17. https://doi.org/10.1145/3526113.3545621
  41. Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. https://doi.org/10.48550/arXiv.1411.1784 arXiv:1411.1784 [cs, stat]
  42. Osamu Morikawa and Takanori Maesako. 1998. HyperMirror: Toward Pleasant-to-Use Video Mediated Communication System. In CSCW ’98, Proceedings of the ACM 1998 Conference on Computer Supported Cooperative Work, Seattle, WA, USA, November 14-18, 1998, Steven E. Poltrock and Jonathan Grudin (Eds.). ACM, 149–158. https://doi.org/10.1145/289444.289489
  43. Ubiq-Genie: Leveraging External Frameworks for Enhanced Social VR Experiences. In 2023 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW). IEEE, Shanghai, China, 497–501. https://doi.org/10.1109/VRW58643.2023.00108
  44. Blended interaction spaces for distributed team collaboration. ACM Trans. Comput. Hum. Interact. 18, 1 (2011), 3:1–3:28. https://doi.org/10.1145/1959022.1959025
  45. Ayoola Olafenwa. 2021. Simplifying Object Segmentation with PixelLib Library. (Jan. 2021).
  46. OpenAI. 2023. GPT-4 Technical Report. https://doi.org/10.48550/arXiv.2303.08774 arXiv:2303.08774 [cs]
  47. Room2Room: Enabling Life-Size Telepresence in a Projected Augmented Reality Environment. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, CSCW 2016, San Francisco, CA, USA, February 27 - March 2, 2016, Darren Gergle, Meredith Ringel Morris, Pernille Bjørn, and Joseph A. Konstan (Eds.). ACM, 1714–1723. https://doi.org/10.1145/2818048.2819965
  48. Improving Language Understanding by Generative Pre-Training.
  49. High-Resolution Image Synthesis with Latent Diffusion Models. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, LA, USA, 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042
  50. Raymond Scupin. 1997. The KJ Method: A Technique for Analyzing Data Derived from Japanese Ethnology. Human Organization 56, 2 (1997), 233–237.
  51. ChatPainter: Improving Text to Image Generation Using Dialogue. https://doi.org/10.48550/arXiv.1802.08216 arXiv:1802.08216 [cs]
  52. Oasis: Procedurally Generated Social Virtual Spaces from 3D Scanned Real Spaces. IEEE Transactions on Visualization and Computer Graphics 24, 12 (Dec. 2018), 3174–3187. https://doi.org/10.1109/TVCG.2017.2762691
  53. Perspectives: Creating Inclusive and Equitable Hybrid Meeting Experiences. Proceedings of the ACM on Human-Computer Interaction 7, CSCW2 (Oct. 2023).
  54. Philip Tuddenham and Peter Robinson. 2009. Territorial coordination and workspace awareness in remote tabletop collaboration. In Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, Boston, MA, USA, April 4-9, 2009. ACM, 2139–2148. https://doi.org/10.1145/1518701.1519026
  55. Wish you were here: being together through composite video and digital keepsakes. In Proceedings of the 20th International Conference on Human-Computer Interaction with Mobile Devices and Services, MobileHCI 2018, Barcelona, Spain, September 03-06, 2018, Lynne Baillie and Nuria Oliver (Eds.). ACM, 17:1–17:11. https://doi.org/10.1145/3229434.3229476
  56. Spacetime: Enabling Fluid Individual and Collaborative Editing in Virtual Reality. In The 31st Annual ACM Symposium on User Interface Software and Technology, UIST 2018, Berlin, Germany, October 14-17, 2018. ACM, 853–866. https://doi.org/10.1145/3242587.3242597
  57. DreamWalker: Substituting Real-World Walking Experiences with a Virtual Reality. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (UIST ’19). Association for Computing Machinery, New York, NY, USA, 1093–1107. https://doi.org/10.1145/3332165.3347875
  58. Free-Form Image Inpainting With Gated Convolution. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), 4470–4479. https://doi.org/10.1109/ICCV.2019.00457
  59. Tabletop Games in the Age of Remote Collaboration: Design Opportunities for a Socially Connected Game Experience. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021, Yoshifumi Kitamura, Aaron Quigley, Katherine Isbister, Takeo Igarashi, Pernille Bjørn, and Steven Mark Drucker (Eds.). ACM, 436:1–436:14. https://doi.org/10.1145/3411764.3445512
  60. When Tablets meet Tabletops: The Effect of Tabletop Size on Around-the-Table Collaboration with Personal Tablets. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, San Jose, CA, USA, May 7-12, 2016. ACM, 5470–5481. https://doi.org/10.1145/2858036.2858224
  61. VRGit: A Version Control System for Collaborative Content Creation in Virtual Reality. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023, Hamburg, Germany, April 23-28, 2023. ACM, 36:1–36:14. https://doi.org/10.1145/3544548.3581136
  62. Adding Conditional Control to Text-to-Image Diffusion Models. https://doi.org/10.48550/arXiv.2302.05543 arXiv:2302.05543 [cs]
  63. Real-Time User-Guided Image Colorization with Learned Deep Priors. ACM Transactions on Graphics 36, 4 (July 2017), 119:1–119:11. https://doi.org/10.1145/3072959.3073703
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets