- The paper introduces a novel server application that streams comprehensive sensor data from HoloLens 2 over TCP, surpassing previous acquisition methods.
- It employs a C++ UWP application combined with a Python client library to deliver 1080p video at 30 FPS with approximately 270 ms latency while ensuring multi-platform support.
- The integration with Unity paves the way for interactive AR/VR applications and future expansion into additional data streams such as spatial mapping and voice input.
Overview of "HoloLens 2 Sensor Streaming"
This paper details the implementation of a server application designed to facilitate real-time streaming of sensor data from the Microsoft HoloLens 2 device over TCP. The developed system represents a significant enhancement over existing HoloLens data acquisition methods by providing comprehensive real-time access to an array of sensor data streams, including four grayscale cameras, a depth sensor, IMU, RGB camera, microphone, and spatial input data such as head pose, eye tracking, and hand tracking.
The server, which operates directly on the HoloLens 2, streams data that can be integrated into Unity projects as a plugin. This integration supports upstream capabilities, thereby potentially enhancing computing power when paired with a client system utilizing the Unity Engine. The system addresses existing limitations by offering capabilities beyond those provided by the Windows Device Portal, particularly regarding simultaneous access to multiple sensor streams.
Technical Implementation
The server application is developed as a C++ Universal Windows Platform (UWP) application. It streams various sensor data at specified TCP ports, employing video and audio compression techniques via the Microsoft Media Foundation SDK to ensure smooth data transmission even at full video frame rates. Each stream can operate in one of several modes, depending on its configuration by the client, with optional inclusion of device pose data.
A Python library complements the server application by handling data reception and decoding on the client side. It facilitates real-time experiments by providing decoded data as NumPy arrays, hence simplifying integration with other computing libraries such as OpenCV and PyAV. The multi-system compatibility (Windows, Linux, OS X) of the client library, alongside example code, underscores the system's adaptability in various research and application contexts.
The research paper notes specific performance metrics, such as achieving 1080p video streaming at 30 FPS with a latency of approximately 270 ms. Network bandwidth requirements are effectively managed through compression, where video and audio data are lossy-compressed, yet depth data remains losslessly encoded using PNG format. Tabled bandwidth allocations for each data stream illustrate the server's capability to handle comprehensive data transmission efficiently, maintaining an average total bandwidth of approximately 19 Mbit/s.
Unity Integration and Use Cases
The server's applicability extends into Unity projects, essentially leveraging the Unity Engine for augmented and virtual reality applications using the HoloLens 2. An additional IPC structure enables the Unity plugin to interface with client applications, opening a plethora of possibilities for real-time, interactive developments in mixed reality spaces. Through Unity, researchers and developers can dynamically create and manipulate objects within augmented environments, thus expanding the potential use cases for HoloLens-based research and applications.
Future Work and Implications
The paper proposes further expansion of the system to include additional data streams, such as Spatial Mapping data, Scene Understanding, and Voice Input. The implications for real-time data processing and interactive application development are substantial, signifying potential advancements in domains such as robotics, health care, and cognitive research.
Overall, this research offers both a practical and theoretical advancement by providing a comprehensive, real-time data streaming system for the HoloLens 2, forming a robust framework for future developments in augmented reality data management and application integration.