Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HoloLens 2 Sensor Streaming (2211.02648v1)

Published 4 Nov 2022 in cs.MM

Abstract: We present a HoloLens 2 server application for streaming device data via TCP in real time. The server can stream data from the four grayscale cameras, depth sensor, IMU, front RGB camera, microphone, head tracking, eye tracking, and hand tracking. Each sent data frame has a timestamp and, optionally, the instantaneous pose of the device in 3D space. The server allows downloading device calibration data, such as camera intrinsics, and can be integrated into Unity projects as a plugin, with support for basic upstream capabilities. To achieve real time video streaming at full frame rate, we leverage the video encoding capabilities of the HoloLens 2. Finally, we present a Python library for receiving and decoding the data, which includes utilities that facilitate passing the data to other libraries. The source code, Python demos, and precompiled binaries are available at https://github.com/jdibenes/hl2ss.

Citations (17)

Summary

  • The paper introduces a novel server application that streams comprehensive sensor data from HoloLens 2 over TCP, surpassing previous acquisition methods.
  • It employs a C++ UWP application combined with a Python client library to deliver 1080p video at 30 FPS with approximately 270 ms latency while ensuring multi-platform support.
  • The integration with Unity paves the way for interactive AR/VR applications and future expansion into additional data streams such as spatial mapping and voice input.

Overview of "HoloLens 2 Sensor Streaming"

This paper details the implementation of a server application designed to facilitate real-time streaming of sensor data from the Microsoft HoloLens 2 device over TCP. The developed system represents a significant enhancement over existing HoloLens data acquisition methods by providing comprehensive real-time access to an array of sensor data streams, including four grayscale cameras, a depth sensor, IMU, RGB camera, microphone, and spatial input data such as head pose, eye tracking, and hand tracking.

The server, which operates directly on the HoloLens 2, streams data that can be integrated into Unity projects as a plugin. This integration supports upstream capabilities, thereby potentially enhancing computing power when paired with a client system utilizing the Unity Engine. The system addresses existing limitations by offering capabilities beyond those provided by the Windows Device Portal, particularly regarding simultaneous access to multiple sensor streams.

Technical Implementation

The server application is developed as a C++ Universal Windows Platform (UWP) application. It streams various sensor data at specified TCP ports, employing video and audio compression techniques via the Microsoft Media Foundation SDK to ensure smooth data transmission even at full video frame rates. Each stream can operate in one of several modes, depending on its configuration by the client, with optional inclusion of device pose data.

A Python library complements the server application by handling data reception and decoding on the client side. It facilitates real-time experiments by providing decoded data as NumPy arrays, hence simplifying integration with other computing libraries such as OpenCV and PyAV. The multi-system compatibility (Windows, Linux, OS X) of the client library, alongside example code, underscores the system's adaptability in various research and application contexts.

Numerical Results and Performance Metrics

The research paper notes specific performance metrics, such as achieving 1080p video streaming at 30 FPS with a latency of approximately 270 ms. Network bandwidth requirements are effectively managed through compression, where video and audio data are lossy-compressed, yet depth data remains losslessly encoded using PNG format. Tabled bandwidth allocations for each data stream illustrate the server's capability to handle comprehensive data transmission efficiently, maintaining an average total bandwidth of approximately 19 Mbit/s.

Unity Integration and Use Cases

The server's applicability extends into Unity projects, essentially leveraging the Unity Engine for augmented and virtual reality applications using the HoloLens 2. An additional IPC structure enables the Unity plugin to interface with client applications, opening a plethora of possibilities for real-time, interactive developments in mixed reality spaces. Through Unity, researchers and developers can dynamically create and manipulate objects within augmented environments, thus expanding the potential use cases for HoloLens-based research and applications.

Future Work and Implications

The paper proposes further expansion of the system to include additional data streams, such as Spatial Mapping data, Scene Understanding, and Voice Input. The implications for real-time data processing and interactive application development are substantial, signifying potential advancements in domains such as robotics, health care, and cognitive research.

Overall, this research offers both a practical and theoretical advancement by providing a comprehensive, real-time data streaming system for the HoloLens 2, forming a robust framework for future developments in augmented reality data management and application integration.

Github Logo Streamline Icon: https://streamlinehq.com