Multiscreen Architecture Overview

Updated 3 April 2026

Multiscreen architecture is a design paradigm that distributes and synchronizes content across heterogeneous devices for immersive, collaborative interactions.
It employs decoupled, layered modules for data management, real-time synchronization, and adaptive UI rendering using protocols like Photon Realtime and WebSockets.
Practical applications include XR collaboration, dynamic web app refactoring, and intelligent environments that optimize spatial layout and multi-user coordination.

A multiscreen architecture refers to a system design paradigm in which digital content, controls, or computational workloads are distributed, synchronized, and coordinated across multiple heterogeneous output devices—such as monitors, tablets, smart TVs, projectors, augmented reality headsets, or smartphones. The objective is to enable new forms of collaboration, immersive analytics, media interaction, and adaptive user interfaces by leveraging the distinct affordances and spatial arrangements of each device. This approach underpins state-of-the-art systems across XR collaboration, web application refactoring, parameter-space exploration, software visualization, and intelligent environments.

1. Architectural Models and Core Components

Multiscreen architectures are typically structured in layered or modular fashion, with decoupled roles for data management, synchronization, device abstraction, and presentation. A canonical form, exemplified in the XR collaboration domain (Porcino et al., 2022), organizes the system as:

Data & Query Tier: Centralized storage (e.g., RDF/SPARQL, SQL) exposed via a translator API supporting multiple query languages and visual DSLs.
Core Synchronization Tier: A single authoritative shared server manages world state, ingests and translates domain data (e.g., geospatial → Unity coordinates), and provides real-time event distribution using protocols such as Photon Engine, WebSockets, or MQTT.
Multi-Device Presentation Tier: Application logic and UI rendering occur separately on each device class—e.g., AR clients (Unity/HoloLens), shared tabletop displays (React/Leaflet), tablet adapters—linked via a device-agnostic protocol bus.

The functional decomposition into modules—such as SharedServer, QueryTranslatorAPI, DisplayAdapter, ARClientApp—facilitates extensibility: new device types are integrated by implementing adapter interfaces that subscribe to “worldUpdate” messages and render payloads without impacting the core synchronization logic.

This decoupled, tiered approach is also reflected in web application refactoring systems (Virtual Splitter (Sarkis et al., 2015)) and intelligent room orchestrators (AmI-Solertis in (Leonidis et al., 2021)), which combine semantic/structural analysis, annotation and splitting phases, and runtime mirroring/synchronization via lightweight message buses.

2. Synchronization Protocols and Data Distribution

Efficient, low-latency synchronization across devices is central. Two dominant communication models appear:

Client–Server Authority: The shared state (map, object transforms, UI event log) is authored and moderated centrally; clients (AR headsets, shared tabletop, tablets) subscribe to updates via persistent connections. This is realized with Photon Realtime (UDP/TCP multicast), WebSockets with JSON payloads, or MQTT-like event buses for responsive sub-100 ms roundtrip.
Collaborative State Propagation: In multi-user visual analytics (Eichner et al., 2019), user actions and view manipulations mutate a shared session model, which downstream triggers for layout recomputation and provenance graph updates across all displays.

Protocol usage:

Protocol / Tech Stack	Typical Role	Example Reference
Photon Realtime	AR object/event sync, sub-100 ms latency	(Porcino et al., 2022)
WebSocket (JSON)	Map/query/state updates, UI coordination	(Porcino et al., 2022, Hansen et al., 2024)
TUIO/UDP	Tangible multi-touch stream to shared server	(Porcino et al., 2022)
MQTT-like event bus	Media, control flow, sensor fusion	(Leonidis et al., 2021)
Mutation Summary/postMessage	DOM diff/application mirroring	(Sarkis et al., 2015)

Latency models are explicitly formulated as:

$L_a = L_{network} + L_{serialization} + L_{processing}$

with $L_{network}\approx 20$ – $50\,\mathrm{ms}$ , $L_{serialization}\approx 1$ – $5\,\mathrm{ms}$ , $L_{processing}\approx 5$ – $20\,\mathrm{ms}$ typical for XR local Wi-Fi (Porcino et al., 2022). Throughput requirements scale linearly with object count and update frequency, e.g., $B \approx N \times f \times S_{msg}$ .

In multi-browser/projector visualizations (Hansen et al., 2024), a dedicated Collaboration Service distributes projection matrices and view states. Each client applies its own spatial transformation, yielding a seamless combined field-of-view:

$S_i(t) = P_i \cdot S_{main}(t)$

where $P_i$ encodes each device's unique calibration.

3. Content Distribution, Segmentation, and Layout

Robust segmentation, assignment, and spatial layout are required for distributing complex UIs or data views across heterogeneous screens.

Functional Segmentation and Contextual Assignment: MSoS (Sarkis et al., 2015) segments web pages into semantically coherent blocks—e.g., interactive vs. multimedia—using a hybrid of structural and visual cue analysis. The algorithm recursively merges DOM nodes by functional label, auto-tunes segmentation granularity via area-based threshold $L_{network}\approx 20$ 0, and preserves functional isolation (video content stays separate from controls). The resulting annotated DOM is suitable for automated distribution by systems like Virtual Splitter (Sarkis et al., 2015).
Dynamic Web Application Splitting: Virtual Splitter (Sarkis et al., 2015) implements semantic and visual region-based mapping (see $L_{network}\approx 20$ 1, $L_{network}\approx 20$ 2), annotates the DOM, then generates two synchronized “master” and “slave” web apps. Instrumentation ensures that all dynamic content and user events are proxied via window messaging, keeping logic centralized yet interaction distributed.
Interactive Visual Layouts: In visual analytics (Eichner et al., 2019), assignments are formalized as a layered graph $L_{network}\approx 20$ 3, with a layout function

$L_{network}\approx 20$ 4

solved by combinatorial optimization to maximize spatial/temporal stability and visibility, under non-overlap constraints. GUI actions (drag, link, adjust degree-of-interest) directly rewrite this model and trigger re-optimization.

Table: Segmentation and Assignment Strategies

Approach	Algorithmic Mechanism	Target Content
MSoS (Sarkis et al., 2015)	Logical tree + functional labeling, adaptive $L_{network}\approx 20$ 5	Web page elements
Virtual Splitter (Sarkis et al., 2015)	Semantic/class or region-based mapping, DOM annotation	HTML/CSS/JS apps
Visual Analysis (Eichner et al., 2019)	Layered graph, quality function $L_{network}\approx 20$ 6	Visualization views

4. Interaction Patterns and Multi-user Coordination

Multiscreen environments afford hybrid interaction mechanisms and usage modalities:

Brushing & Linking: Selections on 2D maps trigger anchor spawns in 3D AR, and vice versa, bridging visual spaces (Porcino et al., 2022).
Grab-and-Transfer: Immersive devices support pickup and cross-device "drop" of objects (e.g., data cards), facilitating workflow partitioning.
Focus + Context: Systematic separation of global (shared display), local (HMD or personal device), and contextual (wall/table) views enables multi-focal collaborative analytics (Porcino et al., 2022, Eichner et al., 2019).
Meta-Analysis: Detailed action provenance and time-branching in session logs afford post-hoc investigation of analytical workflows (Eichner et al., 2019).
Seamless UI Migration: Object presence detection may be used to automatically migrate private UIs back onto a mobile device if it is physically removed (Leonidis et al., 2021).

Coordination is frequently enforced by policies such as display control tokens, ensuring single-writer consistency for shared surfaces and arbitrated transition of content placement (Leonidis et al., 2021).

5. Practical Applications and Performance Considerations

Deployed multiscreen systems span application domains including:

Collaborative AR/XR Analysis: Domain-specific implementations for maritime decision support integrate AR HMDs, shared maps, and tablets, with sub-centimeter alignment using physical anchors (QR codes) and real-time query integration (Porcino et al., 2022).
Immersive Software Visualization: Spatially-calibrated, multi-projector domes (e.g., ARENA2) for collaborative “software cities” employ synchronized browser instances, leveraging custom device projection matrices and optimized rendering split (Hansen et al., 2024).
Web Application Refactoring: Chrome extension–based distributive orchestration of arbitrary HTML/JavaScript applications, with robust synchronization and proxying for stateful, dynamic UIs (Sarkis et al., 2015).
Intelligent Living Environments: Unified event buses coordinate video feeds, contextual overlays, sensor-driven UI placement, and adaptive hibernation for attention and energy mediation in smart living rooms (Leonidis et al., 2021).

Performance strategies employed include offloading computationally expensive transforms to core servers, aggressive caching of query results, spatial culling and LOD for AR clients, and network-efficient mutation batching.

6. Extensibility, Usability, and Design Trade-offs

Multiscreen architectures are systematically extensible via standardized device adapter interfaces and well-defined APIs (e.g., connect/subscribe hooks or annotation conventions) (Porcino et al., 2022, Sarkis et al., 2015). This enables:

Device/Surface Agnosticism: Integration of new UIs or output modalities without significant restructuring.
Context-aware Placement: Placement functions $L_{network}\approx 20$ 7 allow adaptive distribution of roles/content according to user attention state, device affordances, or content type (Leonidis et al., 2021).
Attention and Hibernation Policies: Secondary displays enter hibernation after inactivity and may be awakened in response to context triggers, mitigating distraction and power consumption (Leonidis et al., 2021).
User Empowerment and Consistency: Control-token arbitration, styleguide enforcement, and personalizable behaviors promote usability and ensure spatial coherence as UI fragments migrate between surfaces.

Centralized synchronization simplifies consistency but introduces single points of failure. Reliance on commercial synchronization engines (e.g., Photon) may complicate air-gapped or offline deployments, though self-hosted alternatives can be substituted if necessary (Porcino et al., 2022).

Observed limitations include occasional UI distortion in non-planar display geometries (Hansen et al., 2024), and the need for calibration and scale tuning in analytic environments.

The methodologies and empirical findings described across the referenced systems (Porcino et al., 2022, Hansen et al., 2024, Sarkis et al., 2015, Eichner et al., 2019, Sarkis et al., 2015, Leonidis et al., 2021) collectively define the technical and practical contours of multiscreen architecture: robust functional/task segmentation, hybrid event/data synchronization, spatially-aware layout, hybrid interaction patterns, and architectures that are open to device evolution and adaptable to analytic or entertainment contexts.