AerialDB: Federated UAV Edge Datastore
- AerialDB is a federated, peer-to-peer spatio-temporal edge datastore designed for real-time processing, storage, and retrieval of UAV-generated high-volume data streams.
- It employs deterministic replica placement and distributed multi-dimensional indexing to achieve low latency, fault tolerance, and efficient query processing.
- The system supports critical applications such as disaster management, urban monitoring, and environmental analytics through decentralized, robust execution.
AerialDB is a federated, peer-to-peer spatio-temporal edge datastore designed for real-time processing, storage, and retrieval of data generated by fleets of unmanned aerial vehicles (UAVs) operating in dynamic and highly mobile settings. Architecturally, AerialDB targets deployments where UAVs collect high-volume data streams—including images and video—that must be offloaded to ground-based edge infrastructure due to the limited storage and processing capabilities of onboard UAV computers. The system natively supports time series data, efficient spatio-temporal query primitives, decentralized execution, and robust fault tolerance, making it well-suited for scenarios such as disaster management, large-scale environmental monitoring, and rapid field analytics (Jaiswal et al., 10 Aug 2025).
1. System Architecture and Data Flow
AerialDB is structured as a federated, containerized framework composed of two principal tiers:
- UAV Tier: Each UAV in the fleet batches collected sensor streams into data "shards." These shards are accompanied by metadata comprising shardID, spatial bounding box coordinates, and temporal ranges. UAVs, constrained by onboard computational resources, offload shards to the "parent edge" server determined by spatial Voronoi partitioning at near-real-time intervals.
- Edge Tier: Ground-based edge servers, each running a local InfluxDB instance, serve as data ingestion points, storage nodes, query processors, and participants in a peer-to-peer network. Every edge server exposes a service endpoint for UAV uploads, stores and indexes received shards, and replicates shards selectively to other edges for fault tolerance and load balancing.
Communication between UAVs and edge nodes, as well as inter-edge coordination, is over cellular or dedicated networks. The architectural diagram in the source outlines a mesh of edges, each associated with a Voronoi cell covering a unique geographic region, enabling efficient locality-aware data placement and retrieval.
2. Content-Based Replica Placement and Indexing
AerialDB deploys a lightweight, deterministic replica placement and distributed indexing algorithm for both data and metadata:
- Replica Placement: Each data shard is triply replicated. Placement is determined by three independent hash functions applied to: (1) spatial attributes (ℋₛ(lat,long)), (2) temporal attributes (ℋₜ(time)), and (3) the shardID (ℋᵢ(shardID)). For spatial placement, Voronoi partitioning assigns shards geographically. Temporal placement buckets time ranges to assigned edges in fixed intervals (e.g., 5 minutes), and shardID is bucketed via its direct hash modulo edge count. Ensuring that even correlated data from multiple UAVs is correctly dispersed reduces hotspots and enhances fault tolerance.
- Shard Indexing: Metadata for spatio-temporal predicates (bounding box, time interval, shardID) is sliced into partitions and indexed across multiple edges using identical hash functions. This over-replication enables rapid distributed shard lookup—queries with spatial, temporal, or identifier predicates are efficiently routed to only relevant edge nodes.
The result is a decentralized, multi-dimensional index which both narrows the query search space and preserves system resilience.
3. Distributed Execution Engine and Query Processing
The distributed execution engine in AerialDB is locality-aware and peer-coordinated:
- Query Decomposition: Incoming queries (with user-specified spatial and/or temporal predicates) are parsed by a coordinator edge, which leverages the distributed index to locate matching shardIDs and their edge hosts.
- Parallel Execution: For each identified shard, the engine selects one replica and dispatches a narrowly scoped sub-query, exploiting parallelism across multiple edge servers. The routing policy (e.g., "MinShards") aims to optimize for network and computational load.
- Result Aggregation: Sub-query responses are aggregated by the coordinator and returned to the client.
The engine operates without a centralized leader; metadata and heartbeats are exchanged among all edge nodes. Triple replication of data and index guarantees tolerance to at least two simultaneous edge failures while maintaining service continuity with minimal latency increase.
Fault mitigation, such as failure detection and automatic alternate replica querying, is seamlessly managed by the same decentralized logic.
4. Performance Characteristics and Scalability
AerialDB achieves high scalability and performance via optimized data layout and parallelism:
- Insertion Latency: Empirical measurements indicate median shard insertion latencies of 0.15–0.35 seconds in 100-UAV deployments—about 10× faster than cloud-only InfluxDB (0.9–3.2 seconds).
- Query Latency: Distributed query processing yields up to 100× lower latency than cloud baselines for spatio-temporal predicates, illustrated by quartile plots and microbenchmarks that compare predicate filtering implementations.
- Horizontal Scaling: Experiments with up to 400 drones and 80 edge nodes show maintenance of near-real-time performance, with query latency scaling sub-linearly due to effective index partitioning and replica placement.
- Load Balancing: The "MinShards" selection policy and locality-aware coordinator election distribute computational load, minimize hotspots, and optimize network usage.
These metrics confirm AerialDB’s capability to maintain low latency under high concurrency and operational scale.
5. Fault Tolerance and Robustness
AerialDB implements robust mechanisms to ensure reliability under adverse conditions:
- Triple Replica Redundancy: Each shard’s data and index entries are triply replicated using orthogonally hashed placement, providing safety against up to two simultaneous edge node failures.
- Adaptive Query Routing: Upon edge failure (detected via liveness heartbeats), queries are rerouted to alternate replica hosts; index lookups are broadcast in degenerate scenarios.
- Decentralized State Management: All edges are symmetric in function, eliminating any single point of failure.
The system’s graceful degradation ensures uninterrupted access and analytics even in challenging environments encountered in disaster response or critical infrastructure monitoring scenarios.
6. Real-World Applications in Disaster Management and Beyond
AerialDB is architected for real-world use cases characterized by mobile, high-volume data sources and unreliable network connectivity:
- Disaster Response: Drones rapidly survey affected regions, offloading video and sensor data to ground edge servers for immediate analysis (e.g., damage assessment, survivor identification). Fast spatio-temporal queries enable situational awareness without dependence on cloud resources.
- Urban and Traffic Monitoring: Fleets monitor safety, traffic conditions, or urban events, delivering real-time analytics to local authorities using AerialDB as a high-throughput back end.
- Environmental Analytics: Large-scale deployment allows pervasive aerial sensing with robust, federated storage that supports complex query patterns for environmental research.
The peer-to-peer, edge-centric design alleviates cloud dependency, supporting field-deployable analytics and localized decision-making.
7. Technical Challenges and Solutions
Key technical challenges addressed by AerialDB include:
Challenge | Principal Technique/Resolution | Resulting Property |
---|---|---|
Handling UAV mobility & sporadic connectivity | Local parent edge selection via Voronoi partitioning | Robust data association |
Distributed metadata management | Content-based, stateless hash functions | Decentralization, consistency |
Load balancing and hotspot avoidance | Orthogonal hash-based replica and index placement | Scalability, efficiency |
Query routing and sub-query decomposition | Distributed index with in-memory slicing | Minimal search scope |
Edge/node failure recovery | Triple replica redundancy; heartbeat-driven routing | Graceful degradation |
Collectively, these solutions yield a system capable of performant, large-scale real-time spatio-temporal analytics resilient to failures and topology changes.
Conclusion
AerialDB establishes an advanced federated edge datastore paradigm tailored for mobile UAV fleets and spatio-temporal analytics in dynamic, distributed real-world scenarios (Jaiswal et al., 10 Aug 2025). Its decentralized replica placement, distributed multi-dimensional indexing, rapid query decomposition, and robust execution engine collectively deliver high-throughput, low-latency storage and retrieval that outperforms cloud baselines by orders of magnitude for query workloads. The resultant system is well-positioned for disaster management, dense urban analytics, environmental sensing, and other domains requiring real-time aerial data integration and robust, peer-to-peer infrastructure.