Edge Computing Architectures

Updated 1 March 2026

Edge computing architectures are decentralized frameworks that distribute compute, storage, and control to reduce latency and enhance resilience in IoT, AI, and autonomous systems.
They employ diverse models such as layered, peer-to-peer, cluster-based, and federated topologies to optimize resource allocation, scheduling, and fault tolerance.
Advanced resource management strategies—including ILP, heuristic, and game-theoretic methods—enable responsive autoscaling and cost-effective deployment in heterogeneous environments.

Edge computing architectures are engineered to provide low-latency, scalable, and context-aware processing by decentralizing compute, storage, and control functions toward the network periphery. This pattern is critical for applications with stringent latency, reliability, data privacy, and bandwidth constraints—such as AI-enhanced IoT, autonomy, and health monitoring. The field encompasses a spectrum of architectural and resource management designs, ranging from dual-node pairs to large-scale, federated, or decentralized platforms, unified by their prioritization of in situ computation and resilience in heterogeneous network environments.

1. Taxonomies and Core Architectural Models

Edge computing architectures are classified along several axes, reflecting data flow, administrative control, and tenancy patterns. The main families are:

Layered/Hierarchical Architectures: Three (or more) tiers from end devices (sensors, actuators), to intermediate edge nodes/gateways or micro-data centers, and then to centralized cloud data centers. Vertical dataflow is typical, with upward data movement (for aggregation, analytics) and downward control/configuration (Hong et al., 2018, Makaya et al., 2024, Gupta et al., 11 Oct 2025).
Peer-to-Peer (P2P) and Decentralized Models: End-devices cooperate for message dissemination, load-sharing, and resource pooling, using overlays for multi-hop, topology-agnostic communication. Examples include hybrid edge/P2P networks employing gossip or directed protocols for robust delivery and dynamic capacity (Serena et al., 2021, Tourani et al., 2020, Tošić et al., 2019).
Cluster- and Aggregation-based Topologies: Edge sites are organized into clusters, typically with cluster heads acting as resource aggregators, in-network processors, or failover coordinators (Hong et al., 2018, Seisa et al., 2022).
Hybrid and Federated Platforms: Dynamically combine hierarchical and P2P strategies, often spanning cloud, edge gateways, and progressively more resource-constrained devices (Makaya et al., 2024, Carpio et al., 2021).

For mission-critical IoT, architectures have emerged with explicit dual-node configurations (primary and secondary) for local failover and latency-bound processing, optimizing both availability and worst-case response (Ramu, 2023).

Serverless computing and microservices are increasingly prevalent, abstracting physical resources, supporting fine-grained auto-scaling, and enabling loosely-coupled deployments at the edge (Carpio et al., 2021, Gupta et al., 11 Oct 2025).

2. Architectural Elements and Component Interactions

A typical edge computing system comprises:

Edge Devices: Sensors and actuators with direct environmental interactions, executing deterministic or event-driven code (e.g., sensor preprocessing, emergency actuation) (Makaya et al., 2024, Hong et al., 2018).
Edge Nodes/Gateways: Local compute nodes, often equipped with containers or lightweight VMs, handling stream analytics, ML inference, and data filtering; may host orchestration agents and manage local resource pools (Carpio et al., 2020, Makaya et al., 2024).
Cloud Tiers or Remote Orchestration: Provide persistent state, global optimization, archival analytics, and central model training. Serve as control-plane authorities in hierarchical architectures (Hong et al., 2018, Makaya et al., 2024).
Data Synchronization Modules: Dual-node and federated schemes require background synchronizers to maintain consistency subject to application semantics (eventual, strong, or causal consistency) (Ramu, 2023, Marpu et al., 9 Jul 2025).
Rerouting and Autoscaling Logic: Latency and load monitors implement policy-driven task handoff, failover, or scaling, based on observed thresholds and policy (e.g., real-time latency, SLA violations) (Ramu, 2023, Gupta et al., 11 Oct 2025).
Offloading and Task Scheduling Engines: Deciding between local, peer, or cloud execution based on current system state, workload characterization, and predictive models (Cicconetti et al., 2021, Liang et al., 2020).

Communication typically uses REST, MQTT, or P2P overlays for control/data plane separation. Pervasive edge computing frameworks envision Named Data Networking (NDN) not only for content distribution but as the substrate for service discovery, invocation, and migration, binding code/data/hardware from multiple domains (Tourani et al., 2020).

3. Resource Management, Placement, and Scheduling

Resource allocation in edge/fog architectures is treated as a multi-objective optimization problem: minimizing latency, energy, and resource usage while meeting constraints on capacity, availability, or context affinity (Hong et al., 2018, Makaya et al., 2024). Formally, service placements are modeled as binary integer programs:

$\min_{x_{ij}}\, \alpha\sum_{i,j} L_{ij}\,x_{ij} + \beta\sum_{i,j} E_{ij}\,x_{ij}$

subject to per-node capacity and exclusivity:

$\sum_j x_{ij} = 1\,\forall i;\qquad \sum_i r_i x_{ij} \le R_j\,\forall j$

Algorithms for task placement and auto-scaling span:

Exact ILP or multi-dimensional knapsack solvers, suitable for small-scale deployments (Hong et al., 2018, Makaya et al., 2024).
Greedy, heuristic, or delayed-offer strategies, using contextual and attribute constraints (battery, geographic, sensor types, network state) for real-time scheduling (Makaya et al., 2024).
Game-theoretic and auction-based methods, in decentralized P2P settings, for efficient and fair sharing of volunteer resources (Hong et al., 2018).
Hybrid proactive/reactive scaling, combining load forecasting (LSTM, ARIMA, SVM, GA ensembles) with policy fallback to metrics-driven rules (e.g., Kubernetes HPA, best-fit bin-packing) (Gupta et al., 11 Oct 2025).
Specialized protocols for autoscale triggers, responsive to SLA targets, request queue length, or user-perceived QoE (Gupta et al., 11 Oct 2025).

Application-level failover is enabled in architectures such as the dual-node local edge, with policy of rerouting requests on latency breaches (Ramu, 2023). Distributed execution frameworks support federated learning or distributed gradient aggregation for edge AI, permitting model or task partitioning by layer and resource profile (Wang et al., 2019, Liang et al., 2020).

4. Performance Modeling and Empirical Benchmarks

Architecture evaluation leverages both mathematical models and empirical studies:

Latency Models: Round-trip time is decomposed into processing, propagation, queuing, and transmission; dual-node local edge maintains thresholds below 150ms (sub-70ms for fast-paced applications) (Ramu, 2023).
Scalability and Traffic Offloading: Peer density, gossip protocol parameters (e.g., broadcast probability, neighbor count), and edge server placement are key determinants of coverage, message redundancy, and latency (Serena et al., 2021).
Auto-scaling: Proactive autoscalers, such as ARIMA/LSTM/GA-based predictors, reduce forecast error and cold-start latency, sustaining SLA compliance under bursty or periodic workloads (Gupta et al., 11 Oct 2025).
Microservice Systems: Modular Docker/Kafka–based edge deployments exhibit tight latency and utilization profiles (edge-only median ~0.1–0.3s up to 500 users) and significant gains over cloud-only or unchecked edge-cloud synchronization (Carpio et al., 2020).
Containerized Decentralization: Blockchain-inspired and consensus-driven orchestration achieve effective self-balancing and resource convergence (standard deviation under 5% after 20–30s even at 100 nodes) with modest communications overhead (Tošić et al., 2019).
AI Accelerator Benchmarks: Specialized hardware (Edge TPU, Jetson, VPUs) deliver up to 100× higher performance-per-watt and cost-normalized throughput than traditional x86/cloud, with auto-adaptive split processing for bandwidth or latency trade-off (Liang et al., 2020).
Near-memory and neuromorphic architectures: NM-Carus and NM-Caesar accelerate TinyML inference by 28–54× (energy gain up to 36×) relative to CPU-only, indicating the efficacy of compute-in-memory in edge microcontroller design (Caon et al., 2024). Memristive neuromorphic circuits achieve TOPS/W levels of efficiency for analog ML at the sensor interface (Krestinskaya et al., 2018).

5. Application Domains and Use Case Profiles

Edge compute architectures have been engineered for the following mission and performance-critical domains:

Healthcare Monitoring: Real-time alerts on wearables (e.g., arrhythmia detection via 1D-CNN/MoE) achieve sub-100 ms inference, guarantee data sovereignty, and avoid WAN dependency (Marpu et al., 9 Jul 2025, Ramu, 2023).
Autonomous Vehicles: Inference stacks (camera→DSP→CNN→MoE) maintain ≤8 ms perception loop, with local backup in dual-node or hybrid edge-cloud orchestrations for safety-critical fallback (Ramu, 2023, Marpu et al., 9 Jul 2025).
Smart Infrastructure: Edge nodes act as anomaly detectors using local LSTM or autoencoders, pushing only summarized alerts to the cloud, drastically reducing network utilization (Makaya et al., 2024).
Collaborative Robotics: Layered schemes (device/edge/fog/cloud) support SLAM, planning, and control partitioning, trading compute proximity against orchestration overhead according to mobility and latency profiles (Seisa et al., 2022).
Distributed Learning and Inference: Federated and split-learning systems dynamically partition DNNs and ML tasks across heterogenous edge resources, optimizing for privacy, bandwidth, and device capability (Wang et al., 2019).
Indoor Environmental Sensing: Centralized (GPU/MQTT) and distributed parallel (ARM/MPI) IoT architectures demonstrate similar high accuracy (F-score ~0.95–0.97) with the distributed approach yielding ~37% lower power (Gamazo-Real et al., 2024).

6. Challenges, Limitations, and Future Research

Current architectures face several areas for development:

Heterogeneity and Scheduling: Standard Kubernetes and edge orchestrators are not performance- or network-aware across ARM/x86, constrained devices, or geo-distributed links (Carpio et al., 2021). Lack of real-time profiling impedes optimal resource use.
Consistency Models and Synchronization: Background eventual consistency is tractable, but sub-ms latency and advanced models (e.g., causal consistency for time-series IoT) remain unaddressed (Ramu, 2023).
Security and Privacy: Physically distributed edge units, P2P overlays, and D2D links enlarge attack surfaces; there is a need for end-to-end trust frameworks, hardware root of trust, and privacy-preserving ML protocols (Gupta et al., 11 Oct 2025, Rahimi et al., 2020, Marpu et al., 9 Jul 2025).
Multi-resource Optimization: Most container balancing or migration algorithms optimize a single resource (CPU); multidimensional objectives (RAM, I/O, SLAs) require further algorithmic development (Tošić et al., 2019).
Resilience and Adaptivity: Mobility, intermittent connectivity, and dynamic topology changes drive the need for robust re-registration, checkpointing, and adaptive control loops in both P2P and federated models (Makaya et al., 2024, Tourani et al., 2020).
Benchmarks and Standardization: Comprehensive field trials, open microbenchmarks, and the standardization of APIs, naming, and orchestration semantics are needed for meaningful cross-system comparison (Wang et al., 2019).

Ongoing research includes network/compute-aware schedulers, in situ federated learning, integrated benchmarking, and economic models for incentivizing resource contribution in pervasive or democratized edge ecosystems (Tourani et al., 2020, Makaya et al., 2024).

7. Comparative Summary and Emerging Trends

Traditional vs Modern Edge Approaches: Dual-node designs, hybrid edge-P2P overlays, federated hierarchies, and serverless microservice stacks all outperform traditional single-node or cloud-centric deployments in mission-critical low-latency, privacy, and resilience metrics (Ramu, 2023, Serena et al., 2021, Gupta et al., 11 Oct 2025, Carpio et al., 2020).
Performance-Availability Trade-offs: Local, redundancy-enhanced edge architectures (dual-node, cluster) provide sub-100 ms failover and avoid cloud latency/bandwidth bottlenecks at the cost of increased hardware and sync complexity (Ramu, 2023).
Scalability and Flexibility: Modular microservices, P2P overlays, and federated frameworks favor horizontal scaling and resource harvesting; platforms such as EdgeSphere demonstrate cross-domain, context-aware scheduling in three-tier models (Makaya et al., 2024).
Programmability and Hardware Acceleration: Recent advances in near-memory compute, neuromemristive circuits, and container-based design lower the power and area barrier for always-on, in situ AI at the network edge (Caon et al., 2024, Krestinskaya et al., 2018).
Future Architectures: Next-generation edge will likely combine network-aware scheduling, AI-driven orchestration, and security primitives, leveraging both lightweight, ephemeral microservices and persistent, functionally specialized hardware (Gupta et al., 11 Oct 2025, Caon et al., 2024, Marpu et al., 9 Jul 2025).

Edge computing architectures are thus a multidimensional domain, continuously evolving toward federated, resilient, and context-adaptive forms, driven by application demands and foundational advances in both hardware and distributed systems theory.