FaaS Computing: Principles & Applications

Updated 12 December 2025

FaaS computing is a serverless paradigm that decomposes applications into small, stateless functions triggered by events or invocations.
It delivers fine-grained elasticity with automatic scaling and pay-per-use billing, optimizing resource management and performance.
The model is applied in diverse environments—from cloud to edge and HPC—enabling data-intensive workflows and cost-effective computing.

Function-as-a-Service (FaaS) computing is a serverless paradigm in which applications are decomposed into small, stateless functions that execute in ephemeral, isolated environments in response to events or explicit invocations. FaaS abstracts provisioning, scaling, and resource management away from the developer, enabling fine-grained elasticity, transparent scaling, and pay-per-use billing. The model has evolved beyond cloud-centric offerings to encompass federated, geo-distributed, multi-cloud, edge, and HPC environments, delivering high throughput and low-latency workflow execution for data- and compute-intensive applications (Li et al., 2024, Pawlik et al., 2019, Copik et al., 2020, Ríos-Monje et al., 2023, Jin et al., 2022, Jindal et al., 2021, Malekabbasi et al., 2024, Palma et al., 2024, Jin et al., 2023, Chard et al., 2019, Yussupov et al., 2020).

1. FaaS Principles and Evolution

FaaS computing executes independent, stateless user functions on demand in short-lived containers or processes that are automatically provisioned and scaled by the provider. Key characteristics include:

Stateless execution: Each function runs independently with no local state persistence; external services (datastores, object stores) handle persistence.
Event-driven triggers: Functions are invoked via HTTP requests, message-queue arrivals, object events, or custom triggers.
Automatic scaling: The platform dynamically provisions or deallocates resources in response to workload, scaling to zero when idle.
Granular, pay-per-use billing: Users are billed for execution time and resources consumed, rather than for provisioned capacity.

Early commercial platforms such as AWS Lambda, Azure Functions, and Google Cloud Functions established the basic model (Yussupov et al., 2020), which has subsequently proliferated to open-source and private platforms (OpenFaaS, OpenWhisk, Fission, Knative) and been extended to support container-native, WebAssembly, GPU-accelerated, and blockchain-integrated executions (Palma et al., 2024, Karanjai et al., 2024, Zhao et al., 2023).

More recent work generalizes the FaaS abstraction to the federated and edge computing paradigm—enabling workflows to execute across diverse, heterogeneous clusters, edge servers, and supercomputers, subject to latency, data locality, privacy, and resource heterogeneity constraints (Li et al., 2024, Jin et al., 2022, Jin et al., 2023, Malekabbasi et al., 2024).

2. Programming Models, System Architectures, and Interfaces

The FaaS programming model centers on packaging user logic as functions with explicit input/output signatures, typically written in high-level languages and invoked asynchronously. Platform abstractions and APIs vary:

Core API features: Register, update, delete, and invoke functions via REST/HTTP endpoints; manage configuration (memory size, timeout, triggers).
Dynamic task graphs: Modern scientific and data-processing workflows are expressed as dynamic DAGs in Python (e.g., UniFaaS’s @function decorator and futures model) to handle complex dependencies and runtime-driven composition (Li et al., 2024, Chard et al., 2019).
Remote I/O abstraction: Facilities like RemoteFile/GlobusFile/RsyncFile objects decouple code from data location, enabling transparent wide-area and cross-cluster data staging (Li et al., 2024, Ríos-Monje et al., 2023).
Container and Wasm runtimes: Functions are sandboxed via containers (Docker, Singularity, Shifter) or WebAssembly (Wasm); container platforms offer strong isolation and code portability, while Wasm enables extremely rapid cold starts and lightweight deployment on constrained edge devices (Palma et al., 2024).
Edge and geo-distributed brokers: Decentralized routing (DisGB/GeoBroker) and geo-aware function placement direct invocations to the closest, least-loaded edge or cloud node, optimizing latency and resiliency (Malekabbasi et al., 2024).
Federated orchestration: Control planes treat each registered endpoint or resource as an element of a federated pool, supporting resource-aware scheduling, elasticity, and policy-driven placement (Li et al., 2024, Jin et al., 2022, Jindal et al., 2021).

System architectures typically decompose into control, data, and management planes, with components dedicated to monitoring, profiling, scheduling, data management, and function execution (Li et al., 2024, Chard et al., 2019, Jin et al., 2022, Ríos-Monje et al., 2023).

3. Scheduling, Resource Management, and Performance Modeling

FaaS platforms employ sophisticated scheduling and resource management to minimize latency, respect heterogeneity, and deliver elasticity:

“Observe–predict–decide” cycles: Real-time load, queue, execution, and data-transfer statistics are collected and fed into regression models (random forest, polynomial, deep neural networks) to predict function execution and transfer times (Li et al., 2024, Jindal et al., 2022).
Scheduling algorithms: Strategies include
- Capacity-aware partitioning: Tasks are distributed in proportion to resource worker counts.
- Locality-aware scheduling: Preference for data proximity, minimizing cross-site transfer.
- Dynamic Heterogeneity-Aware (DHA): HEFT-inspired prioritization augments static cost models with dynamic estimation; delay and re-scheduling mechanisms provide elasticity in response to runtime resource changes (Li et al., 2024).
Data management and I/O optimization: Wide-area transfer is negotiated via high-performance tools (Globus, Rsync), with concurrency limits and locality checks; failed transfers are retried up to thresholds (Li et al., 2024, Chard et al., 2019).
Concurrency and function capacity estimation: Tools like FnCapacitor simulate function concurrency under SLOs, modeling 95th-percentile latency as a function of concurrency, allocated memory, and runtime parameters; DNN-based approaches achieve >75% prediction accuracy (Jindal et al., 2022).
GPU-aware scheduling: For ML inference, FaaS schedulers co-optimize GPU model caching (LRU-based), memory utilization, and parallelism via locality-aware assignment, reducing ML inferencing latency by up to 48× (Zhao et al., 2023).

Platform-specific scaling behaviors, cold-start latencies, and rate/concurrency limits are systematically benchmarked in SeBS and other suites, exposing non-linearities, resource underutilization, and cost–performance trade-offs (Copik et al., 2020, Pawlik et al., 2019).

4. Scalability, Elasticity, and Workflow Composition

FaaS supports both massive parallelism and fine-grained dynamic elasticity:

Scalability: Empirical studies show near-ideal scaling up to thousands of concurrent endpoints/workers for sufficiently large tasks; network and control-plane overheads dominate as task duration decreases (Li et al., 2024, Copik et al., 2020, Chard et al., 2019).
Elasticity: Platforms dynamically scale resources in response to queue lengths and workload bursts, with per-endpoint (heterogeneous) scaling policies (Li et al., 2024, Jin et al., 2022, Palma et al., 2024).
Workflow expressivity: FaaS workflows represent complex, multi-stage pipelines as DAGs, supporting data-driven branching, nested dependencies, and mixed data-processing patterns. Multi-event triggers, such as joins over sets of input events, further extend expressivity and reduce unnecessary invocations (Carl et al., 27 May 2025, Kulkarni et al., 27 Sep 2025).
Performance metrics: Makespan, data transfer volume, resource utilization, and monetary cost per workflow are primary metrics in benchmarking studies; scheduling and orchestration overheads can be kept to sub-10 ms per task even with full model-driven prediction (Li et al., 2024, Kulkarni et al., 27 Sep 2025).
Hybrid and federated models: Function Delivery Networks, edge adapters, and blockchain-managed registries (DeFaaS) enable cross-platform, multi-cloud, and geo-distributed workflow deployment, enforcing SLOs, energy efficiency, and policy compliance (Jindal et al., 2021, Karanjai et al., 2024, Malekabbasi et al., 2024).

5. Application Domains and Benchmarking

FaaS computing has been adopted across diverse domains:

Scientific computing: High-throughput screening, distributed montage assembly, large-scale image analytics, neuroimaging pipelines, real-time beamline/instrument data analysis, and ML inference serve as primary use cases for federated and supercomputing-class FaaS architectures (Li et al., 2024, Chard et al., 2019, Ríos-Monje et al., 2023).
Edge/IoT/fog computing: Video analytics, federated learning, smart-parking, and geo-distributed event-processing highlight local data-processing and privacy needs; FaaS platforms in these contexts employ decentralized orchestration, content-based function/data placement, and Wasm-based lightweight execution (Malekabbasi et al., 2024, Jin et al., 2022, Cheng et al., 2019, Palma et al., 2024).
Cloud-native microservices: Workloads include bag-of-tasks, I/O- and CPU-bound services, memory-intensive model inference, and stateful or workflow-coordinated chains (Pawlik et al., 2019, Copik et al., 2020, Kulkarni et al., 27 Sep 2025).
Benchmarks and tools: Standardized suites like SeBS enable cross-provider evaluation (AWS Lambda, Azure Functions, Google Cloud Functions) for latency, throughput, cold/warm start, cost, and reliability. Metrics such as efficiency, resource utilization, and evicted-container half-lives are formalized for consistent assessment (Copik et al., 2020, Pawlik et al., 2019).

Performance engineering studies emphasize the importance of workload/memory sizing, concurrency tuning, data locality, cold-start mitigation (e.g., via pre-warming, code-pruning), and adaptive placement to maximize cost-efficiency and SLA adherence (Liu et al., 2022, Li et al., 2024, Copik et al., 2020).

6. Advanced Topics: Security, Energy Efficiency, and Decentralization

Security and privacy: S-FaaS employs Intel SGX enclaves, transitive attestation, and enclaved metering to provide formal guarantees of integrity, privacy, and accountable billing, with negligible (<6.3%) latency overhead on real workloads (Alder et al., 2018).
Energy-aware scheduling: Function Delivery Networks minimize energy by placement on edge clusters; placing workloads like JSON parsing on Jetson Nano boards reduced total CPU energy by 17× while respecting SLO constraints (Jindal et al., 2021).
Blockchain and decentralization: DeFaaS registers function metadata and access control on a blockchain registry, executes cross-cloud via decentralized API gateways, and persists signed receipts on IPFS, trading additional lookup/invocation latency (50–200 ms per blockchain call) for transparency and multi-cloud, fault-tolerant operation (Karanjai et al., 2024).

7. Limitations, Open Challenges, and Best Practices

Despite its flexibility, FaaS computing presents significant design and operational challenges:

Cold-start latency: Container or VM startup, code-loading, and library import times can dominate execution, motivating code-pruning optimizers (FaaSLight: up to 78.95% code-loading-latency reduction), pre-warming, and AOT snapshots (Liu et al., 2022, Li et al., 2024, Kulkarni et al., 27 Sep 2025).
Multi-tenancy and resource unpredictability: Provider-side rate limits, noisy neighbors, and infrastructure heterogeneity introduce non-trivial performance variance; fine-grained profiling and adaptive models are recommended (Pawlik et al., 2019, Copik et al., 2020, Jindal et al., 2022).
Stateful function support: Classic FaaS imposes stateless constraints, but edge and networked computing motivate hybrid models (external, in-edge, in-function, in-client) with distinct trade-offs for latency, scalability, and consistency; orchestration and migration protocols for stateful FaaS remain active research topics (Cicconetti et al., 2022).
Data movement bottlenecks: Distributed data staging and transfer dominate makespan in wide-area workflows; data-locality-aware scheduling and transparent caching/replication are vital for efficiency (Li et al., 2024, Chard et al., 2019, Jin et al., 2022).
Platform selection and interoperability: The landscape of FaaS platforms is highly heterogeneous; systematic frameworks (e.g., FaaStener) guide platform selection across business, technical, and operational dimensions (Yussupov et al., 2020).
Custom scheduling and orchestration: Optimal performance/cost requires calibration of memory allocation, concurrency, partitioning thresholds, and batch sizes per workload; real-time profiling and auto-tuners are strongly advised (Li et al., 2024, Kulkarni et al., 27 Sep 2025, Jindal et al., 2022).
Extensibility and future work: Open research spans global cache directories, SLA-driven and cost-aware orchestration, multi-cloud and federated placement, formal modeling of data/locality-driven performance, and integration of decentralized control and programmable network overlays (Li et al., 2024, Karanjai et al., 2024, Malekabbasi et al., 2024, Cicconetti et al., 2022).

Function-as-a-Service computing, through ongoing methodological and architectural advances, provides a unified abstraction for fine-grained, event-driven, and elastically scalable execution of compute tasks across the continuum from low-power edge devices to geo-distributed cloud supercomputers, supporting the diverse needs of modern data- and AI-driven scientific and enterprise workflows (Li et al., 2024, Kulkarni et al., 27 Sep 2025, Copik et al., 2020, Jin et al., 2022, Palma et al., 2024, Jindal et al., 2021, Karanjai et al., 2024, Jin et al., 2023, Carl et al., 27 May 2025, Pawlik et al., 2019, Zhao et al., 2023, Alder et al., 2018, Cicconetti et al., 2022, Cheng et al., 2019).