T-Statefun: Transactional Extension for Flink StateFun

Updated 26 December 2025

T-Statefun is a transactional extension of Apache Flink StateFun that integrates exactly-once processing and atomic multi-function transactions into cloud applications.
It layers transactional coordination logic and runtime extensions atop unmodified Flink clusters, leveraging components like ingress routers, coordinator functions, and RPC bridges for seamless state management.
Its feasibility study demonstrates high throughput and low latency benchmarks, addressing challenges in event-driven microservices with programmable transactional APIs.

T-Statefun is a transactional extension of Apache Flink StateFun that demonstrates how a generic streaming dataflow system can support serverless, stateful functions with first-class transactional guarantees. Developed as a feasibility study, T-Statefun seeks to address the core challenges of building scalable, consistent cloud applications—specifically, exactly-once processing, atomic multi-function transactions, and seamless developer programmability—by layering coordination logic and runtime extensions atop unmodified Flink clusters (Psarakis, 19 Dec 2025).

1. Background and Motivation

The evolution of cloud-native architecture has shifted application composition towards event-driven microservices and Function-as-a-Service (FaaS) deployments. In these paradigms, individual services are stateless, with state relegated to externalized data stores or message buses. This architecture introduces significant complexity for two classical problems:

Exactly-once processing and idempotency: Guaranteeing each function's side-effects occur exactly once, even in the face of failures and retries.
Atomic multi-function transactions: Ensuring sets of state mutations execute atomically and in isolation, compensating correctly for failures.

While monolithic transactional databases natively provide ACID semantics, cloud-native decomposition delegates both problems to application code, resulting in reliance on SAGA patterns or distributed 2PC protocols which are either brittle or induce high performance penalties. T-Statefun addresses these difficulties by extending Flink StateFun—already supporting exactly-once state management and message-driven stateful functions—with a transactional API and runtime suitable for SFaaS (Serverless Stateful Functions as a Service) (Psarakis, 19 Dec 2025).

2. System Architecture and Workflow

T-Statefun augments an unmodified Flink cluster by integrating transactional coordination logic and several runtime mechanisms:

Major architectural components:

Component	Responsibility
Ingress router (Kafka ⇒ Flink)	Dispatches function invocations to proper operator partitions
Stateful Function operator	Maintains per-key state (RocksDB) + exactly-once checkpoint
Coordinator Function	Orchestrates transactions: prepare, commit, abort, retry
RPC bridge to remote function	Ships state + request over Protobuf RPC; handles replay/failure
Egress (Flink ⇒ Kafka)	Publishes final result or abort to clients

Function-invocation events, keyed by function type and instance, enter via Kafka. Each stateful function instance executes in a specific operator. The transactional "coordinator functions" (registered within StateFun) implement workflow-level orchestration—either by executing a 2PC plan for serializable isolation or a SAGA plan for weaker, eventually consistent semantics. Each function invocation is processed with bundled per-function state, with atomic commit, snapshotting, and deterministic recovery rooted in Flink’s existing dataflow and checkpointing model (Psarakis, 19 Dec 2025).

The coordinator functions are also responsible for distributed deadlock detection (using Chandy-Misra-Haas wait-for graphs), linearizable locking, and early reply optimizations (replying to the client immediately after commit is broadcast), minimizing the need for invasive changes to the Flink runtime.

3. Transactional API and Programming Model

T-Statefun exposes its transactional workflow API to developers via "coordinator functions" written in Python. These orchestrate multi-function transactions using either two-phase commit (2PC) or SAGA workflows. The API encapsulates transactional primitives, abstracting details such as locking, commit protocols, and compensation logic.

Core coordination API:

context.tpc_invocation(type, key, message)
context.send_on_success(type, key, message)
context.send_on_failure(type, key, message)
context.send_on_retryable(type, key, message) (2PC only)
context.saga_invocation_pair(type, key, message, compensating_message)

Example (2PC workflow):

def serializable_transfer(context, message: Transfer):
    sub = SubtractCredit(amount=message.amount)
    context.tpc_invocation("account_function", message.debtor, sub)
    add = AddCredit(amount=message.amount)
    context.tpc_invocation("account_function", message.creditor, add)

Example (SAGA workflow):

def sagas_transfer(context, message: Transfer):
    dec = SubtractCredit(amount=message.amount)
    inc = AddCredit(amount=message.amount)
    context.saga_invocation_pair("account_function", message.debtor, dec, inc)
    context.saga_invocation_pair("account_function", message.creditor, inc, dec)

Participant functions interact with local Flink state solely via context.get() and context.put(), signaling aborts with exceptions. The coordinator manages orchestration, deadlocks, and compensation automatically (Psarakis, 19 Dec 2025).

4. Implementation Details

T-Statefun requires minimal changes to Flink StateFun, with key extensions in three areas:

Per-operator embedded wrappers: The Protobuf RPC between Flink and remote functions is extended to transport transaction IDs, protocol phase ("PREPARE", "COMMIT", "ABORT"), and request context for idempotent replay.
Coordinator Function Library: Two core coordinator types are registered at runtime: tpc_coordinator for distributed 2PC (two-phase locking and Chandy-Misra-Haas deadlock detection), and saga_coordinator for buffering SAGA pairs and managing compensations.
Deadlock Management: Key-level locks acquired during 2PC-prepares lead to ABORT_RETRYABLE replies on conflict, enabling wait-for graph construction in the coordinator. The deadlock detector aborts the highest-TID transaction in a detected cycle and reschedules it.

T-Statefun inherits exactly-once messaging and strong snapshotting guarantees from Flink, ensuring transactional correctness—even across failures—by leveraging deterministic replay of Kafka-ingested events and idempotent transaction processing keyed by TID (Psarakis, 19 Dec 2025).

5. Performance Evaluation

T-Statefun was evaluated against Beldi (AWS Lambda + DynamoDB, SFaaS with 2PC logging) and Boki (Beldi with improved locking) on three canonical benchmarks: YCSB-T (two-key transfers), DeathStar Travel (hotel/flight reservations), and TPC-C (NewOrder/Payment).

On YCSB-T (10k key workload), T-Statefun achieves up to 2,000 TPS at 99th percentile latency below 50 ms, surpassing Beldi/Boki by a factor of 10×.
Component-level latency (YCSB-T, 100 TPS):

System	Function exec	Networking	State access
T-Statefun	2.7 ms (2.2%)	92 ms (74.3%)	29 ms (23.5%)
Beldi	1.0 ms (0.7%)	56.6 ms (38.4%)	89.6 ms (60.9%)

T-Statefun’s performance is predominantly bottlenecked by network and state-access overhead, but still outperforms DynamoDB-backed approaches due to co-locating state and avoiding remote database round-trips in the execution path (Psarakis, 19 Dec 2025).

6. Limitations and Lessons Learned

T-Statefun surfaced several structural limitations:

Programmability Overhead: Workflow construction is powerful but involves heavy boilerplate; each workflow requires explicit coordinator definition, participant enumeration, and error management. This complexity motivated the higher-level Stateflow DSL, which compiles standard object-oriented code into orchestrated workflows.
Contention Bottlenecks: Fine-grained two-phase locking in the coordinator induces throughput degradation (by 10×) under even moderate contention and increases deadlock management costs. This finding led to the design of Styx and its deterministic transaction protocol, capable of avoiding locks in most cases.
State/Compute Disaggregation: Routing each invocation through an RPC bridge external to Flink operators results in state serialization and network overhead that eclipses the actual business logic for high-throughput applications. The successor, Styx, co-locates state and logic in-memory inside the streaming engine, eliminating this penalty (Psarakis, 19 Dec 2025).

7. Practical Use Cases and Evolution

T-Statefun demonstrates transactional cloud-native application development for workflows such as banking transfers (2PC) and inventory reservation (SAGA). For example, a banking transfer is orchestrated as a 2PC transaction across two "account" functions, while an inventory reserve is orchestrated as a SAGA with compensation in the event of partial completion or failure.

These concrete use cases exhibit the high-level workflow API for transactional state manipulation, while exposing the complexity encapsulated within coordinator functions. The experience gained directly resulted in the creation of:

Stateflow, a Python DSL that compiles declarative object-oriented code into dataflow IR with transactional semantics automatically injected.
Styx, a custom streaming runtime supporting lock-free deterministic transactions, local state, coroutine-driven execution, and transactional state migration for elastic scaling.

The cumulative trajectory pioneered by T-Statefun, and the subsequent evolution in Stateflow and Styx, illustrates a progression towards democratizing scalable, consistent, transactionally safe cloud application development while minimizing distributed systems complexity (Psarakis, 19 Dec 2025).

Markdown Upgrade to Chat

References (1)

Democratizing Scalable Cloud Applications: Transactional Stateful Functions on Streaming Dataflows (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to T-Statefun.