Papers
Topics
Authors
Recent
Search
2000 character limit reached

Value-Change Logging

Updated 3 May 2026
  • Value-change logging is a systematic process of recording dynamic updates in variables, configurations, and memory states to ensure atomicity and crash recovery.
  • It underpins applications in transactional storage, change-data-capture pipelines, process mining, and configuration diagnosability with practical protocols and formal invariants.
  • Recent advances demonstrate reduced latency in NVM systems, lock-free watermark CDC, and enhanced diagnostic logging through tailored formal models and LLM-based augmentation.

Value-change logging refers to the systematic recording of changes in the values of variables, attributes, configuration parameters, memory contents, or database rows, together with the context in which the change occurs. This logging paradigm underpins core requirements in transactional storage, process mining, configuration debugging, and change-data-capture (CDC) pipelines. Recent research across systems, databases, and software engineering has advanced mechanism design, formal models, and implementation strategies for atomic and high-fidelity value-change capture.

1. Atomic Value-Change Logging in Persistent Memory

Efficient and correct value-change logging in non-volatile memory (NVM) systems faces the challenge posed by hardware-induced store reordering. CPU out-of-order execution and the effects of the cache coherence protocol can reorder memory writes, so that multi-word updates may reach persistent storage non-atomically, risking data corruption or inconsistency after crashes. Prior techniques (undo/redo logging, torn bits, two-phase commit) typically required two flush/fence cycles per change, incurring high latency costs.

The "validity-bit" protocol, as formalized in the Persistent Cache Store Order (PCSO) model, leverages the fact that within a single cache line, stores reach NVM in program order. By designating the last store as a dedicated validity flag (1 bit), and issuing a single clflushopt and sfence per cache line, it becomes sufficient to atomically publish multi-word changes: if the validity bit is observed on recovery, all payload stores must have also occurred. This reduces the round trips to NVM by half, with up to 2× reduction in per-update latency (from approximately 400–800 ns to 200 ns on representative hardware), and enables atomic multi-word updates in persistent memory systems (Cohen et al., 2017).

2. Value-Change Logging for Change-Data-Capture (CDC)

In multi-database environments and microservices ecosystems, keeping distributed node states consistent requires recording and propagating value changes with low latency and high integrity. Traditional dual-write or distributed transaction schemes struggle with feasibility and operational complexity. Modern CDC frameworks capture value changes by extracting row-level change events from transaction logs (e.g., binlog, replication slots).

The DBLog system introduces a watermark-based CDC architecture in which two logical watermarks per chunk delineate a capture window: a low watermark (wminw_{\min}) records the LSN just before a chunked SELECT begins, and a high watermark (wmaxw_{\max}) records the LSN just after chunk execution. By interleaving log event processing with table snapshots, DBLog guarantees strict global event ordering, completeness, and deduplication. The protocol is lock-free: only two single-row UPDATEs are issued per chunk for watermarks, chunked SELECTs are executed under read-committed isolation, and source database overhead remains negligible (latency increases under 5%). In production, this approach enables end-to-end latencies of ~100–300 ms for value propagation and supports high concurrency with low operational impact (Andreakis et al., 2020).

3. Data-Aware Value-Change Logging in Process Mining

Process mining and object-centric event analysis require logs that unambiguously represent not only static object or event attributes, but also dynamic or time-varying attribute histories. Conventional log formats (e.g., XES) lack native support for tracking evolving attribute values linked to specific objects or events.

The Data-aware OCEL (DOCEL) format extends classical object-centric event logs by introducing an explicit dynamic attribute relation. Each update is represented by a tuple (a,e,o,v)(a, e, o, v) in the Dyn table, indicating that event ee updated dynamic attribute aa of object oo to value vv at timestamp τ(e)\tau(e). The full history of each dynamic attribute for any object is thus recovered by projecting the relation and sorting by event time. Formal invariants enforce uniqueness (only one update per (a,e,o)(a,e,o)), disjoint static/dynamic attribute naming, and event–object linkage, enabling lossless reconstruction of value-change sequences for advanced data-aware analyses (Goossens et al., 2022).

4. Configuration and Program Value-Change Logging for Diagnosability

In complex configurable systems, silent or latent misconfigurations are a major source of failures but are often poorly diagnosed due to insufficient runtime logging of configuration value changes and their impact. ConfLogger introduces a methodology for systematic configuration value-change logging via static taint analysis and LLM-guided log generation.

Using program dependence graphs (PDG), ConfLogger identifies code statements that are data- or control-dependent on configuration "source" APIs (getter calls to configuration engines). For each configuration-sensitive sink (e.g., branch condition), candidate code blocks are extracted, and an LLM is prompted to generate or augment log statements (using the SLF4J format), ensuring that runtime values, violated constraints, and corrective guidance are captured at the point of use. Empirical evaluation across eight large-scale Java systems demonstrates 100% diagnosis of 30 silent misconfiguration cases, 74% logging-point coverage (higher than LLM-only baselines), significant increases in variable logging precision (8.6%), recall (79.3%), and F1 (26.2%), and over 250% improvement in user troubleshooting accuracy (Shan et al., 28 Aug 2025).

5. Formal Models and Invariants for Value-Change Logs

Formal correctness in value-change logging is generally enforced through a combination of hardware-aware memory models, CDC invariants, and log schema properties:

  • In NVM logging, the PCSO model and the invariant that a cache line is considered "valid" in NVM if and only if its validity bit matches the current sentinel support atomicity and crash consistency. Atomicity is ensured because either all prior payload stores and the final validity bit are persisted, or none are (Cohen et al., 2017).
  • In watermark CDC, invariants regarding LSN range coverage (wmin,wmaxw_{\min}, w_{\max}) guarantee that emitted changes comprise either log events or selected rows, but never omit or duplicate any state transition over the capture interval (Andreakis et al., 2020).
  • In DOCEL, attribute set disjointness, temporal coherence of event timestamps, and single-update-per-dynamic-attribute-per-event properties enable unambiguous value history reconstruction and reliable dynamic analysis (Goossens et al., 2022).
  • In configuration logging, formal PDG-based taint analysis ensures correct identification of all configuration value-flows that may affect program control or behavior, supporting comprehensive diagnosability diagnostics (Shan et al., 28 Aug 2025).

6. Applications and Extensions

Value-change logging underpins several advanced use cases:

The underlying formal models and mechanisms are amenable to adaptation for other runtime value domains, including feature flag tracing and dynamic system state analytics. Extensions may include hybrid dynamic-static analysis feedback, cross-domain tainting, or automated test-generation to maximize diagnostic coverage.

7. Performance, Limitations, and Practical Notes

State-of-the-art value-change logging mechanisms are generally designed for negligible runtime impact, high concurrency, and robustness to partial failures:

  • In NVM, single-flush protocols effectively halve latency per log write.
  • In CDC, chunked SELECTs and lock-free watermark updates avoid long-duration locks, and chunk size parameters allow tuning for minimal OLTP interference.
  • In program configuration logging, empirically measured logging-point coverage and analysis runtime demonstrate practical tractability, though Java-centricity (in ConfLogger) and LLM hallucination risk are limitations.
  • In process mining, the DOCEL model ensures no dynamic update is lost or ambiguously timed, although richer query semantics or real-time streaming support remain open areas.

Future work in these domains includes language-agnostic static analysis, cross-system formal specification of logging contracts, and unified APIs for multi-granularity value-change tracking.

References:

Efficient Logging in Non-Volatile Memory by Exploiting Coherency Protocols (Cohen et al., 2017) DBLog: A Watermark Based Change-Data-Capture Framework (Andreakis et al., 2020) Enhancing Data-Awareness of Object-Centric Event Logs (Goossens et al., 2022) ConfLogger: Enhance Systems' Configuration Diagnosability through Configuration Logging (Shan et al., 28 Aug 2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Value-Change Logging.