Power Measurement Toolkit (PMT)

Updated 12 January 2026

The Power Measurement Toolkit (PMT) is a comprehensive system for analyzing energy consumption in diverse computing environments with precision and integration capabilities.
Designed for high-performance, PMT supports multiple hardware backends and APIs, offering low-overhead solutions for energy-aware applications across HPC, embedded systems, and data centers.
PMT underpins key advancements in green computing, providing tools for researchers to accurately measure, model, and optimize system performance in terms of energy usage.

The Power Measurement Toolkit (PMT) is a class of scientific libraries, device APIs, and modeling protocols for collecting, modeling, and analyzing energy consumption in heterogeneous computing platforms. PMT systems provide precise measurement, time-resolved logging, and direct integration pathways for energy-aware applications, notably in high-performance computing (HPC), embedded systems, and data center environments. PMT abstracts heterogeneous sensor backends, supports both hardware-native and PMC-based power estimation, offers low-overhead operation suitable for live application instrumentation, and has substantially enabled green computing research and workload optimization (Corda et al., 2022, Simsek et al., 2023, Mazzola et al., 30 Jun 2025, Mazzola et al., 2024).

1. Toolkit Architectures and Software Layers

PMT implementations adopt a layered architecture; core software is predominantly written in C++ or other low-level languages and runs on Linux. The key layers are:

Application Layer: User-facing measurement hooks, callable via C++ API or Python bindings (decorator or session-based).
User API: Abstract base class (pmt::pmt), supporting creation for multiple backends such as NVML (NVIDIA GPUs), RAPL (Intel/AMD CPUs), ROCm SMI (AMD GPUs), and external hardware sensors (e.g., PowerSensor2).
Sampling Engine: Background thread per monitored device, orchestrating periodic sensor reads.
Vendor/PMC Backends: Direct calls to hardware APIs (MSR for RAPL, NVML, ROCm SMI) or synthetic models via performance counter sampling (PMCs).
Hardware Support: Underlying SoC, CPU, GPU, accelerators, power rails/counters, with extendability to FPGAs, sysfs sources (Corda et al., 2022, Mazzola et al., 30 Jun 2025, Mazzola et al., 2024).

In recent extensions, PMT systems can be fully in-kernel (Runmeter LKM), operate at context switch or tick-level granularity, enable moving-window aggregation of PMCs, and evaluate real-time linear models in kernel space (fixed-point arithmetic) (Mazzola et al., 30 Jun 2025, Mazzola et al., 2024).

2. Backend Interfaces, Supported Hardware, and Integration

PMT discovers available hardware via its Hardware Abstraction Layer (HAL) and instantiates device-specific drivers. Major supported backends—with corresponding sampling rates and characteristics—are summarized below.

Device Type	Interface / API	Lowest Sampling Period	Notes
CPU	RAPL	500 ms	Package+DRAM domains
CPU	LIKWID	~100 ms	Fallback if RAPL unavailable
CPU	sysfs,/class	User-settable	ARM/Odroid, other SoCs
GPU	NVML	10 ms	Device power only
GPU	ROCm SMI	10 ms	AMD Radeon/Instinct
Ext. meter	PowerSensor2	1 ms	±1% accuracy
Ext. meter	PowerSensor3	50 µs	PCIe/USB/SOC/FPGA/SSD

PMT supports multi-device enumeration, thread-safe sampling across MPI ranks, per-function measurement session management, customizable sampling frequencies, and direct logging to CSV/JSON/binary formats (Corda et al., 2022, Simsek et al., 2023, Vlugt et al., 24 Apr 2025).

3. Measurement Methodology and Mathematical Models

PMT collects instantaneous power readings $P(t)$ at discrete timepoints $t_i$ and integrates to compute energy:

Continuous:

$E = \int_{t_0}^{t_1} P(t) dt$

Discrete (Riemann sum):

$E \approx \sum_{i=1}^N P_i \Delta t_i,\quad \Delta t_i = t_i - t_{i-1}$

Cumulative Counter (where available):

$E = C_\text{cum}(t_1) - C_\text{cum}(t_0)$

Average Power:

$\bar{P} = \frac{E}{t_1 - t_0}$

For PMC-based PMT (modern toolkit variants), offline profiling selects a subset of PMCs $X_{d, f}$ with highest linear correlation to measured power at each DVFS state, then trains a non-negative linear model:

Per-subsystem power:

$P_d(X_{d, f}, f) = L_d(f) + \sum_{i=1}^{|X_{d, f}|} w_{d, f, i} \cdot \frac{x_i}{T}$

Full-system:

$P_{tot}(f) = \sum_{d \in D^*} P_d(X_{d, f_d}, f_d)$

Calibration—via external meters or startup offset estimation—is used to correct sensor drift or systematic offset. Robustness to stochastic jitter is provided by windowed aggregation (Mazzola et al., 30 Jun 2025, Mazzola et al., 2024).

4. APIs for Measurement, Logging, and Analysis

PMT exposes instrumentation points suitable for both static region-averaged measurement and continuous time-series logging.

C++ API

auto sensor = pmt::nvml::NVML::create();
auto S = sensor->read();
// ... measured region ...
auto E = sensor->read();
std::cout << "Energy [J]: " << sensor->joules(S,E) << "\n";
std::cout << "Power [W]: "  << sensor->watts(S,E)  << "\n";

Python API

import pmt
@pmt.measure("nvml")
@pmt.measure("rapl")
def work():
    time.sleep(5)
results = work()
print(results)

Session-based measurement (multi-devices)

pmt_init();
int sess = pmt_create_session("mykernel");
pmt_register_devices(sess, {cpu_dev, gpu_dev});
pmt_start(sess); /* ... */
pmt_stop(sess);
pmt_export(sess, "energy_log.csv", PMT_FORMAT_CSV);
pmt_finalize();

Continuous logging (CSV/JSON):

1 2	rank, session, func, device, id, timestamp_ns, power_W, energy_J 0,SPH-EXA,MomentumEnergy,GPU,0,1673478912345678,210.5,15.875

Post-processing utilities: Python/Pandas and Matplotlib recipes for normalization, device breakdown, energy-delay product (EDP) computation, and performance-per-watt analysis (Corda et al., 2022, Simsek et al., 2023).

5. Performance, Accuracy, and Overhead Analysis

PMT overhead is dictated by operating mode, backend latency, and measurement granularity. Key performance metrics:

Mode	Typical Overhead	Accuracy	Minimum ΔE granularity
C++ measure-mode	~1 ms/region	5–10% vs ext. meter	NVML @10 ms: ∼2 J, RAPL @500 ms: ∼25 J
Python decorator	~10 ms/call	Similar	Stack of backends increases linearly
In-kernel Runmeter	0.2–0.7% CPU	7.5% Power MAPE	Sub-ms responsive, ~1.3% energy error
PowerSensor2/3	<1%	±1% (PS2), ±2–4 W (PS3)	1 ms (PS2), 50 µs (PS3)

Reported errors: GPU kernel measurement on TITAN RTX: PMT/PowerSensor2 within 3% systematic offset; overall energy error across CPU+GPU below 1.3% in PMC model deployments (Corda et al., 2022, Simsek et al., 2023, Mazzola et al., 30 Jun 2025, Vlugt et al., 24 Apr 2025, Mazzola et al., 2024).

6. Guidelines for Deployment and Best Practices

Measurement mode selection: Use region-bracketing for workload-average metrics; dump-mode for time-series and event correlation.
Sampling configuration: Set periods to backend limits (NVML: 10 ms; RAPL: 500 ms; PowerSensor3: as low as 50 µs), balancing resolution against system load.
API usage: Minimize sensor instance re-creation; stack Python decorators judiciously.
Calibration: Periodically benchmark against reference hardware meter; external meter validation to detect drift.
Energy-aware analysis: Compute EDP ( $E \times T$ ), per-watt performance (GFLOP/s/W), function-level breakdowns for optimization.
Cross-platform extension: Add backend by subclassing pmt::pmt, implementing read(), and registering with factory creation.
In-kernel integration: Deploy fixed-point PMC models for closed-loop DVFS, per-task scheduling, and power capping as in Runmeter (Corda et al., 2022, Mazzola et al., 2024, Mazzola et al., 30 Jun 2025).

7. Impact, Limitations, and Application Domains

PMT has shifted energy profiling from coarse system-level accounting to fine-grained, multi-device, per-kernel analysis, directly supporting exascale simulation codes, embedded system prototyping, and real-time scheduling under energy constraints. Example studies include SPH-EXA framework instrumented for GPU-centric astrophysics, data-driven PMC modeling for real-time DVFS, and Kernel Tuner applications for beamformer energy/performance tradeoffs (Simsek et al., 2023, Mazzola et al., 30 Jun 2025, Vlugt et al., 24 Apr 2025).

Limitations reside in backend API resolution, hardware counter compatibility, sensor calibration drift, and the assumption of subsystem power independence—affecting accuracy in tightly interacting CPU/GPU phases (Mazzola et al., 30 Jun 2025, Mazzola et al., 2024). A plausible implication is the need for periodic retraining and external calibration, especially as hardware platforms evolve.

PMT is foundational in sustainable computing research, enabling metrics-driven workload design, energy-aware optimization loops, and aggregation platforms for green HPC and embedded systems. For further details, reference implementation codebases are available as open-source repositories (Corda et al., 2022, Simsek et al., 2023, Mazzola et al., 2024).

Markdown Report Issue Upgrade to Chat

References (5)

PMT: Power Measurement Toolkit (2022)

Accurate Measurement of Application-level Energy Consumption for Energy-Aware Large-Scale Simulations (2023)

Data-Driven Power Modeling and Monitoring via Hardware Performance Counter Tracking (2025)

Data-Driven Power Modeling and Monitoring via Hardware Performance Counters Tracking (2024)

PowerSensor3: A Fast and Accurate Open Source Power Measurement Tool (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Power Measurement Toolkit (PMT).

Power Measurement Toolkit (PMT)

1. Toolkit Architectures and Software Layers

2. Backend Interfaces, Supported Hardware, and Integration

3. Measurement Methodology and Mathematical Models

4. APIs for Measurement, Logging, and Analysis

5. Performance, Accuracy, and Overhead Analysis

6. Guidelines for Deployment and Best Practices

7. Impact, Limitations, and Application Domains

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Power Measurement Toolkit (PMT)

1. Toolkit Architectures and Software Layers

2. Backend Interfaces, Supported Hardware, and Integration

3. Measurement Methodology and Mathematical Models

4. APIs for Measurement, Logging, and Analysis

5. Performance, Accuracy, and Overhead Analysis

6. Guidelines for Deployment and Best Practices

7. Impact, Limitations, and Application Domains

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research