- The paper presents BatteryLife, the largest and most diverse dataset and benchmark for battery life prediction, and introduces CyclePatch, a novel modeling technique.
- The BatteryLife dataset comprises over 90,000 samples from 998 batteries, offering significantly increased size and diversity across various aging conditions.
- The novel CyclePatch technique models degradation patterns across battery cycles and achieves state-of-the-art prediction performance on the BatteryLife benchmark.
The paper introduces BatteryLife, a comprehensive dataset and benchmark for battery life prediction (BLP), addressing limitations in existing datasets regarding size, diversity, and inconsistent benchmarks. The authors integrate 16 datasets, increasing the sample size by 2.4 times compared to the previous largest dataset. BatteryLife encompasses batteries from 8 formats, 80 chemical systems, 12 operating temperatures, and 646 charge/discharge protocols, including zinc-ion, sodium-ion, and industry-tested large-capacity lithium-ion batteries.
The paper identifies three key challenges in the field:
- Limited dataset size, hindering comprehensive insights into modern battery life data.
- Restricted data diversity in existing datasets, raising concerns about the generalizability of findings.
- Inconsistent and limited benchmarks, obscuring the effectiveness of baselines.
To address these challenges, the authors propose BatteryLife, a dataset that is 2.4 times larger than BatteryML, with more than 90,000 samples from 998 batteries. It offers unparalleled diversity, delivering 4 times the format, 16 times the chemical system, 2.4 times the operating temperature, and 3.4 times the charge/discharge protocol compared to BatteryML.
The paper introduces CyclePatch, a plug-in technique to model degradation test data. CyclePatch addresses the observation that voltage and current time series exhibit similar patterns across cycles within a protocol. It treats each cycle as a token, capturing recurring patterns in degradation tests. The cycling data from batteries with different life labels show discriminative characteristics, both in terms of cycles of the same number and variance across cycles. CyclePatch employs an intra-cycle encoder to model the interactions among variables within each cycle and generates informed representations for each cycle, on which an inter-cycle encoder is applied to learn patterns across cycle tokens.
The main contributions of this work are:
- BatteryLife is the largest battery life dataset, offering more than 90,000 samples from 998 batteries.
- BatteryLife is the most diverse battery life dataset, containing lab-tested Li-ion, Na-ion, and Zn-ion batteries, as well as industry-tested large-capacity Li-ion batteries.
- BatteryLife provides a comprehensive benchmark for BLP, offering fair comparisons of popular baselines, and introduces CyclePatch as a plug-in technique for BLP.
The paper defines battery life as the cycle number at which the state of health (SOH) becomes no larger than 80%, where SOH is defined as:
SOH=Q0​Qi​​
where:
- SOH is the state of health
- Qi​ is the capacity of ith cycle
- Q0​ is the initial capacity
The cycling data pattern is affected by aging conditions. The aging factors considered in this work are battery format, anode, cathode, electrolyte, charge protocols, discharge protocols, operation temperature, nominal capacity, and manufacturer.
The paper defines the problem as: given the input X1:S​ with ∀S≤100, predict the battery life denoted by y∈R1. Xi:N​=[Xi​,Xi+1​,⋯,XN​]∈R3×T represents the voltage, current, and capacity variables across T time steps starting from the ith cycle to the Nth cycle, where Xi​∈R3×Ti​ is the cycling data of the ith cycle with Ti​ time steps.
The paper splits BatteryLife into four parts: Li-ion, Zn-ion, Na-ion, and CALB. The data statistics of each part are summarized in Table 2 of the paper.
The popular benchmark methods that the paper considers are: Transformer encoder, LSTM (Long short-term memory), BiLSTM (bidirectional long short-term memory), GRU (gate recurrent unit), BiGRU (bidirectional gate recurrent unit), CNN (convolutional neural network), MLP (multilayer perceptron), DLinear, PatchTST, Autoformer, iTransformer and MICN.
CyclePatch segments the cycling time series into basic units that have recurring patterns throughout degradation tests. The cycle token is computed by:
[X1​,X2​,X3​,⋯,XS​]=Segment(Xi:S​)
X^i​=Wflatten(Xi​)+b
where:
- Xi​ is the ith cycle
- W∈RD1​×900
- b∈RD1​
- flatten(Xi​)∈R900
The computation at the lth layer of the intra-cycle encoder is given by:
z^il​=W2l​σ(W1l​zil−1​+b1l​)+b2l​
zil​=LN(z^il​+zil−1​)
where:
- W1l​∈RD2​×D1​
- W2l​∈RD1​×D2​
- b1l​∈RD1​
- b2l​∈RD2​
- LN denotes layer normalization
An inter-cycle encoder is then applied to extract key patterns across cycle token embeddings:
v=f(H)
y^​=Projection(v)
where:
- f(â‹…) represents the inter-cycle encoder
- v captures both intra-cycle and inter-cycle information
- y^​ is the final prediction
The paper conducts experiments to answer four research questions:
- RQ1: How do the benchmark methods perform on different domains?
- RQ2: How do the main components of CyclePatch framework affect the performance?
- RQ3: How adaptable are benchmark methods when applied across aging conditions in each domain?
- RQ4: How transferrable is the model pretrained on the Li-ion domain for other domains?
The paper employs two metrics to evaluate model performance: MAPE (mean absolute percentage error) and α-accuracy.
The MAPE is computed as:
MAPE=N1​i∑N​yi​∣yi​−y^​i​∣​
where:
- yi​ is the ground truth battery life of the ith sample
- y^​i​ is the predicted battery life of the ith sample
- N is the number of samples in the testing set
The α-accuracy is computed as:
α−accuracy=N1​i=1∑N​1∣yi​−y^​i​∣≤αyi​​(y^​i​)
The paper finds that CyclePatch methods achieve the best performance across all domains. The best MAPE are 0.179, 0.515, 0.255, and 0.141 for the Li-ion, Zn-ion, Na-ion, and CALB datasets, respectively. Techniques successful in other time series fields cannot be naively applied to BLP. The model performance improves with the increase in the number of usable cycles initially and then plateaus. All components in CyclePatch significantly contribute to its performance. Models generally perform worse on unseen aging conditions compared to those seen ones. Domain adaptation significantly improves model performance on Zn-ion and Na-ion datasets.
In future studies, the authors plan to incorporate more datasets into BatteryLife and focus on developing more transferrable models for BLP.