Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multidimensional Multi-scale Entropy (MMSE)

Updated 6 April 2026
  • Multidimensional Multi-scale Entropy (MMSE) is an information-theoretic metric that extends single-dimensional entropy to assess joint complexity across multiple scales and channels.
  • It employs normalization, coarse-graining, embedding, and Chebyshev norm match counting to produce a robust entropy scalar that mitigates noise and captures cross-metric interactions.
  • MMSE finds practical use in software reliability, physiological signal analysis, and emotion recognition, providing actionable insights into system aging and regime shifts.

Multidimensional Multi-scale Entropy (MMSE) is an information-theoretic metric developed to quantify the joint temporal complexity of multidimensional time series over multiple scales. Originally introduced as an indicator of software aging, MMSE extends classical single-dimensional sample entropy to simultaneously handle multivariate signals and their scale-dependent irregularities. The core motivation is to construct robust, noise-tolerant indicators of system state or complexity that account for cross-metric interactions and are stable across a range of real-world fluctuation regimes. MMSE is widely applicable in domains where multichannel or multimodal observations are fundamental, including software reliability engineering, physiological signal analysis, and emotion recognition.

1. Mathematical Formulation and Algorithmic Definition

Let XRN×pX \in \mathbb{R}^{N \times p} denote a window of pp concurrent time-series of length NN. The MMSE computation proceeds as follows (Chen et al., 2015, Xiao et al., 2021, Tung et al., 2018):

  1. Normalization: Each metric (column) is linearly rescaled to the interval [0,1][0,1]:

Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}

for j=1Nj = 1 \ldots N, i=1pi = 1 \ldots p.

  1. Similarity Tolerance (rr): Compute the p×pp \times p covariance Σ=Cov(X)\Sigma = \mathrm{Cov}(X'), and set pp0 as the matching tolerance for high-dimensional comparisons.
  2. Coarse-graining: For each scale pp1, the series is coarse-grained:

pp2

for pp3, pp4.

  1. Embedding: For a fixed embedding dimension pp5, form overlapping vectors

pp6

for pp7.

  1. Counting Matches: Using the pp8 norm, compute

pp9

and average:

NN0

  1. Sample Entropy per Scale:

NN1

  1. MMSE Scalar (Composed Entropy):

NN2

This pipeline yields a single scalar per window that reflects the overall multiscale, multidimensional entropy of the input window.

The necessity for “multidimensional” and “multi-scale” features arises from empirical limitations of single-metric or single-scale entropy approaches. Single-channel entropy may miss complex cross-metric dependencies, while traditional entropy at a single timescale is sensitive to high-frequency noise or transient patterns. By integrating over coarse-grained scales NN3, MMSE filters noise and captures slow, global trends. By embedding all NN4 metrics as state-vectors, MMSE probes joint state irregularity, thus being highly sensitive to subtle system-wide changes due to aging, coordination loss, or multi-source heterogeneity (Chen et al., 2015, Xiao et al., 2021).

MMSE generalizes univariate multi-scale entropy (MSE) and multivariate sample entropy (MSampEn), as confirmed by parallel definitions in multichannel physiological settings (Xiao et al., 2021, Tung et al., 2018). The distance measure (Chebyshev/NN5 norm) and tolerance NN6 ensure robust detection of high-dimensional divergence.

3. Algorithmic Implementation and Computational Complexity

A high-level pseudocode, organizing the main steps, is summarized below (aligning with (Chen et al., 2015, Xiao et al., 2021, Tung et al., 2018)):

rr0

The dominant computational cost is NN7 for full match counting at each scale, as each embedded vector is compared pairwise within its window. Coarse-graining is NN8. For NN9, [0,1][0,1]0, and [0,1][0,1]1, this results in approximately [0,1][0,1]2 distance computations per window (Chen et al., 2015, Xiao et al., 2021), motivating optimizations for large datasets (e.g., windowing, indexing, or down-sampling).

4. Theoretical Properties

Three core properties are established and theoretically justified (Chen et al., 2015):

  1. Monotonicity: Under monotonically increasing failure probability [0,1][0,1]3, the entropy proxy [0,1][0,1]4 grows monotonically while [0,1][0,1]5. MMSE empirically inherits such monotonicity under mild stationarity.
  2. Stability: Summing sample entropy squared across scales (Euclidean norm) filters out local spikes and increases robustness against transient or high-frequency noise.
  3. Integration: Embedding all [0,1][0,1]6 metrics into a joint vector space ensures sensitivity to cross-channel interactions and hidden joint irregularities.

These properties distinguish MMSE as an indicator with provable discrimination capability vis-à-vis system health and aging, justifying its use in both anomaly detection and complexity quantification.

5. Tuning, Parameter Selection, and Practical Guidelines

Key parameters and practical recommendations synthesized from empirical studies (Chen et al., 2015, Xiao et al., 2021, Tung et al., 2018):

  • Embedding dimension ([0,1][0,1]7): [0,1][0,1]8 or [0,1][0,1]9 suffices for most engineering and physiological time series.
  • Number of scales (Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}0): Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}1–Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}2 is common, with window length Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}3–Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}4; for separation across physiological regimes, Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}5 up to Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}6–Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}7 can be used if Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}8 is large.
  • Tolerance (Xj,i=Xj,iminkXk,imaxkXk,iminkXk,iX'_{j,i} = \frac{X_{j,i} - \min_k X_{k,i}}{\max_k X_{k,i} - \min_k X_{k,i}}9): For software metrics, set j=1Nj = 1 \ldots N0; for physiological data, j=1Nj = 1 \ldots N1–j=1Nj = 1 \ldots N2 for stable separation.
  • Metric selection (j=1Nj = 1 \ldots N3): In software, reduce 70+ counters to j=1Nj = 1 \ldots N4–j=1Nj = 1 \ldots N5 by PCA and variable selection; in physiological signals, group multichannel data into functional regions.
  • Window size (j=1Nj = 1 \ldots N6): j=1Nj = 1 \ldots N7 is effective for engineering time series; as little as j=1Nj = 1 \ldots N8–j=1Nj = 1 \ldots N9 suffices for simple AR models with short embedding.

Parameter sensitivity is dominated by the requirements that i=1pi = 1 \ldots p0, and for high embedding dimensions or many channels, either high data lengths or channel-selection methods are necessary to avoid estimator degeneracy.

6. Comparative Evaluation and Empirical Performance

MMSE has been extensively compared to scalar multi-scale entropy (MSE) and to more recent generalizations such as Variational Embedding Multiscale Sample Entropy (VEMSE) (Xiao et al., 2021):

  • Synthetic signals: MMSE separates autoregressive (AR) model classes for i=1pi = 1 \ldots p1 (i=1pi = 1 \ldots p2); VEMSE achieves similar separation at shorter i=1pi = 1 \ldots p3 and higher i=1pi = 1 \ldots p4.
  • Noise robustness: MMSE performance degrades under strong noise, but to a lesser extent than scalar MSE.
  • Computational efficiency: MMSE requires quadratic time in window length per scale, while VEMSE is 20–50% faster in benchmarks.
  • Real-world data: In software aging detection (Helix-Server, AntVision), MMSE in the CHAOS framework delivered 5-fold higher detection precision and 3 orders of magnitude improvement in ahead-time-to-failure vs. single-metric or Hölder-based alternatives (Chen et al., 2015).
  • Physiological analysis: In wind and heart rate data, MMSE correctly ordered regime complexity, although with overlapping error bars at large scales. For EEG emotion recognition, MMSE features did not achieve statistical significance for arousal or valence separation, whereas permutation-entropy variants (e.g., MMPE) did (Tung et al., 2018).

A synthesized comparison is given below:

Domain / Task MMSE Efficacy Quantitative Notes
Software Aging (Chen et al., 2015) High 5× precision, near-zero ATTF improvement
Wind/Physio (Xiao et al., 2021) Moderate Good regime ordering, error bar overlap
Emotion-EEG (Tung et al., 2018) Not significant MMSE features not selected (p > 0.1); other entropy variants outperformed

7. Extensions, Variants, and Domain-Specific Adaptations

The literature presents multiple extensions of MMSE to address limitations in short data settings, high embedding dimensions, and channel heterogeneity:

  • Variational Embedding MMSE (VEMSE): Assigns varying embedding dimensions per channel, improving effectiveness for small i=1pi = 1 \ldots p5, high i=1pi = 1 \ldots p6, mixed signal quality, or significant data heterogeneity (Xiao et al., 2021).
  • Composite/Refined Coarse-Graining: Averaging across block offsets to avoid undefined logs and increase estimator stability in nonstationary signals (Tung et al., 2018).
  • Parameter Adaptations: Channel grouping and PCA-driven selection are standard to manage computational and estimator complexity, especially for i=1pi = 1 \ldots p7.

A plausible implication is that, while MMSE is robust for moderate i=1pi = 1 \ldots p8 and i=1pi = 1 \ldots p9, cutting-edge research focuses on reducing its sensitivity to curse-of-dimensionality effects and adapting embedding choices dynamically per channel or signal condition.


In summary, Multidimensional Multi-scale Entropy provides a proven, theoretically sound, and practically tunable framework for measuring joint multivariate complexity over multiple scales. Its adoption in critical software monitoring, physiological regime analysis, and hybrid multimodal recognition tasks demonstrates both broad flexibility and the importance of careful parameterization and empirical validation (Chen et al., 2015, Xiao et al., 2021, Tung et al., 2018).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Multidimensional Multi-scale Entropy (MMSE).