Papers
Topics
Authors
Recent
Search
2000 character limit reached

mmcmcBayes: Region-Level DMR Detection in EWAS

Updated 5 February 2026
  • mmcmcBayes is an R package that uses a region-centric, multistage Bayesian MCMC framework to detect differentially methylated regions (DMRs) in epigenomic studies.
  • It models regional methylation patterns with a flexible alpha-skew generalized normal distribution and adaptively splits genomic regions based on Bayes factors.
  • The package offers comprehensive tools for summarizing, comparing, and visualizing DMRs, serving as an effective alternative to CpG-level aggregation methods.

mmcmcBayes is an R package designed for region-level detection of differentially methylated regions (DMRs) in epigenome-wide association studies (EWAS). It implements a multistage Markov chain Monte Carlo (MCMC) framework that directly models regional methylation patterns, employing a flexible skewed distribution for summary statistics and adaptively splitting genomic regions based on Bayesian evidence measures. The package provides functions for summarizing, comparing, and visualizing detected regions, and is positioned as a region-level alternative to conventional CpG-aggregation strategies (Yang et al., 4 Feb 2026).

1. Objective and Rationale

The principal objective of mmcmcBayes is to detect DMRs—contiguous blocks of CpG sites with consistent, group-specific differences in DNA methylation levels. In contrast to methods such as bumphunter, DMRcate, DSS, and bsseq, which typically conduct CpG-level tests followed by spatial aggregation, mmcmcBayes treats regions as fundamental units. This region-centric approach models sample-wise summaries over candidate regions, testing region-level hypotheses directly.

Key motivations include the ability to:

  • Accommodate spatial correlation of CpGs within genomic regions without imposing ad hoc thresholds for grouping.
  • Capture skewed or multimodal regional methylation patterns, often encountered in heterogeneous biological samples.
  • Evaluate evidence for differential methylation via Bayes factors, eliminating dependence on p-value permutation calibration procedures.

A DMR identified by mmcmcBayes is a region where the joint methylation distribution differs between two groups, commonly corresponding to regulatory elements or disease biomarkers (Yang et al., 4 Feb 2026).

2. Statistical Modeling Framework

Data Summarization

For each subject jj, group mm (with m=0m=0 for cancer and m=1m=1 for control), and candidate region kk at stage \ell, the input is the mean β\beta-value across CpGs in the segment, mapped to an M-value for numerical stability:

$y_{jmk}^\ell = \logit(\beta_{jmk}^\ell + c) = \log\left(\frac{\beta_{jmk}^\ell + c}{1 - \beta_{jmk}^\ell + c}\right),\quad c = 10^{-6}$

Distributional Model

The alpha-skew generalized normal (ASGN) distribution, with four parameters (location μ\mu, scale σ>0\sigma > 0, skewness αR\alpha \in \mathbb{R}, shape κ>0\kappa > 0), is employed to model yjmky_{jmk}^\ell:

f(x;μ,σ,α,κ)=κασΓ(1/κ)exp(xμασκ)[1+erf(α(xμ)ασ2)]f(x; \mu, \sigma, \alpha, \kappa) = \frac{\kappa}{\alpha\,\sigma\,\Gamma(1/\kappa)}\exp\left(-\left|\frac{x-\mu}{\alpha\,\sigma}\right|^{\kappa}\right)\left[1 + \operatorname{erf}\left(\frac{\alpha(x-\mu)}{\alpha\,\sigma\sqrt{2}}\right)\right]

Parameter roles:

Parameter Effect Description
μ\mu Central location Distribution shift
σ\sigma Scale Distribution width
α\alpha Skewness (α>0\alpha>0 right, <0<0 left) Asymmetry extent
κ\kappa Shape Tail weight

Each segment kk at stage \ell and group mm supposes yjmky_{jmk}^\ell i.i.d. ASGN(μmk,σmk,αmk,κ)(\mu_{mk}^\ell, \sigma_{mk}^\ell, \alpha_{mk}^\ell, \kappa), with priors αmkN(μa,σa2)\alpha_{mk}^\ell \sim N(\mu_a, \sigma_a^2), μmkN(μn,σn2)\mu_{mk}^\ell \sim N(\mu_n, \sigma_n^2), $\sigma^2_{mk}^\ell \sim \text{IG}(A_d, B_d)$. Hyperparameters are user-controllable or weakly-informative; posterior means from stage 1\ell-1 are used for >1\ell>1 (Yang et al., 4 Feb 2026).

3. Bayesian Evidence Evaluation

Hypotheses and Bayes Factor

For region kk at stage \ell:

  • H0H_0: All samples share one ASGN.
  • H1H_1: Cancer and control groups have distinct ASGN distributions.

The Bayes factor is:

BFk=p(datakH1)p(datakH0)\mathrm{BF}_k^\ell = \frac{p(\text{data}_{k}^\ell \mid H_1)}{p(\text{data}_{k}^\ell \mid H_0)}

In practice, marginal likelihoods are approximated by plugging in posterior mean MCMC estimates:

BFkjf(yj0kα^0k,μ^0k,σ^0k,κ)jf(yj1kα^1k,μ^1k,σ^1k,κ)\mathrm{BF}_k^\ell \approx \frac{\prod_{j} f(y_{j0k}^\ell | \hat\alpha_{0k}^\ell, \hat\mu_{0k}^\ell, \hat\sigma_{0k}^\ell, \kappa)}{\prod_{j} f(y_{j1k}^\ell | \hat\alpha_{1k}^\ell, \hat\mu_{1k}^\ell, \hat\sigma_{1k}^\ell, \kappa)}

Interpretation: BF>1>1 signals evidence for differential methylation. Thresholds for declaring/splitting regions are user-defined (default: (0.5, 0.8, 1.05)), with lower values producing finer segmentation and higher values yielding greater conservatism (Yang et al., 4 Feb 2026).

4. Multistage Region-Splitting Strategy

The detection algorithm proceeds in LL stages:

  1. Initial stage: the whole chromosome or block is treated as one segment.
  2. For each segment at stage \ell:
    • Compute mean M-values for each group.
    • Fit ASGN under H0H_0 and H1H_1 (with asgn_func); obtain posterior means.
    • Compute BFk\mathrm{BF}_k^\ell.
    • If BFk\mathrm{BF}_k^\ell \ge threshold for stage \ell:
      • If <L\ell < L, split into num_splits subregions for next stage.
      • If =L\ell = L, declare as final DMR.
    • If BFk<\mathrm{BF}_k^\ell < threshold, no action.
  3. Iteration stops when LL reached or no new segments qualify.
  4. Output is a table of DMRs (chromosome, CpG range, CpG count, BF, stage).

The region-splitting mechanism adaptively refines DMR candidates, avoiding arbitrary aggregation of CpG-level signals and focusing resolution according to statistical evidence (Yang et al., 4 Feb 2026).

5. Package Implementation and Workflow

Installation and Dependencies

mmcmcBayes is available from CRAN:

1
install.packages("mmcmcBayes")
Dependencies: R (≥4.0), coda, stats4, graphics [all CRAN].

Core Functions

Function Purpose
mmcmcBayes Main region-level DMR detection (returns DMR data.frame)
asgn_func Fits ASGN via MCMC; returns posteriors for (α,μ,σ2)(\alpha, \mu, \sigma^2)
summarize_dmrs Summarizes detected DMRs (counts, region sizes, Bayes factors)
compare_dmrs Computes overlaps between two DMR results
plot_dmr_region Plots group mean M-values across CpGs in region

Example Workflow

  1. Prepare two methylation data.frames (CpGs by sample, sorted by genomic position).
  2. Run detection:

1
2
3
rst <- mmcmcBayes(cancer_data, normal_data)
summary <- summarize_dmrs(rst)
plot_dmr_region(rst, cancer_data, normal_data, dmr_index=1:4)

Parameters such as max_stages, num_splits, MCMC control, and thresholds are adjustable (Yang et al., 4 Feb 2026).

6. Empirical Performance and Application

Simulation Study

  • Simulations (chr6 baseline + Gaussian noise, 10 synthetic DMRs per replicate, various lengths/effect sizes) show that, at max_stages=2, increased splits raise FDR with negligible sensitivity change.
  • At max_stages=3, sensitivity increases up to ~50 splits, with FDR controlled below ~10%, after which both FDR and computational time rise sharply.
  • Default parameters (max_stages=3, num_splits=50, bf_thresholds=(0.5, 0.8, 1.05)) achieve sensitivity ~80–90% and FDR ~5–10% (Yang et al., 4 Feb 2026).

Real Data

  • Application to Illumina 450K lung cancer data (chr6; 36,438 CpGs; 19/group): detected 1,514 DMRs at stage 3, regions median 15 CpGs, Bayes factor range [1.05, ~1.96], with visualized heterogeneity across DMRs (Yang et al., 4 Feb 2026).

7. Recommendations and Usage Considerations

mmcmcBayes provides a region-level Bayesian framework that models distributional shifts via the ASGN and guides segmentation with Bayes factors. Recommended default settings (Illumina-type data):

  • max_stages = 3, num_splits = 50
  • bf_thresholds = c(0.5, 0.8, 1.05)
  • MCMC: nburn=5000, niter=10000, thin=1

Best practices:

  • Pre-sort CpGs by chromosome and position.
  • Apply the analysis in parallel by chromosome if necessary.
  • Use summarize_dmrs and plot_dmr_region to inspect Bayes factor distributions and DMR configuration.
  • mmcmcBayes provides a complement to CpG-level tools, especially in the presence of skewed or multimodal methylation distributions where region-level differences are not well-captured by sitewise aggregation (Yang et al., 4 Feb 2026).
Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to mmcmcBayes Package.