Analysis of Bonferroni and Benjamini–Hochberg Procedures in Controlling False Discoveries
The pervasive challenge of controlling Type I errors in multiple hypothesis testing necessitates rigorous methodologies, especially in contexts involving high-throughput data such as microarrays. This paper critically examines the Bonferroni procedure and its comparison with the less conservative Benjamini–Hochberg (BH) method. Contrary to popular belief that the Bonferroni approach is excessively conservative, this paper demonstrates that its perceived limitations may stem from traditional parameter constraints rather than inherent methodological shortcomings.
Context and Objective
Microarray analyses and similar high-throughput technologies often involve testing thousands of hypotheses simultaneously. The classical Bonferroni method, despite its popularity, has been criticized for being too conservative, especially when the family-wise error rate (FWER) control is the primary concern. Various approaches like the false discovery rate (FDR), introduced by Benjamini and Hochberg, offer less stringent control, prioritizing exploratory analysis over strict Type I error regulation. The present paper revisits the Bonferroni method, advocating its utility and stability in controlling the per family error rate (PFER).
Methodological Insights
The authors present an extended interpretation of the Bonferroni procedure, which focuses on controlling the PFER rather than the probability of at least one false discovery. This reinterpretation aims to clarify the misconception of its excessive conservatism. The paper highlights that when adjusted appropriately, the Bonferroni procedure's outcomes are highly correlated to those of the BH procedure. Through simulated data, they demonstrate that both procedures can achieve comparable levels of FDR and PFER, with Bonferroni providing a potentially higher stability in terms of variance in total numbers of discoveries, particularly in correlated datasets.
Simulation Studies and Results
The paper's simulations involve both independent (SIM43) and correlated (SIM43CORR) datasets. The latter introduces moderate pairwise correlations, reflective of real-world microarray data. Results indicate that while the BH procedure often yields a higher number of rejections, the Bonferroni method can achieve similar power under appropriately chosen parameters. The standard deviation of the number of true discoveries, a proxy for the power of a multiple testing procedure (MTP), is generally lower for Bonferroni compared to BH, suggesting superior stability, especially in correlated scenarios. Scatterplots further confirm the high correlation between the outputs of both procedures, affirming the comparable efficacy of the Bonferroni procedure under the extended interpretation.
Implications and Conclusions
The implications of this research are multifaceted. Practically, the Bonferroni method remains a viable option for researchers aiming for a mean control over Type I errors instead of absolute probability bounds. This can be particularly advantageous when datasets exhibit complex dependency structures, where the stability in error rates becomes crucial. From a theoretical standpoint, this paper encourages a re-evaluation of traditional perspectives on multiple testing procedures, urging a shift towards understanding the nuanced applications of classical methods like Bonferroni.
Future Prospects
The paper posits that practitioners should consider the choice between Bonferroni and BH based on their experimental objectives and tolerance for false positives. Additionally, the stability and simplicity of the extended Bonferroni procedure could inspire new methodological innovations and further research into its applicability across diverse fields. The endeavor to promote stable and reliable error control techniques remains a pertinent frontier in statistical genomics and beyond.
In summary, this analysis provides a compelling case for the relevance of the Bonferroni procedure in modern biological data analysis settings. By elucidating its stability and capacity for parameter flexibility, the research sets the stage for renewed methodological applications and considerations in controlling false discoveries.