Data-Driven Logistic Regression Ensembles With Applications in Genomics

Published 17 Feb 2021 in stat.ME and stat.ML | (2102.08591v6)

Abstract: Advances in data collecting technologies in genomics have significantly increased the need for tools designed to study the genetic basis of many diseases. Effective statistical methods should excel in both prediction accuracy and biomarker identification. We introduce a novel approach to high-dimensional binary classification that integrates regularization with ensembling techniques. Our method constructs compact ensembles of interpretable models derived by optimizing a global objective function. In medical genomics applications, our approach identifies critical biomarkers overlooked by competing methods. We develop a variable importance ranking system to help researchers prioritize promising genes. The method's asymptotic properties are established, and an efficient computational algorithm is provided. Through extensive simulations across complex scenarios and analysis of genomics datasets for cancer, multiple sclerosis, and psoriasis, we demonstrate strong predictive performance. Based on our numerical experiments, we offer practical guidelines for determining optimal ensemble size.

Abstract PDF Upgrade to Chat

Citations (1)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

Data-Driven Logistic Regression Ensembles With Applications in Genomics

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (3)

Collections

Data-Driven Logistic Regression Ensembles With Applications in Genomics

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (3)

Collections