Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Advancing Gene Selection in Oncology: A Fusion of Deep Learning and Sparsity for Precision Gene Selection (2403.01927v1)

Published 4 Mar 2024 in q-bio.GN, cs.CV, q-bio.QM, and q-bio.TO

Abstract: Gene selection plays a pivotal role in oncology research for improving outcome prediction accuracy and facilitating cost-effective genomic profiling for cancer patients. This paper introduces two gene selection strategies for deep learning-based survival prediction models. The first strategy uses a sparsity-inducing method while the second one uses importance based gene selection for identifying relevant genes. Our overall approach leverages the power of deep learning to model complex biological data structures, while sparsity-inducing methods ensure the selection process focuses on the most informative genes, minimizing noise and redundancy. Through comprehensive experimentation on diverse genomic and survival datasets, we demonstrate that our strategy not only identifies gene signatures with high predictive power for survival outcomes but can also streamlines the process for low-cost genomic profiling. The implications of this research are profound as it offers a scalable and effective tool for advancing personalized medicine and targeted cancer therapies. By pushing the boundaries of gene selection methodologies, our work contributes significantly to the ongoing efforts in cancer genomics, promising improved diagnostic and prognostic capabilities in clinical settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Cancer gene census (cgc). https://cancer.sanger.ac.uk/cosmic/census.
  2. WHO. https://www.who.int/news-room/fact-sheets/detail/cancer.
  3. Unsupervised gene selection using biological knowledge: application in sample clustering. BMC bioinformatics, 18:1–13, 2017.
  4. Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM transactions on computational biology and bioinformatics, 13(5):971–989, 2015.
  5. A comparative study of data analysis techniques. International journal of emerging trends & technology in computer science, 3(2):95–101, 2014.
  6. Gene-gene interaction: the curse of dimensionality. Annals of translational medicine, 7(24), 2019.
  7. A kernel-based clustering method for gene selection with gene expression data. Journal of biomedical informatics, 62:12–20, 2016.
  8. A hybrid autoencoder network for unsupervised image clustering. Algorithms, 12(6):122, 2019.
  9. Prognostic impact of ptk6 expression in triple negative breast cancer. BMC Women’s Health, 23(1):575, 2023.
  10. David R Cox. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2):187–202, 1972.
  11. Fundamentals of data normalization. Addison-Wesley Longman Publishing Co., Inc., 1991.
  12. Bradley Efron. The efficiency of cox’s likelihood function for censored data. Journal of the American statistical Association, 72(359):557–565, 1977.
  13. Unsupervised gene selection and clustering using simulated annealing. In International Workshop on Fuzzy Logic and Applications, pages 229–235. Springer, 2005.
  14. Supervised classification and gene selection using simulated annealing. In The 2006 IEEE International Joint Conference on Neural Network Proceedings, pages 3566–3571. IEEE, 2006.
  15. Stephane Fotso. Deep neural networks for survival analysis based on a multi-task framework. arXiv preprint arXiv:1801.05512, 2018.
  16. Simultaneous clustering and attribute discrimination. In Ninth IEEE International Conference on Fuzzy Systems. FUZZ-IEEE 2000 (Cat. No. 00CH37063), volume 1, pages 158–163. IEEE, 2000.
  17. Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Medical & biological engineering & computing, 57:159–176, 2019.
  18. Jiawei Han. Spatial clustering methods in data mining: A survey. Geographic data mining and knowledge discovery, pages 188–217, 2001.
  19. Chst11 gene expression and dna methylation in breast cancer. International Journal of Oncology, 46(3):1243–1251, 2015.
  20. Orienting conflicted graph edges using genetic algorithms to discover pathways in protein-protein interaction networks. IEEE/ACM transactions on computational biology and bioinformatics, 18(5):1970–1985, 2020.
  21. Two-stage hybrid gene selection using mutual information and genetic algorithm for cancer data classification. Journal of medical systems, 43:1–11, 2019.
  22. The statistical analysis of failure time data. John Wiley & Sons, 2011.
  23. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  24. Akhila Krishna. 164 selected genes from Sparse NMTLR model for OV, BRCA and HNSC cancer. https://docs.google.com/spreadsheets/d/16kuq-n_5dKHuwF_Nw4s7SmaSN4yCu60eK0LYNFJ1c9I/edit?usp=sharing, 2023. Accessed: 2023-01-24.
  25. Learning individual survival models from pancancer whole transcriptome data. Clinical Cancer Research, 29(19):3924–3936, 2023.
  26. The robust inference for the cox proportional hazards model. Journal of the American statistical Association, 84(408):1074–1078, 1989.
  27. An integrated tcga pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell, 173(2):400–416, 2018.
  28. A bayesian framework to quantify survival uncertainty. Annals of Oncology, 30:vii32–vii33, 2019.
  29. Uncertainty estimation in cancer survival prediction. arXiv preprint arXiv:2003.08573, 2020.
  30. Principal components analysis (pca). Computers & Geosciences, 19(3):303–342, 1993.
  31. Feature selection and classification in gene expression cancer data. In 2017 International Conference on Computational Intelligence in Data Science (ICCIDS), pages 1–6. IEEE, 2017.
  32. Brca1 gene in breast cancer. Journal of cellular physiology, 196(1):19–41, 2003.
  33. Classification of human cancer diseases by gene expression profiles. Applied Soft Computing, 50:124–134, 2017.
  34. Stochastic methods for l 1 regularized loss minimization. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 929–936, 2009.
  35. A survey on semi-supervised feature selection methods. Pattern Recognition, 64:141–158, 2017.
  36. Deep auto-encoder based clustering. Intelligent Data Analysis, 18(6S):S65–S76, 2014.
  37. An approach for feature selection using local searching and global optimization techniques. Neural Computing and Applications, 28:2915–2930, 2017.
Citations (1)

Summary

We haven't generated a summary for this paper yet.