eipy: An Open-Source Python Package for Multi-modal Data Integration using Heterogeneous Ensembles (2401.09582v2)
Abstract: In this paper, we introduce eipy--an open-source Python package for developing effective, multi-modal heterogeneous ensembles for classification. eipy simultaneously provides both a rigorous, and user-friendly framework for comparing and selecting the best-performing multi-modal data integration and predictive modeling methods by systematically evaluating their performance using nested cross-validation. The package is designed to leverage scikit-learn-like estimators as components to build multi-modal predictive models. An up-to-date user guide, including API reference and tutorials, for eipy is maintained at https://eipy.readthedocs.io . The main repository for this project can be found on GitHub at https://github.com/GauravPandeyLab/eipy .
- Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information fusion, 58:82–115, 2020.
- Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2236–2246, Melbourne, Australia, July 2018. Association for Computational Linguistics. doi: 10.18653/v1/P18-1208. URL https://aclanthology.org/P18-1208.
- A multi-modal model-fusion approach for improved prediction of freezing of gait in parkinson’s disease. IEEE Sensors Journal, 2023.
- Toolbox for multimodal learn (scikit-multimodallearn). The Journal of Machine Learning Research, 23(1):2407–2413, 2022.
- Harnessing multimodal data integration to advance precision oncology. Nature Reviews Cancer, 22(2):114–126, 2022.
- Leo Breiman. Random forests. Machine learning, 45:5–32, 2001.
- Explaining a series of models by propagating shapley values. Nature communications, 13(1):4512, 2022.
- Sébastien Eustace et al. Poetry: Python packaging and dependency management made easy, 2023. URL https://python-poetry.org/.
- Richard Hodson. Precision medicine. Nature, 537(7619):S49–S49, 2016.
- Multimodal machine learning in precision health: A scoping review. npj Digital Medicine, 5(1):171, 2022.
- State of the field in multi-omics research: from computational needs to data mining and sharing. Frontiers in Genetics, 11:610798, 2020.
- pytest: helps you write better programs, 2023a. URL https://docs.pytest.org/.
- Holger Krekel et al. tox - automation project, 2023b. URL https://tox.wiki/.
- Łukasz Langa et al. Black: The uncompromising python code formatter, 2023. URL https://black.readthedocs.io/en/stable/.
- Integrating multimodal data through interpretable heterogeneous ensembles. Bioinformatics Advances, 2(1):vbac065, 09 2022. ISSN 2635-0041. doi: 10.1093/bioadv/vbac065. URL https://doi.org/10.1093/bioadv/vbac065.
- Multizoo & multibench: A standardized toolkit for multimodal deep learning. Journal of Machine Learning Research, 24:1–7, 2023.
- A unified approach to interpreting model predictions. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf.
- From local explanations to global understanding with explainable ai for trees. Nature machine intelligence, 2(1):56–67, 2020.
- Facilitating youth diabetes studies with the most comprehensive epidemiological dataset available through a public web portal. medRxiv, 2023. doi: 10.1101/2023.08.02.23293517. URL https://www.medrxiv.org/content/early/2023/08/04/2023.08.02.23293517.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Sam Reid. A review of heterogeneous ensemble methods. Department of Computer Science, University of Colorado at Boulder, 2007.
- Eric J. Topol. As artificial intelligence goes multimodal, medical applications multiply. Science, 381(6663):eadk6139, 2023. doi: 10.1126/science.adk6139. URL https://www.science.org/doi/abs/10.1126/science.adk6139.
- Sphinx, 2023. URL https://www.sphinx-doc.org/.
- What makes training multi-modal classification networks hard? In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12695–12705, 2020.
- Predicting protein function and other biomedical characteristics with heterogeneous ensembles. Methods, 93:92–102, 2016. ISSN 1046-2023. doi: https://doi.org/10.1016/j.ymeth.2015.08.016. URL https://www.sciencedirect.com/science/article/pii/S1046202315300566. Computational protein function predictions.
- David H. Wolpert. Stacked generalization. Neural Networks, 5(2):241–259, 1992. ISSN 0893-6080. doi: https://doi.org/10.1016/S0893-6080(05)80023-1. URL https://www.sciencedirect.com/science/article/pii/S0893608005800231.
- Convolutional neural networks for multimodal remote sensing data classification. IEEE Transactions on Geoscience and Remote Sensing, 60:1–10, 2022. doi: 10.1109/TGRS.2021.3124913.
- Multi-modal multi-step wind power forecasting based on stacking deep learning model. Renewable Energy, 215:118991, 2023. ISSN 0960-1481. doi: https://doi.org/10.1016/j.renene.2023.118991. URL https://www.sciencedirect.com/science/article/pii/S0960148123008972.
- Multi-modal stacking ensemble for the diagnosis of cardiovascular diseases. Journal of Personalized Medicine, 13(2), 2023. ISSN 2075-4426. doi: 10.3390/jpm13020373. URL https://www.mdpi.com/2075-4426/13/2/373.
- Tarek Ziadé et al. Flake8: Your tool for style guide enforcement, 2023. URL https://flake8.pycqa.org/.
- Machine learning-based prediction of covid-19 diagnosis based on symptoms. npj digital medicine, 4(1):3, 2021.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.