Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FlowCyt: A Comparative Study of Deep Learning Approaches for Multi-Class Classification in Flow Cytometry Benchmarking (2403.00024v2)

Published 28 Feb 2024 in cs.LG and q-bio.QM

Abstract: This paper presents FlowCyt, the first comprehensive benchmark for multi-class single-cell classification in flow cytometry data. The dataset comprises bone marrow samples from 30 patients, with each cell characterized by twelve markers. Ground truth labels identify five hematological cell types: T lymphocytes, B lymphocytes, Monocytes, Mast cells, and Hematopoietic Stem/Progenitor Cells (HSPCs). Experiments utilize supervised inductive learning and semi-supervised transductive learning on up to 1 million cells per patient. Baseline methods include Gaussian Mixture Models, XGBoost, Random Forests, Deep Neural Networks, and Graph Neural Networks (GNNs). GNNs demonstrate superior performance by exploiting spatial relationships in graph-encoded data. The benchmark allows standardized evaluation of clinically relevant classification tasks, along with exploratory analyses to gain insights into hematological cell phenotypes. This represents the first public flow cytometry benchmark with a richly annotated, heterogeneous dataset. It will empower the development and rigorous assessment of novel methodologies for single-cell analysis.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Rapid cell population identification in flow cytometry data. Cytom Part J Int Soc Anal Cytol, 79:6–13, 2011.
  2. visne enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nature biotechnology, 31(6):545–552, June 2013.
  3. E. Arvaniti and M. Claassen. Sensitive detection of rare disease-associated cell subsets via representation learning. nature communications. Nature communications, 8(1):14825, April 2017.
  4. Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. Analytical chemistry, 81(16):6813–6822, August 2009.
  5. Dimensionality reduction for visualizing single-cell data using umap. Nature biotechnology, 37(1):38–44, January 2019.
  6. Single-cell mass cytometry of differential immune and drug responses across a human hematopoietic continuum. Science, 332(6030):687–696, May 2011.
  7. Hemagraph: Breaking barriers in hematologic single cell classification with graph attention. 2024.
  8. Automated identification of stratifying signatures in cellular subpopulations. Proceedings of the National Academy of Sciences, 111(26):E2770–E2777, July 2014.
  9. Cytotree: an r/bioconductor package for analysis and visualization of flow and mass cytometry data. BMC bioinformatics, 22(1):1–20, 2021.
  10. Flock cluster analysis of plasma cell flow cytometry data predicts bone marrow involvement by plasma cell neoplasia. Leukemia Research, 48:40–45, September 2016.
  11. Merging mixture components for cell population identification in flow cytometry. Advances in bioinformatics, 2009, 2009.
  12. M.J. Fulwyler. Electronic separation of biological cells by volume. Science, 150(3698):910–911, November 1965.
  13. Chromosome measurement and sorting by flow systems. Proceedings of the National Academy of Sciences, 72(4):1231–1234, April 1975.
  14. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
  15. Metacyto: a tool for automated meta-analysis of mass and flow cytometry data. Cell reports, 24(5):1377–1388, July 2018.
  16. A robust and interpretable end-to-end deep learning model for cytometry data. Proceedings of the National Academy of Sciences, 117(35):21373–21380, September 2020.
  17. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
  18. flowai: automatic and interactive anomaly discerning tools for flow cytometry data. Bioinformatics, 32(16):2473–2480, August 2016.
  19. Swift-scalable clustering for automated identification of rare cell populations in large, high-dimensional flow cytometry datasets, part 2: Biological evaluation. Cytom Part J Int Soc Anal Cytol, 85:422–433, February 2014.
  20. Why attention graphs are all we need: Pioneering hierarchical classification of hematologic cell populations with leukograph. 2024.
  21. Extracting a cellular hierarchy from high-dimensional cytometry data with spade. Nature biotechnology, 29(10):886–891, October 2011.
  22. A comparison of single-cell trajectory inference methods. Nature biotechnology, 37(5):547–554, 2019.
  23. Artificial intelligence enhances diagnostic flow cytometry workflow in the detection of minimal residual disease of chronic lymphocytic leukemia. Cancers, 14(10):2537, May 2022.
  24. Minimizing batch effects in mass cytometry data. Frontiers in immunology, 10:2367, October 2019.
  25. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  26. Flowsom: Using self-organizing maps for visualization and interpretation of cytometry data. Cytom Part J Int Soc Anal Cytol, 87:636–645, 2015.
  27. Floremi: Flow density survival regression using minimal feature redundancy. Cytom Part J Int Soc Anal Cytol, 89:22–29, 2016.
  28. Cytonorm: a normalization algorithm for cytometry data. Cytometry Part A, 97(3):268–278, March 2020.
  29. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
  30. Revisiting semi-supervised learning with graph embeddings. In International conference on machine learning, pages 40–48. PMLR, 2016.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com