Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 172 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 34 tok/s Pro
GPT-5 High 40 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 436 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

A consensus-constrained parsimonious Gaussian mixture model for clustering hyperspectral images (2403.03349v2)

Published 5 Mar 2024 in stat.ME, cs.CV, and eess.IV

Abstract: The use of hyperspectral imaging to investigate food samples has grown due to the improved performance and lower cost of instrumentation. Food engineers use hyperspectral images to classify the type and quality of a food sample, typically using classification methods. In order to train these methods, every pixel in each training image needs to be labelled. Typically, computationally cheap threshold-based approaches are used to label the pixels, and classification methods are trained based on those labels. However, threshold-based approaches are subjective and cannot be generalized across hyperspectral images taken in different conditions and of different foods. Here a consensus-constrained parsimonious Gaussian mixture model (ccPGMM) is proposed to label pixels in hyperspectral images using a model-based clustering approach. The ccPGMM utilizes information that is available on some pixels and specifies constraints on those pixels belonging to the same or different clusters while clustering the rest of the pixels in the image. A latent variable model is used to represent the high-dimensional data in terms of a small number of underlying latent factors. To ensure computational feasibility, a consensus clustering approach is employed, where the data are divided into multiple randomly selected subsets of variables and constrained clustering is applied to each data subset; the clustering results are then consolidated across all data subsets to provide a consensus clustering solution. The ccPGMM approach is applied to simulated datasets and real hyperspectral images of three types of puffed cereal, corn, rice, and wheat. Improved clustering performance and computational efficiency are demonstrated when compared to other current state-of-the-art approaches.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4):433–459.
  2. Amigo, J. M. (2010). Practical issues of hyperspectral imaging analysis of solid dosage forms. Analytical and bioanalytical chemistry, 398(1):93–109.
  3. Amigo, J. M. (2020). Hyperspectral and multispectral imaging: Setting the scene. In Data Handling in Science and Technology, volume 32, pages 3–16. Elsevier.
  4. Hyperspectral image analysis. a tutorial. Analytica chimica acta, 896:34–51.
  5. Preprocessing of hyperspectral and multispectral images. In Data handling in science and technology, volume 32, pages 37–53. Elsevier.
  6. Sparse Bayesian infinite factor models. Biometrika, 98(2):291–306.
  7. How to pre-process Raman spectra for reliable and stable models? Analytica chimica acta, 704(1-2):47–56.
  8. Discriminative variable selection for clustering with the sparse fisher-em algorithm. Computational Statistics, 29(3):489–513.
  9. Model-based clustering and classification for data science: with applications in R, volume 50. Cambridge University Press.
  10. Application of hyperspectral imaging in food safety inspection and control: a review. Critical reviews in food science and nutrition, 52(11):1039–1058.
  11. Random projection for high dimensional data clustering: A cluster ensemble approach. In Proceedings of the 20th international conference on machine learning (ICML-03), pages 186–193.
  12. Linearization and scatter-correction for near-infrared reflectance spectra of meat. Applied spectroscopy, 39(3):491–500.
  13. The EM algorithm for mixtures of factor analyzers. Technical report, Technical Report CRG-TR-96-1, University of Toronto.
  14. Comparison of spectral selection methods in the development of classification models from visible near infrared hyperspectral imaging data. Journal of Spectral Imaging, 8.
  15. DBscan: Fast density-based clustering with R. Journal of Statistical Software, 91:1–30.
  16. Comparing partitions. Journal of classification, 2:193–218.
  17. Model-based co-clustering for hyperspectral images. Journal of Spectral Imaging.
  18. DBscan: Past, present and future. In The fifth international conference on the applications of digital information and web technologies (ICADIWT 2014), pages 232–238. IEEE.
  19. Mixtures of factor analyzers. In In Proceedings of the Seventeenth International Conference on Machine Learning. Citeseer.
  20. Finite mixture models. John Wiley & Sons.
  21. Modelling high-dimensional data by mixtures of factor analyzers. Computational Statistics & Data Analysis, 41(3-4):379–388.
  22. McNicholas, P. D. (2016). Model-based clustering. Journal of Classification, 33(3):331–373.
  23. Parsimonious Gaussian mixture models. Statistics and Computing, 18(3):285–296.
  24. Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models. Computational Statistics & Data Analysis, 54(3):711–723.
  25. Semi-supervised model-based clustering with positive and negative constraints. Advances in data analysis and classification, 10(3):327–349.
  26. The Bayesian information criterion: background, derivation, and applications. Wiley Interdisciplinary Reviews: Computational Statistics, 4(2):199–203.
  27. A review on image segmentation techniques. Pattern recognition, 26(9):1277–1294.
  28. Consensus-based ensembles of soft clusterings. Applied Artificial Intelligence, 22(7-8):780–810.
  29. R Core Team (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  30. Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends in Analytical Chemistry, 28(10):1201–1222.
  31. Bayesian model averaging in model-based clustering and density estimation. arXiv preprint arXiv:1506.09035.
  32. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2):461–464.
  33. mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. The R Journal, 8(1):289–317.
  34. Sokal, R. R. (1963). The principles and practice of numerical taxonomy. Taxon, pages 190–199.
  35. Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of machine learning research, 3(Dec):583–617.
  36. Deep learning classifiers for near infrared spectral imaging: a tutorial. Journal of Spectral Imaging, 9.
  37. Hyperspectral image clustering: Current achievements and future lines. IEEE Geoscience and Remote Sensing Magazine, 9(4):35–67.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: