latrend: A Framework for Clustering Longitudinal Data (2402.14621v1)
Abstract: Clustering of longitudinal data is used to explore common trends among subjects over time for a numeric measurement of interest. Various R packages have been introduced throughout the years for identifying clusters of longitudinal patterns, summarizing the variability in trajectories between subject in terms of one or more trends. We introduce the R package "latrend" as a framework for the unified application of methods for longitudinal clustering, enabling comparisons between methods with minimal coding. The package also serves as an interface to commonly used packages for clustering longitudinal data, including "dtwclust", "flexmix", "kml", "lcmm", "mclust", "mixAK", and "mixtools". This enables researchers to easily compare different approaches, implementations, and method specifications. Furthermore, researchers can build upon the standard tools provided by the framework to quickly implement new cluster methods, enabling rapid prototyping. We demonstrate the functionality and application of the latrend package on a synthetic dataset based on the therapy adherence patterns of patients with sleep apnea.
- Akmedoids R package for generating directionally-homogeneous clusters of longitudinal data sets. Journal of Open Source Software, 5(56):2379.
- Time-series clustering - A decade review. Information Systems, 53:16–38.
- An extensive comparative study of cluster validity indices. Pattern recognition, 46(1):243–256.
- Identifying longitudinal patterns for individuals and subgroups: An example with adherence to treatment for obstructive sleep apnea. Multivariate Behavioral Research, 50(1):91–108.
- Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1):1–48.
- mixtools: An R package for analyzing finite mixture models. Journal of Statistical Software, 32(6):1–29.
- Bouveyron, C. (2015). funFEM: Clustering in the Discriminative Functional Subspace.
- A review of psychosocial factors and personality in the treatment of obstructive sleep apnoea. European Respiratory Review, 28(152).
- Model-based clustering for longitudinal data. Computational Statistics & Data Analysis, 52(3):1441–1457.
- A comparison of methods for clustering longitudinal data with slowly changing trends. Communications in Statistics - Simulation and Computation.
- A latent-class heteroskedastic hurdle trajectory model: Patterns of adherence in obstructive sleep apnea patients on CPAP therapy. BMC Medical Research Methodology, 21(1):1–15.
- Desgraupes, B. (2018). clusterCrit: Clustering Indices.
- data.table: Extension of ‘data.frame’.
- Modeling intensive longitudinal data with mixtures of nonparametric trajectories and time-varying effects. Psychological Methods, 20(4):444–469.
- kml and kml3d: R packages to cluster longitudinal data. Journal of Statistical Software, 65(4):1–34.
- FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. Journal of Statistical Software, 28(4):1–35.
- Hamaker, E. L. (2012). Why researchers should think "within-person": A paradigmatic rationale. In Mehl, M. R. and Conner, T. S., editors, Handbook of Research Methods for Studying Daily Life, pages 43–61. Guilford Publications.
- Hennig, C. (2007). Cluster-wise assessment of cluster stability. Computational Statistics & Data Analysis, 52(1):258–271.
- Comparing partitions. Journal of Classification, 2(1):193–218.
- Komárek, A. (2009). A new R package for Bayesian estimation of multivariate normal mixtures allowing for selection of the number of components and interval-censored data. Computational Statistics & Data Analysis, 53(12):3932–3947.
- Liao, T. W. (2005). Clustering of time series data—a survey. Pattern Recognition, 38(11):1857–1874.
- Model-based clustering of longitudinal data. Canadian Journal of Statistics, 38(1):153–168.
- foreach: Provides Foreach Looping Construct.
- Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In The SAGE Handbook of Quantitative Methodology for the Social Sciences, pages 346–369. SAGE Publications, Inc.
- Nagin, D. S. (2005). Group-Based Modeling of Development. Harvard University Press, 1st edition.
- Group-based multi-trajectory modeling. Statistical Methods in Medical Research, 27(7):2015–2023.
- Nielsen, J. D. (2018). crimCV: Group-Based Modelling of Longitudinal Data.
- Estimation of extended mixed models using latent classes and latent processes: The R package lcmm. Journal of Statistical Software, 78(2):1–56.
- R Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.
- Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65.
- Sardá-Espinosa, A. (2019). Time-series clustering in R using the dtwclust package. The R Journal.
- mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models. The R Journal, 8(1):205–233.
- The GRoLTS-checklist: Guidelines for reporting on latent trajectory studies. Structural Equation Modeling: A Multidisciplinary Journal, 24(3):451–467.
- An overview of mixture modelling for latent evolutions in longitudinal data: Modelling approaches, fit statistics and software. Advances in Life Course Research, 43:100323.
- Van Dongen, S. (2000). Performance criteria for graph clustering and Markov cluster experiments. techreport INS-R0012, CWI (Centre for Mathematics and Computer Science).
- Modern Applied Statistics with S. Springer-Verlag, 4th edition.
- Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2nd edition.
- Identifying longitudinal patterns of CPAP treatment in OSA using growth mixture modeling: Disease characteristics and psychological determinants. Frontiers in Neurology, 13:1063461.
- You, K. (2018). mclustcomp: Measures for Comparing Clusters.