Classification of Infant Sleep/Wake States: Cross-Attention among Large Scale Pretrained Transformer Networks using Audio, ECG, and IMU Data (2306.15808v1)
Abstract: Infant sleep is critical to brain and behavioral development. Prior studies on infant sleep/wake classification have been largely limited to reliance on expensive and burdensome polysomnography (PSG) tests in the laboratory or wearable devices that collect single-modality data. To facilitate data collection and accuracy of detection, we aimed to advance this field of study by using a multi-modal wearable device, LittleBeats (LB), to collect audio, electrocardiogram (ECG), and inertial measurement unit (IMU) data among a cohort of 28 infants. We employed a 3-branch (audio/ECG/IMU) large scale transformer-based neural network (NN) to demonstrate the potential of such multi-modal data. We pretrained each branch independently with its respective modality, then finetuned the model by fusing the pretrained transformer layers with cross-attention. We show that multi-modal data significantly improves sleep/wake classification (accuracy = 0.880), compared with use of a single modality (accuracy = 0.732). Our approach to multi-modal mid-level fusion may be adaptable to a diverse range of architectures and tasks, expanding future directions of infant behavioral research.
- A. R. Tarullo, P. D. Balsam, and W. P. Fifer, “Sleep and infant learning,” Infant Child Dev., vol. 20, no. 1, pp. 35–46, Jan. 2011.
- M. S. Blumberg, A. J. Gall, and W. D. Todd, “The development of sleep-wake rhythms and the search for elemental circuits in the infant brain,” Behav. Neurosci., vol. 128, no. 3, pp. 250–263, Jun. 2014.
- R. E. Dahl, “The regulation of sleep and arousal: Development and psychopathology,” Dev Psychopathol, vol. 8, no. 3, pp. 3–27, 1996.
- E. Bathory and S. Tomopoulos, “Sleep regulation, physiology and development, sleep duration and patterns, and sleep hygiene in infants, toddlers, and preschool-age children,” Curr. Probl. Pediatr. Adolesc. Health Care, vol. 47, no. 2, pp. 29–42, Feb. 2017.
- B. C. Galland, B. J. Taylor, D. E. Elder, and P. Herbison, “Normal sleep patterns in infants and children: a systematic review of observational studies,” Sleep medicine reviews, vol. 16, no. 3, p. 213—222, June 2012.
- A. Hupbach, R. L. Gomez, R. R. Bootzin, and L. Nadel, “Nap-dependent learning in infants,” Dev. Sci., vol. 12, no. 6, pp. 1007–1012, Nov. 2009.
- K. Horváth, K. Myers, R. Foster, and K. Plunkett, “Napping facilitates word learning in early lexical development,” J. Sleep Res., vol. 24, no. 5, pp. 503–509, Oct. 2015.
- E. Dearing, K. McCartney, N. L. Marshall, and R. M. Warner, “Parental reports of children’s sleep and wakefulness: longitudinal associations with cognitive and language outcomes,” Infant Behav. Dev., vol. 24, no. 2, pp. 151–170, Feb. 2001.
- A. Bernier, S. M. Carlson, S. Bordeleau, and J. Carrier, “Relations between physiological and cognitive regulatory systems: infant sleep regulation and subsequent executive functioning,” Child Dev., vol. 81, no. 6, pp. 1739–1752, Nov. 2010.
- M. Ednick, A. P. Cohen, G. L. McPhail, D. Beebe, N. Simakajornboon, and R. S. Amin, “A review of the effects of sleep during the first year of life on cognitive, psychomotor, and temperament development,” Sleep, vol. 32, no. 11, pp. 1449–1458, Nov. 2009.
- A. Scher, “Infant sleep at 10 months of age as a window to cognitive development,” Early Hum. Dev., vol. 81, no. 3, pp. 289–292, Mar. 2005.
- A. Sadeh, G. De Marcas, Y. Guri, A. Berger, L. Tikotzky, and Y. Bar-Haim, “Infant sleep predicts attention regulation and behavior problems at 3-4 years of age,” Dev. Neuropsychol., vol. 40, no. 3, pp. 122–137, 2015.
- M. Thunström, “Severe sleep problems in infancy associated with subsequent development of attention-deficit/hyperactivity disorder at 5.5 years of age,” Acta Paediatr., vol. 91, no. 5, pp. 584–592, 2002.
- F. V. O’Callaghan, A. Al Mamun, M. O’Callaghan, A. Clavarino, G. M. Williams, W. Bor, H. Heussler, and J. M. Najman, “The link between sleep problems in infancy and early childhood and attention problems at 5 and 14 years: Evidence from a birth cohort study,” Early Hum. Dev., vol. 86, no. 7, pp. 419–424, Jul. 2010.
- J.-P. Chaput, C. E. Gray, V. J. Poitras, V. Carson, R. Gruber, C. S. Birken, J. E. MacLean, S. Aubert, M. Sampson, and M. S. Tremblay, “Systematic review of the relationships between sleep duration and health indicators in the early years (0-4 years),” BMC Public Health, vol. 17, no. Suppl 5, p. 855, Nov. 2017.
- L. Matricciani, C. Paquet, B. Galland, M. Short, and T. Olds, “Children’s sleep and health: A meta-review,” Sleep Med. Rev., vol. 46, pp. 136–150, Aug. 2019.
- “About littlebeats.” [Online]. Available: https://littlebeats.hdfs.illinois.edu/about-littlebeats/
- A. Sadeh, “Iii. sleep assessment methods,” Monogr. Soc. Res. Child Dev., vol. 80, no. 1, pp. 33–48, Mar. 2015.
- S. E. Beck and C. L. Marcus, “Pediatric polysomnography,” Sleep Med. Clin., vol. 4, no. 3, pp. 393–406, Sep. 2009.
- J. Palotti, R. Mall, M. Aupetit, M. Rueschman, M. Singh, A. Sathyanarayana, S. Taheri, and L. Fernandez-Luque, “Benchmark on a large cohort for sleep-wake classification with machine learning techniques,” Jun 2019.
- C. Cajochen, J. Pischke, D. Aeschbach, and A. A. Borbély, “Heart rate dynamics during human sleep,” Physiology & Behavior, vol. 55, no. 4, pp. 769–774, 1994.
- J. Malik, Y.-L. Lo, and H. tieng Wu, “Sleep-wake classification via quantifying heart rate variability by convolutional neural network,” Physiological Measurement, vol. 39, no. 8, p. 085004, aug 2018.
- E. Dafna, A. Tarasiuk, and Y. Zigel, “Sleep-wake evaluation from whole-night non-contact audio recordings of breathing sounds,” p. e0117382, Feb 2015.
- W. Karlen, C. Mattiussi, and D. Floreano, “Sleep and wake classification with ecg and respiratory effort signals,” IEEE Transactions on Biomedical Circuits and Systems, vol. 3, no. 2, pp. 71–78, 2009.
- O. Walch, Y. Huang, D. Forger, and C. Goldstein, “Sleep stage prediction with raw acceleration and photoplethysmography heart rate data derived from a consumer wearable device,” Sleep, vol. 42, no. 12, 08 2019, zsz180.
- S. Cabon, F. Porée, A. Simon, B. Met-Montot, P. Pladys, O. Rosec, N. Nardi, and G. Carrault, “Audio- and video-based estimation of the sleep stages of newborns in neonatal intensive care unit,” Biomedical Signal Processing and Control, vol. 52, pp. 362–370, 2019.
- A. Supratak, H. Dong, C. Wu, and Y. Guo, “DeepSleepNet: A model for automatic sleep stage scoring based on raw single-channel EEG,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 25, no. 11, pp. 1998–2008, nov 2017.
- A. Supratak and Y. Guo, “Tinysleepnet: An efficient deep learning model for sleep stage scoring based on raw single-channel eeg,” 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 641–644, 2020.
- A. Baevski, H. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” 2020.
- J. Oh, H. Chung, J. myoung Kwon, D. gyun Hong, and E. Choi, “Lead-agnostic self-supervised learning for local and global representations of electrocardiogram,” 2022.
- H. Xu, P. Zhou, R. Tan, M. Li, and G. Shen, “Limu-bert: Unleashing the potential of unlabeled data for imu sensing applications,” in Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems, ser. SenSys ’21. New York, NY, USA: Association for Computing Machinery, 2021, p. 220–233.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019.
- Z. Ni, T. Wu, T. Wang, F. Sun, and Y. Li, “Deep multi-branch two-stage regression network for accurate energy expenditure estimation with ecg and imu data,” p. 3224–3233, Oct 2022.
- K. Gadzicki, R. Khamsehashari, and C. Zetzsche, “Early vs late fusion in multimodal convolutional neural networks,” in 2020 IEEE 23rd International Conference on Information Fusion (FUSION), 2020, pp. 1–6.
- Y. Cao, T. Iqbal, Q. Kong, F. An, W. Wang, and M. D. Plumbley, “An improved event-independent network for polyphonic sound event localization and detection,” 2021.
- L. Ying, H. Yu, J. Wang, Y. Ji, and S. Qian, “Multi-level multi-modal cross-attention network for fake news detection,” IEEE Access, vol. 9, pp. 132 363–132 373, 2021.
- X. Wei, T. Zhang, Y. Li, Y. Zhang, and F. Wu, “Multi-modality cross attention network for image and sentence matching,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 10 938–10 947.
- X. Song, H. Chao, X. Xu, H. Guo, S. Xu, B. Turkbey, B. J. Wood, T. Sanford, G. Wang, and P. Yan, “Cross-modal attention for multi-modal image registration,” Medical Image Analysis, vol. 82, p. 102612, 2022.
- J. Li, M. Hasegawa-Johnson, and N. L. McElwain, “Towards robust family-infant audio analysis based on unsupervised pretraining of wav2vec 2.0 on large-scale unlabeled family audio,” 2023.
- D. Xu, J. A. Richards, and J. Gilkerson, “Automated analysis of child phonetic production using naturalistic recordings,” J. Speech Lang. Hear. Res., vol. 57, no. 5, pp. 1638–1650, Oct. 2014.
- H. Dudley, “The vocoder—electrical re-creation of speech,” Journal of the Society of Motion Picture Engineers, vol. 34, no. 3, pp. 272–278, 1940.
- F.-J. Chang, M. Radfar, A. Mouchtaris, B. King, and S. Kunzmann, “End-to-end multi-channel transformer for speech recognition,” 2021.