Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pretext Training Algorithms for Event Sequence Data (2402.10392v1)

Published 16 Feb 2024 in cs.LG and cs.AI

Abstract: Pretext training followed by task-specific fine-tuning has been a successful approach in vision and language domains. This paper proposes a self-supervised pretext training framework tailored to event sequence data. We introduce a novel alignment verification task that is specialized to event sequences, building on good practices in masked reconstruction and contrastive learning. Our pretext tasks unlock foundational representations that are generalizable across different down-stream tasks, including next-event prediction for temporal point process models, event sequence classification, and missing event interpolation. Experiments on popular public benchmarks demonstrate the potential of the proposed method across different tasks and data domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Realistic synthetic financial transactions for anti-money laundering models. In Advances in Neural Information Processing Systems, 2023.
  2. Meta temporal point processes. In International Conference on Learning Representations, 2023.
  3. Neural flows: Efficient alternative to neural odes. In NeurIPS, pages 21325–21337, 2021.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  5. Neural ordinary differential equations. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems, pages 6572–6583, 2018.
  6. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  7. Primenet: Pre-training for irregular multivariate time series. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 7184–7192, 2023.
  8. Perfect match: Improved cross-modal embeddings for audio-visual synchronisation. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3965–3969. IEEE, 2019.
  9. An introduction to the theory of point processes: volume I: elementary theory and methods. Springer, 2003.
  10. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  11. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1555–1564, 2016.
  12. Unsupervised scalable representation learning for multivariate time series. Advances in neural information processing systems, 32:4652–4663, 2019.
  13. Large language models are zero-shot time series forecasters. arXiv preprint arXiv:2310.07820, 2023.
  14. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022.
  15. Time-llm: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:2310.01728, 2023.
  16. An agent-based approach to care in independent living. In Ambient Intelligence - First International Joint Conference, AmI 2010, Malaga, Spain, November 10-12, 2010. Proceedings, pages 177–186, 2010.
  17. Community interaction and conflict on the web. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pages 933–943. International World Wide Web Conferences Steering Committee, 2018.
  18. Predicting dynamic embedding trajectory in temporal interaction networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1269–1278. ACM, 2019.
  19. Open-access MIMIC-II database for intensive care research. In 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2011, Boston, MA, USA, August 30 - Sept. 3, 2011, pages 8315–8318, 2011.
  20. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.
  21. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  22. Univl: A unified video and language pre-training model for multimodal understanding and generation. arXiv preprint arXiv:2002.06353, 2020.
  23. A variational auto-encoder model for stochastic point processes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3165–3174, 2019.
  24. The neural hawkes process: A neurally self-modulating multivariate point process. Advances in neural information processing systems, 30:6754–6764, 2017.
  25. Imputing missing events in continuous-time event streams. In International Conference on Machine Learning, pages 4475–4485. PMLR, 2019.
  26. Transformer embeddings of irregularly spaced events and their participants. In International conference on learning representations, 2021.
  27. End-to-end learning of visual representations from uncurated instructional videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9879–9889, 2020.
  28. Shuffle and learn: unsupervised learning using temporal order verification. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 527–544. Springer, 2016.
  29. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  30. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  31. Colin R Reeves. Genetic algorithms. Handbook of metaheuristics, pages 109–139, 2010.
  32. Faster r-cnn: Towards real-time object detection with region proposal networks. In Neural Information Processing Systems (NIPS), pages 91–99, 2015.
  33. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211–252, 2015.
  34. Intensity-free learning of temporal point processes. In International Conference on Learning Representations, 2020.
  35. Neural temporal point processes: A review. arXiv preprint arXiv:2104.03528, 2021.
  36. Multi-time attention networks for irregularly sampled time series. In International Conference on Learning Representations, 2021.
  37. Predicting in-hospital mortality of icu patients: The physionet/computing in cardiology challenge 2012. In 2012 Computing in Cardiology, pages 245–248. IEEE, 2012.
  38. What makes for good views for contrastive learning? Advances in neural information processing systems, 33:6827–6839, 2020.
  39. Unsupervised representation learning for time series with temporal neighborhood coding. In International Conference on Learning Representations, 2021.
  40. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  41. Attention is all you need. Advances in neural information processing systems, 30:5998–6008, 2017.
  42. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008.
  43. Self-attention with functional time representation learning. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 15889–15899, 2019.
  44. Easytpp: Towards open benchmarking the temporal point processes. arXiv preprint arXiv:2307.08097, 2023.
  45. Ts2vec: Towards universal representation of time series. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8980–8987, 2022.
  46. A transformer-based framework for multivariate time series representation learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, page 2114–2124, 2021a.
  47. A transformer-based framework for multivariate time series representation learning. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 2114–2124, 2021b.
  48. What constitutes good contrastive learning in time-series forecasting? arXiv preprint arXiv:2306.12086, 2023.
  49. Self-attentive hawkes process. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, pages 11183–11193, 2020.
  50. Transformer hawkes process. In International conference on machine learning, pages 11692–11702. PMLR, 2020.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets