Investigating Low Data, Confidence Aware Image Prediction on Smooth Repetitive Videos using Gaussian Processes (2307.11259v2)
Abstract: The ability to predict future states is crucial to informed decision-making while interacting with dynamic environments. With cameras providing a prevalent and information-rich sensing modality, the problem of predicting future states from image sequences has garnered a lot of attention. Current state-of-the-art methods typically train large parametric models for their predictions. Though often able to predict with accuracy these models often fail to provide interpretable confidence metrics around their predictions. Additionally these methods are reliant on the availability of large training datasets to converge to useful solutions. In this paper, we focus on the problem of predicting future images of an image sequence with interpretable confidence bounds from very little training data. To approach this problem, we use non-parametric models to take a probabilistic approach to image prediction. We generate probability distributions over sequentially predicted images, and propagate uncertainty through time to generate a confidence metric for our predictions. Gaussian Processes are used for their data efficiency and ability to readily incorporate new training data online. Our methods predictions are evaluated on a smooth fluid simulation environment. We showcase the capabilities of our approach on real world data by predicting pedestrian flows and weather patterns from satellite imagery.
- W. Yu, Y. Lu, S. Easterbrook, and S. Fidler, “Efficient and information-preserving future frame prediction and beyond,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=B1eY˙pVYvB
- W. Byeon, Q. Wang, R. K. Srivastava, and P. Koumoutsakos, “Contextvp: Fully context-aware video prediction,” 2018.
- C. Finn, I. Goodfellow, and S. Levine, “Unsupervised learning for physical interaction through video prediction,” 2016.
- D. Helbing, “A fluid dynamic model for the movement of pedestrians,” 1998.
- D. Low, “Statistical physics - following the crowd,” Nature, vol. 407, pp. 465–6, 10 2000.
- P. Wang and P. Luh, “Fluid-based analysis of pedestrian crowd at bottlenecks,” 2023.
- K. Kashinath, M. Mustafa, A. Albert, J.-L. Wu, C. Jiang, S. Esmaeilzadeh, K. Azizzadenesheli, R. Wang, A. Chattopadhyay, A. Singh, and et al., “Physics-informed machine learning: case studies for weather and climate modelling,” Philosophical Transactions A: Mathematical, Physical and Engineering Sciences, vol. 379, no. 2194, p. Art. No. 20200093, Apr 2021.
- A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” International Journal of Robotics Research (IJRR), 2013.
- G. J. Brostow, J. Shotton, J. Fauqueur, and R. Cipolla, “Segmentation and recognition using structure from motion point clouds,” in Computer Vision – ECCV 2008, D. Forsyth, P. Torr, and A. Zisserman, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008, pp. 44–57.
- P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: A benchmark,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 304–311.
- K. O’Shea and R. Nash, “An introduction to convolutional neural networks,” 2015.
- Y. Wang, L. Jiang, M.-H. Yang, L.-J. Li, M. Long, and L. Fei-Fei, “Eidetic 3d lstm: A model for video prediction and beyond,” in ICLR, 2019.
- S. Aigner and M. Korner, “Futuregan: Anticipating the future frames of video sequences using spatio-temporal 3d convolutions in progressively growing gans,” arXiv: Computer Vision and Pattern Recognition, 2018.
- D. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, pp. 533–536, 1986.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- X. Chen, W. Wang, J. Wang, and W. Li, “Learning object-centric transformation for video prediction,” in Proceedings of the 25th ACM International Conference on Multimedia, ser. MM ’17. New York, NY, USA: Association for Computing Machinery, 2017, p. 1503–1512. [Online]. Available: https://doi.org/10.1145/3123266.3123349
- N. Wichers, R. Villegas, D. Erhan, and H. Lee, “Hierarchical long-term video prediction without supervision,” 2018.
- J. Walker, K. Marino, A. Gupta, and M. Hebert, “The pose knows: Video forecasting by generating pose futures,” 2017.
- I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” 2014.
- D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” 2014.
- M. Talebizadeh and A. Moridnejad, “Uncertainty analysis for the forecast of lake level fluctuations using ensembles of ann and anfis models,” Expert Systems with Applications, vol. 38, no. 4, pp. 4126–4135, 2011. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0957417410010328
- M. Raissi, P. Perdikaris, and G. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, pp. 686–707, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0021999118307125
- C. M. Jiang, S. Esmaeilzadeh, K. Azizzadenesheli, K. Kashinath, M. Mustafa, H. A. Tchelepi, P. Marcus, Prabhat, and A. Anandkumar, “Meshfreeflownet: A physics-constrained deep continuous space-time super-resolution framework,” 2020.
- D. Greenfeld, M. Galun, R. Kimmel, I. Yavneh, and R. Basri, “Learning to optimize multigrid pde solvers,” 2019.
- Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Fourier neural operator for parametric partial differential equations,” 2021.
- N. U. Shinde, J. Johnson, S. Herbert, and M. C. Yip, “Object-centric representations for interactive online learning with non-parametric methods,” 2023.
- J. Ko, D. J. Klein, D. Fox, and D. Haehnel, “Gaussian processes and reinforcement learning for identification and control of an autonomous blimp,” in Proceedings 2007 IEEE International Conference on Robotics and Automation, 2007, pp. 742–747.
- B. Wilcox and M. C. Yip, “Solar-gp: Sparse online locally adaptive regression using gaussian processes for bayesian robot model learning and control,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 2832–2839, 2020.
- M. P. Deisenroth and C. E. Rasmussen, “Pilco: A model-based and data-efficient approach to policy search,” in Proceedings of the 28th International Conference on International Conference on Machine Learning, ser. ICML’11. Madison, WI, USA: Omnipress, 2011, p. 465–472.
- M. Turchetta, F. Berkenkamp, and A. Krause, “Safe exploration for interactive machine learning,” 2019.
- D. Yang, L. Li, K. Redmill, and Ümit Özgüner, “Top-view trajectories: A pedestrian dataset of vehicle-crowd interaction from controlled experiments and crowded campus,” 2019.
- Denver7, “Tracking hurricane ian’s explosive growth through satellite imagery,” Sept. 2022. [Online]. Available: https://www.youtube.com/watch?v=Fw8VWSn9Lps
- S. Vinga, “Convolution integrals of normal distribution functions,” 01 2004.