Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction (2403.18447v1)

Published 27 Mar 2024 in cs.CL, cs.CV, cs.LG, and cs.RO

Abstract: LLMs have demonstrated impressive ability in context understanding and generative performance. Inspired by the recent success of language foundation models, in this paper, we propose LMTraj (Language-based Multimodal Trajectory predictor), which recasts the trajectory prediction task into a sort of question-answering problem. Departing from traditional numerical regression models, which treat the trajectory coordinate sequence as continuous signals, we consider them as discrete signals like text prompts. Specially, we first transform an input space for the trajectory coordinate into the natural language space. Here, the entire time-series trajectories of pedestrians are converted into a text prompt, and scene images are described as text information through image captioning. The transformed numerical and image data are then wrapped into the question-answering template for use in a LLM. Next, to guide the LLM in understanding and reasoning high-level knowledge, such as scene context and social relationships between pedestrians, we introduce an auxiliary multi-task question and answering. We then train a numerical tokenizer with the prompt data. We encourage the tokenizer to separate the integer and decimal parts well, and leverage it to capture correlations between the consecutive numbers in the LLM. Lastly, we train the LLM using the numerical tokenizer and all of the question-answer prompts. Here, we propose a beam-search-based most-likely prediction and a temperature-based multimodal prediction to implement both deterministic and stochastic inferences. Applying our LMTraj, we show that the language-based model can be a powerful pedestrian trajectory predictor, and outperforms existing numerical-based predictor methods. Code is publicly available at https://github.com/inhwanbae/LMTrajectory .

Definition Search Book Streamline Icon: https://streamlinehq.com
References (149)
  1. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  2. Adapt: Efficient multi-agent trajectory prediction with adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  3. Disentangled multi-relational graph convolutional network for pedestrian trajectory prediction. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021.
  4. A set of control points conditioned pedestrian trajectory prediction. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.
  5. Learning pedestrian group representations for multi-modal trajectory prediction. In Proceedings of the European Conference on Computer Vision (ECCV), 2022a.
  6. Non-probability sampling network for stochastic human trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022b.
  7. EigenTrajectory: Low-rank descriptors for multi-modal trajectory forecasting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  8. UniLMv2: Pseudo-masked language models for unified language model pre-training. In Proceedings of the IEEE International Conference on Machine Learning (PMLR), 2020.
  9. An empirical investigation of contextualized number prediction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
  10. Conditional flow variational autoencoders for structured sequence prediction. arXiv preprint arXiv:1908.09008, 2020.
  11. Group lstm: Group trajectory prediction in crowded scenarios. In Proceedings of the European Conference on Computer Vision Workshop (ECCVW), 2018.
  12. Neural machine translation with monolingual translation memory. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
  13. Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction. In Proceedings of the Conference on Robot Learning (CoRL), 2019.
  14. Human trajectory prediction via counterfactual analysis. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021a.
  15. Personalized trajectory prediction via distribution discrimination. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021b.
  16. Unsupervised sampling promoting for stochastic human trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023a.
  17. Traj-mae: Masked autoencoders for trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023b.
  18. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.
  19. Looking to relations for future trajectory forecast. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
  20. R-pred: Two-stage motion prediction via tube-query attention-based trajectory refinement. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  21. Goal-gan: Multimodal trajectory prediction based on goal position estimation. In Proceedings of the Asian Conference on Computer Vision (ACCV), 2020.
  22. Mg-gan: A multi-generator model preventing out-of-distribution samples in pedestrian trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  23. Trajectory forecasts in unknown environments conditioned on grid-based plans. arXiv preprint arXiv:2001.00735, 2020.
  24. Cogview: Mastering text-to-image generation via transformers. In Proceedings of the Neural Information Processing Systems (NeurIPS), 2021.
  25. Unified language model pre-training for natural language understanding and generation. In Proceedings of the Neural Information Processing Systems (NeurIPS), 2019.
  26. Sparse instance conditioned multimodal trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  27. Gd-gan: Generative adversarial networks for trajectory prediction and group detection in crowds. In Proceedings of the Asian Conference on Computer Vision (ACCV), 2018a.
  28. Soft+ hardwired attention: An lstm framework for human trajectory prediction and abnormal event detection. Neural Networks, 108:466–478, 2018b.
  29. Beam search strategies for neural machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2017.
  30. TableGPT: Few-shot table-to-text generation with table structure reconstruction and content matching. In Proceedings of the International Conference on Computational Linguistics (COLING), 2020.
  31. Stochastic trajectory prediction via motion indeterminacy diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  32. Social gan: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  33. Social force model for pedestrian dynamics. Physical review E, 51(5):4282, 1995.
  34. Stgat: Modeling spatial-temporal interactions for human trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
  35. Introducing probabilistic bézier curves for n-step sequence prediction. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020.
  36. The trajectron: Probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
  37. Motiondiffuser: Controllable multi-agent motion prediction using diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  38. ForecastQA: a question answering challenge for event forecasting with temporal text data. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
  39. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2019.
  40. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations (ICLR), 2017.
  41. Social-bigat: Multimodal trajectory forecasting using bicycle-gan and graph attention networks. In Proceedings of the Neural Information Processing Systems (NeurIPS), 2019.
  42. Interpretable social anchors for human trajectory forecasting in crowds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  43. Taku Kudo. Subword regularization: Improving neural network translation models with multiple subword candidates. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2018.
  44. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2018.
  45. Trajectory prediction with linguistic representations. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2022.
  46. Muse-vae: Multi-scale vae for environment-aware long term trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  47. Desire: Distant future prediction in dynamic scenes with interacting agents. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  48. Crowds by example. Computer Graphics Forum, 26(3):655–664, 2007.
  49. Conditional generative neural system for probabilistic trajectory prediction. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019.
  50. Evolvegraph: Multi-agent trajectory prediction with dynamic relational reasoning. In Proceedings of the Neural Information Processing Systems (NeurIPS), 2020.
  51. BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models. In Proceedings of the International Conference on Machine Learning (ICML), 2023.
  52. Spatial-temporal consistency network for low-latency trajectory forecasting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  53. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
  54. Yuke Li. Which way are you going? imitative decision learning for path forecasting in dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  55. Peeking into the future: Predicting future person activities and locations in videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  56. Simaug: Learning robust representations from simulation for trajectory prediction. In Proceedings of the European Conference on Computer Vision (ECCV), 2020a.
  57. The garden of forking paths: Towards multi-future trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020b.
  58. Temporal pyramid network for pedestrian trajectory prediction with multi-supervision. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021.
  59. Kg-bart: Knowledge graph-augmented bart for generative commonsense reasoning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021a.
  60. Social nce: Contrastive learning of socially-aware motion representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021b.
  61. Decoupled weight decay regularization. In Proceedings of the International Conference on Learning Representations (ICLR), 2018.
  62. Autotrajectory: Label-free trajectory extraction and prediction from videos using dynamic points. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
  63. Fast inference and update of probabilistic density estimation on trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  64. It is not the journey but the destination: Endpoint conditioned trajectory prediction. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
  65. From goals, waypoints & paths to long term human trajectory forecasting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  66. Scene-lstm: A model for human trajectory prediction. arXiv preprint arXiv:1808.04018, 2018.
  67. Leapfrog diffusion model for stochastic trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  68. Mantra: Memory augmented networks for multiple trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020a.
  69. Multiple trajectory prediction of moving agents with memory augmented networks. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020b.
  70. Smemo: Social memory for trajectory forecasting. arXiv preprint arXiv:2203.12446, 2022.
  71. Abnormal crowd behavior detection using social force model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
  72. Distributed representations of words and phrases and their compositionality. In Proceedings of the Neural Information Processing Systems (NeurIPS), 2013.
  73. Social-stgcnn: A social spatio-temporal graph convolutional neural network for human trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  74. Social-implicit: Rethinking trajectory prediction evaluation and the effectiveness of implicit maximum likelihood estimation. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  75. How many observations are enough? knowledge distillation for trajectory forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  76. Spu-bert: Faster human multi-trajectory prediction from socio-physical understanding of bert. Knowledge-Based Systems, 2023.
  77. Social-patternn: Socially-aware trajectory prediction guided by motion patterns. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022.
  78. Back to square one: probabilistic trajectory forecasting without bells and whistles. In Proceedings of the Neural Information Processing Systems Workshop (NeurIPSW), 2018.
  79. Trajectory prediction with latent belief energy-based model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  80. A deep reinforced model for abstractive summarization. In Proceedings of the International Conference on Learning Representations (ICLR), 2018.
  81. You’ll never walk alone: Modeling social behavior for multi-target tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2009.
  82. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.
  83. Language models as knowledge bases? In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
  84. A data-driven model for interaction-aware pedestrian motion prediction in object cluttered environments. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2018.
  85. Learn tarot with mentor: A meta-learned self-supervised approach for trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  86. ProphetNet: Predicting future n-gram for sequence-to-SequencePre-training. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
  87. Language models are unsupervised multitask learners. OpenAI blog, 2019.
  88. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning (ICML), 2021.
  89. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research (JMLR), 2020.
  90. Pedestrian prediction by planning using deep neural networks. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2018.
  91. Trace and pace: Controllable pedestrian animation via guided trajectory diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  92. Structural adapters in pretrained language models for AMR-to-Text generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
  93. Learning social etiquette: Human trajectory understanding in crowded scenes. In Proceedings of the European Conference on Computer Vision (ECCV), 2016.
  94. Imagenet large scale visual recognition challenge. International Journal on Computer Vision (IJCV), 2015.
  95. Sophie: An attentive gan for predicting paths compliant to social and physical constraints. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  96. Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
  97. Neural machine translation of rare words with subword units. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2016.
  98. Introvert: Human trajectory prediction via conditional 3d attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  99. Sgcn: Sparse graph convolution network for pedestrian trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021a.
  100. Social interpretable tree for pedestrian trajectory prediction. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2022.
  101. Trajectory unified transformer for pedestrian trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  102. Multimodal interaction-aware trajectory prediction in crowded space. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020.
  103. Social-dpf: Socially acceptable distribution prediction of futures. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021b.
  104. Automatic generation of socratic subquestions for teaching math word problems. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
  105. Numeracy for language models: Evaluating and improving their ability to predict numbers. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2018.
  106. Reciprocal learning networks for human trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020a.
  107. Recursive social behavior graph for trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020b.
  108. Three steps to multimodal trajectory prediction: Modality clustering, classification and synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  109. Human trajectory prediction with momentary observation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  110. Stimulus verification is a universal and effective sampler in multi-modal human trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  111. Generating text with recurrent neural networks. In Proceedings of the International Conference on Machine Learning (ICML), 2011.
  112. Dynamic and static context-aware lstm for multi-agent motion prediction. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
  113. Elliot Turiel. The development of social knowledge: Morality and convention. Cambridge University Press, 1983.
  114. Human trajectory prediction using spatially aware deep attention models. arXiv preprint arXiv:1705.09436, 2017.
  115. Attention is all you need. In Proceedings of the Neural Information Processing Systems (NeurIPS), 2017.
  116. Graph attention networks. In Proceedings of the International Conference on Learning Representations (ICLR), 2018.
  117. Social attention: Modeling attention in human crowds. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2018.
  118. Stepwise goal-driven networks for trajectory prediction. IEEE Robotics and Automation Letters (RA-L), 2022.
  119. Fend: A future enhanced distribution-aware contrastive learning framework for long-tail trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  120. Social ode: Multi-agent trajectory forecasting with neural ordinary differential equations. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  121. View vertically: A hierarchical network for trajectory prediction via fourier spectrums. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  122. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.
  123. Groupnet: Multiscale hypergraph neural networks for trajectory prediction with relational reasoning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022a.
  124. Remember intentions: Retrospective-memory-based trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022b.
  125. Eqmotion: Equivariant multi-agent motion prediction with invariant interaction reasoning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023a.
  126. Socialvae: Human trajectory prediction using timewise latents. In Proceedings of the European Conference on Computer Vision (ECCV), 2022c.
  127. Improving multilingual neural machine translation with auxiliary source languages. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
  128. Cf-lstm: Cascaded feature-based long short-term networks for predicting pedestrian trajectory. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020.
  129. Adaptive trajectory prediction via transferable gnn. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022d.
  130. Uncovering the missing pattern: Unified framework towards trajectory imputation and prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023b.
  131. PromptCast: A new prompt-based learning paradigm for time series forecasting. arXiv preprint arXiv:2210.08964, 2022.
  132. Ss-lstm: A hierarchical lstm model for pedestrian trajectory prediction. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2018.
  133. Translating human mobility forecasting through natural language generation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, 2022a.
  134. Leveraging language foundation models for human mobility forecasting. In Proceedings of the 30th International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL), 2022b.
  135. Who are you with and where are you going? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2011.
  136. Towards making the most of bert in neural machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020.
  137. Bitrap: Bi-directional pedestrian trajectory prediction with multi-modal goal estimation. IEEE Robotics and Automation Letters (RA-L), 2021.
  138. Understanding pedestrian behaviors from stationary crowd groups. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
  139. Spatio-temporal graph transformer networks for pedestrian trajectory prediction. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
  140. Agentformer: Agent-aware transformers for socio-temporal multi-agent forecasting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  141. Human trajectory prediction via neural social physics. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  142. Sr-lstm: State refinement for lstm towards pedestrian trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  143. Where are you heading? dynamic trajectory prediction with expert goal examples. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  144. Tnt: Target-driven trajectory prediction. In Proceedings of the Conference on Robot Learning (CoRL), 2020.
  145. Multi-agent tensor fusion for contextual trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  146. Learning to prompt for vision-language models. arXiv preprint arXiv:2109.01134, 2021.
  147. Towards language-free training for text-to-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  148. Biff: Bi-level future fusion with polyline-based coordinate for interactive trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  149. Solving math word problems concerning systems of equations with gpt-3. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Inhwan Bae (8 papers)
  2. Junoh Lee (4 papers)
  3. Hae-Gon Jeon (23 papers)
Citations (5)