Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence (2304.06798v1)

Published 13 Apr 2023 in cs.AI, cs.CL, and cs.CV

Abstract: Large pre-trained models, also known as foundation models (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have yet seen an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges of developing multimodal foundation models for GeoAI. We first investigate the potential of many existing FMs by testing their performances on seven tasks across multiple geospatial subdomains including Geospatial Semantics, Health Geography, Urban Geography, and Remote Sensing. Our results indicate that on several geospatial tasks that only involve text modality such as toponym recognition, location description recognition, and US state-level/county-level dementia time series forecasting, these task-agnostic LLMs can outperform task-specific fully-supervised models in a zero-shot or few-shot learning setting. However, on other geospatial tasks, especially tasks that involve multiple data modalities (e.g., POI-based urban function classification, street view image-based urban noise intensity classification, and remote sensing image scene classification), existing foundation models still underperform task-specific models. Based on these observations, we propose that one of the major challenges of developing a FM for GeoAI is to address the multimodality nature of geospatial tasks. After discussing the distinct challenges of each geospatial data modality, we suggest the possibility of a multimodal foundation model which can reason over various types of geospatial data through geospatial alignments. We conclude this paper by discussing the unique risks and challenges to develop such a model for GeoAI.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (14)
  1. Gengchen Mai (46 papers)
  2. Weiming Huang (10 papers)
  3. Jin Sun (67 papers)
  4. Suhang Song (2 papers)
  5. Deepak Mishra (78 papers)
  6. Ninghao Liu (98 papers)
  7. Song Gao (72 papers)
  8. Tianming Liu (161 papers)
  9. Gao Cong (54 papers)
  10. Yingjie Hu (26 papers)
  11. Chris Cundy (18 papers)
  12. Ziyuan Li (32 papers)
  13. Rui Zhu (138 papers)
  14. Ni Lao (31 papers)
Citations (100)

Summary

We haven't generated a summary for this paper yet.