Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Eyeballing Combinatorial Problems: A Case Study of Using Multimodal Large Language Models to Solve Traveling Salesman Problems (2406.06865v1)

Published 11 Jun 2024 in cs.AI

Abstract: Multimodal LLMs (MLLMs) have demonstrated proficiency in processing di-verse modalities, including text, images, and audio. These models leverage extensive pre-existing knowledge, enabling them to address complex problems with minimal to no specific training examples, as evidenced in few-shot and zero-shot in-context learning scenarios. This paper investigates the use of MLLMs' visual capabilities to 'eyeball' solutions for the Traveling Salesman Problem (TSP) by analyzing images of point distributions on a two-dimensional plane. Our experiments aimed to validate the hypothesis that MLLMs can effectively 'eyeball' viable TSP routes. The results from zero-shot, few-shot, self-ensemble, and self-refine zero-shot evaluations show promising outcomes. We anticipate that these findings will inspire further exploration into MLLMs' visual reasoning abilities to tackle other combinatorial problems.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Mohammed Elhenawy (34 papers)
  2. Ahmed Abdelhay (3 papers)
  3. Taqwa I. Alhadidi (11 papers)
  4. Shadi Jaradat (6 papers)
  5. Ahmed Jaber (11 papers)
  6. Sebastien Glaser (6 papers)
  7. Andry Rakotonirainy (14 papers)
  8. Huthaifa I Ashqar (3 papers)
Citations (2)
X Twitter Logo Streamline Icon: https://streamlinehq.com