Dice Question Streamline Icon: https://streamlinehq.com

Extent of performance impact from ChatGPT’s image processing capabilities on physics assessments

Determine the extent to which enabling GPT-4’s image processing and interpretation capabilities improves its performance on undergraduate physics assessments that include figures, diagrams, or graphical information, relative to its text-only performance.

Information Square Streamline Icon: https://streamlinehq.com

Background

Throughout the paper, GPT-4 was prompted between November 2023 and February 2024, during which image analysis and processing were not available. The authors repeatedly note difficulties with diagrammatic and graphical questions, which required workaround strategies such as describing diagrams textually or converting them into code-like representations.

Later in the paper, the authors acknowledge that ChatGPT has acquired image processing and interpretation capabilities, but they explicitly state that it is not clear how much this will improve grades. Quantifying this effect is important for understanding the vulnerability of assessment formats that rely on diagrams or graphical information.

References

Finally, we note that the advent of image processing and interpretation capability by ChatGPT could enhance the grades that it can score, although the extent to which this will help is not clear (cf., Polverini & Gregorcic 2024).

Can ChatGPT pass a physics degree? Making a case for reformation of assessment of undergraduate degrees (2412.01312 - Pimbblet et al., 2 Dec 2024) in Section 5, “Does GPT-4 Pass the Degree?”