Combining SSR with light fine-tuning to improve fidelity

Determine whether and to what extent combining the semantic similarity rating (SSR) approach—which maps large language model–generated free-text purchase-intent statements to 5-point Likert response distributions via embedding similarity to predefined anchor statements—with light fine-tuning strategies such as calibration or prompt optimization improves fidelity to human survey outcomes relative to zero-shot SSR in consumer concept testing.

Background

The paper introduces semantic similarity rating (SSR) as a method to elicit free-text purchase-intent responses from LLMs and translate them into Likert-scale distributions by computing embedding similarity to anchor statements. Across 57 personal care product concept surveys (9,300 human responses), SSR achieves high alignment with human results, including realistic response distributions and approximately 90% test–retest correlation attainment, without any fine-tuning on survey data.

The authors note that reference statement sets were manually optimized and that no training data or fine-tuning was used, emphasizing generality and low cost. They explicitly raise the question of whether hybrid methods—combining SSR with lightweight calibration or prompt optimization—could further increase fidelity beyond zero-shot SSR, motivating investigation into such combinations.

References

Finally, there is an open question about combining SSR with light fine-tuning approaches. Although we deliberately avoided training data here to demonstrate generality, hybrid methods where SSR is used in tandem with calibration or prompt optimization may achieve even higher fidelity.

— LLMs Reproduce Human Purchase Intent via Semantic Similarity Elicitation of Likert Ratings (2510.08338 - Maier et al., 9 Oct 2025) in Discussion and Conclusion (final paragraph)

Combining SSR with light fine-tuning to improve fidelity

Background

References

Related Problems