Dice Question Streamline Icon: https://streamlinehq.com

Consistent Difficulty Calibration Across LSC Editions

Establish a fully consistent, standardized protocol for calibrating task difficulty across different editions of the ACM Lifelog Search Challenge to enable fair year-over-year comparisons of interactive lifelog retrieval systems under the live competitive format.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper compares interactive lifelog retrieval systems across LSC'22, LSC'23, and LSC'24, noting that task difficulty can vary year-to-year, especially for subjective or open-ended tasks. Although average scores per task type are used to mitigate variability, achieving a consistent difficulty calibration remains challenging.

The authors highlight that ensuring fully comparable difficulty across editions may require re-running historical systems on current tasks with the same users and interfaces, which is not feasible in the live competition setup. Hence, the need for an agreed-upon calibration protocol is explicitly identified as an open challenge.

References

Although we mitigate this by averaging scores across multiple tasks and reporting per-task-type performance, a fully consistent difficulty calibration across years remains an open challenge.

The State-of-the-Art in Lifelog Retrieval: A Review of Progress at the ACM Lifelog Search Challenge Workshop 2022-24 (2506.06743 - Tran et al., 7 Jun 2025) in Section Conclusion, Subsection Limitations