Explain the MSMARCO metric discrepancy for Jasper embeddings
Determine the cause of the MSMARCO evaluation discrepancy observed for the Jasper text embedding model (jasper_en_vision_language_v1), namely why Normalized Discounted Cumulative Gain (NDCG) and Mean Reciprocal Rank (MRR) are high while Mean Average Precision (MAP) is very low, given that the teacher models stella_en_1.5B_v5 and NV-Embed-v2 do not exhibit this behavior.
References
After releasing the Jasper model, an enthusiastic user (user name is raghavlite, https://huggingface.co/raghavlite) points out that the NDCG/MRR score is perfect and the MAP score is very low. Jasper model is distilled from stella_en_1.5B_v5 and NV-Embed-v2 and their MSMARCO score do not have this appearance. As of now, we still haven't been able to figure out what happened.