MAP@100: Ranking Quality Metric

Updated 9 March 2026

MAP@100 is a metric that quantifies ranking quality by calculating average precision over only the top 100 results, integrating both relevance and rank order.
It is applied in scenarios such as query evaluation, large-scale image retrieval, and deep learning pipelines by leveraging differentiable surrogates like smooth histogram-binning.
Analytic baselines and statistical significance tests using MAP@100 provide a rigorous framework for benchmarking and validating improvements in ranking algorithms.

@@@@1@@@@ at 100 (MAP@100) is a widely used metric in information retrieval and recommender systems, designed to evaluate the quality of ranking algorithms when only the top 100 results are of interest. MAP@100 integrates both the relevance and ranking positions of retrieved items, and its application spans query-based evaluation, large-scale image retrieval, and statistical significance testing of ranking improvements (Manzhos et al., 4 Nov 2025, Revaud et al., 2019).

1. Formal Definitions and Notational Framework

Let $N$ denote the total number of candidate items and $R$ the subset of relevant items ( $R\leq N$ ). For a retrieved list truncated at position $k=100$ :

$\mathrm{rel}(i)\in\{0,1\}$ indicates relevance of item at position $i$ .
$\mathrm{P}@i = \frac{1}{i} \sum_{j=1}^i \mathrm{rel}(j)$ is the precision at position $i$ .
$M=\min(R,100)$ is the normalization denominator.
The $k$ th harmonic number is $H_k = \sum_{i=1}^k 1/i$ ; $H_k^{(2)} = \sum_{i=1}^k 1/i^2$ .

For a single ranking, the Average Precision at 100 is

$AP@100 = \frac{1}{M} \sum_{i=1}^{100} \mathrm{P}@i \cdot \mathrm{rel}(i) .$

For a set $U$ of $Q$ queries or users, each with its own $AP@100_u$ ,

$MAP@100 = \frac{1}{Q}\sum_{u=1}^Q AP@100_u.$

This definition ensures that MAP@100 is sensitive to both the number and distribution of relevant items within the top 100 ranks (Manzhos et al., 4 Nov 2025, Revaud et al., 2019).

2. Random Baseline: Expectation and Variance under Uniform Rankings

MAP@100’s significance is enhanced by analytic baselines under the random-ranking model, where $R$ relevant items are uniformly distributed among $N$ candidates (sampling without replacement).

The expectation of $AP@100$ is

$E[AP@100] = \frac{R}{N\,M} \left(\frac{R-1}{N-1}\cdot 100 + \frac{N-R}{N-1} H_{100}\right),$

where $M = \min(R,100)$ (Manzhos et al., 4 Nov 2025). This establishes the expected MAP@100 achievable by chance.

The variance has the form

$\mathrm{Var}(AP@100) = \frac{1}{M^2} \frac{R}{N} \Big[100(C + 2(E-F) + 99 G) + H_{100}(B-2(E - 100 F)) + H^{(2)}_{100}(A-D) + H_{100}^2 D \Big],$

with explicit expressions for $A, B, C, D, E, F, G$ as functions of $N$ and $R$ . For $Q$ independent queries,

$E[MAP@100] = \frac{1}{Q} \sum_{u=1}^Q E[AP@100_u],\qquad \mathrm{Var}(MAP@100) = \frac{1}{Q^2} \sum_{u=1}^Q \mathrm{Var}(AP@100_u),$

and for homogeneous queries $(N_u = N, R_u = R)$ ,

$E[MAP@100] = E[AP@100],\quad \mathrm{Var}(MAP@100) = \frac{1}{Q} \mathrm{Var}(AP@100).$

These baselines are fundamental for statistical testing and for contextualizing observed system performance (Manzhos et al., 4 Nov 2025).

3. MAP@100 for Batched Image Retrieval and Listwise Optimization

In deep image retrieval systems, MAP@100 is computed as follows. Let $d_i \in \mathbb{R}^C$ be normalized descriptors and $S_{i}^q = d_q^\top d_i$ the cosine similarity. The database $\{I_i\}_{i=1}^N$ is searched for relevant items corresponding to query $q$ . The definitions proceed as:

$Y_i^q\in\{0,1\}$ : relevance label for query $q$ , database item $i$ .
$N^q = \sum_{i=1}^{N} Y_i^q$ : total relevant images.

Truncated average precision at $K=100$ ,

$AP@100(S^q, Y^q) = \sum_{k=1}^{100} P_k(S^q, Y^q) \cdot \Delta r_k(S^q, Y^q),$

with $P_k$ and $\Delta r_k$ as the precision and recall increments at position $k$ . Over $Q$ queries,

$MAP@100 = \frac{1}{Q} \sum_{q=1}^Q AP@100(S^q, Y^q).$

This formulation aligns with practical retrieval and learning pipelines (Revaud et al., 2019).

4. Differentiable Surrogates: Smooth Histogram-Binning for AP@100

Classic AP@100 calculation is non-differentiable due to explicit sorting and indicator functions. A smooth surrogate is constructed by soft-binning the score axis $[-1,1]$ into $M$ bins of width $\Delta=2/(M-1)$ , each centered at $b_m = 1 - (m-1)\Delta$ . The kernel

$\delta(x, m) = \max(1 - |x-b_m|/\Delta, 0)$

provides a soft assignment, and soft histograms for all and relevant items are accumulated:

$C_{q, m}^{all} = \sum_{i=1}^N \delta(S_i^q, m)$ ,
$C_{q, m}^{rel} = \sum_{i=1}^N \delta(S_i^q, m) Y_i^q$ .

Approximated precision and recall per bin are

$\hat P_{q,m} = \frac{\sum_{m'=1}^m C_{q,m'}^{rel}}{\sum_{m'=1}^m C_{q,m'}^{all}}, \qquad \Delta \hat r_{q,m} = \frac{C_{q,m}^{rel}}{N^q},$

yielding quantized AP,

$AP_Q(S^q, Y^q) = \sum_{m=1}^M \hat P_{q,m} \cdot \Delta \hat r_{q,m}.$

This AP surrogate is differentiable w.r.t. network parameters, enabling direct end-to-end optimization (Revaud et al., 2019).

5. Computation, Training, and Memory Considerations

Sorting-based $AP@100$ requires $O(N \log N)$ per query. The histogram-binning approximation bypasses explicit sorting with computational cost $O(NM)$ per query. For batched training, $N=B$ (batch size), yielding $O(B^2 M)$ operations per batch. Memory usage is dominated by the $B \times B$ similarity matrix and $B \cdot C$ descriptors. For example, with $B=4096$ and descriptor dimension $C=2048$ , total memory footprint is $O(B^2 + B C)$ —well within typical GPU memory budgets (Revaud et al., 2019).

Training with this surrogate involves:

Forward-passing all $B$ images to obtain descriptors.
Computing the similarity matrix, AP surrogates, and loss gradients wrt descriptors ( $O(B^2 M)$ memory/compute).
Backpropagating gradients by recomputing each image’s forward pass individually, eliminating the need to store all activations. This staged procedure optimally utilizes GPU resources and provides 2–3 $\times$ speed-ups over alternative approaches such as hard-negative mining (Revaud et al., 2019).

6. Statistical Significance and Null Model Interpretations

MAP@100 values are conventionally interpreted relative to random-ranking baselines. Compute the mean ( $\mu_0 = E[MAP@100]$ ) and standard deviation ( $\sigma_0 = \sqrt{\mathrm{Var}(MAP@100)}$ ) for the random model. Given an observed $MAP@100_\mathrm{obs}$ , the standardized $z$ -score is:

$z = \frac{MAP@100_\mathrm{obs} - \mu_0}{\sigma_0} .$

Under the null (random) hypothesis, $z$ is approximately standard normal. $z>1.96$ implies statistical significance at $p<0.05$ . This framework enables researchers to rigorously assess if observed ranking gains exceed those explainable by chance, with analytic baselines for mean and fluctuation scale (Manzhos et al., 4 Nov 2025).

7. Practical Relevance and Context Among Metrics

MAP@100 is preferred in scenarios where only the top-ranked results are critical, such as web search, recommender system outputs, and image retrieval tasks. Compared to untruncated MAP, MAP@100 more closely models user-facing scenarios where lower-ranked results are rarely examined. The differentiable surrogates developed for deep learning pipelines facilitate direct optimization of retrieval objectives, outperforming proxy loss functions or heuristic approaches (Revaud et al., 2019). The closed-form random baselines further enhance MAP@100’s interpretability and robustness for benchmarking systems (Manzhos et al., 4 Nov 2025).

References:

(Manzhos et al., 4 Nov 2025): "Average Precision at Cutoff k under Random Rankings: Expectation and Variance"
(Revaud et al., 2019): "Learning with Average Precision: Training Image Retrieval with a Listwise Loss"

Markdown Report Issue Upgrade to Chat

References (2)

Average Precision at Cutoff k under Random Rankings: Expectation and Variance (2025)

Learning with Average Precision: Training Image Retrieval with a Listwise Loss (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Mean Average Precision at 100 (MAP@100).