First-Proposal Bias
- First-proposal bias is a systematic tendency where initial information, like prior rejections, disproportionately sways subsequent evaluations.
- Experimental evidence in academic peer review shows that signaling prior rejection leads to significantly lower scores, with measurable effects such as a mean score drop of 0.78 on a 10-point scale.
- This bias spans diverse domains—from algorithmic object detection to group decision-making—and can be mitigated through masking prior outcomes, calibration techniques, and inverse propensity scoring.
First-proposal bias is a systematic tendency for decisions, rankings, or learning processes to be disproportionately influenced by initial exposure, availability, or order of candidate proposals, particularly when information is revealed about their prior status, when feedback or labels are sparse, or when evaluation protocols favor the first or most visible options. This phenomenon appears across scientific peer review, recommender systems, object detection, statistical modeling, and collaborative decision processes. Its mechanisms and consequences have significant implications for fairness, robustness, and interpretability in both human and automated systems.
1. Experimental Evidence in Academic Peer Review
First-proposal bias is quantitatively demonstrated in conference peer review, most notably among novice reviewers. In a randomized controlled trial simulating machine learning conference review conditions, 133 novice reviewers (masters, junior PhD students, recent graduates from top US universities) each received anonymized papers to review. When reviewers were notified that a paper had previously been rejected at a similar venue, they assigned significantly lower overall scores to resubmissions compared to control (non-signaled) papers (mean difference Δ = -0.78 on a 10-point Likert scale, 95% CI = [-1.30, -0.24], p = 0.036).
The effect was most pronounced in the "quality" criterion (Δ = -0.46, Cohen’s d = -0.23, p = 0.005), and also significant for "clarity" and "significance". "Originality" score was not statistically affected (p = 0.105), and reviewer confidence did not change (p = 0.902). "Relative effect size" analysis revealed that only 42% of the test reviews (resubmission-notified) exceeded control counterparts. This establishes a clear, content-independent negative shift: knowledge of prior rejection increases the probability of a more negative evaluation, irrespective of actual merit (Stelmakh et al., 2020).
2. Mechanisms and Psychological Underpinnings
First-proposal bias is associated with several cognitive phenomena:
- Anchoring: Reviewers or decision-makers fixate on the first information received (e.g., rejection signal), disproportionately influencing their judgments.
- Outcome Bias: Knowledge of past outcomes (e.g., rejection) colors subsequent assessment, regardless of current content.
- Social Proof: If a proposal was previously rejected, reviewers may infer broad, if implicit, consensus that it is weaker, reducing their willingness to endorse it.
- Confirmation Bias: The revealed prior status invites selective emphasis on weaknesses, leading to harsher evaluations.
These biases operate even in analytic, high-stakes tasks such as paper review, remaining robust under randomization and when reviewers are given clear criteria. The numerical magnitude (sub-point changes) is small, but sufficient to dramatically alter acceptance probabilities at top conferences—one point shifts odds by factors of 5–6 (Stelmakh et al., 2020).
3. Manifestations in Broader Scientific Review and Ranking
First-proposal bias is persistent in proposal ranking systems outside the conference context. In ALMA (Atacama Large Millimeter/submillimeter Array) proposal review, first-time principal investigators (PIs) receive systematically poorer Stage 1 rankings than experienced PIs, as quantified by Anderson-Darling k-sample test p-values < 1e-5 across cycles, indicating significant distributional shift. The Stage 2 face-to-face panel discussions do not correct this bias; cumulative distribution functions of ranks remain similar before and after panel deliberation. The bias magnitude is on par with known regional effects (Europe/North America outperform Chile/East Asia) and much larger than gender differences, which are marginal or inconsistent (Carpenter, 2019).
4. First-Proposal Bias in Algorithmic and Data Protocols
Object Detection and Evaluation Protocols
Evaluation protocols that reward overlap with annotated objects create systemic first-proposal bias. When object proposal algorithms are scored only on partially annotated datasets, methods tailored to the specific annotated categories (such as concatenated detector outputs) achieve artificially high recall by concentrating proposals on those categories, ignoring genuine category independence. This "gameability" is measurable: methods trained on increasing numbers of categories show recall growth only if they are category-dependent, whereas category-independent methods remain flat. Thus, partially annotated datasets amplify first-proposal bias and overestimate true generalization capacity (Chavali et al., 2015).
A summary comparison:
| Protocol/Dataset | Amplifies First-Proposal Bias | True Category Independence |
|---|---|---|
| Partially Annotated Evaluation | Yes | No |
| Fully Annotated Evaluation | No (or much less) | Yes |
Open-World, Few-Shot, and Open-Vocabulary Detection
In few-shot object detection, the proposal distribution learned in base training (over labeled base classes) fails to generalize to novel classes, resulting in misaligned, inconsistent candidate regions for new objects. The "first-proposal bias" here stems from treating unlabeled (novel) objects as background, which skews proposal offsets and statistical distributions. Proposal Distribution Calibration (PDC) tackles this by sampling proposals for novel instances matching base class statistics, with dedicated losses (classification, localization, contrastive) to bridge the gap (up to +6.7 AP50 improvement for 1-shot Pascal VOC splits) (Li et al., 2022).
In open-vocabulary object detection, model confidence is disproportionately high for base classes, with proposals for novel classes under-scored due to feature and training imbalance. MEDet applies region-level mining and post-hoc class-wise logit adjustment (using cluster statistics) to suppress overrepresented base classes, balancing predictions for novel categories (Chen et al., 2022).
THPN in open-world proposals uses hybrid scoring (classification and localization-based heads) and open-world self-training, explicitly designed to mitigate first-proposal/label bias. The tunable architecture allows control over the trade-off between known and unknown object recall, efficiently improving OOD detection in highly label-biased or sparsely labeled environments (Inkawhich et al., 2022).
5. Selection Bias and Preference Elicitation
Selection bias in initial ("first-proposal") preference elicitation interactions leads to persistent overrepresentation and compounding error in conversational recommender systems. Early exposure to particular topics or genres biases the system toward those, amplifying downstream item recommendation error. Inverse Propensity Scoring (IPS) is shown to robustly mitigate these effects in semi-simulated and synthetic experiments, significantly improving test set error and ranking performance versus naïve approaches (Gupta et al., 1 May 2024).
| Step | Bias Manifestation | Impact (if Ignored) | Debiasing Methods |
|---|---|---|---|
| Early PE (first props) | Topic over/under-representation | Recommendations locked-in to bias | IPS, exposure models |
| All stages | Compounding feedback/learning bias | Reinforcement of initial bias | Exploration, unbiased eval |
6. First-Proposal Bias in Group Decision-Making
In collective decision scenarios modeled as drift-diffusion processes or sequential evidence accumulation, the first agents to make a decision typically possess the strongest initial bias. Their choices overwhelmingly reflect this bias, regardless of objective evidence or correct answer. In contrast, late-deciding individuals act as if unbiased and are statistically more accurate. In the asymptotic regime, the probability that a non-extreme bias agent makes the earliest decision decays as for group size , where . This demonstrates that fast group decisions encode initial predispositions, whereas slow decisions reflect evidence (Linn et al., 2023).
7. Mitigation Strategies and Policy Implications
Mitigating first-proposal bias requires structural changes and targeted debiasing interventions:
- Review and Ranking: Masking resubmission status in peer-review, reviewer training, cautious adoption of review-sharing policies, focus on reviewer experience effects (Stelmakh et al., 2020, Carpenter, 2019).
- Data and Algorithmic Protocols: Use fully (or near fully) annotated datasets, cross-dataset generalization tests, diagnostic measures of bias capacity, proposal calibration using base-class statistics and clustering (Chavali et al., 2015, Li et al., 2022, Chen et al., 2022, Inkawhich et al., 2022).
- Recommender Systems: Application of inverse propensity scoring, active exploration in preference elicitation, unbiased evaluation sets (Gupta et al., 1 May 2024).
- Experimental Design: Statistical debiasing at the equilibrium level (e.g., directional derivative corrections in FPPE for A/B testing) (Liao et al., 11 Feb 2024).
- Collaborative Decision Processes: Downweighting early decisions in group consensus or sequential decision aggregation (Linn et al., 2023).
Summary
First-proposal bias is a pervasive, quantifiable phenomenon shaping outcomes wherever initial exposure, prior status, or presentation order interacts with human or algorithmic evaluation, learning, or decision processes. Its manifestations are robust across domains and can materially influence fairness, robustness, and accuracy. Empirical and theoretical work demonstrates that bias-aware protocols, debiasing methods, and critical evaluation practices are essential for reliable scientific, algorithmic, and collaborative decision-making.