Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 73 tok/s

Gemini 2.5 Pro 40 tok/s Pro

GPT-5 Medium 32 tok/s Pro

GPT-5 High 28 tok/s Pro

GPT-4o 75 tok/s Pro

Kimi K2 184 tok/s Pro

GPT OSS 120B 466 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

Potential fitting biases resulting from grouping data into variable width bins (1209.2690v1)

Published 12 Sep 2012 in physics.data-an, astro-ph.CO, astro-ph.IM, and hep-ex

Abstract: When reading peer-reviewed scientific literature describing any analysis of empirical data, it is natural and correct to proceed with the underlying assumption that experiments have made good faith efforts to ensure that their analyses yield unbiased results. However, particle physics experiments are expensive and time consuming to carry out, thus if an analysis has inherent bias (even if unintentional), much money and effort can be wasted trying to replicate or understand the results, particularly if the analysis is fundamental to our understanding of the universe. In this note we discuss the significant biases that can result from data binning schemes. As we will show, if data are binned such that they provide the best comparison to a particular (but incorrect) model, the resulting model parameter estimates when fitting to the binned data can be significantly biased, leading us to too often accept the model hypothesis when it is not in fact true. When using binned likelihood or least squares methods there is of course no a priori requirement that data bin sizes need to be constant, but we show that fitting to data grouped into variable width bins is particularly prone to produce biased results if the bin boundaries are chosen to optimize the comparison of the binned data to a wrong model. The degree of bias that can be achieved simply with variable binning can be surprisingly large. Fitting the data with an unbinned likelihood method, when possible to do so, is the best way for researchers to show their analyses are not biased by binning effects. Failing that, equal bin widths should be employed as a cross-check of the fitting analysis whenever possible.