Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 48 tok/s

Gemini 2.5 Pro 48 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 19 tok/s Pro

GPT-4o 107 tok/s Pro

Kimi K2 205 tok/s Pro

GPT OSS 120B 473 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

Rao-Blackwellized Stochastic Gradients for Discrete Distributions (1810.04777v3)

Published 10 Oct 2018 in stat.ML and cs.LG

Abstract: We wish to compute the gradient of an expectation over a finite or countably infinite sample space having $K \leq \infty$ categories. When $K$ is indeed infinite, or finite but very large, the relevant summation is intractable. Accordingly, various stochastic gradient estimators have been proposed. In this paper, we describe a technique that can be applied to reduce the variance of any such estimator, without changing its bias---in particular, unbiasedness is retained. We show that our technique is an instance of Rao-Blackwellization, and we demonstrate the improvement it yields on a semi-supervised classification problem and a pixel attention task.

Citations (30)

View on Semantic Scholar