Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 39 tok/s Pro
GPT-4o 112 tok/s Pro
Kimi K2 188 tok/s Pro
GPT OSS 120B 442 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Association Rule Mining using Maximum Entropy (1501.02143v1)

Published 9 Jan 2015 in cs.DB

Abstract: Recommendations based on behavioral data may be faced with ambiguous statistical evidence. We consider the case of association rules, relevant e.g.~for query and product recommendations. For example: Suppose that a customer belongs to categories A and B, each of which is known to have positive correlation with buying product C, how do we estimate the probability that she will buy product C? For rare terms or products there may not be enough data to directly produce such an estimate --- perhaps we never directly observed a connection between A, B, and C. What can we do when there is no support for estimating the probability by simply computing the observed frequency? In particular, what is the right thing to do when A and B give rise to very different estimates of the probability of C? We consider the use of maximum entropy probability estimates, which give a principled way of extrapolating probabilities of events that do not even occur in the data set! Focusing on the basic case of three variables, our main technical contributions are that (under mild assumptions): 1) There exists a simple, explicit formula that gives a good approximation of maximum entropy estimates, and 2) Maximum entropy estimates based on a small number of samples are provably tightly concentrated around the true maximum entropy frequency that arises if we let the number of samples go to infinity. Our empirical work demonstrates the surprising precision of maximum entropy estimates, across a range of real-life transaction data sets. In particular we observe the average absolute error on maximum entropy estimates is a factor $3$--$14$ less compared to using independence or extrapolation estimates, when the data used to make the estimates has low support. We believe that the same principle can be used to synthesize probability estimates in many settings.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.