Preference-Perceptron: Interactive Preference Learning
- Preference-Perceptron is a class of algorithms that extend perceptron learning to interactive preference elicitation and collaborative filtering, integrating user choice feedback.
- It employs both neural and linear models with online updates and MILP-based query selection to manage high-dimensional and combinatorial data.
- The method offers scalability, fast convergence, and robust theoretical guarantees, proving effective in applications such as trip planning and PC configuration.
The Preference-Perceptron is a class of algorithms that extends perceptron-style online learning to problems of interactive preference elicitation or collaborative filtering, where user preferences inform the weight updates. These algorithms bridge the gap between classical logistic regression-based recommendation and more expressive neural or linear models capable of accommodating noisy, partial-preference feedback, especially in high-dimensional or combinatorial domains (Chakraborty, 2024, Dragone et al., 2017).
1. Model Definitions and Problem Settings
Preference-Perceptron algorithms are designed to infer personalized utility functions for users based on their observed preferences across items, configurations, or combinatorial candidates.
- In classical collaborative filtering, the system seeks to learn a function (possibly via an MLP, i.e., multilayer perceptron) that predicts an individual user's preference (binary or real-valued) given feature representation of an item or configuration (Chakraborty, 2024).
- In combinatorial preference elicitation, each round consists of a context , a set of queried candidates , and partial feedback in the form of a single chosen item (Dragone et al., 2017).
For each domain:
- The true user utility (latent) is assumed linear: with a hybrid feature map .
- Feature vectors may encode categorical, Boolean, and real-valued attributes, often with one-hot and numerical encoding, and context-specific features.
2. Model Architectures
Feed-forward Neural Preference-Perceptron
This instantiation generalizes logistic regression with an MLP for user-item pairs:
- Input: Feature vector for item (optionally user features).
- Network: For layers, activations are recursively defined:
where is a nonlinearity (ReLU, sigmoid).
- Output: Scalar preference score passed through a sigmoid (binary) or linear (regression) head:
Online Linear Preference-Perceptron (Choice Perceptron)
For set-wise queries over combinatorial spaces:
- At round with current weights , after observing user choice from , update:
where
Intuitively, this increases the utility of the chosen configuration, penalizing the average of rejected ones (Dragone et al., 2017).
- Estimated utility at each iteration: .
3. Mathematical Formulation and Optimization
Preference-Perceptron algorithms employ several key optimization and learning rules:
- Feature Normalization and Scaling
- Loss Functions
- Cross-entropy for binary:
- Squared error for real-valued:
- Gradient-Based Updates
with batch or mini-batch variants.
- Query-Selection via MILP (Combinatorial Setting)
- At each round, maximize a weighted sum of feature diversity and estimated utility within :
subject to distinctness and optimality constraints (Dragone et al., 2017).
4. Learning Dynamics and Theoretical Guarantees
Preference-Perceptron algorithms admit rigorous regret analysis under reasonable user models and query strategies in set-wise feedback settings.
- User Model: The choice probabilities are non-decreasing in true utility.
- Query Informativeness (), Affirmativeness (): Definitions guarantee that, in expectation, learning steps are sufficiently informative and update direction is controlled.
- Regret Bound:
This yields convergence in average regret for combinatorial constructive preference elicitation (Dragone et al., 2017).
5. Training and Practical Deployment
- Stochastic Optimization: Both single-sample and mini-batch gradient updates are used; mini-batch sizes of $10$–$100$ are effective (Chakraborty, 2024).
- Hyperparameters: Initial learning rate , regularization , mini-batch size, network depth (MLP), activation functions, and early stopping criteria form the critical hyperparameter set.
- Convergence Monitoring: Early stopping via validation loss plateau, iteration limits, or small weight updates are commonly used.
- Implementation Considerations: Query selection in constructive settings requires MILP solvers for practical, efficient computation of diverse and high-utility candidate sets (Dragone et al., 2017).
6. Applications, Empirical Findings, and Scalability
Preference-Perceptron strategies have been empirically validated in several domains:
- Synthetic Testbeds: On high-dimensional Boolean and hybrid feature domains, Preference-Perceptron outperforms or matches regret of Bayesian EVPI and max-margin (SetMargin) approaches while being one to two orders of magnitude faster for set queries (–4).
- PC Configuration: Handles seven categorical and one numerical attribute with compatibility constraints, achieving lower regret and ≈5× faster evaluation than SetMargin.
- Trip-Planning: With up to 127 features, the method remains scalable where alternatives are infeasible due to feature blow-up or MILP timeouts for very large (Dragone et al., 2017).
In collaborative filtering, the MLP Preference-Perceptron augments classical logistic regression by learning more complex, nonlinear boundaries, leveraging backpropagation and advanced optimization (feature scaling, normalization, regularization, learning rate decay, gradient checking) to fit user preference data (Chakraborty, 2024).
7. Distinction from Related Algorithms and Summary
- Classical Perceptron vs. Preference-Perceptron: The latter is specialized for preference-based, partial-information feedback (binary choices, real-valued ratings, setwise selection) and adapts through either backpropagation in neural nets or linear updates from setwise choice.
- Interpretation: The Preference-Perceptron label applies because the perceptron or MLP receives direct preference (binary or scalar) signals as targets, fitting weights accordingly by perceptron-style or backpropagation learning rules (Chakraborty, 2024, Dragone et al., 2017).
- Collaborative Filtering and Constructive Elicitation: Choice Perceptron uniquely supports constructive (synthesized) object spaces and hybrid features, and it provides formal learning guarantees and state-of-the-art empirical performance.
The Preference-Perceptron thus subsumes a family of algorithms instrumental both for collaborative filtering in recommender systems and interactive preference elicitation over hybrid combinatorial spaces, furnishing scalable, theoretically grounded, and empirically validated frameworks for preference modeling in complex domains.