Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 82 tok/s

Gemini 2.5 Pro 62 tok/s Pro

GPT-5 Medium 32 tok/s Pro

GPT-5 High 36 tok/s Pro

GPT-4o 78 tok/s Pro

Kimi K2 195 tok/s Pro

GPT OSS 120B 423 tok/s Pro

Claude Sonnet 4.5 33 tok/s Pro

2000 character limit reached

Multi-Source Domain Adaptation with Mixture of Experts (1809.02256v2)

Published 7 Sep 2018 in cs.CL

Abstract: We propose a mixture-of-experts approach for unsupervised domain adaptation from multiple sources. The key idea is to explicitly capture the relationship between a target example and different source domains. This relationship, expressed by a point-to-set metric, determines how to combine predictors trained on various domains. The metric is learned in an unsupervised fashion using meta-training. Experimental results on sentiment analysis and part-of-speech tagging demonstrate that our approach consistently outperforms multiple baselines and can robustly handle negative transfer.

Citations (170)

View on Semantic Scholar

Summary

The paper introduces a novel mixture-of-experts framework that leverages a point-to-set Mahalanobis metric to effectively integrate multiple source domains.
It mitigates negative transfer by assigning confidence weights to individual domain-specific classifiers through a meta-training procedure.
Experimental results on Amazon reviews and SANCL datasets demonstrate significant improvements, achieving up to 13% error reduction in POS tagging.

Analyzing "Multi-Source Domain Adaptation with Mixture of Experts"

This paper introduces a novel methodology for unsupervised domain adaptation involving multi-source scenarios utilizing a mixture-of-experts (MoE) framework. The proposed approach addresses the challenges associated with leveraging distinct multiple domain sources by formulating a point-to-set metric to effectively model relationships between target examples and individual source domains.

Core Methodology

Traditionally, domain adaptation has been designed around a single-source to target transfer. However, the authors identify the potential in multi-source adaptations, which is particularly effective when a target domain does not precisely align with any single source but shares characteristics with several. The paper tackles the inherent challenge of negative transfer that arises when aggregating data from diverse sources by implementing a point-to-set Mahalanobis distance metric within the hidden representation space of the models. This distance metric enables the computation of confidence weights for combining predictions from multiple domain-specific classifiers (or experts).

The authors employ a meta-training procedure to learn this metric in an unsupervised manner. Within this meta-training, each source domain is cyclically designated as a meta-target domain, with the remaining as meta-sources. This formulation allows the model to generalize well across domains by learning robust domain relationships through minimization of the loss using Mixture of Experts predictions on the meta-target.

Experimental Results

The experimental evaluation spans sentiment analysis using a multi-domain Amazon reviews dataset and part-of-speech (POS) tagging using the SANCL dataset. In both cases, the proposed MoE approach demonstrated superior performance over baseline models, including those with traditional single-source adaptation protocols and unified multi-source models. Numerically, the MoE model achieved a 7% error reduction on the Amazon review sentiment analysis task and a significant 13% error reduction on the SANCL POS tagging task compared to baselines.

A notable strength of this model is its capacity to handle negative transfer. For instance, in POS tagging experiments, where Twitter data, significantly different from the target domains, was included, MoE was adept at ignoring irrelevant features while drawing valuable insights from other sources. These findings were supported by quantitative performance metrics and visual progressions of the α-confidence metric distributions across source domains.

Theoretical and Practical Implications

This work contributes to the field of domain adaptation by facilitating the intelligent aggregation of information from multiple heterogeneous sources. Theoretically, the introduction of a mixture-of-experts framework enriches the landscape of adaptation strategies, showcasing how point-to-set metrics can encapsulate complex domain relationships. Practically, the approach's capability to handle negative transfer scenarios holds promise for developing more robust AI systems that can leverage diverse data sources without risking performance degradation.

Future Directions

There lies potential in further extending this approach to encapsulate a wider array of tasks within NLP and possibly other fields. The indirect learning of domain relations via meta-training could be explored with other forms of deep learning encoders or on more intricate datasets. The adversarial training component also opens avenues for future works in enhancing representation alignment in multi-domain settings.

This paper represents a meaningful advancement in handling multi-source domain adaptation, offering vital insights for researchers seeking to optimize AI model performance across diverse and complex data environments.