SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills (2306.16176v1)

Published 28 Jun 2023 in cs.CL

Abstract: Traditional multitask learning methods basically can only exploit common knowledge in task- or language-wise, which lose either cross-language or cross-task knowledge. This paper proposes a general multilingual multitask model, named SkillNet-X, which enables a single model to tackle many different tasks from different languages. To this end, we define several language-specific skills and task-specific skills, each of which corresponds to a skill module. SkillNet-X sparsely activates parts of the skill modules which are relevant either to the target task or the target language. Acting as knowledge transit hubs, skill modules are capable of absorbing task-related knowledge and language-related knowledge consecutively. Based on Transformer, we modify the multi-head attention layer and the feed forward network layer to accommodate skill modules. We evaluate SkillNet-X on eleven natural language understanding datasets in four languages. Results show that SkillNet-X performs better than task-specific baselines and two multitask learning baselines (i.e., dense joint model and Mixture-of-Experts model). Furthermore, skill pre-training further improves the performance of SkillNet-X on almost all datasets. To investigate the generalization of our model, we conduct experiments on two new tasks and find that SkillNet-X significantly outperforms baselines.

Authors (9)

Zhangyin Feng (14 papers)
Yong Dai (33 papers)
Fan Zhang (686 papers)
Duyu Tang (65 papers)
Xiaocheng Feng (54 papers)
Shuangzhi Wu (29 papers)
Bing Qin (186 papers)
Yunbo Cao (43 papers)
Shuming Shi (126 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills (2306.16176v1)

Summary

Related Papers