Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MPN: Leveraging Multilingual Patch Neuron for Cross-lingual Model Editing (2401.03190v1)

Published 6 Jan 2024 in cs.CL, cs.AI, and cs.CV

Abstract: LLMs are known for encoding a vast amount of factual knowledge, but they often becomes outdated due to the ever-changing nature of external information. A promising solution to this challenge is the utilization of model editing methods to update the knowledge in an efficient manner. However, the majority of existing model editing techniques are limited to monolingual frameworks, thus failing to address the crucial issue of cross-lingual knowledge synchronization for multilingual models. To tackle this problem, we propose a simple yet effective method that trains multilingual patch neuron to store cross-lingual knowledge. It can be easily adapted to existing approaches to enhance their cross-lingual editing capabilities. To evaluate our method, we conduct experiments using both the XNLI dataset and a self-constructed XFEVER dataset. Experimental results demonstrate that our proposed method achieves improved performance in cross-lingual editing tasks without requiring excessive modifications to the original methodology, thereby showcasing its user-friendly characteristics. Codes will be released soon.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Journey to the center of the knowledge neurons: Discoveries of language-independent knowledge neurons and degenerate knowledge neurons. arXiv preprint arXiv:2308.13198.
  2. Xnli: Evaluating cross-lingual sentence representations. arXiv preprint arXiv:1809.05053.
  3. Knowledge neurons in pretrained transformers. arXiv preprint arXiv:2104.08696.
  4. Editing factual knowledge in language models. arXiv preprint arXiv:2104.08164.
  5. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  6. Calibrating factual knowledge in pretrained language models. arXiv preprint arXiv:2210.03329.
  7. Transformer feed-forward layers are key-value memories. arXiv preprint arXiv:2012.14913.
  8. Aging with grace: Lifelong model editing with discrete key-value adaptors. arXiv preprint arXiv:2211.11031.
  9. Transformer-patcher: One mistake worth one neuron. arXiv preprint arXiv:2301.09785.
  10. Multilingual lama: Investigating knowledge in multilingual pretrained language models. arXiv preprint arXiv:2102.00894.
  11. Zero-shot relation extraction via reading comprehension. arXiv preprint arXiv:1706.04115.
  12. Pmet: Precise model editing in a transformer. arXiv preprint arXiv:2308.08742.
  13. Memory-assisted prompt editing to improve gpt-3 after deployment. arXiv preprint arXiv:2201.06009.
  14. Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35:17359–17372.
  15. Mass-editing memory in a transformer. arXiv preprint arXiv:2210.07229.
  16. Fast model editing at scale. arXiv preprint arXiv:2110.11309.
  17. Memory-based model editing at scale. In International Conference on Machine Learning, pages 15817–15831. PMLR.
  18. Cross-lingual transfer with target language-ready task adapters. arXiv preprint arXiv:2306.02767.
  19. Bad-x: Bilingual adapters improve zero-shot cross-lingual transfer. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1791–1799.
  20. Mad-x: An adapter-based framework for multi-task cross-lingual transfer. arXiv preprint arXiv:2005.00052.
  21. Cross-lingual consistency of factual knowledge in multilingual language models. arXiv preprint arXiv:2310.10378.
  22. Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328. PMLR.
  23. Massive editing for large language models via meta learning. arXiv preprint arXiv:2311.04661.
  24. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  25. Cross-lingual knowledge editing in large language models. arXiv preprint arXiv:2309.08952.
  26. A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426.
  27. Emerging cross-lingual structure in pretrained language models. arXiv preprint arXiv:1911.01464.
  28. Language anisotropic cross-lingual model editing. arXiv preprint arXiv:2205.12677.
  29. Editing large language models: Problems, methods, and opportunities. arXiv preprint arXiv:2305.13172.
  30. Melo: Enhancing model editing with neuron-indexed dynamic lora. arXiv preprint arXiv:2312.11795.
  31. Can we edit factual knowledge by in-context learning? arXiv preprint arXiv:2305.12740.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Nianwen Si (5 papers)
  2. Hao Zhang (947 papers)
  3. Weiqiang Zhang (6 papers)
Citations (5)