Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 88 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 15 tok/s
GPT-5 High 16 tok/s Pro
GPT-4o 105 tok/s
GPT OSS 120B 471 tok/s Pro
Kimi K2 202 tok/s Pro
2000 character limit reached

Model Editing by Standard Fine-Tuning (2402.11078v3)

Published 16 Feb 2024 in cs.LG, cs.AI, and cs.CL

Abstract: Standard fine-tuning is considered not as effective as specialized methods for model editing due to its comparatively poor performance. However, it is simple, agnostic to the architectural details of the model being edited, and able to leverage advances in standard training techniques with no additional work (e.g., black-box PEFT for computational efficiency), making it an appealing choice for a model editor. In this work, we show that standard fine-tuning alone can yield competitive model editing performance with two minor modifications. First, we optimize the conditional likelihood rather than the full likelihood. Second, in addition to the typical practice of training on randomly paraphrased edit prompts to encourage generalization, we also train on random or similar unedited facts to encourage locality. Our experiments on the ZsRE and CounterFact datasets demonstrate that these simple modifications allow standard fine-tuning to match or outperform highly specialized editors in terms of edit score.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Knowledge neurons in pretrained transformers. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8493–8502.
  2. Editing factual knowledge in language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6491–6506.
  3. Transformer feed-forward layers are key-value memories. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5484–5495.
  4. Model editing can hurt general abilities of large language models. arXiv preprint arXiv:2401.04700.
  5. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations.
  6. Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35:17359–17372.
  7. Mass-editing memory in a transformer. In The Eleventh International Conference on Learning Representations.
  8. Fast model editing at scale. In International Conference on Learning Representations.
  9. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  10. Direct preference optimization: Your language model is secretly a reward model. In Thirty-seventh Conference on Neural Information Processing Systems.
  11. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.
  12. Editable neural networks. In International Conference on Learning Representations.
  13. Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
  14. A comprehensive study of knowledge editing for large language models. arXiv preprint arXiv:2401.01286.
  15. Can we edit factual knowledge by in-context learning? arXiv preprint arXiv:2305.12740.
Citations (5)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube