Can I understand what I create? Self-Knowledge Evaluation of Large Language Models (2406.06140v1)
Abstract: LLMs have achieved remarkable progress in linguistic tasks, necessitating robust evaluation frameworks to understand their capabilities and limitations. Inspired by Feynman's principle of understanding through creation, we introduce a self-knowledge evaluation framework that is easy to implement, evaluating models on their ability to comprehend and respond to self-generated questions. Our findings, based on testing multiple models across diverse tasks, reveal significant gaps in the model's self-knowledge ability. Further analysis indicates these gaps may be due to misalignment with human attention mechanisms. Additionally, fine-tuning on self-generated math task may enhance the model's math performance, highlighting the potential of the framework for efficient and insightful model evaluation and may also contribute to the improvement of LLMs.
- Zhiquan Tan (20 papers)
- Lai Wei (67 papers)
- Jindong Wang (150 papers)
- Xing Xie (220 papers)
- Weiran Huang (53 papers)