Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Rate Perturbation: A Generic Plugin of Learning Rate Schedule towards Flatter Local Minima (2208.11873v1)

Published 25 Aug 2022 in cs.LG

Abstract: Learning rate is one of the most important hyper-parameters that has a significant influence on neural network training. Learning rate schedules are widely used in real practice to adjust the learning rate according to pre-defined schedules for fast convergence and good generalization. However, existing learning rate schedules are all heuristic algorithms and lack theoretical support. Therefore, people usually choose the learning rate schedules through multiple ad-hoc trials, and the obtained learning rate schedules are sub-optimal. To boost the performance of the obtained sub-optimal learning rate schedule, we propose a generic learning rate schedule plugin, called LEArning Rate Perturbation (LEAP), which can be applied to various learning rate schedules to improve the model training by introducing a certain perturbation to the learning rate. We found that, with such a simple yet effective strategy, training processing exponentially favors flat minima rather than sharp minima with guaranteed convergence, which leads to better generalization ability. In addition, we conduct extensive experiments which show that training with LEAP can improve the performance of various deep learning models on diverse datasets using various learning rate schedules (including constant learning rate).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Hengyu Liu (30 papers)
  2. Qiang Fu (159 papers)
  3. Lun Du (50 papers)
  4. Tiancheng Zhang (8 papers)
  5. Ge Yu (63 papers)
  6. Shi Han (74 papers)
  7. Dongmei Zhang (193 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.