Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating Deep Learning in SystemML using Layer-wise Adaptive Rate Scaling(LARS) Optimizer (2102.03018v1)

Published 5 Feb 2021 in cs.LG

Abstract: Increasing the batch size of a deep learning model is a challenging task. Although it might help in utilizing full available system memory during training phase of a model, it results in significant loss of test accuracy most often. LARS solved this issue by introducing an adaptive learning rate for each layer of a deep learning model. However, there are doubts on how popular distributed machine learning systems such as SystemML or MLlib will perform with this optimizer. In this work, we apply LARS optimizer to a deep learning model implemented using SystemML.We perform experiments with various batch sizes and compare the performance of LARS optimizer with \textit{Stochastic Gradient Descent}. Our experimental results show that LARS optimizer performs significantly better than Stochastic Gradient Descent for large batch sizes even with the distributed machine learning framework, SystemML.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Kanchan Chowdhury (3 papers)
  2. Ankita Sharma (10 papers)
  3. Arun Deepak Chandrasekar (1 paper)
Citations (2)

Summary

We haven't generated a summary for this paper yet.