Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
86 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
52 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Continuous Learning in a Hierarchical Multiscale Neural Network (1805.05758v1)

Published 15 May 2018 in cs.CL

Abstract: We reformulate the problem of encoding a multi-scale representation of a sequence in a LLM by casting it in a continuous learning framework. We propose a hierarchical multi-scale LLM in which short time-scale dependencies are encoded in the hidden state of a lower-level recurrent neural network while longer time-scale dependencies are encoded in the dynamic of the lower-level network by having a meta-learner update the weights of the lower-level neural network in an online meta-learning fashion. We use elastic weights consolidation as a higher-level to prevent catastrophic forgetting in our continuous learning framework.

Citations (6)

Summary

We haven't generated a summary for this paper yet.