Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HyperInterval: Hypernetwork approach to training weight interval regions in continual learning (2405.15444v3)

Published 24 May 2024 in cs.LG and cs.AI

Abstract: Recently, a new Continual Learning (CL) paradigm was presented to control catastrophic forgetting, called Interval Continual Learning (InterContiNet), which relies on enforcing interval constraints on the neural network parameter space. Unfortunately, InterContiNet training is challenging due to the high dimensionality of the weight space, making intervals difficult to manage. To address this issue, we introduce \our{} \footnote{The source code is available at https://github.com/gmum/HyperInterval}, a technique that employs interval arithmetic within the embedding space and utilizes a hypernetwork to map these intervals to the target network parameter space. We train interval embeddings for consecutive tasks and train a hypernetwork to transform these embeddings into weights of the target network. An embedding for a given task is trained along with the hypernetwork, preserving the response of the target network for the previous task embeddings. Interval arithmetic works with a more manageable, lower-dimensional embedding space rather than directly preparing intervals in a high-dimensional weight space. Our model allows faster and more efficient training. Furthermore, \our{} maintains the guarantee of not forgetting. At the end of training, we can choose one universal embedding to produce a single network dedicated to all tasks. In such a framework, hypernetwork is used only for training and, finally, we can utilize one set of weights. \our{} obtains significantly better results than InterContiNet and gives SOTA results on several benchmarks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Patryk Krukowski (4 papers)
  2. Anna Bielawska (2 papers)
  3. Kamil Książek (9 papers)
  4. Paweł Wawrzyński (15 papers)
  5. Paweł Batorski (5 papers)
  6. Przemysław Spurek (74 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets