Papers
arxiv:2306.01690

GateON: an unsupervised method for large scale continual learning

Published on Jun 2, 2023
Authors:
,
,

Abstract

The objective of continual learning (CL) is to learn tasks sequentially without retraining on earlier tasks. However, when subjected to CL, traditional neural networks exhibit catastrophic forgetting and limited generalization. To overcome these problems, we introduce a novel method called 'Gate and Obstruct Network' (GateON). GateON combines learnable gating of activity and online estimation of parameter relevance to safeguard crucial knowledge from being overwritten. Our method generates partially overlapping pathways between tasks which permits forward and backward transfer during sequential learning. GateON addresses the issue of network saturation after parameter fixation by a re-activation mechanism of fixed neurons, enabling large-scale continual learning. GateON is implemented on a wide range of networks (fully-connected, CNN, Transformers), has low computational complexity, effectively learns up to 100 MNIST learning tasks, and achieves top-tier results for pre-trained BERT in CL-based NLP tasks.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2306.01690 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2306.01690 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2306.01690 in a Space README.md to link it from this page.

Collections including this paper 1