Publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2024
- AIR: Analytic Imbalance Rectifier for Continual LearningAug 2024
Continual learning enables AI models to learn new data sequentially without retraining in real-world scenarios. Most existing methods assume the training data are balanced, aiming to reduce the catastrophic forgetting problem that models tend to forget previously generated data. However, data imbalance and the mixture of new and old data in real-world scenarios lead the model to ignore categories with fewer training samples. To solve this problem, we propose an analytic imbalance rectifier algorithm (AIR), a novel online exemplar-free continual learning method with an analytic (i.e., closed-form) solution for data-imbalanced class-incremental learning (CIL) and generalized CIL scenarios in real-world continual learning. AIR introduces an analytic re-weighting module (ARM) that calculates a re-weighting factor for each class for the loss function to balance the contribution of each category to the overall loss and solve the problem of imbalanced training data. AIR uses the least squares technique to give a non-discriminatory optimal classifier and its iterative update method in continual learning. Experimental results on multiple datasets show that AIR significantly outperforms existing methods in long-tailed and generalized CIL scenarios.
- Analytic Exemplar-Free Online Continual Learning with Large Models for Imbalanced Autonomous Driving TasksIEEE Transactions on Vehicular Technology, Aug 2024
In autonomous driving, even a meticulously trained model can encounter failures when facing unfamiliar scenarios. One of these scenarios can be formulated as an online continual learning (OCL) problem. That is, data come in an online fashion, and models are updated according to these streaming data. Two major OCL challenges are catastrophic forgetting and data imbalance. To address these challenges, we propose an Analytic Exemplar-Free Online Continual Learning algorithm (AEF-OCL). The AEF-OCL leverages analytic continual learning principles and employs ridge regression as a classifier for features extracted by a large backbone network. It solves the OCL problem by recursively calculating the analytical solution, ensuring an equalization between the continual learning and its joint-learning counterpart, and works without the need to save any used samples (i.e., exemplar-free). Additionally, we introduce a Pseudo-Features Generator (PFG) module that recursively estimates the mean and the variance of real features for each class. It over-samples offset pseudo-features from the same normal distribution as the real features, thereby addressing the data imbalance issue. Experimental results demonstrate that despite being an exemplar-free strategy, our method outperforms various methods on the autonomous driving SODA10M dataset. The source code is available at https://github.com/ZHUANGHP/Analytic-continual-learning.
- REAL: Representation Enhanced Analytic Learning for Exemplar-free Class-incremental LearningRun He, Huiping Zhuang, Di Fang , Yizhu Chen, Kai Tong, and Cen ChenMar 2024
Exemplar-free class-incremental learning (EFCIL) aims to mitigate catastrophic forgetting in class-incremental learning without available historical data. Compared with its exemplar-based CIL counterpart that stores representative historical samples, the EFCIL suffers more from forgetting issues under the exemplar-free constraint. Recently, a new EFCIL branch named analytic continual learning (ACL) introduces a simple yet effective gradient-free paradigm to address forgetting. However, it suffers from ineffective representations. In this paper, inspired by the recently developed ACL, we propose a representation enhanced analytic learning (REAL) for EFCIL. The REAL constructs a dual-stream base pretraining (DS-BPT) and a representation enhancing distillation (RED) process to enhance the representation of the feature extractor. The DS-BPT pretrains model in streams of both general base knowledge acquisition via self-supervised contrastive learning and learning supervised feature distribution via supervised learning for base knowledge extraction. The RED process merges supervised knowledge to the backbone with general knowledge and facilitates a subsequent ACL that converts the CIL to a recursive least-square problem. Our method provides competitive performance and alleviates the issue of ineffective representations on unseen data in the existing ACL. Empirical results on various datasets including CIFAR-100, ImageNet-100 and ImageNet-1k, demonstrate that our REAL improves all existing ACL variants to outperform the state-of-the-arts in EFCIL, and achieve comparable or even superior performance compared with the exemplar-based methods.
- GACL: Exemplar-Free Generalized Analytic Continual LearningIn Advances in Neural Information Processing Systems, Dec 2024
Class incremental learning (CIL) trains a network on sequential tasks with separated categories in each task but suffers from catastrophic forgetting, where models quickly lose previously learned knowledge when acquiring new tasks. The generalized CIL (GCIL) aims to address the CIL problem in a more real-world scenario, where incoming data have mixed data categories and unknown sample size distribution. Existing attempts for the GCIL either have poor performance or invade data privacy by saving exemplars. In this paper, we propose a new exemplar-free GCIL technique named generalized analytic continual learning (GACL). The GACL adopts analytic learning (a gradient-free training technique), and delivers an analytical (i.e., closed-form) solution to the GCIL scenario. This solution is derived via decomposing the incoming data into exposed and unexposed classes, thereby attaining a weight-invariant property, a rare yet valuable property supporting an equivalence between the incremental learning and its joint training. Such an equivalence is crucial in GCIL settings as data distributions among different tasks no longer pose challenges adopting our GACL. Theoretically, this property of equivalence is validated through matrix analysis tools. Empirically, we conduct extensive experiments, where, compared with existing GCIL methods, our GACL exhibits a consistently leading performance across various datasets and GCIL settings.
- Analytic Federated LearningMay 2024
In this paper, we introduce analytic federated learning (AFL), a new training paradigm that brings analytical (i.e., closed-form) solutions to the federated learning (FL) community. Our AFL draws inspiration from analytic learning – a gradient-free technique that trains neural networks with analytical solutions in one epoch. In the local client training stage, the AFL facilitates a one-epoch training, eliminating the necessity for multi-epoch updates. In the aggregation stage, we derive an absolute aggregation (AA) law. This AA law allows a single-round aggregation, removing the need for multiple aggregation rounds. More importantly, the AFL exhibits a \textitweight-invariant property, meaning that regardless of how the full dataset is distributed among clients, the aggregated result remains identical. This could spawn various potentials, such as data heterogeneity invariance, client-number invariance, absolute convergence, and being hyperparameter-free (our AFL is the first hyperparameter-free method in FL history). We conduct experiments across various FL settings including extremely non-IID ones, and scenarios with a large number of clients (e.g., \ge 1000). In all these settings, our AFL constantly performs competitively while existing FL techniques encounter various obstacles.