基于神经坍塌的类增量学习方法

扈超舜; 叶标华; 谢晓华; 赖剑煌

doi:10.13471/j.cnki.acta.snus.ZR20240136

您当前的位置：

首页 >

文章列表页 >

基于神经坍塌的类增量学习方法

学术荟翠专栏（组稿人：胡建勋） | 更新时间：2024-11-25

- 基于神经坍塌的类增量学习方法
  增强出版
- Inducing Neural Collapse in class-incremental learning
- 中山大学学报(自然科学版)(中英文) 2024年63卷第6期页码：224-235
- 作者机构：
  
  中山大学计算机学院，广东广州 510006
- 作者简介：
  
  扈超舜（1997年生），男；研究方向：模式识别、计算机视觉；E-mail：huchsh3@mail2.sysu.edu.cn
  赖剑煌（1964年生），男；研究方向：模式识别、计算机视觉；E-mail：stsljh@mail.sysu.edu.cn
- 基金信息：
  
  国家自然科学基金(U22A2095;62076258)
- DOI：10.13471/j.cnki.acta.snus.ZR20240136
  中图分类号： TP391.4
- 纸质出版日期：2024-11-25，
  
  网络出版日期：2024-07-31，
  
  收稿日期：2024-04-29，
  
  录用日期：2024-05-17
- 稿件说明：
移动端阅览
扈超舜,叶标华,谢晓华等.基于神经坍塌的类增量学习方法[J].中山大学学报(自然科学版)(中英文),2024,63(06):224-235.

HU Chaoshun,YE Biaohua,XIE Xiaohua,et al.Inducing Neural Collapse in class-incremental learning[J].Acta Scientiarum Naturalium Universitatis Sunyatseni,2024,63(06):224-235.
扈超舜,叶标华,谢晓华等.基于神经坍塌的类增量学习方法[J].中山大学学报(自然科学版)(中英文),2024,63(06):224-235. DOI： 10.13471/j.cnki.acta.snus.ZR20240136.

HU Chaoshun,YE Biaohua,XIE Xiaohua,et al.Inducing Neural Collapse in class-incremental learning[J].Acta Scientiarum Naturalium Universitatis Sunyatseni,2024,63(06):224-235. DOI： 10.13471/j.cnki.acta.snus.ZR20240136.

摘要

类增量学习中的新旧类不平衡导致少数坍塌发生，旧类的识别能力降低. 现有方法通常基于经验调整深度特征空间中类别间的几何关系以避免少数坍塌，缺乏理论指导. 神经坍塌从理论上揭示了类别间的最佳几何结构——等角紧致框架. 受此启发，本文提出了一种名为持续构造神经坍塌的方法来解决少数坍塌. 该方法通过紧致损失和等角损失来约束形成等角紧致框架结构. 然而不平衡数据分布导致全局质心估计不准确和旧类之间约束困难，进而导致上述两个损失无法充分施展其效果. 为此，本文进一步提出了分类器向量辅助模块和难例采样模块来分别解决上述两个问题. 实验结果表明，本文提出的方法有效诱导了神经坍塌的发生，并且在 CIFAR100和ImageNet数据集上都超过了当前最优方法.

Abstract

In class-incremental learning， the imbalance between new and old classes leads to Minority Collapse， resulting in decreased performance for old classes. Existing methods typically rely on empirical adjustments to the geometric relationships between classes in the deep feature space to avoid Minority Collapse， lacking theoretical guidance. Neural Collapse theoretically reveals the optimal geometric structure between classes—the Equiangular Tight Frame （ETF）. Inspired by this， this paper proposes a method called Continuous Construction of Neural Collapse （CCNC） to address Minority Collapse. The method constrains the formation of an ETF structure through compactness loss and equiangular loss. The imbalanced data distribution can lead to inaccurate global centroid estimation and difficulties in maintaining constraints among old classes， rendering these losses ineffective. To address the above two issues， the paper presents a classifier vector supplementation module and a hard example sampling module， respectively. Experimental results indicate that the proposed method successfully induces Neural Collapse and outperforms the current best methods on the CIFAR100 and ImageNet datasets.

关键词

类增量学习神经坍塌少数坍塌动态扩展结构

Keywords

class-incremental learningNeural CollapseMinority Collapsedynamically expanding architecture

references

ALJUNDI R， LIN M， GOUJAUD B， et al， 2019. Gradient based sample selection for online continual learning［EB/OL］. arXiv： 1903.08671. http：//arxiv.org/abs/1903.08671http://arxiv.org/abs/1903.08671.

BANG J， KIM H， YOO Y， et al， 2021. Rainbow memory： Continual learning with a memory of diverse samples［C］//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 8214-8223.

CAI T， ZHANG Z， TAN X， et al， 2023. Multi-centroid task descriptor for dynamic class incremental inference［C］//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 7298-7307.

CHAUDHRY A， DOKANIA P K， AJANTHAN T， et al， 2018. Riemannian walk for incremental learning： Understanding forgetting and intransigence［C］//European Conference on Computer Vision （ECCV）： 556-572.

CHEN X， CHANG X， 2023. Dynamic residual classifier for class incremental learning［C］//2023 IEEE/CVF International Conference on Computer Vision （ICCV）： 18697-18706.

CUBUK E D， ZOPH B， MANÉ D， et al， 2019. AutoAugment： Learning augmentation strategies from data［C］//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 113-123.

DENG J， DONG W， SOCHER R， et al， 2009. ImageNet： A large-scale hierarchical image database［C］//2009 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）： 248-255.

DHAR P， SINGH R V， PENG K C， et al， 2019. Learning without memorizing［C］//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 5133-5141.

DOUILLARD A， CORD M， OLLION C， et al， 2020. PODNet： pooled outputs distillation for small-asks incremental learning［C］//European Conference on Computer Vision （ECCV）： 86-102.

DOUILLARD A， RAMÉ A， COUAIRON G， et al， 2022. Dytox： Transformers for continual learning with dynamic token expansion［C］//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 9275-9285.

E W， WOJTOWYTSCH S， 2020. On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers［EB/OL］. arXiv： 2012.05420. https：//arxiv.org/abs/2012.05420https://arxiv.org/abs/2012.05420.

FANG C， HE H， LONG Q， et al， 2021. Exploring deep neural networks via layer-peeled model： Minority collapse in imbalanced training［J］. Proc Natl Acad Sci， 118（43）： e2103091118.

FISHER R A， 1936. The use of multiple measurements in taxonomic problems［J］. Ann Eugen， 7（2）： 179-188.

FU C， DU B， ZHANG L， 2024. Do we need learnable classifiers？A hyperspectral image classification algorithm based on attention-enhanced ResBlock-in-ResBlock and ETF classifier［J］. IEEE Trans Geosci Remote Sens， 62： 1-13.

GAO Q， ZHAO C， GHANEM B， et al， 2022. R-DFCIL： Relation-guided representation learning for data-free class incremental learning［C］//European Conference on Computer Vision （ECCV）： 423-439.

HAN X Y， PAPYAN V， DONOHO D L， 2021. Neural collapse under MSE loss： Proximity to and dynamics on the central path［EB/OL］. arXiv： 2106.02073. http：//arxiv.org/abs/2106.02073http://arxiv.org/abs/2106.02073.

HE K， ZHANG X， REN S， et al， 2016. Deep residual learning for image recognition［C］//2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）： 770-778.

HINTON G， VINYALS O， DEAN J， 2015. Distilling the knowledge in a neural network［EB/OL］. arXiv： 1503.02531. https：//arxiv.org/abs/1503.02531https://arxiv.org/abs/1503.02531.

HUANG B， CHEN Z， ZHOU P， et al， 2023. Resolving task confusion in dynamic expansion architectures for class incremental learning［C］// Proceedings of the 37th AAAI Conference on Artificial Intelligence， 37（1）： 908-916.

HOU S， PAN X， LOY C C， et al， 2019. Learning a unified classifier incrementally via rebalancing［C］//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 831-839.

JI W， LU Y， ZHANG Y， et al， 2021. An unconstrained layer-peeled perspective on neural collapse［EB/OL］. arXiv： 2110.02796. http：//arxiv.org/abs/2110.02796http://arxiv.org/abs/2110.02796.

KANG M， PARK J， HAN B， 2022. Class-incremental learning by knowledge distillation with adaptive feature consolidation［C］//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 16050-16059.

KIRKPATRICK J， PASCANU R， RABINOWITZ N， et al， 2017. Overcoming catastrophic forgetting in neural networks［J］. Proc Natl Acad Sci USA， 114（13）： 3521-3526.

KRIZHEVSKY A， 2009. Learning multiple layers of features from tiny images［D］. Toronto： University of Toronto.

LEE J， HONG H G， JOO D， et al， 2020. Continual learning with extended kronecker-factored approximate curvature［C］//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 8998-9007.

LI Z， HOIEM D， 2018. Learning without forgetting［J］. IEEE Trans Pattern Anal Mach Intell， 40（12）： 2935-2947.

LI Z， SHANG X， HE R， et al， 2023. No fear of classifier biases： Neural collapse inspired federated learning with synthetic and fixed classifier［C］//2023 IEEE/CVF International Conference on Computer Vision （ICCV）： 5319-5329.

LU J F， STEINERBERGER S， 2022. Neural collapse under cross-entropy loss［J］. Appl Comput Harmon Anal， 59： 224-241.

LUO Z， LIU Y， SCHIELE B， et al， 2023. Class-incremental exemplar compression for class-incremental learning［C］//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 11371-11380.

MIXON D G， PARSHALL H， PI J， 2022. Neural collapse with unconstrained features［J］. Sampl Theory Signal Process Data Anal， 20（2）： 11.

PAPYAN V， HAN X Y， DONOHO D L， 2020. Prevalence of neural collapse during the terminal phase of deep learning training［J］. Proc Natl Acad Sci， 117（40）： 24652-24663.

PARK W， KIM D， LU Y， et al， 2019. Relational knowledge distillation［C］//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 3962-3971.

REBUFFI S A， KOLESNIKOV A， SPERL G， et al， 2017. iCaRL： Incremental classifier and representation learning［C］//2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）： 2001-2010.

REN J， YU C， SHENG S， et al， 2020. Balanced meta-softmax for long-tailed visual recognition［C］//Proceedings of the 34th International Conference on Neural Information Processing Systems： 4175-4186.

ROMERO A， BALLAS N， KAHOU S E， et al， 2014. FitNets： Hints for thin deep nets［EB/OL］. 1412.6550. http：//arxiv.org/abs/1412.6550http://arxiv.org/abs/1412.6550.

TAO X Y， CHANG X Y， HONG X P， et al， 2020. Topology-preserving class-incremental learning［C］//European Conference on Computer Vision （ECCV）： 254-270.

WANG F Y， ZHOU D W， LIU L， et al， 2023. BEEF： Bi-compatible class-incremental learning via energy-based expansion and fusion［C］//The 11th International Conference on Learning Representations （ICLR）： 37860-37879.

WANG F Y， ZHOU D W， YE H J， et al， 2022. FOSTER： Feature boosting and compression for class-incremental learning［C］//European Conference on Computer Vision （ECCV）： 398-414.

WU Y， CHEN Y， WANG L， et al， 2019. Large scale incremental learning［C］//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 374-382.

YAN S， XIE J， HE X， 2021. DER： Dynamically expandable representation for class incremental learning［C］//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 3014-3023.

YANG Y， CHEN S， LI X， et al， 2022. Inducing neural collapse in imbalanced learning： Do we really need a learnable classifier at the end of deep neural network？［C］//Proceedings of the 36th International Conference on Neural Information Processing Systems： 37991-38002.

YANG Y， ZHOU D W， ZHAN D C， et al， 2023a. Cost-effective incremental deep model： Matching model capacity with the least sampling［J］. IEEE Trans Knowl Data Eng， 35（4）： 3575-3588.

YANG Y B， YUAN H B， LI X T， et al， 2023b. Neural collapse inspired feature-classifier alignment for few-shot class incremental learning［EB/OL］. arXiv： 2302.03004. https：//arxiv.org/abs/2302.03004https://arxiv.org/abs/2302.03004.

ZENKE F， POOLE B， GANGULI S， 2017. Continual learning through synaptic intelligence［C］//Proceedings of the 34th International Conference on Machine Learning： 3987-3995.

ZHAO B， XIAO X， GAN G， et al， 2020. Maintaining discrimination and fairness in class incremental learning［C］//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）： 13205-13214.

ZHAO H， WANG H， FU Y， et al， 2022. Memory-efficient class-incremental learning for image classification［J］. IEEE Trans Neural Netw Learn Syst， 33（10）： 5966-5977.

ZHOU J， LI X， DING T， et al， 2022. On the optimization landscape of neural collapse under mse loss： Global optimality with unconstrained features［C］//Proceedings of the 39th International Conference on Machine Learning： 27179-27202.

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据