基于层间互相关感知损失的风格迁移方法

庄轩权; 李彩霞; 黎培兴

doi:10.13471/j.cnki.acta.snus.2019.10.11.2019A079

您当前的位置：

首页 >

文章列表页 >

基于层间互相关感知损失的风格迁移方法

研究论文 | 更新时间：2023-11-01

- 基于层间互相关感知损失的风格迁移方法
- Style transfer based on cross-layer correlation perceptual loss
- 中山大学学报(自然科学版) 2020年59卷第6期页码：126-135
- 作者机构：
  
  1.中山大学数学学院，广东广州 510275
  2.中山大学广东省计算科学重点实验室，广东广州 510275
- 作者简介：
  
  庄轩权（1995年生），男；研究方向：深度学习与图像处理；E-mail:andrezhuang@tencent.com
  黎培兴（1971年生），男；研究方向：机器学习与数据挖掘；E-mail:lnslpx@mail.sysu.edu.cn
- 基金信息：
  
  广东省基础与应用基础研究基金(2020B1515310007);中山大学广东省计算科学重点实验室(2020B1212060032)
- DOI：10.13471/j.cnki.acta.snus.2019.10.11.2019A079
  中图分类号： TP183
- 纸质出版日期：2020-11-25，
  
  收稿日期：2019-10-11，
扫描看全文
庄轩权,李彩霞,黎培兴.基于层间互相关感知损失的风格迁移方法[J].中山大学学报(自然科学版),2020,59(06):126-135.

ZHUANG Xuanquan,LI Caixia,LI Peixing.Style transfer based on cross-layer correlation perceptual loss[J].Acta Scientiarum Naturalium Universitatis Sunyatseni,2020,59(06):126-135.
庄轩权,李彩霞,黎培兴.基于层间互相关感知损失的风格迁移方法[J].中山大学学报(自然科学版),2020,59(06):126-135. DOI： 10.13471/j.cnki.acta.snus.2019.10.11.2019A079.

ZHUANG Xuanquan,LI Caixia,LI Peixing.Style transfer based on cross-layer correlation perceptual loss[J].Acta Scientiarum Naturalium Universitatis Sunyatseni,2020,59(06):126-135. DOI： 10.13471/j.cnki.acta.snus.2019.10.11.2019A079.

摘要

深度学习在风格迁移领域的应用使一系列以图片艺术风格化为核心的产品真正落地，而从像素级损失向基于Gram矩阵的感知损失的转变是其中最关键的跨越。Gram矩阵在艺术风格特征的提取上有良好的效果，但其局限于同等级语义特征间相关性统计的做法并不能作为艺术风格的充分表示。自Gram矩阵被提出以来，一系列研究并未对其进行充分的研究和改进，而是关注于模型结构的设计以提高风格迁移的速度。提出使用层间互相关矩阵作为Gram矩阵的代替或补充进行风格迁移任务的风格损失函数计算。实验表明，在得到相似水平输出结果的情况下，使用层间互相关矩阵方法可以降低20%的计算时间。

Abstract

Great success in deep-learning-based style transfer is accelerating the development of photo artistic stylization applications. And the change of loss function from per-pixel loss to perceptual loss based on the Gram matrix is the most critical part of this progress. Gram matrix shows good performance in style feature extraction， but it only focuses on correlations among same level features. Therefore， Gram matrix cannot be considered as a complete representation of styles. However， most of the research focus on how to improve transfer speed by designing new model structure instead of analyzing and modifying the Gram matrix. The cross-layer correlation matrix is used to calculate style loss function as a replacement or supplement to the Gram matrix. By experiments， it is shown that this method can reduce 20% of the calculation time in comparison with the Gram matrix method while yielding similar outputs.

关键词

风格迁移Gram矩阵卷积神经网络风格损失函数感知损失深度学习

Keywords

style transferGram matrixconvolutional neural networkstyle loss functionperceptual lossdeep learning

references

GATYS L A， ECKER A S， BETHGE M. Image style transfer using convolutional neural networks ［C］//Computer Vision and Pattern Recognition， 2016： 2414-2423.

GATYS L A， ECKER A S， BETHGE M. Texture synthesis using convolutional neural networks ［C］//International Conference on Neural Information Processing Systems， 2015： 262-270.

JOHNSON J， ALAHI A， FEIFEI L. Perceptual losses for real-time style transfer and super-resolution ［C］//European Conference on Computer Vision， 2016： 694-711.

SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition ［J］. International Conference on Learning Representations， 2015.

LI Y， WANG N， LIU J， et al. Demystifying neural style transfer ［C］//IJCAI， 2017： 2230-2236.

ULYANOV D， VEDALDI A， LEMPITSKY V. Improved texture networks： Maximizing quality and diversity in feed-forward stylization and texture synthesis ［C］//Computer Vision and Pattern Recognition， 2017： 4105-4113.

DUMOULIN V， SHLENS J， KUDLUR M. A learned representation for artistic style ［J］. International Conference on Learning Representations， 2017.

HUANG X， BELONGIE S. Arbitrary style transfer in real-time with adaptive instance normalization ［C］//International Conference on Computer Vision， 2017：1510-1519.

WANG H， LIANG X， ZHANG H， et al. Zm-net： real-time zero-shot image manipulation network ［C］//Computer Vision and Pattern Recognition，2017.

GOODFELLOW I J， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial networks ［J］. Advances in Neural Information Processing Systems， 2014， 3： 2672-2680.

RADFORD A， METZ L， CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks ［C］//International Conference on Learning Representations， 2016.

ZHU J， PARK T， ISOLA P， et al. Unpaired image-to-image translation using cycle-consistent adversarial networks ［C］//International Conference on Computer Vision， 2017： 2242-2251.

YI Z， ZHANG H，TAN P， et al. DualGAN： unsupervised dual learning for image-to-image translation ［C］//International Conference on Computer Vision， 2017： 2868-2876.

KIM T， CHA M， KIM H， et al. Learning to discover cross-domain relations with generative adversarial networks ［C］//Computer Vision and Pattern Recognition， 2017.

KARRAS T， LAINE S， AILA T， et al. A style-based generator architecture for generative adversarial networks ［C］//Computer Vision and Pattern Recognition， 2019： 4401-4410.

SHAHAM T R， DEKEL T， MICHAELI T， et al. SinGAN： Learning a generative model from a single natural image ［C］//International Conference on Computer Vision， 2019： 4570-4580.

KINGMA D P， BA J. Adam： a method for stochastic optimization ［C］//International Conference on Learning Representations， 2015.

RUDER S. An overview of gradient descent optimization algorithms ［J］. arXiv： Learning， 2016.

DOGO E M， AFOLABI O J， NWULU N I， et al. A comparative analysis of gradient descent-based optimization algorithms on convolutional neural networks ［J］. International Conference on Computational Techniques， Electronics and Mechanical Systems （CTEMS）， Belgaum， India， 2018： 92-99.

HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition ［C］//Computer Vision and Pattern Recognition， 2016： 770-778.

GAO S， CHENG M， ZHAO K， et al. Res2Net： A new multi-scale backbone architecture ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2019： 1-1.

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于U-Net的格子玻尔兹曼方法

基于深度学习与多源遥感数据的新增建设用地自动检测

基于深度神经网络的格子玻尔兹曼算法

核主成分分析网络的人脸识别方法

非可控环境行人再识别综述