首页出版说明中文期刊中文图书环宇英文官网付款页面

基于对抗生成网络的超分辨率海报生成模型

利 锦轩, 邓 秀勤
广东工业大学/数学与统计学院

摘要


随着计算机技术的不断进步,图像生成技术也在飞速发展。本文提出了一种以中文文本为输入的海报生成
方法,将StackGan模型的第二阶段改进为超分辨率模型(GAN_SR3)。我们引入CLIP模型将中文文本映射到高维编
码,提升文本与图像的匹配度。通过在StackGan的第一阶段输入编码和随机噪声,生成低分辨率图像,然后输入至
改进的超分辨率模型SR3,获得高分辨率海报。在自建数据集上与其他模型进行比较,实验结果显示GAN_SR3在电
影海报和网络海报数据集上获得更高的Inception Score和更小的FID。

关键词


文本图像生成;对抗生成网络;深度学习;图像生成

全文:

PDF


参考


[1] GOODFELLOW I, POUGET-ABADIE J, MIRZA M,

et al. Generative adversarial networks[J]. Communications of

the ACM, 2020, 63(11): 139-144.

[2] LIU Q, CHEN M, ZHOU D. Fast haze removal from

a single image[C]//2013 25th Chinese Control and Decision

Conference (CCDC). IEEE, 2013: 3780-3785.

[3] Kingma D P, Welling M. Auto-Encoding Variational

Bayes[J]. stat, 2014, 1050: 1

[4] KARRAS T, LAINE S, AITTALA M, et al. Analyzing

and improving the image quality of stylegan[C]//Proceedings

of the IEEE/CVF conference on computer vision and pattern

recognition. 2020: 8110-8119.

[5] RADFORD A, KIM J W, HALLACY C, et al.

Learning transferable visual models form natural language

supervision[C]//International conference on machine learning.

PMLR,2021:8748-8763

[6] Ramesh A, Dhariwal P, Nichol A, et al.

Hierarchical Text-Conditional Image Generation with CLIP

Latents[J]. arXiv e-prints, 2022: arXiv: 2204.06125.

[7] HO J, JAIN A, ABBEEL P. Denoising diffusion

probabilistic models[J]. Advances in neural information

processing systems, 2020, 33: 6840-6851.

[8] REED S, AKATA Z, YAN X, et al. Generative

adversarial text to image synthesis[C]//International conference

on machine learning. PMLR, 2016: 1060-1069.

[9] Reed S, Akata Z, Mohan S, et al. Learning

what and where to draw[J]. Advances in neural information

processing systems, 2016, 29.

[10] ZHANG H, XU T, LI H, et al. Stackgan: Text to

photo-realistic image synthesis with stacked generative

adversarial networks[C]//Proceedings of the IEEE international

conference on computer vision. 2017: 5907-5915.

[11] ZHANG H, XU T, LI H, et al. Stackgan++:

Realistic image synthesis with stacked generative adversarial

networks[J]. IEEE transactions on pattern analysis and

machine intelligence, 2018, 41(8): 1947-1962.

[12] XU T, ZHANG P, HUANG Q, et al. Attngan: Finegrained text to image generation with attentional generative

adversarial networks[C]//Proceedings of the IEEE conference

on computer vision and pattern recognition. 2018: 1316-1324.

[13] Arjovsky M, Chintala S, Bottou L.

Wasserstein generative adversarial networks[C]//International

conference on machine learning. PMLR, 2017: 214-223.

[14] Kang M, Zhu J Y, Zhang R, et al. Scaling up

gans for text-to-image synthesis[C]//Proceedings of the IEEE/

CVF Conference on Computer Vision and Pattern Recognition.

2023: 10124-10134.

[15] Yang A, Pan J, Lin J, et al. Chinese CLIP:

Contrastive Vision-Language Pretraining in Chinese[J]. arXiv

e-prints, 2022: arXiv: 2211.01335.

[16] Arjovsky M, Chintala S, Bottou L.

Wasserstein generative adversarial networks[C]//International

conference on machine learning. PMLR, 2017: 214-223.

[17] SAHARIA C, HO J, CHAN W, et al. Image superresolution via iterative refinement[J]. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 2022, 45(4): 4713-4726.

[18] Ronneberger O, Fischer P, Brox T. U-net:

Convolutional networks for biomedical image segmentation[C]//

Medical Image Computing and Computer-Assisted

Intervention–MICCAI 2015: 18th International Conference,

Munich, Germany, October 5-9, 2015, Proceedings, Part III

18. Springer International Publishing, 2015: 234-241.




DOI: http://dx.doi.org/10.12361/2661-3727-05-06-148173

Refbacks

  • 当前没有refback。