开放期刊系统

基于对抗生成网络的超分辨率海报生成模型

利锦轩, 邓秀勤
广东工业大学/数学与统计学院

摘要

随着计算机技术的不断进步，图像生成技术也在飞速发展。本文提出了一种以中文文本为输入的海报生成
方法，将StackGan模型的第二阶段改进为超分辨率模型（GAN_SR3）。我们引入CLIP模型将中文文本映射到高维编
码，提升文本与图像的匹配度。通过在StackGan的第一阶段输入编码和随机噪声，生成低分辨率图像，然后输入至
改进的超分辨率模型SR3，获得高分辨率海报。在自建数据集上与其他模型进行比较，实验结果显示GAN_SR3在电
影海报和网络海报数据集上获得更高的Inception Score和更小的FID。

关键词

文本图像生成；对抗生成网络；深度学习；图像生成

全文:

PDF

参考

[1] GOODFELLOW I, POUGET-ABADIE J, MIRZA M,

et al. Generative adversarial networks[J]. Communications of

the ACM, 2020, 63(11): 139-144.

[2] LIU Q, CHEN M, ZHOU D. Fast haze removal from

a single image[C]//2013 25th Chinese Control and Decision

Conference (CCDC). IEEE, 2013: 3780-3785.

[3] Kingma D P, Welling M. Auto-Encoding Variational

Bayes[J]. stat, 2014, 1050: 1

[4] KARRAS T, LAINE S, AITTALA M, et al. Analyzing

and improving the image quality of stylegan[C]//Proceedings

of the IEEE/CVF conference on computer vision and pattern

recognition. 2020: 8110-8119.

[5] RADFORD A, KIM J W, HALLACY C, et al.

Learning transferable visual models form natural language

supervision[C]//International conference on machine learning.

PMLR,2021:8748-8763

[6] Ramesh A, Dhariwal P, Nichol A, et al.

Hierarchical Text-Conditional Image Generation with CLIP

Latents[J]. arXiv e-prints, 2022: arXiv: 2204.06125.

[7] HO J, JAIN A, ABBEEL P. Denoising diffusion

probabilistic models[J]. Advances in neural information

processing systems, 2020, 33: 6840-6851.

[8] REED S, AKATA Z, YAN X, et al. Generative

adversarial text to image synthesis[C]//International conference

on machine learning. PMLR, 2016: 1060-1069.

[9] Reed S, Akata Z, Mohan S, et al. Learning

what and where to draw[J]. Advances in neural information

processing systems, 2016, 29.

[10] ZHANG H, XU T, LI H, et al. Stackgan: Text to

photo-realistic image synthesis with stacked generative

adversarial networks[C]//Proceedings of the IEEE international

conference on computer vision. 2017: 5907-5915.

[11] ZHANG H, XU T, LI H, et al. Stackgan++:

Realistic image synthesis with stacked generative adversarial

networks[J]. IEEE transactions on pattern analysis and

machine intelligence, 2018, 41(8): 1947-1962.

[12] XU T, ZHANG P, HUANG Q, et al. Attngan: Finegrained text to image generation with attentional generative

adversarial networks[C]//Proceedings of the IEEE conference

on computer vision and pattern recognition. 2018: 1316-1324.

[13] Arjovsky M, Chintala S, Bottou L.

Wasserstein generative adversarial networks[C]//International

conference on machine learning. PMLR, 2017: 214-223.

[14] Kang M, Zhu J Y, Zhang R, et al. Scaling up

gans for text-to-image synthesis[C]//Proceedings of the IEEE/

CVF Conference on Computer Vision and Pattern Recognition.

2023: 10124-10134.

[15] Yang A, Pan J, Lin J, et al. Chinese CLIP:

Contrastive Vision-Language Pretraining in Chinese[J]. arXiv

e-prints, 2022: arXiv: 2211.01335.

[16] Arjovsky M, Chintala S, Bottou L.

Wasserstein generative adversarial networks[C]//International

conference on machine learning. PMLR, 2017: 214-223.

[17] SAHARIA C, HO J, CHAN W, et al. Image superresolution via iterative refinement[J]. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 2022, 45(4): 4713-4726.

[18] Ronneberger O, Fischer P, Brox T. U-net:

Convolutional networks for biomedical image segmentation[C]//

Medical Image Computing and Computer-Assisted

Intervention–MICCAI 2015: 18th International Conference,

Munich, Germany, October 5-9, 2015, Proceedings, Part III

18. Springer International Publishing, 2015: 234-241.

DOI: http://dx.doi.org/10.12361/2661-3727-05-06-148173

Refbacks

当前没有refback。

合作支持单位

新加坡万仕出版社
北京春城教育出版物研究中心
马来西亚唐博科学研究院
北京万象兴荣科技文化发展有限公司
新加坡亿科出版社
春城(成都)文化传媒有限公司

基于对抗生成网络的超分辨率海报生成模型

摘要

关键词

全文:

参考

Refbacks

合作支持单位

数据库合作单位

环宇中文期刊

友情链接

联系环宇

用户名
密码
记住我