Face Photo and Sketch Synthesis Using CycleGAN: The Role of Residual Blocks and Spectral Normalization

Nuntipat Phisutthangkoon; Patikorn  Anchuen; Thiansiri  Luangwilai; Phummipat  Daungklang

PDF

Published: May 1, 2026

Keywords:

CycleGAN Face Photo-Sketch Synthesis Generative Adversarial Network Residual Blocks Spectral Normalization ImageTo-Image Translation CUHK Face Sketch Database SSIM PSNR

Nuntipat Phisutthangkoon

Department of Computer Science, Navaminda Kasatriyadhiraj Royal Air Force Academy, Saraburi, Thailand

Patikorn Anchuen

Office of Graduate Studies, Navaminda Kasatriyadhiraj Royal Air Force Academy, Saraburi, Thailand

Thiansiri Luangwilai

Mathematics Department, Navaminda Kasatriyadhiraj Royal Air Force Academy, Saraburi, Thailand

Phummipat Daungklang

Department of Computer Science, Navaminda Kasatriyadhiraj Royal Air Force Academy, Saraburi, Thailand

Abstract

This study investigates the problem of face photo and sketch synthesis using CycleGAN, with a particular focus on architectural choices that affect model performance. A major challenge in this field is the limited availability of paired face photo-sketch datasets, which makes it difficult to develop and evaluate eﬀective models. To address this limitation, this research demonstrates how to generate face sketches and photos from small datasets, oﬀering a practical solution for expanding existing data collections in future applications, particularly in law enforcement and forensic sketch analysis. Four CycleGAN configurations were designed by varying the number of residual blocks in the generator and applying spectral normalization to the discriminator. Experiments were conducted on the CUHK Face Sketch Database, with quantitative evaluations using SSIM and PSNR, as well as qualitative assessments via human visual inspection. The results show that increasing generator depth and incorporating spectral normalization in the discriminator both improve image quality and training stability, although these benefits come at the cost of increased training time. Notably, the configuration with ResNet9 and spectral normalization consistently achieved the highest scores, with SSIM and PSNR values of 0.702 and 18.53 for sketch-to-photo translation, and 0.702 and 17.35 for photo-to-sketch translation, and produced the most visually convincing results in both tasks. These findings highlight the importance of architectural design choices for eﬀective face photo-sketch translation and provide guidance for future work in related applications.

How to Cite

Phisutthangkoon, N., Anchuen, P. ., Luangwilai, T. ., & Daungklang, P. . (2026). Face Photo and Sketch Synthesis Using CycleGAN: The Role of Residual Blocks and Spectral Normalization. INTERNATIONAL SCIENTIFIC JOURNAL OF ENGINEERING AND TECHNOLOGY (ISJET), 10(1), 84–95. retrieved from https://ph02.tci-thaijo.org/index.php/isjet/article/view/260920

Issue

Vol. 10 No. 1 (2026): January-June

Section

Research Article

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

เนื้อหาข้อมูล

References

T. Valeeprakhon, K. Orkphol, and P. Chaihuadjaroen, “Deep convolutional neural networks based on VGG-16 transfer learning for abnormalities peeled shrimp classification,” Int. Sci. J. Eng. Technol., vol. 6, no. 2, pp. 13-23, Dec. 2022.

N. Hongcharoen, P. Sanguansat, and S. Marukatat, “Fluke eggs detection and classification using deep convolutional neural network,” Int. Sci. J. Eng. Technol., vol. 6, no. 2, pp. 40-53, Dec. 2022.

J. -Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired imageto-image translation using cycle-consistent adversarial networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 2223-2232, https://doi.org/10.1109/ICCV.2017.244.

T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization for generative adversarial networks,” in Proc. Int. Conf. Learn. Representations (ICLR), 2018, pp. 1-4.

M. Turk and A. Pentland, “Eigenfaces for recognition,” J. Cogn. Neurosci., vol. 3, no. 1, pp. 71-86, 1991.

X. Wang and X. Tang, “Face photo–sketch synthesis and recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 11, pp. 1955-1967, Nov. 2009.

A. Irfan, R. Mudassar, S. Muhammad, K. Seifedine, and R. Seungmin, “Union is strength: Improving face sketch synthesis by fusing outcomes of fully convolutional networks and random sampling locality constraint,” Alexandria Eng. J., vol. 61, no. 12, pp. 10727-10741, Apr. 2022.

M. Anwesha, S. Alistair, B. Marija, and J. Hossein, “Rhi3DGen: Analyzing rhinophyma using 3D face models and synthetic data,” Intell.-Based Med., vol. 8, Art. no. 100124, Nov. 2023.

P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 1125-1134.

X. Gao, N. Wang, D. Tao, and W. Liu, “Face sketch-photo synthesis and retrieval using sparse representation,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 8, pp. 1213-1226, Aug. 2012.

L. Wang, V. Sindagi, and V. Patel, “High-quality facial photo-sketch synthesis using multi-adversarial networks,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recognit. (FG), 2018, pp. 83-90.

Y. Fu, W. Deng, J. Du, and J. Han, “Identity-aware CycleGAN for face photo-sketch synthesis and recognition,” Pattern Recognit., vol. 102, Art. no. 107249, Jun. 2020, https://doi.org/10.1016/j.patcog.2020.107249

B. Cao, N. Wang, J. Li, Q. Hu, and X. Gao, “Face photosketch synthesis via full-scale identity supervision,” Pattern Recognit., vol. 124, Art. no. 108446, Apr. 2022, https://doi.org/10.1016/j.patcog.2021.108446

C. Chaofeng, L. Wei, T. Xiao, and K. W. K. Kwan-Yee, “Semi-supervised Cycle-GAN for face photo-sketch translation in the wild,” Comput. Vis. Image Understand., vol. 235, Art. no. 103775, Oct. 2023, https://doi.org/10.1016/j.cviu.2023.103775

Y. Peng, C. Zhang, H. Xu, T. Fukumoto, and K. Maeno, “DiﬀFaceSketch: High-fidelity face image synthesis with sketch-guided latent diﬀusion model,” arXiv, 2023, [Online]. Available: https://doi.org/10.48550/arXiv.2302.06908 [Accessed: Aug. 19, 2025].

T. Wang, Z. Hu, and Y. Guo, “An eficient lightweight network for image denoising using progressive residual and convolutional attention feature fusion,” Sci. Rep., vol. 14, Art. no. 9544, Apr. 2024, https://doi.org/10.1038/s41598-024-60139-x

Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600-612, Apr. 2004, https://doi.org/10.1109/TIP.2003.819861

Article Sidebar

Main Article Content

Abstract

Article Details

References