Comparative Study of Deep Transfer Learning Models and Data Augmentation Techniques for Real and AI-generated Portraits Classification

Authors

  • Pailin Manaohwan Faculty of Science at Sriracha, Kasetsart University Sriracha Campus
  • Sorachai Phetkhunthot Faculty of Science at Sriracha, Kasetsart University Sriracha Campus
  • Julalak Kaewwangsakoon Faculty of Science at Sriracha, Kasetsart University Sriracha Campus
  • Pongsan Prakitsri Assistant Professor in Faculty of Science at Sriracha, Kasetsart University Sriracha Campus

Keywords:

deep learning, convolution neural networks, transfer learning, fine tuning

Abstract

Distinguishing between a human portrait and one generated by artificial intelligence (AI) is becoming increasingly difficult. As AI technology advances, making AI-generated portraits more similar to real people portraits, posing a significant challenge for classification systems. Portraits contain a vast amount of information regarding color, texture, lighting, and subtle details. Deep learning models, with their layered architecture, can effectively learn patterns and relationships within this data. This paper explores the power of transfer learning with CNNs and data augmentation to enhance accuracy for classifying real and AI-generated portraits. We leverage three pre-trained models (MobileNetV2, ResNet50, and EfficientNetV2S) on a dataset of 3,000 images (1,500 per class). The performance is evaluated with and without image augmentation, providing valuable insights into their combined effect. Our findings suggest that EfficientNetV2S without data augmentation achieved the highest accuracy of 94.67%. and 94.47% for F1-Score.

References

Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition,” in Proc. of the IEEE., vol. 86, no. 11, pp. 2278-2324, 1998.

A. Krizhevsky, I. Sutskever and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems, United States, 2012.

J. Yosinski, J. Clune, Y. Bengio and H. Lipson, “How Transferable Are Features In Deep Neural Networks?,” in Advances in Neural Information Processing Systems 27 (NIPS’ 14), NIPS Foundation, Canada, 2014.

C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of Big Data., vol. 6, no. 60, pp. 1-48, 2019.

K. He, X. Zhang, S. Ren and J. Sun, “Deep Residual Learning for Image Recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, United States, 2016.

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H Adam, “MobileNets: efficient convolutional neural networks for mobile vision applications,” ArXiv., pp. 1-9, 2017.

M. Tan and Q. V. Le, “EfficientNet: rethinking model scaling for convolutional neural networks,” ArXiv., pp. 1-11, 2019.

Y. Taigman, M. Tang, M. A. Ranzato and L. Wolf, “DeepFace: Closing the Gap to Human-Level Performance in Face Verification,” in 2014 IEEE Conference on Computer Vision and Vision and Pattern Recognition (CVPR), Columbus, Ohio, United States, 2014.

Z. Cao, T. Simon, S. E. Wei and Y. Sheikh, “Realtime multi-person 2D pose estimation using part affinity fields,” ArXiv., pp. 1-9, 2017.

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar and L. Fei-Fei, “Large-Scale Video Classification with Convolutional Neural Networks,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014.

F. M. Salman and S. S. Abu-Naser, “Classification of real and fake human faces using deep learning,” International Journal of Academic Engineering Research., vol. 6, no. 3, pp. 1-14, 2022.

J. J. Bird and A. Lotfi, “CIFAKE: Image classification and explainable identification of AI-generated synthetic images,” ArXiv., pp. 1-13, 2023.

D. C. Epstein, I. Jain, O. Wang and R. Zhang, "Online Detection of AI-Generated Images," in 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Paris, France, 2023.

ณัฐวุฒิ ศรีวิบูลย์, “การปรับปรุงประสิทธิภาพการจำแนกภาพเอกซเรย์ทรวงอกด้วยโครงข่ายประสาทเทียมแบบสังวัฒนาการโดยใช้เทคนิคการเพิ่มภาพสำหรับวินิจฉัยโรคโควิด-19,” วารสารวิชาการพระจอมเกล้าพระนครเหนือ, ปีที่ 31, ฉบับที่ 1, น. 109-117, 2564.

ยุทธนา ป้องโสม, “การศึกษาเปรียบเทียบประสิทธิภาพในการวินิจฉัยโรคปอดอักเสบในผู้ป่วยเด็ก จากภาพรังสีทรวงอกด้วยปัญญาประดิษฐ์ ในรูปแบบโครงข่ายประสาทเทียมคอนโวลูชั่นแบบต่างๆ,” สรรพสิทธิเวชสาร, ปีที่ 43, ฉบับที่ 2, น. 41-52, 2565.

อุมาภรณ์ สายแสงจันทร์ , รพีพร ช่ำชอง และอรรถพล สุวรรณษา, “การวิเคราะห์ใบมะนาวที่เป็นโรคโดยใช้การเรียนรู้เชิงลึก,” วารสารวิทยาการสนเทศและเทคโนโลยีประยุกต์, ปีที่ 4, ฉบับที่ 1, น. 71-86, 2565.

Downloads

Published

2024-12-18

How to Cite

[1]
P. . Manaohwan, S. . Phetkhunthot, J. . Kaewwangsakoon, and P. Prakitsri, “Comparative Study of Deep Transfer Learning Models and Data Augmentation Techniques for Real and AI-generated Portraits Classification”, TJOR, vol. 12, no. 2, pp. 28–40, Dec. 2024.

Issue

Section

Research Paper (3 ผู้ประเมิน ตามเกณฑ์การขอตำแหน่งวิชาการ)