Scalable Deep Neural Network Training: Overcoming Memory Constraints with Performance Preservation

Main Article Content

Kaligotla Ravi Kumar
C. Sivakumar

Abstract

This paper introduces a novel method for training Deep Neural Networks (DNNs) with batch sizes that surpass memory constraints, without compromising model performance. Conventional methods struggle with memory limitations when using large batch sizes, often resulting in reduced efficiency and degraded training outcomes. Our proposed solution, Small-Batch Processing (SBP), overcomes these challenges by leveraging a batch streaming mechanism and a loss normalization strategy based on gradient accumulation. SBP processes smaller micro-batches sequentially, enabling large-batch training on a single device without requiring memory augmentation or additional GPUs. Experimental evaluations confirm that SBP matches the performance of traditional approaches while enabling the use of batch sizes that were previously infeasible due to memory constraints. This technique significantly enhances the potential for training DNNs in resource-limited environments, driving improvements in both scalability and training efficiency for deep learning applications.

Article Details

Section
Research Articles

References

Ibrahim, G. J.; Rashid, T. A.; Akinsolu, M. O. An energy efficient service composition mechanism using a hybrid meta-heuristic algorithm in a mobile cloud environment. J. Parallel Distrib. Comput. 2020, 143, 77–87. https://doi.org/10.1016/j.jpdc.2020.05.002

Raj, D. J. S. Optimized mobile edge computing framework for IoT based medical sensor network nodes. J. Ubiquitous Comput. Commun. Technol. 2021, 3(1), 33–42. https://doi.org/10.36548/jucct.2021.1.004

Cong, P.; Zhou, J.; Li, L.; Cao, K.; Wei, T.; Li, K. A survey of hierarchical energy optimization for mobile edge computing: A perspective from end devices to the cloud. ACM Comput. Surv. 2020, 53(2), 1–44. https://doi.org/10.1145/3378935

Naouri, A.; Wu, H.; Nouri, N. A.; Dhelim, S.; Ning, H. A novel framework for mobile-edge computing by optimizing task offloading. IEEE Internet Things J. 2021, 8(16), 13065–13076. https://doi.org/10.1109/JIOT.2021.3064225

Li, G.; Yan, J.; Chen, L.; Wu, J.; Lin, Q.; Zhang, Y. Energy consumption optimization with a delay threshold in cloud-fog cooperation computing. IEEE Access 2019, 7, 159688–159697. https://doi.org/10.1109/ACCESS.2019.2950443

Tang, C.; Wei, X.; Zhu, C.; Wang, Y.; Jia, W. Mobile vehicles as fog nodes for latency optimization in smart cities. IEEE Trans. Veh. Technol. 2020, 69 (9), 9364–9375. https://doi.org/10.1109/TVT.2020.2970763

Zhang, W. Z.; Elgendy, I. A.; Hammad, M.; Iliyasu, A. M.; Du, X.; Guizani, M.; Abd El-Latif, A. A. Secure and optimized load balancing for multitier IoT and edge-cloud computing systems. IEEE Internet Things J. 2020, 8(10), 8119–8132. https://doi.org/10.1109/JIOT.2020.3042433

Wei, X.; Tang, C.; Fan, J.; Subramaniam, S. Joint optimization of energy consumption and delay in cloud-to-thing continuum. IEEE Internet Things J. 2019, 6 (2), 2325–2337. https://doi.org/10.1109/JIOT.2019.2906287

Yuvaraj, N.; Kousik, N. V.; Jayasri, S.; Daniel, A.; Rajakumar, P. A survey on various load balancing algorithm to improve the task scheduling in cloud computing environment. J. Adv. Res. Dyn. Control Syst. 2019, 11(08), 2397–2406.

Kai, C.; Zhou, H.; Yi, Y.; Huang, W. Collaborative cloud-edge-end task offloading in mobile-edge computing networks with limited communication capability. IEEE Trans. Cogn. Commun. Netw. 2020, 7(2), 624–634. https://doi.org/10.1109/TCCN.2020.3018159

Peng, H.; Wen, W. S.; Tseng, M. L.; Li, L. L. Joint optimization method for task scheduling time and energy consumption in mobile cloud computing environment. Appl. Soft Comput. 2019, 80, 534–545. https://doi.org/10.1016/j.asoc.2019.04.027

Wu, H.; Wolter, K.; Jiao, P.; Deng, Y.; Zhao, Y.; Xu, M. EEDTO: An energy-efficient dynamic task offloading algorithm for blockchain-enabled IoT-edge-cloud orchestrated computing. IEEE Internet Things J. 2020, 8(4), 2163–2176. https://doi.org/10.1109/JIOT.2020.3033521

Velmurugadass, P.; Dhanasekaran, S.; Anand, S. S.; Vasudevan, V. Enhancing Blockchain security in cloud computing with IoT environment using ECIES and cryptography hash algorithm. Mater. Today Proc. 2021, 37, 2653–2659. https://doi.org/10.1016/j.matpr.2020.08.519

Yang, P.; Zhang, Y.; Lv, J. Load optimization based on edge collaboration in software defined ultra-dense networks. IEEE Access 2020, 8, 30664–30674. https://doi.org/10.1109/ACCESS.2020.2973041

Saad, M. M.; Khan, M. T. R.; Srivastava, G.; Jhaveri, R. H.; Islam, M.; Kim, D. Cooperative vehicular networks: An optimal and machine learning approach. Comput. Electr. Eng. 2022, 103, 108348. https://doi.org/10.1016/j.compeleceng.2022.108348.