Ellipsoidal Coverage Function (ECF) – a Modified Mahalanobis Radial Basis Function with Geometrical Coverage Learning (GCL) Algorithm

Main Article Content

Tanat Piumsuwan
Prompong Sugunnasil

บทคัดย่อ

This research presents an Ellipsoidal Coverage Function (ECF) with the addition of the Geometrical Coverage Learning (GCL) algorithm concept for classification. The motivation for this research stems from inefficiencies in nonlinear Deep Neural Networks (DNNs). The implementation of a higher-order function for neurons, while less popular than a deeper linear design, has been claimed to improve the robustness of the model in dealing with noisy environments and to negate the need for a deeper network. The ECF is a higher-order neuron design based on the Mahalanobis distance Radial Basis Function (MRBF) design, but with the number of parameters linearly scaled instead of quadratically with respect to the input dimension. This means that the ECF neuron can approximate the volume coverage of an MRBF in the feature space under a non-rotating constraint and is more suitable for integration into a neural network for further backpropagation (BP) optimization. The integration of the GCL into the ECF for classification architectures boosts learning efficiency, underscoring the versatility and potential impact of this research. The results from experiments with computer vision tasks in a transfer learning environment suggest that the integrated GCL ensures that the ECF can map all the data and correct the training parameters of the network faster. Furthermore, the ECF with the GCL algorithm demonstrates competitive performance relative to other nonlinear neuron designs in a transfer learning setup across different datasets, including itself without GCL. These findings point to a better nonlinear machine learning model in terms of performance and efficiency combined.

Article Details

ประเภทบทความ
บทความวิจัย

เอกสารอ้างอิง

LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521(7553), 436–444. https://doi.org/10.1038/nature14539

Strazzeri, F.; Sánchez-García, R. J. Possibility Results for Graph Clustering: A Novel Consistency Axiom. Pattern Recognition 2022, 128, 108687. https://doi.org/10.1016/j.patcog.2022.108687

McCulloch, W. S.; Pitts, W. A Logical Calculus of the Ideas Immanent in Nervous Activity. The bulletin of mathematical biophysics 1943, 5(4), 115–133. https://doi.org/10.1007/BF02478259

Nesterov, Y. A Method for Unconstrained Convex Minimization Problem with the Rate of Convergence O (1/K2). In Dokl. Akad. Nauk. SSSR; 1983; Vol. 269, p 543.

Mammone, R. J.; Zeevi, Y. Y. Neural Networks: Theory and Applications; Academic Press Professional, Inc., 1992.

Qian, N. On the Momentum Term in Gradient Descent Learning Algorithms. Neural Networks 1999, 12(1), 145–151. https://doi.org/10.1016/S0893-6080(98)00116-6

Ruder, S. An Overview of Gradient Descent Optimization Algorithms, 2017. https://arxiv.org/abs/1609.04747.

Duchi, J.; Hazan, E.; Singer, Y. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research 2011, 12(61), 2121–2159.

Zeiler, M. D. ADADELTA: An Adaptive Learning Rate Method, 2012. https://arxiv.org/abs/1212.5701.

Kingma, D. P.; Ba, J. Adam: A Method for Stochastic Optimization, 2014.

Dozat, T. Incorporating Nesterov Momentum into Adam. 2016.

Krizhevsky, A.; Sutskever, I.; Hinton, G. E. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems; Pereira, F., Burges, C. J., Bottou, L., Weinberger, K. Q., Eds.; Curran Associates, Inc., 2012; Vol. 25.

Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition, 2015. https://arxiv.org/abs/1409.1556.

Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper With Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2015; pp 1–9.

He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016; pp 770–778.

Howard, A. G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, 2017. https://arxiv.org/abs/1704.04861.

Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K. Q. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017; pp 4700–4708.

Zhang, X.; Zhou, X.; Lin, M.; Sun, J. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2018; pp 6848–6856.

Chen, Y.; Li, J.; Xiao, H.; Jin, X.; Yan, S.; Feng, J. Dual Path Networks. In Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc., 2017; Vol. 30.

Tan, M.; Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning; Chaudhuri, K., Salakhutdinov, R., Eds.; Proceedings of Machine Learning Research; PMLR, 2019; Vol. 97, pp 6105–6114.

Xie, Q.; Luong, M.-T.; Hovy, E.; Le, Q. V. Self-Training With Noisy Student Improves ImageNet Classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020; pp 10687–10698.

Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing Properties of Neural Networks, 2014. https://arxiv.org/abs/1312.6199.

Goodfellow, I. J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples, 2015. https://arxiv.org/abs/1412.6572.

Nguyen, A.; Yosinski, J.; Clune, J. Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2015; pp 427–436.

Fawzi, A.; Moosavi-Dezfooli, S.-M.; Frossard, P. Robustness of Classifiers: From Adversarial to Random Noise. In Advances in Neural Information Processing Systems; Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R., Eds.; Curran Associates, Inc., 2016; Vol. 29.

Moosavi-Dezfooli, S.-M.; Fawzi, A.; Frossard, P. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016; pp 2574–2582.

Carlini, N.; Wagner, D. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security; AISec ’17; Association for Computing Machinery: New York, NY, USA, 2017; pp 3–14. https://doi.org/10.1145/3128572.3140444

Zadeh, P. H.; Hosseini, R.; Sra, S. Deep-RBF Networks Revisited: Robust Classification with Rejection, 2018. https://arxiv.org/abs/1812.03190

Broomhead, D. S.; Lowe, D. Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks; RSRE memorandum; Royal Signals and Radar Establishment, 1988.

Moradi, M. J.; Roshani, M. M.; Shabani, A.; Kioumarsi, M. Prediction of the Load-Bearing Behavior of SPSW with Rectangular Opening by RBF Network. Applied Sciences 2020, 10(3). https://doi.org/10.3390/app10031185

Zhang, D.; Zhang, N.; Ye, N.; Fang, J.; Han, X. Hybrid Learning Algorithm of Radial Basis Function Networks for Reliability Analysis. IEEE Transactions on Reliability 2021, 70(3), 887–900. https://doi.org/10.1109/TR.2020.3001232

Meng, X.; Zhang, Y.; Qiao, J. An Adaptive Task-Oriented RBF Network for Key Water Quality Parameters Prediction in Wastewater Treatment Process. Neural Computing and Applications 2021, 33(17), 11401–11414. https://doi.org/10.1007/s00521-020-05659-z

Lin, M.; Zeng, X.; Wu, J. State of Health Estimation of Lithium-Ion Battery Based on an Adaptive Tunable Hybrid Radial Basis Function Network. Journal of Power Sources 2021, 504, 230063. https://doi.org/10.1016/j.jpowsour.2021.230063

Karamichailidou, D.; Alexandridis, A.; Anagnostopoulos, G.; Syriopoulos, G.; Sekkas, O. Modeling Biogas Production from Anaerobic Wastewater Treatment Plants Using Radial Basis Function Networks and Differential Evolution. Computers & Chemical Engineering 2022, 157, 107629. https://doi.org/10.1016/j.compchemeng.2021.107629

Liu, Z.; Dang, Z.; Yu, J. Stock Price Prediction Model Based on RBF-SVM Algorithm. In 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC); IEEE, 2020; pp 124–127. https://doi.org/10.1109/ICCEIC51584.2020.00032

Sermpinis, G.; Karathanasopoulos, A.; Rosillo, R.; de la Fuente, D. Neural Networks in Financial Trading. Annals of Operations Research 2021, 297(1), 293–308. https://doi.org/10.1007/s10479-019-03144-y

Li, X.; Sun, Y. Application of RBF Neural Network Optimal Segmentation Algorithm in Credit Rating. Neural Computing and Applications 2021, 33(14), 8227–8235. https://doi.org/10.1007/s00521-020-04958-9

Kumar, R.; Srivastava, S.; Dass, A.; Srivastava, S. A Novel Approach to Predict Stock Market Price Using Radial Basis Function Network. International Journal of Information Technology 2021, 13(6), 2277–2285. https://doi.org/10.1007/s41870-019-00382-y

Sohrabi, P.; Jodeiri Shokri, B.; Dehghani, H. Predicting Coal Price Using Time Series Methods and Combination of Radial Basis Function (RBF) Neural Network with Time Series. Mineral Economics 2023, 36(2), 207–216. https://doi.org/10.1007/s13563-021-00286-z

Amirian, M.; Schwenker, F. Radial Basis Function Networks for Convolutional Neural Networks to Learn Similarity Distance Metric and Improve Interpretability. IEEE Access 2020, 8, 123087–123097. https://doi.org/10.1109/ACCESS.2020.3007337

Jiang, Q.; Zhu, L.; Shu, C.; Sekar, V. An Efficient Multilayer RBF Neural Network and Its Application to Regression Problems. Neural Computing and Applications 2022, 34(6), 4133–4150. https://doi.org/10.1007/s00521-021-06373-0

Papadimitrakis, M.; Alexandridis, A. Active Vehicle Suspension Control Using Road Preview Model Predictive Control and Radial Basis Function Networks. Applied Soft Computing 2022, 120, 108646. https://doi.org/10.1016/j.asoc.2022.108646

Cevikalp, H.; Larlus, D.; Jurie, F. A Supervised Clustering Algorithm for the Initialization of RBF Neural Network Classifiers. In 2007 IEEE 15th Signal Processing and Communications Applications; IEEE, 2007; pp 1–4. https://doi.org/10.1109/SIU.2007.4298803

Wang, D.; Zeng, X.-J.; Keane, J. A. A Clustering Algorithm for Radial Basis Function Neural Network Initialization. Neurocomputing 2012, 77(1), 144–155. https://doi.org/10.1016/j.neucom.2011.08.023

Czarnowski, I.; Jędrzejowicz, P. Kernel-Based Fuzzy C-Means Clustering Algorithm for RBF Network Initialization. In Intelligent Decision Technologies 2016; Czarnowski, I., Caballero, A. M., Howlett, R. J., Jain, L. C., Eds.; Springer International Publishing: Cham, 2016; pp 337–347.

Beheim, L.; Zitouni, A.; Belloir, F. New RBF Neural Network Classifier with Optimized Hidden Neurons Number. 2004.

Ibrikci, T.; Brandt, M. E.; Wang, G.; Acikkar, M. Mahalanobis Distance with Radial Basis Function Network on Protein Secondary Structures. In Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society] [Engineering in Medicine and Biology; 2002; Vol. 3, pp 2184–2185. https://doi.org/10.1109/IEMBS.2002.1053230

Mahalanobis, P. C. ON THE GENERALIZED DISTANCE IN STATISTICS. Sankhyā: The Indian Journal of Statistics, Series A (2008-) 2018, 80, S1–S7.

Lee, S.; Kil, R. M. A Gaussian Potential Function Network with Hierarchically Self-Organizing Learning. Neural Networks 1991, 4(2), 207–224. https://doi.org/10.1016/0893-6080(91)90005-P

Park, J.; Sandberg, I. W. Universal Approximation Using Radial-Basis-Function Networks. Neural Computation 1991, 3(2), 246–257. https://doi.org/10.1162/neco.1991.3.2.246

Valls, J. M.; Aler, R.; Fernández, O. Using a Mahalanobis-Like Distance to Train Radial Basis Neural Networks. In Computational Intelligence and Bioinspired Systems; Cabestany, J., Prieto, A., Sandoval, F., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2005; pp 257–263.

Li, Z.; and Fox, J. M. Integrating Mahalanobis Typicalities with a Neural Network for Rubber Distribution Mapping. Remote Sensing Letters 2011, 2(2), 157–166. https://doi.org/10.1080/01431161.2010.505589

Ning, X.; Tian, W.; Yu, Z.; Li, W.; Bai, X.; Wang, Y. HCFNN: High-Order Coverage Function Neural Network for Image Classification. Pattern Recognition 2022, 131, 108873. https://doi.org/10.1016/j.patcog.2022.108873

Ning, X.; Tian, W.; He, F.; Bai, X.; Sun, L.; Li, W. Hyper-Sausage Coverage Function Neuron Model and Learning Algorithm for Image Classification. Pattern Recognition 2023, 136, 109216. https://doi.org/10.1016/j.patcog.2022.109216

Xu, C.; Yu, F.; Xu, Z.; Liu, C.; Xiong, J.; Chen, X. QuadraNet: Improving High-Order Neural Interaction Efficiency with Hardware-Aware Quadratic Neural Networks. In 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC); IEEE, 2024; pp 19–25. https://doi.org/10.1109/ASP-DAC58780.2024.10473936

Zhang, L.; Zhang, B. A Geometrical Representation of McCulloch-Pitts Neural Model and Its Applications. IEEE Transactions on Neural Networks 1999, 10(4), 925–929. https://doi.org/10.1109/72.774263

Glorot, X.; Bengio, Y. Understanding the Difficulty of Training Deep Feedforward Neural Networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics; Teh, Y. W., Titterington, M., Eds.; Proceedings of Machine Learning Research; PMLR: Chia Laguna Resort, Sardinia, Italy, 2010; Vol. 9, pp 249–256.

Kumar, S. K. On Weight Initialization in Deep Neural Networks, 2017. https://arxiv.org/abs/1704.08863

Narkhede, M. V.; Bartakke, P. P.; Sutaone, M. S. A Review on Weight Initialization Strategies for Neural Networks. Artificial Intelligence Review 2022, 55(1), 291–322. https://doi.org/10.1007/s10462-021-10033-z

Wightman, R. PyTorch Image Models. GitHub repository, 2019. https://doi.org/10.5281/zenodo.4414861

Wightman, R.; Touvron, H.; Jégou, H. ResNet Strikes Back: An Improved Training Procedure in Timm, 2021. https://arxiv.org/abs/2110.00476

Luo, L.; Xiong, Y.; Liu, Y.; Sun, X. Adaptive Gradient Methods with Dynamic Bound of Learning Rate, 2019. https://arxiv.org/abs/1902.09843

Krizhevsky, A.; Hinton, G.; others. Learning Multiple Layers of Features from Tiny Images. 2009.

Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE 1998, 86(11), 2278–2324. https://doi.org/10.1109/5.726791

Huang, Y.; Qiu, C.; Yuan, K. Surface Defect Saliency of Magnetic Tile. The Visual Computer 2020, 36(1), 85–96. https://doi.org/10.1007/s00371-018-1588-5

Kylberg, G. Kylberg Texture Dataset v. 1.0; External report (Blue series); 35; Centre for Image Analysis, Swedish University of Agricultural Sciences and Uppsala University, 2014; p 4. www.cb.uu.se/~gustaf/texture.

Maguire, M.; Dorafshan, S.; Thomas, R. J. SDNET2018: A Concrete Crack Image Dataset for Machine Learning Applications. 2018. https://doi.org/10.15142/T3TD19