A Mixture-of-experts Approach to Production Capacity Planning for Diverse Demand Patterns via Deep Reinforcement Learning
Keywords:
Production capacity planning, Deep Reinforcement Learning, Ensemble Learning, Mixture of experts, Proximal policy optimization, Mixed Integer Linear ProgrammingAbstract
Ensemble Learning is gaining traction in Reinforcement Learning (RL) due to its ability to improve performance, robustness, and capabilities of RL models. This paper addresses the challenge of production planning with fluctuating demand by proposing a novel Mixture of Experts Deep Reinforcement Learning (MoE-DRL) model. We leverage a combination of Proximal Policy Optimization (PPO), a powerful reinforcement learning algorithm, and Ensemble learning, a technique that combines multiple models. We propose a mixture of expert ensemble learning model that combine multiple expert PPO-DRL agents through a gating model (MoE PPO-DRL). The gating model learns to select the best expert agent for predicting the most suitable production plan for each situation's different demand patterns. The proposed model was trained and tested against the results obtained from the Mixed Integer Linear Programming model and the individual expert PPO agents. The MoE PPO-DRL model achieved a total average profit that was 25.9% higher than an average of all expert single-agent models. It also achieved a 11.02% optimality gap, which is significantly lower than the 22.93% average gap of all expert single-agent models.
References
H. Golmohamadi, "Demand-side management in industrial sector: A review of heavy industries," Renewable and Sustainable Energy Reviews, vol. 156, p. 111963, 2022/03/01/ 2022, doi: https://doi.org/10.1016/j.rser.2021.111963.
K. V. Ramani and B. H. Dholakia, "A decision support system to forecast cement demand," Information Technology for Development, vol. 8, no. 3, pp. 169-182, 1999/01/01 1999, doi: 10.1080/02681102.1999.9525305.
F. Kulmer, M. Wolf, and C. Ramsauer, "Medium-term Capacity Management through Reinforcement Learning – Literature review and concept for an industrial pilot-application," Procedia CIRP, vol. 107, pp. 1065-1070, 2022/01/01/ 2022, doi: https://doi.org/10.1016/j.procir.2022.05.109.
P. M. Castro, I. Harjunkoski, and I. E. Grossmann, "New Continuous-Time Scheduling Formulation for Continuous Plants under Variable Electricity Cost," Industrial & Engineering Chemistry Research, vol. 48, pp. 6701–6714, 2009, doi: https://doi.org/10.1021/ie900073k. ACS Publications.
M. Erdirik-Dogan and I. E. Grossmann, "Simultaneous planning and scheduling of single-stage multi-product continuous plants with parallel lines," Computers and Chemical Engineering, vol. 32, pp. 2664–2683, 2008.
Y. Nie, L. T. Biegler, C. M. Villa, and J. M. Wassick, "Discrete Time Formulation for the Integration of Scheduling and Dynamic Optimization," Industrial and Engineering Chemistry Research, vol. 54, no. 16, pp. 4303–4315, 2015.
A. Obermeier, C. Windmeier, E. Esche, and J.-U. Repke, "A discrete-time scheduling model for power-intensive processes taking fatigue of equipment into consideration," Chemical Engineering Science, 2018.
I. BACH, G. BOCEWICZ, and Z. BANASZAK, "Coonstraint Programming Approach to Multi-product Scheduling," vol. 3, ed: Applied Computer Science, 2007.
C. A. Floudas and X. Lin, "Continuous-time versus discrete-time approaches for scheduling of chemical processes: a review," Computers and Chemical Engineering, vol. 28, pp. 2109–2129, 2004.
H. Mokhtari, I. N. K. Abadi, and S. H. Zegordi, "Production capacity planning and scheduling in a no-wait environment with controllable processing times: An integrated modeling approach," Expert Systems with Applications, vol. 38, no. 10, pp. 12630-12642, 2011/09/15/ 2011, doi: https://doi.org/10.1016/j.eswa.2011.04.051.
Y. Ma, F. Qiao, and J. Lu, "Learning-based dynamic scheduling of semiconductor manufacturing system.," presented at the IEEE International Conference on Automation Science and Engineering (CASE 2016), 2016.
H. Seidgar, M. Zandieh, and I. Mahdavi, "Bi-objective optimization for integrating production and preventive maintenance scheduling in two-stage assembly flow shop problem," Journal of Industrial and Production Engineering, 2016.
X. Yao, N. Almatooq, R. G. Askin, and G. Gruber, "Capacity planning and production scheduling integration: improving operational efficiency via detailed modelling," International Journal of Production Research, vol. 60, no. 24, pp. 7239-7261, 2022/12/17 2022, doi: 10.1080/00207543.2022.2028031.
L. Lin and M. Gen, "Hybrid evolutionary optimisation with learning for production scheduling: state-of-the-art survey on algorithms and applications," International Journal of Production Research, 2018, doi: 10.1080/00207543.2018.1437288.
M. Larios-gómez, P. M. Quintero-flores, M. Anzures-garcía, and M. Camacho-hernandez, "Application of real-time fan scheduling in exploration-exploitation to optimize minimum function objectives," Applied Computer Science, no. 19(2), pp. 43-54, 2023, doi: https://doi.org/10.35784/acs-2023-13.
S. L. Takeda-Berger, E. M. Frazzon, E. Broda, and a. M. Freitag, "Machine Learning in Production Scheduling: An Overview of the Academic Literature," in Dynamics in Logistics Proceedings of the 7th International Conference LDIC 2020, Bremen, Germany, 2020: Springer Nature Switzerland AG, p. 409.
C. D. Hubbs, C. Li, N. V. Sahinidis, I. E. Grossmann, and J. M. Wassick, "A Deep Reinforcement Learning Approach for Chemical Production Scheduling," An International Journal of Computers Applications in Chemical Engineering, 2020.
Max Mowbray, Dongda Zhang, and E. A. D. R. Chanona, "Distributional Reinforcement Learning for Scheduling of Chemical Production Processes," arXiv:2203.00636, 2022.
D. Weichert, P. Link, A. Stoll, S. Rüping, S. Ihlenfeldt, and S. Wrobel, "A review of machine learning for the optimization of production processes," The International Journal of Advanced Manufacturing Technology, vol. 104, no. 5, pp. 1889-1902, 2019/10/01 2019, doi: 10.1007/s00170-019-03988-5.
D. Kalita. "An Overview and Applications of Artificial Neural Networks." Data Science Blogathon. https://www.analyticsvidhya.com (accessed.
H. Rummukainen and J. K. Nurminen, "Practical Reinforcement Learning Experiences in Lot Scheduling Application," presented at the International Federation of Automatic Control (IFAC) Conference 2019, 2019.
T. Quah, D. Machalek, and K. M. Powell, "Comparing Reinforcement Learning Methods for Real-Time Optimization of a Chemical Process," MDPI, vol. 8, p. 1497, 2020, doi: doi:10.3390/pr8111497.
S. Windmuller, "Smart capacity planning: A machine learning approach," Master of Science, Department of Information Systems, Production and Logistics Management, University of Innsbruck, Innsbruck
, Austria, 2020.
A. Srinivasan, "Reinforcement Learning: Advancements, Limitations, and Real-world Applications," International Journal of Scientific Research in Engineering and Management (IJSREM), vol. 07, 08, 2023, doi: 10.55041/IJSREM25118.
Y. Yang, H. Lv, and N. Chen, "A Survey on ensemble learning under the era of deep learning," Artificial Intelligence Review, vol. 56, no. 6, pp. 5545-5589, 2023/06/01 2023, doi: 10.1007/s10462-022-10283-5.
C.-F. Chien, C.-C. Ku, and Y.-Y. Lu, "Ensemble learning for demand forecast of After-Market spare parts to empower data-driven value chain and an empirical study," Computers & Industrial Engineering, vol. 185, p. 109670, 2023/11/01/ 2023, doi: https://doi.org/10.1016/j.cie.2023.109670.
W. Niu et al., "An ensemble transfer learning strategy for production prediction of shale gas wells," Energy, vol. 275, p. 127443, 2023/07/15/ 2023, doi: https://doi.org/10.1016/j.energy.2023.127443.
J. Sun, M.-y. Jia, and H. Li, "AdaBoost ensemble for financial distress prediction: An empirical comparison with data from Chinese listed companies," Expert Systems with Applications, vol. 38, no. 8, pp. 9305-9312, 2011/08/01/ 2011, doi: https://doi.org/10.1016/j.eswa.2011.01.042.
G. Haixiang, L. Yijing, L. Yanan, L. Xiao, and L. Jinling, "BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification," Engineering Applications of Artificial Intelligence, vol. 49, pp. 176-193, 2016/03/01/ 2016, doi: https://doi.org/10.1016/j.engappai.2015.09.011.
R. Guo, D. Fu, and G. Sollazzo, "An ensemble learning model for asphalt pavement performance prediction based on gradient boosting decision tree," International Journal of Pavement Engineering, vol. 23, no. 10, pp. 3633-3646, 2022/08/24 2022, doi: 10.1080/10298436.2021.1910825.
R. Natras, B. Soja, and M. Schmidt, "Ensemble Machine Learning of Random Forest, AdaBoost and XGBoost for Vertical Total Electron Content Forecasting," Remote Sensing, vol. 14, no. 15, doi: 10.3390/rs14153547.
M. Pérez-Ortiz, P. A. Gutiérrez, P. Tino, C. Casanova-Mateo, and S. Salcedo-Sanz, "A mixture of experts model for predicting persistent weather patterns," in 2018 International Joint Conference on Neural Networks (IJCNN), 8-13 July 2018 2018, pp. 1-8, doi: 10.1109/IJCNN.2018.8489179.
R. Rambabu, P. Vadakkepat, K. C. Tan, and M. Jiang, "A Mixture-of-Experts Prediction Framework for Evolutionary Dynamic Multiobjective Optimization," IEEE Transactions on Cybernetics, vol. 50, no. 12, pp. 5099-5112, 2020, doi: 10.1109/TCYB.2019.2909806.
S. E. Yuksel, J. N. Wilson, and P. D. Gader, "Twenty Years of Mixture of Experts," IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 8, pp. 1177-1193, 2012, doi: 10.1109/TNNLS.2012.2200299.
G. Cheng, L. Dong, W. Cai, and C. Sun, "Multi-Task Reinforcement Learning With Attention-Based Mixture of Experts," IEEE Robotics and Automation Letters, vol. 8, no. 6, pp. 3812-3819, 2023, doi: 10.1109/LRA.2023.3271445.
S. Zhao, I. E. Grossmann, and L. Tang, "Integrated scheduling of rolling sector in steel production with consideration of energy consumption under time-of-use electricity prices," Computers and Chemical Engineering, vol. 111, pp. 55–65, 2018.
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal policy optimization algorithms," arXiv preprint arXiv:1707.06347, 2017.
E. Bilgin, Mastering Reinforcement Learning with Python. Packt Publishing Ltd., 2020.
M. Lapan, Deep Reinforcement Learning Hands-On - Second Edition. Packt, 2020, p. 606.
I. C. Gormley and S. Frühwirth-Schnatter, Mixture of Experts Models, 1st Edition ed. Taylor & Francis, 2018.
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, Illustrated edition ed. The MIT Press, 2016.
M. Andrychowicz et al., "What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study," arXiv:2006.05990v1, 2020.
A. B. W. Putra, R. Malani, B. Suprapty, and A. F. O. Gaffar, "A Deep Auto Encoder Semi Convolution Neural Network for Yearly Rainfall Prediction," in 2020 International Seminar on Intelligent Technology and Its Applications (ISITIA), 22-23 July 2020 2020, pp. 205-210, doi: 10.1109/ISITIA49792.2020.9163775.
S. Mitchell, M. OSullivan, and I. Dunning, "PuLP: a linear programming toolkit for python," The University of Auckland, Auckland, New Zealand, vol. 65, 2011.
A. Paszke et al., "Pytorch: An imperative style, high-performance deep learning library," Advances in neural information processing systems, vol. 32, 2019.
G. V. Rossum and F. L. Drake, Python 3 Reference Manual. CreateSpace, 2009.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.