AI System Design for Robotic Hand to Play the Piano

Main Article Content

Wuttichon Aukkhosuwan
Wannarat Suntiamorntut

Abstract

Robotic and Artificial Intelligent (AI) have been introduced as a key factor for industry revolution4.0. Many industries, such as manufacturing, agriculture, logistics, supply chain, and so on, are transformed and applied robotic and AI to enhance productivity and reduce cost. AI in creative work is very challenging, especially in music. This paper presents a system to enable the robotic arm to play piano notes with minimal errors. We used the knowledge of Optical music recognition (OMR), Automatic music transcription (AMT), Music source separation (MSS), and the elimination of robot arm cycle times problems for creating this system. The robotic arm used in testing with this system was LEGO. It can perform 4 functions. The first function is to play piano notes from sheet piano using Sheet Vision and Tesseract-OCR. Note reading accuracy is 21.36%, and note reading accuracy with note duration is 13.46%. The second function is to play the piano like a piano sound separate from a music file using Spleeter and Onsets and Frames. Piano note accuracy is 58.32%, piano note onset time error is ± 0.17, and piano note duration time error is ± 0.65. The third function is to play piano notes from real-time piano sounds using Onsets and Frames Real-time mode. The last function is to play the piano notes from the brain waves by comparing the frequency of the brain waves with the frequency of the piano notes. The design and experimental results are explained in this paper.

Article Details

Section
Research Articles

References

Weinberg, G.; Raman, A.; Mallikarjuna, T. Interactive jamming with Shimon: A social robotic musician. In: 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI). 2009, 233-234. https://doi.org/10.1145/1514095.1514152.

Lin, JC.; Huang, HH.; Li, YF.; Tai, JC.; Liu, LW. Electronic piano playing robot. In: 2010 International Symposium on Computer, Communication, Control and Automation (3CA). 2010, 2, 353-356. https://doi.org/10.1109/3CA.2010.5533457.

Zhang, D.; Jianhe, L.; Beizhi, L.; Lau, D.; Cameron, C. Design and analysis of a piano playing robot. In: 2009 International Conference on Information and Automation. 2009, 757-761. https://doi.org/10.1109/ICINFA.2009.5205022.

Zhang, A.; Malhotra, M.; Matsuoka, Y. Musical piano performance by the ACT Hand. In: 2011 IEEE International Conference on Robotics and Automation. 2011, 3536-3541. https://doi.org/10.1109/ICRA.2011.5980342.

Li, YF.; Chuang, LL. Controller design for music playing robot — Applied to the anthropomorphic piano robot. In: 2013 IEEE 10th International Conference on Power Electronics and Drive Systems (PEDS). 2013, 968-973. https://doi.org/10.1109/PEDS.2013.6527158.

Fahn, CS.; Lu, KJ. Humanoid recognizing piano scores techniques. In: 2014 International Conference on Information Science, Electronics and Electrical Engineering. 2014, 3, 1397-1402. https://doi.org/10.1109/InfoSEEE.2014.6946149.

Luo, Y.; Mesgarani, N. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation. IEEE/ACM Trans Audio, Speech, Lang Process. 2019, 27(8), 1256-1266. https://doi.org/10.1109/TASLP.2019.2915167.

Defossez, A.; Usunier, N.; Bottou, L.; Bach, F. Music Source Separation in the Waveform Domain. Published online 2019.

Hennequin, R.; Khlif, A.; Voituret, F.; Moussallam, M. Spleeter: a fast and efficient music source separation tool with pre-trained models. Journal of Open Source Software. 2020, 5, 2154. https://doi.org/10.21105/joss.02154

Hawthorne, C.; Elsen, E.; Song, J.; Roberts, A.; Raffel, C.; Engel, J.; Oore, S.; Eck, D. Onsets and Frames: Dual-Objective Piano Transcription. in Proceedings of the 19th ISMIR Conference, Paris, France, September 23-27, 2018. 50-57.

Trabelsi, C.; Bilaniuk, O.; Zhang, Y.; et al. Deep Complex Networks. Published online 2017. https://doi.org/10.48550/ARXIV.1705.09792.

Hawthorne, C.; Simon, I.; Swavely, R.; Manilow, E.; Engel, J. Sequence-to-Sequence Piano Transcription with Transformers. Published online 2021. https://doi.org/10.48550/ARXIV.2107.09142.

Raffel, C. Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching. Columbia University, 2016.

Sharma, S.; Mittal, VK. Window selection for accurate music source separation using REPET. In: 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN). 2016, 270-274. https://doi.org/10.1109/SPIN.2016.7566702.