RIADA: A Machine-Learning Based Infrastructure for Recognising the Emotions of Spotify Songs.

Pedro Álvarez; Jorge García de Quirós; Sandra Baldassarri

doi:10.9781/ijimai.2022.04.002

Authors

Pedro Álvarez Universidad de Zaragoza
Jorge García de Quirós Universidad de Zaragoza
Sandra Baldassarri Universidad de Zaragoza

DOI:

https://doi.org/10.9781/ijimai.2022.04.002

Keywords:

Affective Annotation, Cloud Computing, Emotion recognition, Machine Learning, Music, Spotify

Supporting Agencies

This work has been supported by the TIN2017-84796-C2-2-R and RTI2018-096986-B-C31 projects, granted by the Spanish Ministerio de Economía y Competitividad, and the DisCo-T21-20R and AffectiveLab-T60-20R projects, granted by the Aragonese Government.

Abstract

The music emotions can help to improve the personalization of services and contents offered by music streaming providers. Many research works based on the use of machine learning techniques have addressed the problem of recognising the music emotions during the last years. Nevertheless, the results obtained are only applied on small-size music repositories and do not consider what the users feel when they listen to the songs. These issues prevent the existing proposals to be integrated into the personalization mechanisms of the online music providers. In this paper, we present the RIADA infrastructure which is composed by a set of systems able to annotate emotionally the catalog of songs offered by Spotify based on the users’ perception. RIADA works with the Spotify playlist miner and data services to build emotion recognition models that can solve the open challenges previously mentioned. Machine learning algorithms, music information retrieval techniques, architectures for parallelization of applications and cloud computing have been combined to develop a complex result of engineering able to integrate the music emotions into the Spotify-based applications.

Downloads

Download data is not yet available.

References

G. Knox, H. Datta, “Streaming services and the homogenization of music consumption,” 2020. [Online]. Available: https://research.tilburguniversity.edu/en/publications/streaming-services-and-the-homogenization-ofmusic-consumptionf , [Online; accessed 19-July-2020].

M. Schedl, H. Zamani, C.-W. Chen, Y. Deldjoo, M. Elahi, “Current challenges and visions in music recommender systems research,” International Journal of Multimedia Information Retrieval, vol. 7, pp. 95– 116, 03 2018, doi: 10.1007/s13735-018-0154-2.

X. Yang, Y. Dong, J. Li, “Review of data features-based music emotion recognition methods,” Multimedia Systems, vol. 24, pp. 365–389, July 2018, doi: 10.1007/s00530-017-0559-4.

A. Pannese, M.-A. Rappaz, D. Grandjean, “Metaphor and music emotion: Ancient views and future directions,” Consciousness and Cognition, vol. 44, pp. 61–71, 2016, doi: https://doi.org/10.1016/j.concog.2016.06.015

J. Nam, K. Choi, J. Lee, S. Chou, Y. Yang, “Deep learning for audio-based music classification and tagging: Teaching computers to distinguish rock from bach,” IEEE Signal Processing Magazine, vol. 36, no. 1, pp. 41–51, 2019, doi: 10.1109/MSP.2018.2874383.

C. Gökalp, “Music emotion recognition: a multimodal machine learning approach,” Master’s thesis, School of Management, Sabanci University, 2019.

G. Liu, Z. Tan, “Research on multi-modal music emotion classification based on audio and lyirc,” in 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), vol. 1, 2020, pp. 2331–2335.

Y. Dong, X. Yang, X. Zhao, J. Li, “Bidirectional convolutional recurrent sparse network (bcrsn): An efficient model for music emotion recognition,” IEEE Transactions on Multimedia, vol. 21, no. 12, pp. 3150–3163, 2019, doi: 10.1109/TMM.2019.2918739.

M. Russo, L. Kraljević, M. Stella, M. Sikora, “Cochleogram-based approach for detecting perceived emotions in music,” Information Processing & Management, vol. 57, Sept. 2020, doi: 10.1016/j.ipm.2020.102270.

R. Sarkar, S. Choudhury, S. Dutta, A. Roy, S. K. Saha, “Recognition of emotion in music based on deep convolutional neural network,” Multimedia Tools and Applications, vol. 79, pp. 765––783, 2020, doi: 10.1007/s11042-019-08192-x.

S. Zhao, S. Wang, M. Soleymani, D. Joshi, Q. Ji, “Affective computing for large-scale heterogeneous multimedia data: A survey,” ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 15, no. 3s, pp. 1–32, 2019, doi: 10.1145/3363560.

M. Soleymani, A. Aljanaki, Y. Yang, “DEAM: Mediaeval database for emotional analysis in music.” http://cvml.unige.ch/databases/DEAM/, 2016. [Online; accessed 19-July-2020].

X. Hu, J. Downie, C. Laurier, M. Bay, A. Ehmann, “The 2007 mirex audio mood classification task: Lessons learned,” 01 2008, pp. 462–467.

B.-J. Han, S. Rho, S. Jun, E. Hwang, “Music emotion classification and context-based music recommendation,” Multimedia Tools and Applications, vol. 47, no. 3, pp. 433–460, 2010.

E. Commission, “Horizon 2020 work programme 2014–2015. European commission decision c(2015)8621,” 2015. [Online]. Available: https://ec.europa.eu/research/participants/data/ref/h2020/wp/2014_2015/annexes/h2020-wp1415-annex-ga_en.pdf, [Online; accessed 25-July-2021].

A. Gabrielsson, E. Lindstrom, The influence of musical structure on emotional expression, pp. 223–248. Oxford University Press, 2001.

R. E. Thayer, R. J. McNally, “The biopsychology of mood and arousal,” Cognitive and Behavioral Neurology, vol. 5, no. 1, p. 65, 1992.

O. Lartillot, P. Toiviainen, “A matbal toolbox for musical feature extraction from audio,” in Proceedings of the 10th International Conference on Digital Audio Effects, DAFx-07, Bordeaux, France, 2007, pp. 1–8.

G. Tzanetakis, “Marsyas-0.2: A case study in implementing music information retrieval systems,” Intelligent Music Information Systems: Tools and Methodologies, pp. 1–48, 2007, doi: 10.4018/978-1-59904-663-1.ch002.

D. Cabrera, “Psysound: A computer program for psychoacoustical analysis,” in Proceedings of the Australian Acoustical Society Conference, 1999, pp. 47–54.

OpenSMILE, “OpenSMILE audio feature extraction.” https://www.audeering.com/opensmile/, 2020. [Online; accessed 19-July-2020].

D. McEnnis, C. McKay, I. Fujinaga, P. Depalle, “jaudio: An feature extraction library.,” in Proceedings of the 6th International Conference on Music Information Retrieval, ISMIR 2005, London, UK, 01 2005, pp. 600–603.

J. L. Zhang, X. L. Huang, L. F. Yang, Y. Xu, S. T. Sun, “Feature selection and feature learning in arousal dimension of music emotion by using shrinkage methods,” Multimedia systems, vol. 23, no. 2, pp. 251–264, 2017, doi: 10.1007/s00530-015-0489-y.

P. Zachar, R. Ellis, Categorical versus dimensional models of affect: A seminar on the theories of Panksepp and Russell. John Benjamins Publishing Company, 2012.

E. Schubert, “Update of the hevner adjective checklist,” Perceptual and motor skills, vol. 96, no. 3, pp. 1117–1122, 2003, doi: 10.2466/pms.2003.96.3c.1117.

J. Russell, “A circumplex model of affect,” Journal of personality and social psychology, vol. 39, no. 6, pp. 1161–1178, 1980.

A. Tellegen, D. Watson, L. A. Clark, “On the dimensional and hierarchical structure of affect,” Psychological Science, vol. 10, no. 4, pp. 297–303, 1999, doi: 10.1111/1467-9280.00157.

R. E. Thayer, “Toward a psychological theory of multidimensional activation (arousal),” Motivation and Emotion, vol. 2, no. 1, pp. 1–34, 1978, doi: 10.1007/BF00992729.

Y.-H. Chin, C.-H. Lin, E. Siahaan, I.-C. Wang, J.-C. Wang, “Music emotion classification using double-layer support vector machines,” in Proceedings of the 1st International Conference on Orange Technologies (ICOT 2013), 2013, pp. 193–196.

J. Deng, Emotion-based music retrieval and recommendation. PhD dissertation, Hong Kong Baptist University, 2014.

N. Nalini, S. Palanivel, “Music emotion recognition: The combined evidence of mfcc and residual phase,” Egyptian Informatics Journal, vol. 17, no. 1, pp. 1–10, 2016, doi: https://doi.org/10.1016/j.eij.2015.05.004.

R. Panda, R. Malheiro, R. P. Paiva, “Novel audio features for music emotion recognition,” IEEE Transactions on Affective Computing, vol. Published, 2018, doi: 10.1109/TAFFC.2018.2820691.

P. F. Vale, “The role of artist and genre on music emotion recognition,” Master’s thesis, Information Management School, 2017.

Y. Ospitia-Medina, J. R. Beltrán, S. Baldassarri, “Emotional classification of music using neural networks with the mediaeval dataset,” Personal and Ubiquitous Computing, vol. April (online), pp. 1–13, 04 2020, doi: 10.1007/ s00779-020-01393-4.

M. Rumiantcev, O. Khriyenko, “Emotion based music recommendation system,” in Proceedings of the 26th Conference of Open Innovations Association FRUCT, Yaroslavl, Russia, 2020, pp. 639–645.

M.-C. Chiu, L.-W. Ko, “Develop a personalized intelligent music selection system based on heart rate variability and machine learning,” Multimedia Tools and Applications, vol. 76, pp. 15607––15639, 09 2016, doi: 10.1007/s11042-016-3860-x.

K.-A. Bodarwé, J. Noack, P. Jean-Jacques, “Emotion-based music recommendation using supervised learning,” in Proceedings of the 14th International Conference on Mobile and Ubiquitous Multimedia, New York, NY, USA, 2015, pp. 341–344, Association for Computing Machinery.

F. Paolizzo, N. Pichierri, D. Casali, D. Giardino, M. Matta, G. Costantini, “Multilabel automated recognition of emotions induced through music,” CoRR, vol. abs/1905.12629, 2019.

J. H. Juthi, A. Gomes, T. Bhuiyan, I. Mahmud, “Music emotion recognition with the extraction of audio features using machine learning approaches,” in Lecture Notes in Electrical Engineering. Proceedings of ICETIT 2019, Emerging Trends in Information Technology, vol. 605, 2020, pp. 318–329, Springer International Publishing.

K. W. Cheuk, Y.-J. Luo, B. B, T, G. Roig, D. Herremans, “Regression-based music emotion prediction using triplet neural networks,” in Proceedings of the International Joint Conference on Neural Network, IJCNN, Glasgow, 07 2020, IEEE.

A. Ma, I. Sethi, N. Patel, “Multimedia content tagging using multilabel decision tree,” in Proceedings of the 11th IEEE International Symposium on Multimedia, 2009, pp. 606–611.

S. Das, S. Debbarma, B. Bhattacharyya, “Building a computational model for mood classification of music by integrating an asymptotic approach with the machine learning techniques,” Journal of Ambient Intelligence and Humanized Computing, vol. May (online), pp. 1–13, 05 2020, doi: 10.1007/s12652-020-02145-1.

R. Panda, R. P. Paiva, “Music emotion classification: Dataset acquisition and comparative analysis,” in 15th International Conference on Digital Audio Effects, DAFx-12, 10 2012, pp. 1–7.

R. Panda, Emotion-based Analysis and Classification of Audio Music. PhD dissertation, Universidade de Coimbra, 2019.

E. Georgieva, M. Suta, N. Burton, “Hitpredict: Predicting hit songs using spotify data,” 2018. [Online; accessed 19-July-2020].

M. Sciandra, I. Spera, “A model based approach to spotify data analysis: A beta GLMM,” SSRN Electronic Journal, vol. 3, pp. 1–18, 01 2020, doi: 10.2139/ssrn.3557124.

J. H. Oh, S. Ouwejan, S. T. Kim, I. Ng, “Music intelligence: Granular data and prediction of top ten hit songs,” SSRN Electronic Journal, pp. 1–12, 05 2020, doi: 10.2139/ssrn.3585176.

R. Oi, “Spotify on the streets: walking and listening to music in urban spaces,” Master’s thesis, Lundt University, 2019. http://lup.lub.lu.se/student-papers/record/8976269.

M. Dittenbach, R. Neumayer, A. Rauber, “Playsom: An alternative approach to track selection and playlist generation in large music collections,” in Proceedings of the Workshop of the EU Network of Excellence DELOS on Audio-Visual Content and Information Visualization in Digital Libraries (AVIVDiLib 2005), 2005, pp. 226–235.

A. Germain, J. Chakareski, “Spotify me: Facebook-assisted automatic playlist generation,” in IEEE 15th International Workshop on Multimedia Signal Processing (MMSP 2013), Sep. 2013, pp. 25–28.

M. Pichl, E. Zangerle, G. Specht, “Combining spotify and twitter data for generating a recent and public dataset for music recommendation,” in Proceedings of the 26th GI-Workshop Grundlagen von Datenbanken (GvDB 2014), Ritten, Italy, 2015, pp. 35–40.

F. Fessahaye, L. Pérez, T. Zhan, R. Zhang, C. Fossier, R. Markarian, C. Chiu, J. Zhan, L. Gewali, P. Oh, “Trecsys: A novel music recommendation system using deep learning,” in 2019 IEEE International Conference on Consumer Electronics (ICCE), 2019, pp. 1–6.

M. Madathil, “Music recommendation system spotify - collaborative filtering,” 2017. Reports in Computer Music. Aachen University, Germany.

N. F. R. Fauzia, “The use of song titles in spotify playlists to express the affection,” in International Seminar on Sociolinguistics and Dialectology: “Changes and Development of Language in Social Life” 2017, 2017, pp. 185–189.

S. Giammusso, M. Guerriero, P. Lisena, E. Palumbo, R. Troncy, “Predicting the emotion of playlist using track lyrics,” in 19th International Society for Music Information Retrieval Conference, Paris, France, 2018.

G. Subramaniam, J. Verma, N. Chandrasekhar, K. Narendra, K. George, “Generating playlists on the basis of emotion,” in 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018, pp. 366–373.

H. Abderrazik, G. Angela, H. Brouwer, H. Janse, S. Lutz, G. Smitskamp, S. Manolios, C. C. S. Liem, “Spotivibes: Tagging playlist vibes with colors,” in Proceedings of the 6th Joint Workshop on Interfaces and Human Decision Making for Recommender Systems co-located with 13th ACM Conference on Recommender Systems, RecSys 2019, vol. CEUR, 2450, 2019, pp. 55–59.

P. Helmholz, M. Meyer, S. Robra-Bissantz, “Feel the moosic: Emotionbased music selection and recommendation,” in 32nd Bled eConference: Humanizing Technology for a Sustainable Society, Bled, Slovenia, 06 2019, pp. 203–221.

Spotify for developers, “Spotify web api.” https://developer.spotify.com/documentation/web-api/, 2020. [Online; accessed 19-July-2020].

M. D. Team, Microsoft Application Architecture Guide, 2nd Edition (Patterns & Practices). Wiley, 2009.

L. Liu, M. T. Özsu Eds., n-Tier Architecture, pp. 1924–1924. Springer US, 2009.

T. M. project, “AcousticBrainz.” http://acousticbrainz.org/, 2015. [Online; accessed 19-July-2020].

Spotify for developers, “Description of the Audio Feature Object,” 2020. [Online]. Available: https://developer.spotify.com/documentation/webapi/reference/#objectaudiofeaturesobject, [Online; accessed 26-July2021].

R. Panda, R. Malheiro, R. P. Paiva, “Novel audio features for music emotion recognition,” IEEE Transactions on Affective Computing, vol. Published, 2018, doi: 10.1109/TAFFC.2018.2820691.

C. Laurier, M. Sordo, J. Serrá, P. Herrera, “Music mood representations from social tags,” in Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR 2009), Kobe, Japan, 2009, pp. 381–386.

K. Trohidis, G. Tsoumakas, G. Kalliris, I. Vlahavas, “Multi-label classification of music by emotion,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2011, p. 4, Sep 2011, doi: 10.1186/1687-4722-2011-426793.

J. Bergstra, Y. Bengio, “Random search for hyper-parameter optimization,” Journal of Machine Learning Research, vol. 13, pp. 281–305, Feb. 2012.

P. Desmet, M. Vastenburg, V. Bel, D., N. Romero, “Pick-a-mood; development and application of a pictorial mood-reporting instrument,” in Proceedings of the 8th International Conference on Design and Emotion: Out of Control - Proceedings, 09 2012, pp. 1–12.

F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad, M. Stal, PatternOriented Software Architecture, Volume 1, A System of Patterns. Wiley, 1996.