Multimodal sentiment analysis of intangible cultural heritage songs with strengthened audio features-guided attention

Année de publication:
2022
Auteurs:

Magazine:
Journal of Information Science
Volume:
Numéro:
Pages:
Date de publication:
2022///
ISBN:
01655515 (ISSN)

Mots-clés

Analysis Models; Arts Computing; Audio Acoustics; Audio Features; Digital Humanities; Digital Humanity; Intangible Cultural Heritage; Intangible Cultural Heritages; Modal Analysis; Multi-modal; Multimodal Sentiment Analyse; Multimodal Sentiment Analysis; Music; Neural Network; Neural Networks; Neural-networks; Sentiment Analysis; Spectra Features; Spectrograms; Spectrographs; Spectrum Analysis;

URL

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85138347518&doi=10.1177%2f01655515221114454&partnerID=40&md5=3e9b97624720dd1472a96bd8f87f520d

Résumé

Intangible cultural heritage (ICH) songs convey folk lives and stories from different communities and nations through touching melodies and lyrics, which are rich in sentiments. Currently, researches about the sentiment analysis of songs are mainly based on lyrics, audios and lyric-audio. Recent studies have shown that deep spectrum features extracted from the spectrogram, generated from the audio, perform well in several speech-based tasks. However, studies combining spectrum features in multimodal sentiment analysis of songs are in a lack. Hence, we propose to combine the audio, lyric and spectrogram to conduct multimodal sentiment analysis for ICH songs, in a tri-modal fusion way. In addition, the correlations and interactions between different modalities are not considered fully. Here, we propose a multimodal song sentiment analysis model (MSSAM), including a strengthened audio features-guided attention (SAFGA) mechanism, which can learn intra- and inter-modal information effectively. First, we obtain strengthened audio features through the fusion of acoustic and spectrum features. Then, the strengthened audio features are used to guide the attention weights distribution of words in the lyric with help of SAFGA, which can make the model focus on the important words with sentiments and related with the sentiment of strengthened audio features, capturing modal interactions and complementary information. We take two world-level ICH lists, Jingju (京剧) and Kunqu (昆曲), as examples, and build sentiment analysis datasets. We compare the proposed model with other state-of-the-arts baselines in Jingju and Kunqu datasets. Experimental results demonstrate the superiority of our proposed model.

ARCHIVE du patrimoine immatériel de NAVARRE

Recherche à Navarre et en Basse-Navarre

Études

Bibliographie

Dictionnaire de la Navarre

Mots-clés

URL

Résumé

L’Archive du patrimoine immatériel de Navarre existe grâce au soutien du Gouvernement de Navarre et de l'Université Publique de Navarre.