With the development of the Internet and technology, online music platforms and music streaming services are flourishing. Information overload due to an abundance of digital music has become a common problem for many users. Social tags that are helpful for music recommendations have been discussed. However, label sparsity and a cold start problem, commonly observed with social tags, limit the effectiveness in supporting the recommendation system. A music autotagging system then becomes an alternative solution for supplementing a shortage of tags. Most prior studies on automatic labeling used only audio data for their analysis. However, some studies have suggested that lyrics enhance the music classification system to obtain more information and improve the overall accuracy. In addition to lyrics, audio data are also an important resource for finding music features. In summary, this paper proposes a music autotagging system that relies on both audio and lyrics to solve the above problems. Due to the development of deep learning algorithms in recent years, many scholars have effectively used neural networks to extract audio and textual features. Some of them also considered a structure of lyrics to extract features that consequentially improves the classification task. For lyric feature extraction, this study employs two types of deep learning models: convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The feature extraction architecture is mainly motivated and characterized by the lyric architecture. In addition, a multitask learning method is adopted to learn correlations between tags. The experiments support that a multitask learning classifier that combines audio and lyric information has a better performance than a single-task learning classification method using only audio data than previous studies.
All Science Journal Classification (ASJC) codes
- Media Technology
- Hardware and Architecture
- Computer Networks and Communications