TY - GEN
T1 - Gradient Boost Tree Network based on Extensive Feature Analysis for Popularity Prediction of Social Posts
AU - Hsu, Chih Chung
AU - Lee, Chia Ming
AU - Hou, Xiu Yu
AU - Tsai, Chi Han
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/10/26
Y1 - 2023/10/26
N2 - Social media popularity (SMP) prediction is a complex task, affected by various features such as text, images, and spatial-temporal information. One major challenge in SMP is integrating features from multiple modalities without overemphasizing user-specific details while efficiently capturing relevant user information. This study introduces a robust multi-modality feature mining framework for predicting SMP scores by incorporating additional identity-related features sourced from the official SMP dataset when a user's path alias is accessible. Our preliminary analyses suggest these supplemental features significantly enrich the user-related context, contributing to a substantial improvement in performance and proving that non-identity features are relatively unimportant. This implies that we should focus more on discovering the identity-related features than other meta-data. To further validate our findings, we perform comprehensive experiments investigating the relationship between those identity-related features and scores. Finally, the LightGBM and TabNet are employed within our framework to effectively capture intricate semantic relationships among different modality features and user-specific data. Our experimental results confirm that these identity-related features, especially external ones, significantly improve the prediction performance of SMP tasks.
AB - Social media popularity (SMP) prediction is a complex task, affected by various features such as text, images, and spatial-temporal information. One major challenge in SMP is integrating features from multiple modalities without overemphasizing user-specific details while efficiently capturing relevant user information. This study introduces a robust multi-modality feature mining framework for predicting SMP scores by incorporating additional identity-related features sourced from the official SMP dataset when a user's path alias is accessible. Our preliminary analyses suggest these supplemental features significantly enrich the user-related context, contributing to a substantial improvement in performance and proving that non-identity features are relatively unimportant. This implies that we should focus more on discovering the identity-related features than other meta-data. To further validate our findings, we perform comprehensive experiments investigating the relationship between those identity-related features and scores. Finally, the LightGBM and TabNet are employed within our framework to effectively capture intricate semantic relationships among different modality features and user-specific data. Our experimental results confirm that these identity-related features, especially external ones, significantly improve the prediction performance of SMP tasks.
UR - http://www.scopus.com/inward/record.url?scp=85179557800&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85179557800&partnerID=8YFLogxK
U2 - 10.1145/3581783.3612843
DO - 10.1145/3581783.3612843
M3 - Conference contribution
AN - SCOPUS:85179557800
T3 - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
SP - 9451
EP - 9455
BT - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 31st ACM International Conference on Multimedia, MM 2023
Y2 - 29 October 2023 through 3 November 2023
ER -