TY - GEN
T1 - HCV
T2 - 31st International Conference on Multimedia Modeling, MMM 2025
AU - Chen, Liang Chia
AU - Chu, Wei Ta
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Visual object tracking is one of the most fundamental research in computer vision. Recent mainstream trackers prioritize accuracy, leading to issues such as prolonged computation time and substantial computational resources to achieve significant performance. To address this challenge, in this paper, we propose a lightweight single object tracking model named HCV. Through the proposed feature fusion module, we establish an interface between CNN and Vision Transformer (ViT), enabling the utilization of diverse hierarchical features from CNN while simultaneously acquiring global features through attention mechanisms. This approach maintains good tracking accuracy while reducing computational overhead and parameter requirements. We evaluate this idea on the UAV123, LaSOT, GOT-10k, and TrackingNet datasets, and demonstrate the efficiency of this lightweight tracking model.
AB - Visual object tracking is one of the most fundamental research in computer vision. Recent mainstream trackers prioritize accuracy, leading to issues such as prolonged computation time and substantial computational resources to achieve significant performance. To address this challenge, in this paper, we propose a lightweight single object tracking model named HCV. Through the proposed feature fusion module, we establish an interface between CNN and Vision Transformer (ViT), enabling the utilization of diverse hierarchical features from CNN while simultaneously acquiring global features through attention mechanisms. This approach maintains good tracking accuracy while reducing computational overhead and parameter requirements. We evaluate this idea on the UAV123, LaSOT, GOT-10k, and TrackingNet datasets, and demonstrate the efficiency of this lightweight tracking model.
UR - http://www.scopus.com/inward/record.url?scp=85215795717&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85215795717&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-2061-6_4
DO - 10.1007/978-981-96-2061-6_4
M3 - Conference contribution
AN - SCOPUS:85215795717
SN - 9789819620609
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 45
EP - 59
BT - MultiMedia Modeling - 31st International Conference on Multimedia Modeling, MMM 2025, Proceedings
A2 - Ide, Ichiro
A2 - Kompatsiaris, Ioannis
A2 - Xu, Changsheng
A2 - Yanai, Keiji
A2 - Chu, Wei-Ta
A2 - Nitta, Naoko
A2 - Riegler, Michael
A2 - Yamasaki, Toshihiko
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 8 January 2025 through 10 January 2025
ER -