HCV: Lightweight Hybrid CNN-Vision Transformer for Visual Object Tracking

Liang Chia Chen, Wei Ta Chu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Visual object tracking is one of the most fundamental research in computer vision. Recent mainstream trackers prioritize accuracy, leading to issues such as prolonged computation time and substantial computational resources to achieve significant performance. To address this challenge, in this paper, we propose a lightweight single object tracking model named HCV. Through the proposed feature fusion module, we establish an interface between CNN and Vision Transformer (ViT), enabling the utilization of diverse hierarchical features from CNN while simultaneously acquiring global features through attention mechanisms. This approach maintains good tracking accuracy while reducing computational overhead and parameter requirements. We evaluate this idea on the UAV123, LaSOT, GOT-10k, and TrackingNet datasets, and demonstrate the efficiency of this lightweight tracking model.

Original languageEnglish
Title of host publicationMultiMedia Modeling - 31st International Conference on Multimedia Modeling, MMM 2025, Proceedings
EditorsIchiro Ide, Ioannis Kompatsiaris, Changsheng Xu, Keiji Yanai, Wei-Ta Chu, Naoko Nitta, Michael Riegler, Toshihiko Yamasaki
PublisherSpringer Science and Business Media Deutschland GmbH
Pages45-59
Number of pages15
ISBN (Print)9789819620609
DOIs
Publication statusPublished - 2025
Event31st International Conference on Multimedia Modeling, MMM 2025 - Nara, Japan
Duration: 2025 Jan 82025 Jan 10

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15521 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference31st International Conference on Multimedia Modeling, MMM 2025
Country/TerritoryJapan
CityNara
Period25-01-0825-01-10

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Cite this