TY - GEN
T1 - Swift Concurrent Semantic Segmentation and Object Detection on Edge Devices
AU - Hsu, Chih Chung
AU - Jiang, Yun Zhong
AU - Huang, Wei Hao
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - We propose a real-time network optimized for joint semantic segmentation and object detection on edge devices. Our architecture builds on the latest YOLO series network and incorporates lightweight segmentation sub-networks for multi-task learning. Specifically, we leverage layers two to four of the YOLO network, which contain substantial semantic information at varying resolutions, to segment objects of diverse sizes. We introduce the Parallel Aggregation Pyramid Pooling Module (PAPPM) to efficiently generate buffered semantic segmentation feature maps by utilizing single-point addition and residual learning. This approach reduces computational complexity and memory usage without compromising accuracy. We also propose a novel Progressively Iterative Learning (PIL) approach to learn the weights for the backbone, neck, and multi-task heads, respectively, without catastrophic forgetting. Our approach achieves state-of-the-art performance on benchmark datasets, demonstrating the effectiveness of our proposed techniques.
AB - We propose a real-time network optimized for joint semantic segmentation and object detection on edge devices. Our architecture builds on the latest YOLO series network and incorporates lightweight segmentation sub-networks for multi-task learning. Specifically, we leverage layers two to four of the YOLO network, which contain substantial semantic information at varying resolutions, to segment objects of diverse sizes. We introduce the Parallel Aggregation Pyramid Pooling Module (PAPPM) to efficiently generate buffered semantic segmentation feature maps by utilizing single-point addition and residual learning. This approach reduces computational complexity and memory usage without compromising accuracy. We also propose a novel Progressively Iterative Learning (PIL) approach to learn the weights for the backbone, neck, and multi-task heads, respectively, without catastrophic forgetting. Our approach achieves state-of-the-art performance on benchmark datasets, demonstrating the effectiveness of our proposed techniques.
UR - http://www.scopus.com/inward/record.url?scp=85172289345&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85172289345&partnerID=8YFLogxK
U2 - 10.1109/ICMEW59549.2023.00013
DO - 10.1109/ICMEW59549.2023.00013
M3 - Conference contribution
AN - SCOPUS:85172289345
T3 - Proceedings - 2023 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2023
SP - 40
EP - 45
BT - Proceedings - 2023 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2023
Y2 - 10 July 2023 through 14 July 2023
ER -