TY - GEN
T1 - Exploring Feature Fusion from A Contrastive Multi-Modality Learner for Liver Cancer Diagnosis
AU - Chiang, Yang Fan
AU - Li, Pei Xuan
AU - Wu, Ding You
AU - Hsieh, Hsun Ping
AU - Ko, Ching Chung
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s).
PY - 2023/12/6
Y1 - 2023/12/6
N2 - Self-supervised contrastive learning has achieved promising results in computer vision, and recently it also received attention in the medical domain. In practice, medical data is hard to collect and even harder to annotate, but leveraging multi-modality medical images to make up for small datasets has proved to be helpful. In this work, we focus on mining multi-modality Magnetic Resonance (MR) images to learn multi-modality contrastive representations. We first present multi-modality data augmentation (MDA) to adapt contrastive learning to multi-modality learning. Then, the proposed cross-modality group convolution (CGC) is used for multi-modality features in the downstream fine-tune task. Specifically, in the pre-training stage, considering different behaviors from each MRI modality with the same anatomic structure, yet without designing a handcrafted pretext task, we select two augmented MR images from a patient as a positive pair, and then directly maximize the similarity between positive pairs using Simple Siamese networks. To further exploit multi-modality representation, we combine 3D and 2D group convolution with a channel shuffle operation to efficiently incorporate different modalities of image features. We evaluate our proposed methods on liver MR images collected from a well-known hospital in Taiwan. Experiments show our framework has significantly improved from previous methods.
AB - Self-supervised contrastive learning has achieved promising results in computer vision, and recently it also received attention in the medical domain. In practice, medical data is hard to collect and even harder to annotate, but leveraging multi-modality medical images to make up for small datasets has proved to be helpful. In this work, we focus on mining multi-modality Magnetic Resonance (MR) images to learn multi-modality contrastive representations. We first present multi-modality data augmentation (MDA) to adapt contrastive learning to multi-modality learning. Then, the proposed cross-modality group convolution (CGC) is used for multi-modality features in the downstream fine-tune task. Specifically, in the pre-training stage, considering different behaviors from each MRI modality with the same anatomic structure, yet without designing a handcrafted pretext task, we select two augmented MR images from a patient as a positive pair, and then directly maximize the similarity between positive pairs using Simple Siamese networks. To further exploit multi-modality representation, we combine 3D and 2D group convolution with a channel shuffle operation to efficiently incorporate different modalities of image features. We evaluate our proposed methods on liver MR images collected from a well-known hospital in Taiwan. Experiments show our framework has significantly improved from previous methods.
UR - http://www.scopus.com/inward/record.url?scp=85182940468&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85182940468&partnerID=8YFLogxK
U2 - 10.1145/3595916.3626383
DO - 10.1145/3595916.3626383
M3 - Conference contribution
AN - SCOPUS:85182940468
T3 - Proceedings of the 5th ACM International Conference on Multimedia in Asia, MMAsia 2023
BT - Proceedings of the 5th ACM International Conference on Multimedia in Asia, MMAsia 2023
PB - Association for Computing Machinery, Inc
T2 - 5th ACM International Conference on Multimedia in Asia, MMAsia 2023
Y2 - 6 December 2023 through 8 December 2023
ER -