Contrastive Learning for Unsupervised Sentence Embedding with False Negative Calibration

Chi Min Chiu, Ying Jia Lin, Hung Yu Kao

研究成果: Conference contribution

摘要

Contrastive Learning, a transformative approach to the embedding of unsupervised sentences, fundamentally works to amplify similarity within positive samples and suppress it amongst negative ones. However, an obscure issue associated with Contrastive Learning is the occurrence of False Negatives, which treat similar samples as negative samples that will hurt the semantics of the sentence embedding. To address it, we propose a framework called FNC (False Negative Calibration) to alleviate the influence of false negatives. Our approach has two strategies to amplify the effect, i.e. false negative elimination and reuse. Specifically, in the training process, our method eliminates false negatives by clustering and comparing the semantic similarity. Next, we reuse those eliminated false negatives to reconstruct new positive pairs to boost contrastive learning performance. Our experiments on seven semantic textual similarity tasks demonstrate that our approach is more effective than competitive baselines.

原文English
主出版物標題Advances in Knowledge Discovery and Data Mining - 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024, Proceedings
編輯De-Nian Yang, Xing Xie, Vincent S. Tseng, Jian Pei, Jen-Wei Huang, Jerry Chun-Wei Lin
發行者Springer Science and Business Media Deutschland GmbH
頁面290-301
頁數12
ISBN(列印)9789819722617
DOIs
出版狀態Published - 2024
事件28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024 - Taipei, Taiwan
持續時間: 2024 5月 72024 5月 10

出版系列

名字Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
14647 LNAI
ISSN(列印)0302-9743
ISSN(電子)1611-3349

Conference

Conference28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024
國家/地區Taiwan
城市Taipei
期間24-05-0724-05-10

All Science Journal Classification (ASJC) codes

  • 理論電腦科學
  • 一般電腦科學

指紋

深入研究「Contrastive Learning for Unsupervised Sentence Embedding with False Negative Calibration」主題。共同形成了獨特的指紋。

引用此