Contrastive Learning for Unsupervised Sentence Embedding with False Negative Calibration

  • Chi Min Chiu
  • , Ying Jia Lin
  • , Hung Yu Kao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Contrastive Learning, a transformative approach to the embedding of unsupervised sentences, fundamentally works to amplify similarity within positive samples and suppress it amongst negative ones. However, an obscure issue associated with Contrastive Learning is the occurrence of False Negatives, which treat similar samples as negative samples that will hurt the semantics of the sentence embedding. To address it, we propose a framework called FNC (False Negative Calibration) to alleviate the influence of false negatives. Our approach has two strategies to amplify the effect, i.e. false negative elimination and reuse. Specifically, in the training process, our method eliminates false negatives by clustering and comparing the semantic similarity. Next, we reuse those eliminated false negatives to reconstruct new positive pairs to boost contrastive learning performance. Our experiments on seven semantic textual similarity tasks demonstrate that our approach is more effective than competitive baselines.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024, Proceedings
EditorsDe-Nian Yang, Xing Xie, Vincent S. Tseng, Jian Pei, Jen-Wei Huang, Jerry Chun-Wei Lin
PublisherSpringer Science and Business Media Deutschland GmbH
Pages290-301
Number of pages12
ISBN (Print)9789819722617
DOIs
Publication statusPublished - 2024
Event28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024 - Taipei, Taiwan
Duration: 2024 May 72024 May 10

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14647 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024
Country/TerritoryTaiwan
CityTaipei
Period24-05-0724-05-10

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Contrastive Learning for Unsupervised Sentence Embedding with False Negative Calibration'. Together they form a unique fingerprint.

Cite this