Pseudo Triplet Networks for Classification Tasks with Cross-Source Feature Incompleteness

Cayon Liow, Cheng Te Li, Chun Pai Yang, Shou De Lin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Cross-source feature incompleteness - a scenario where certain features are only available in one data source but missing in another - is a common and significant challenge in machine learning. It typically arises in situations where the training data and testing data are collected from different sources with distinct feature sets. Addressing this challenge has the potential to greatly improve the utility of valuable datasets that might otherwise be considered incomplete and enhance model performance. This paper introduces the novel Pseudo Triplet Network (PTN) to address cross-source feature incompleteness. PTN fuses two Siamese network architectures - Triplet Networks and Pseudo Networks. By segregating data into instance, positive, and negative subsets, PTN facilitates effectively contrastive learning through a hybrid loss function design. The model was rigorously evaluated on six benchmark datasets from the UCI Repository, in comparison with five other methods for managing missing data, under a range of feature overlap and missing data scenarios. The PTN consistently exhibited superior performance, displaying resilience in high missing ratio situations and maintaining robust stability across various data scenarios.

Original languageEnglish
Title of host publicationCIKM 2023 - Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages4079-4083
Number of pages5
ISBN (Electronic)9798400701245
DOIs
Publication statusPublished - 2023 Oct 21
Event32nd ACM International Conference on Information and Knowledge Management, CIKM 2023 - Birmingham, United Kingdom
Duration: 2023 Oct 212023 Oct 25

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference32nd ACM International Conference on Information and Knowledge Management, CIKM 2023
Country/TerritoryUnited Kingdom
CityBirmingham
Period23-10-2123-10-25

All Science Journal Classification (ASJC) codes

  • General Business,Management and Accounting
  • General Decision Sciences

Fingerprint

Dive into the research topics of 'Pseudo Triplet Networks for Classification Tasks with Cross-Source Feature Incompleteness'. Together they form a unique fingerprint.

Cite this