Multi-label image recognition by using semantics consistency, object correlation, and multiple samples

Wei Ta Chu, Si Heng Huang

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

An image can be annotated from the local perspective, based on objects visually present. An image can also be annotated from the global perspective, based on implicit emotion or meanings derived from it. We propose three points relatively little studied before. First, semantics remain the same even if the image is manipulated by some geometric processes. Second, object correlation is important in image labelling. We propose to use a standard recurrent neural network to take object sequences in random orders. Third, we observe that some entity can be represented by multiple image samples, and multiple samples can be jointly considered to improve recognition performance. These three points are implemented in a network that jointly considers global and local information. With comprehensive evaluation studies, we verify that a simple network with these points is effective and is able to achieve competitive performance compared to the state of the arts.

Original languageEnglish
Article number103067
JournalJournal of Visual Communication and Image Representation
Volume77
DOIs
Publication statusPublished - 2021 May

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Media Technology
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Multi-label image recognition by using semantics consistency, object correlation, and multiple samples'. Together they form a unique fingerprint.

Cite this