Language-driven Diversified Image Retargeting

Rui Wang, Nisha Huang, Fan Tang, Weiming Dong, Tong Yee Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Content-aware image resizing could automatically retarget an image to different aspect ratios while preserving visually salient contents. However, it is difficult for users to interact with the retargeting process and control the results. In this paper, we propose a language-driven diversified image retargeting (LDIR) method that allows the users to control the retargeting process by providing additional textual descriptions. Taking the original image and user-provided texts as inputs, LDIR retargets the image into the desired resolution while preserving the content indicated by texts. Following a self-play reinforcement learning pipeline, a multimodel reward function is proposed by considering both the visual quality and language guidance. Preliminary experiments manifest that LDIR can achieve diversified image retargeting guided by texts.

Original languageEnglish
Title of host publicationProceedings - SIGGRAPH Asia 2022 Posters
EditorsStephen N. Spencer
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450394628
DOIs
Publication statusPublished - 2022 Dec 26
EventSIGGRAPH Asia 2022 - Computer Graphics and Interactive Techniques Conference - Asia, SA 2022 - Daegu, Korea, Republic of
Duration: 2022 Dec 62022 Dec 9

Publication series

NameProceedings - SIGGRAPH Asia 2022 Posters

Conference

ConferenceSIGGRAPH Asia 2022 - Computer Graphics and Interactive Techniques Conference - Asia, SA 2022
Country/TerritoryKorea, Republic of
CityDaegu
Period22-12-0622-12-09

All Science Journal Classification (ASJC) codes

  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Language-driven Diversified Image Retargeting'. Together they form a unique fingerprint.

Cite this