TY - GEN
T1 - Language-driven Diversified Image Retargeting
AU - Wang, Rui
AU - Huang, Nisha
AU - Tang, Fan
AU - Dong, Weiming
AU - Lee, Tong Yee
N1 - Funding Information:
This research was supported in part by the National Natural Science Foundation of China under Nos. 62102162, 61832016, by Ministry of Science and Technology, Taiwan (No. 111-2221-E-006-112-MY3) and by the Open Projects Program of NLPR.
Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/12/26
Y1 - 2022/12/26
N2 - Content-aware image resizing could automatically retarget an image to different aspect ratios while preserving visually salient contents. However, it is difficult for users to interact with the retargeting process and control the results. In this paper, we propose a language-driven diversified image retargeting (LDIR) method that allows the users to control the retargeting process by providing additional textual descriptions. Taking the original image and user-provided texts as inputs, LDIR retargets the image into the desired resolution while preserving the content indicated by texts. Following a self-play reinforcement learning pipeline, a multimodel reward function is proposed by considering both the visual quality and language guidance. Preliminary experiments manifest that LDIR can achieve diversified image retargeting guided by texts.
AB - Content-aware image resizing could automatically retarget an image to different aspect ratios while preserving visually salient contents. However, it is difficult for users to interact with the retargeting process and control the results. In this paper, we propose a language-driven diversified image retargeting (LDIR) method that allows the users to control the retargeting process by providing additional textual descriptions. Taking the original image and user-provided texts as inputs, LDIR retargets the image into the desired resolution while preserving the content indicated by texts. Following a self-play reinforcement learning pipeline, a multimodel reward function is proposed by considering both the visual quality and language guidance. Preliminary experiments manifest that LDIR can achieve diversified image retargeting guided by texts.
UR - http://www.scopus.com/inward/record.url?scp=85145575593&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85145575593&partnerID=8YFLogxK
U2 - 10.1145/3550082.3564169
DO - 10.1145/3550082.3564169
M3 - Conference contribution
AN - SCOPUS:85145575593
T3 - Proceedings - SIGGRAPH Asia 2022 Posters
BT - Proceedings - SIGGRAPH Asia 2022 Posters
A2 - Spencer, Stephen N.
PB - Association for Computing Machinery, Inc
T2 - SIGGRAPH Asia 2022 - Computer Graphics and Interactive Techniques Conference - Asia, SA 2022
Y2 - 6 December 2022 through 9 December 2022
ER -