Language-driven Diversified Image Retargeting

Rui Wang, Nisha Huang, Fan Tang, Weiming Dong, Tong Yee Lee

研究成果: Conference contribution


Content-aware image resizing could automatically retarget an image to different aspect ratios while preserving visually salient contents. However, it is difficult for users to interact with the retargeting process and control the results. In this paper, we propose a language-driven diversified image retargeting (LDIR) method that allows the users to control the retargeting process by providing additional textual descriptions. Taking the original image and user-provided texts as inputs, LDIR retargets the image into the desired resolution while preserving the content indicated by texts. Following a self-play reinforcement learning pipeline, a multimodel reward function is proposed by considering both the visual quality and language guidance. Preliminary experiments manifest that LDIR can achieve diversified image retargeting guided by texts.

主出版物標題Proceedings - SIGGRAPH Asia 2022 Posters
編輯Stephen N. Spencer
發行者Association for Computing Machinery, Inc
出版狀態Published - 2022 12月 26
事件SIGGRAPH Asia 2022 - Computer Graphics and Interactive Techniques Conference - Asia, SA 2022 - Daegu, Korea, Republic of
持續時間: 2022 12月 62022 12月 9


名字Proceedings - SIGGRAPH Asia 2022 Posters


ConferenceSIGGRAPH Asia 2022 - Computer Graphics and Interactive Techniques Conference - Asia, SA 2022
國家/地區Korea, Republic of

All Science Journal Classification (ASJC) codes

  • 電腦繪圖與電腦輔助設計
  • 電腦視覺和模式識別


深入研究「Language-driven Diversified Image Retargeting」主題。共同形成了獨特的指紋。