Responsible Visual Editing
Minheng Ni, Yeli Shen, Lei Zhang*, Wangmeng Zuo*
;
Abstract
"With the recent advancements in visual synthesis, there is a growing risk of encountering synthesized images with detrimental effects, such as hate, discrimination, and privacy violations. Unfortunately, it remains unexplored on how to avoid synthesizing harmful images and convert them into responsible ones. In this paper, we present responsible visual editing, which edits risky concepts within an image to more responsible ones with minimal content changes. However, the concepts that need to be edited are often abstract, making them hard to be located and edited. To tackle these challenges, we propose a Cognitive Editor (CoEditor) by harnessing the large multimodal models through a two-stage cognitive process: (1) a perceptual cognitive process to locate what to be edited and (2) a behavioral cognitive process to strategize how to edit. To mitigate the negative implications of harmful images on research, we build a transparent and public dataset, namely AltBear, which expresses harmful information using teddy bears instead of humans. Experiments demonstrate that CoEditor can effectively comprehend abstract concepts in complex scenes, significantly surpassing the baseline models for responsible visual editing. Moreover, we find that the AltBear dataset corresponds well to the harmful content found in real images, providing a safe and effective benchmark for future research. Our source code and dataset can be found at https://github.com/kodenii/ Responsible-Visual-Editing."
Related Material
[pdf]
[supplementary material]
[DOI]