Enhancing Tampered Text Detection through Frequency Feature Fusion and Decomposition
Zhongxi Chen, Shen Chen, Taiping Yao*, Ke Sun, Shouhong Ding, Xianming Lin*, Liujuan Cao, Rongrong Ji
;
Abstract
"Document image tampering poses a grave risk to the veracity of information, with potential consequences ranging from misinformation dissemination to financial and identity fraud. Current detection methods use frequency information to uncover tampering that is invisible to the naked eye. However, these methods often fail to integrate this information effectively, thereby compromising RGB detection capabilities and missing the high-frequency details necessary to detect subtle tampering. To address these gaps, we introduce a Feature Fusion and Decomposition Network (FFDN) that combines a Visual Enhancement Module (VEM) with a Wavelet-like Frequency Enhancement (WFE). Specifically, the VEM makes tampering traces visible while preserving the integrity of original RGB features using zero-initialized convolutions. Meanwhile, the WFE decomposes the features to explicitly retain high-frequency details that are often overlooked during downsampling, focusing on small but critical tampering clues. Rigorous testing on the DocTamper dataset confirms FFDN’s preeminence, significantly outperforming existing state-of-the-art methods in detecting tampering."
Related Material
[pdf]
[supplementary material]
[DOI]