SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution

mingjun zheng, Long Sun, Jiangxin Dong, Jinshan Pan* ;

Abstract


"Transformer-based restoration methods achieve significant performance as the self-attention (SA) of the Transformer can explore non-local information for better high-resolution image reconstruction. However, the key dot-product SA requires substantial computational resources, which limits its application in low-power devices. Moreover, the low-pass nature of the SA mechanism limits its capacity for capturing local details, consequently leading to smooth reconstruction results. To address these issues, we propose a self-modulation feature aggregation (SMFA) module to collaboratively exploit both local and non-local feature interactions for a more accurate reconstruction. Specifically, the SMFA module employs an efficient approximation of self-attention (EASA) branch to model non-local information and uses a local detail estimation (LDE) branch to capture local details. Additionally, we further introduce a partial convolution-based feed-forward network (PCFN) to refine the representative features derived from the SMFA. Extensive experiments show that the proposed SMFANet family achieve a better trade-off between reconstruction performance and computational efficiency on public benchmark datasets. In particular, compared to the ×4 SwinIR-light, SMFANet+ achieves 0.14dB higher performance over five public testsets on average, and ×10 times faster runtime, with only about 43% of the model complexity (e.g., FLOPs). Our source codes and pre-trained models are available at: https://github.com/Zheng-MJ/SMFANet."

Related Material


[pdf] [supplementary material] [DOI]