Yan, X. and Shan, T. and Qin, H. and Akhtar, N. and Liu, Y. and Rahmani, H. and Mian, A. (2026) Prompt-guided selective frequency network for real-world scene text image super-Resolution. Pattern Recognition, 175: 112974. ISSN 0031-3203
Full text not available from this repository.Abstract
Real-world scene text image super-resolution is challenging due to complex writing strokes, random text distribution, and diverse scene degradations. Existing text super-resolution methods focus on pure text images or fixed-size single-line text, which limits their practical utility. To address that, we propose a Prompt-Guided Selective Frequency super-resolution Network (PGSFNet). Our unique bicephalous neural model comprises a super-resolution branch and a prompt guidance branch. The latter specifically helps in leveraging text content-aware information priors. To that end, we propose a Text Information Enhancement module. To exploit selective frequency information present in the image, PGSFNet employs a proposed Adaptive Frequency Modulator fused with multi-attention structures. Considering the criticality of text edges in our task, we also propose a tailored text edge perception loss. Extensive experiments on the standard open real-world scene text image datasets demonstrate remarkable performance of our method, achieving up to 8.75% PNSR gain for × 2 and 2.28% SSIM gain for × 4 super-resolution on the Real-CE dataset. Our code will be made public at https://github.com/holastq/PGSFNet.