Dual alignment : Partial negative and soft-label alignment for text-to-image person retrieval

Song, X. and Jin, X. and Qi, J. and Liu, J. (2026) Dual alignment : Partial negative and soft-label alignment for text-to-image person retrieval. Information Fusion, 127: 103644. ISSN 1566-2535

[thumbnail of TIReID_InformationFusion__1_]

Text (TIReID_InformationFusion__1_)
TIReID_InformationFusion_1_.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (12MB)

Abstract

Text-to-image person retrieval is a task to retrieve the right matched images based on a given textual description of the interested person. The main challenge lies in the inherent modal difference between texts and images. Most existing works narrow the modality gap by aligning the feature representations of text and image in a latent embedding space. However, these methods usually leverage the hard label and mine insufficient or incorrect hard negatives to achieve cross-modal alignment, generating incorrect hard negative pairs so as to suboptimal performance. To tackle the above problems, we propose a dual alignment framework, Partial negative and Soft-label Alignment (PASA), which includes the partial negative alignment (PA) strategy and the Soft-label Alignment (SA) strategy. Specifically, PA pushes far away the hard negatives in the triplet loss by considering a certain amount of negatives within each mini-batch as hard negatives, preventing the distraction to the positive text–image pairs. Based on PA, SA further achieves the alignment between the similarity distribution on these hard negatives by the manner of soft-label, as well as the alignment between inter-modal and intra-modal. Extensive experiments on three public datasets, CUHK-PEDES, ICFG-PEDES and RSTPReid, demonstrate that our proposed PASA method can consistently improve the performance of text-to-image person retrieval, and achieve new state-of-the-art results on the above three datasets.

Item Type:

Journal Article

Journal or Publication Title:

Information Fusion

Additional Information:

Export Date: 29 September 2025; Cited By: 0

Uncontrolled Keywords:

/dk/atira/pure/subjectarea/asjc/1700/1708

Subjects:

?? hardware and architecturesignal processingsoftwareinformation systems ??

Departments:

Faculty of Science and Technology > Lancaster Environment Centre

ID Code:

232734

Deposited By:

ep_importer_pure

Deposited On:

01 Oct 2025 09:15

Refereed?:

Yes

Published?:

Published

Last Modified:

01 Oct 2025 09:15

URI:

https://eprints.lancs.ac.uk/id/eprint/232734