FewVS : A Vision-Semantics Integration Framework for Few-Shot Image Classification

Li, Zhuoling and Wang, Yong and Li, Kaitong (2024) FewVS : A Vision-Semantics Integration Framework for Few-Shot Image Classification. In: MM '24 : Proceedings of the 32nd ACM International Conference on Multimedia. ACM, New York, pp. 1341-1350. ISBN 9798400706868

Full text not available from this repository.

Abstract

Some recent methods address few-shot image classification by extracting semantic information from class names and devising mechanisms for aligning vision and semantics to integrate information from both modalities. However, class names provide only limited information, which is insufficient to capture the visual details in images. As a result, such vision-semantics alignment is inherently biased, leading to suboptimal integration outcomes. In this paper, we avoid such biased vision-semantics alignment by introducing CLIP, a natural bridge between vision and semantics, and enforcing unbiased vision-vision alignment as a proxy task. Specifically, we align features encoded from the few-shot encoder and CLIP's vision encoder on the same image. This alignment is accomplished through a linear projection layer, with a training objective formulated using optimal transport-based assignment prediction. Thanks to the inherent alignment between CLIP's vision and text encoders, the few-shot encoder is indirectly aligned to CLIP's text encoder, which serves as the foundation for better vision-semantics integration. In addition, to further improve vision-semantics integration at the testing stage, we mine potential fine-grained semantic attributes of class names from large language models. Correspondingly, an online optimization module is designed to adaptively integrate the semantic attributes and visual information extracted from images. Extensive results on four datasets demonstrate that our method outperforms state-of-the-art methods. The code is available at https://github.com/zhuolingli/FewVS.

Item Type:
Contribution in Book/Report/Proceedings
ID Code:
227295
Deposited By:
Deposited On:
01 Apr 2025 14:15
Refereed?:
Yes
Published?:
Published
Last Modified:
01 Apr 2025 14:15