Pose-Guided Multi-Cue Explicit Query Construction for Disambiguating Human-Object Interactions

Zou, Minghao and Liu, Shangkun and Zeng, Qingtian and Zhang, Xue and Yuan, Guiyuan and Hao, Xiaoshuai and Liu, Jun and Zhou, Wei (2026) Pose-Guided Multi-Cue Explicit Query Construction for Disambiguating Human-Object Interactions. IEEE Transactions on Circuits and Systems for Video Technology. ISSN 1051-8215

[thumbnail of paper]
Text (paper)
paper.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (1MB)

Abstract

Human-Object Interaction (HOI) detection remains challenging due to the semantic ambiguity of interaction categories and the limited discriminability of their feature representations. Existing approaches often improve recognition by employing sophisticated models or auxiliary textual annotations. While effective in certain gains, these solutions incur additional computational or annotation costs and struggle to capture intrinsic interaction regularities. To address these issues, we propose Pose-Guided Multi-Cue Explicit Query Construction (PM-EQC), a unified Transformer-based framework that builds upon collaborative modeling of appearance, spatial, and pose cues for discriminative interaction reasoning. At its core, the Collaborative Multi-Cue Query Constructor (CM-CQC) jointly models dependencies among visual cues to generate explicit query embeddings. CM-CQC further incorporates a hierarchical pose contextualization mechanism: global body configurations adaptively guide attention to local critical joints, yielding fine-grained pose embeddings and more precise interaction disambiguation. Owing to its modular design, PM-EQC integrates seamlessly with diverse backbones and benefits from their advances. Extensive experiments on PhysLab, HICO-DET, and V-COCO datasets demonstrate that PM-EQC achieves state-of-the-art performance, and the code is publicly available at https://github.com/ZMHSDUST/ PM-EQC.

Item Type:
Journal Article
Journal or Publication Title:
IEEE Transactions on Circuits and Systems for Video Technology
Uncontrolled Keywords:
/dk/atira/pure/subjectarea/asjc/2200/2214
Subjects:
?? media technologyelectrical and electronic engineering ??
ID Code:
235991
Deposited By:
Deposited On:
11 Mar 2026 15:40
Refereed?:
Yes
Published?:
Published
Last Modified:
17 Mar 2026 00:11