Related or duplicate:Distinguishing similar CQA questions via convolutional neural networks

Zhang, Wei Emma and Sheng, Quan Z. and Tang, Zhejun and Ruan, Wenjie (2018) Related or duplicate:Distinguishing similar CQA questions via convolutional neural networks. In: SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. Association for Computing Machinery, Inc, USA, pp. 1153-1156. ISBN 9781450356572

Full text not available from this repository.


Plenty of research attempts target the automatic duplicate detection in Community Question Answering (CQA) systems and frame the task as a supervised learning problem on the question pairs. However, these methods rely on handcrafted features, leading to the difficulty of distinguishing related and duplicate questions as they are often textually similar. To tackle this issue, we propose to leverage neural network architecture to extract "deep" features to identify whether a question pair is duplicate or related. In particular, we construct question correlation matrices, which capture the word-wise similarities between questions. The constructed matrices are input to our proposed convolutional neural network (CNN), in which the convolutional operation moves through the two dimensions of the matrices. Empirical studies on a range of real-world CQA datasets confirm the effectiveness of our proposed correlation matrices and the CNN. Our method outperforms the state-of-the-art methods and achieves better classification performance.

Item Type:
Contribution in Book/Report/Proceedings
Uncontrolled Keywords:
ID Code:
Deposited By:
Deposited On:
22 Jun 2019 00:59
Last Modified:
27 Apr 2022 07:35