Luo, Bingfeng and Feng, Yansong and Wang, Zheng and Zhao, Dongyan (2016) Improving first order temporal fact extraction with unreliable data. In: Natural Language Understanding and Intelligent Applications : 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China,. Lecture Notes in Computer Science . Springer, Cham, pp. 251-262. ISBN 9783319504957
141.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.
Download (655kB)
Abstract
In this paper, we deal with the task of extracting first order temporal facts from free text. This task is a subtask of relation extraction and it aims at extracting relations between entity and time. Currently, the field of relation extraction mainly focuses on extracting relations between entities. However, we observe that the multi-granular nature of time expressions can help us divide the dataset constructed by distant supervision to reliable and less reliable subsets, which can help to improve the extraction results on relations between entity and time. We accordingly contribute the first dataset focusing on the first order temporal fact extraction task using distant supervision. To fully utilize both the reliable and the less reliable data, we propose to use curriculum learning to rearrange the training procedure, label dropout to make the model be more conservative about less reliable data, and instance attention to help the model distinguish important instances from unimportant ones. Experiments show that these methods help the model outperform the model trained purely on the reliable dataset as well as the model trained on the dataset where all subsets are mixed together.