Standard
MultiFC : A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. / Augenstein, Isabelle; Lioma, Christina; Wang, Dongsheng; Chaves Lima, Lucas; Hansen, Casper; Hansen, Christian; Simonsen, Jakob Grue.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2019. p. 4684-4697.
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
Augenstein, I, Lioma, C, Wang, D, Chaves Lima, L, Hansen, C, Hansen, C
& Simonsen, JG 2019,
MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. in
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, pp. 4684-4697, 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China,
03/11/2019.
https://doi.org/10.18653/v1/D19-1475
APA
Augenstein, I., Lioma, C., Wang, D., Chaves Lima, L., Hansen, C., Hansen, C.
, & Simonsen, J. G. (2019).
MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. In
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 4684-4697). Association for Computational Linguistics.
https://doi.org/10.18653/v1/D19-1475
Vancouver
Augenstein I, Lioma C, Wang D, Chaves Lima L, Hansen C, Hansen C et al.
MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics. 2019. p. 4684-4697
https://doi.org/10.18653/v1/D19-1475
Author
Augenstein, Isabelle ; Lioma, Christina ; Wang, Dongsheng ; Chaves Lima, Lucas ; Hansen, Casper ; Hansen, Christian ; Simonsen, Jakob Grue. / MultiFC : A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2019. pp. 4684-4697
Bibtex
@inproceedings{0472b943e7d84bee802ae916a3389b26,
title = "MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims",
abstract = "We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists. We present an in-depth analysis of the dataset, highlighting characteristics and challenges. Further, we present results for automatic veracity prediction, both with established baselines and with a novel method for joint ranking of evidence pages and predicting veracity that outperforms all baselines. Significant performance increases are achieved by encoding evidence, and by modelling metadata. Our best-performing model achieves a Macro F1 of 49.2%, showing that this is a challenging testbed for claim veracity prediction.",
author = "Isabelle Augenstein and Christina Lioma and Dongsheng Wang and {Chaves Lima}, Lucas and Casper Hansen and Christian Hansen and Simonsen, {Jakob Grue}",
year = "2019",
doi = "10.18653/v1/D19-1475",
language = "English",
pages = "4684--4697",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
publisher = "Association for Computational Linguistics",
note = "2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) ; Conference date: 03-11-2019 Through 07-11-2019",
}
RIS
TY - GEN
T1 - MultiFC
T2 - 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
AU - Augenstein, Isabelle
AU - Lioma, Christina
AU - Wang, Dongsheng
AU - Chaves Lima, Lucas
AU - Hansen, Casper
AU - Hansen, Christian
AU - Simonsen, Jakob Grue
PY - 2019
Y1 - 2019
N2 - We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists. We present an in-depth analysis of the dataset, highlighting characteristics and challenges. Further, we present results for automatic veracity prediction, both with established baselines and with a novel method for joint ranking of evidence pages and predicting veracity that outperforms all baselines. Significant performance increases are achieved by encoding evidence, and by modelling metadata. Our best-performing model achieves a Macro F1 of 49.2%, showing that this is a challenging testbed for claim veracity prediction.
AB - We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists. We present an in-depth analysis of the dataset, highlighting characteristics and challenges. Further, we present results for automatic veracity prediction, both with established baselines and with a novel method for joint ranking of evidence pages and predicting veracity that outperforms all baselines. Significant performance increases are achieved by encoding evidence, and by modelling metadata. Our best-performing model achieves a Macro F1 of 49.2%, showing that this is a challenging testbed for claim veracity prediction.
U2 - 10.18653/v1/D19-1475
DO - 10.18653/v1/D19-1475
M3 - Article in proceedings
SP - 4684
EP - 4697
BT - Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
PB - Association for Computational Linguistics
Y2 - 3 November 2019 through 7 November 2019
ER -