
摘要
识别新闻文章的真实性是一个有趣的问题,而自动化这一过程则是一项具有挑战性的任务。检测新闻文章是否为假新闻仍然是一个开放性问题,因为这取决于许多当前最先进模型未能涵盖的因素。在本文中,我们探讨了假新闻识别的一个子任务,即立场检测。给定一篇新闻文章,任务是确定正文与其声明的相关性。我们提出了一种新颖的方法,该方法结合了神经网络、统计学和外部特征,以提供对此问题的有效解决方案。我们从深度递归模型中计算神经嵌入(neural embedding),从加权n-gram词袋模型中提取统计特征,并通过特征工程启发式方法构建外部特征。最后,通过深度神经层将所有这些特征结合起来,从而将标题-正文新闻对分类为同意、不同意、讨论或无关。我们将所提出的这种方法与当前最先进的模型在假新闻挑战数据集上进行了比较。通过广泛的实验,我们发现所提出的模型优于所有最先进的技术,包括假新闻挑战赛中的提交作品。
代码仓库
vineet2104/StanceDetection-CS626
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| fake-news-detection-on-fnc-1 | Baseline based on skip-thought embeddings (Bhatt et al., 2017) | Per-class Accuracy (Agree): 31.80 Per-class Accuracy (Disagree): 0.00 Per-class Accuracy (Discuss): 81.20 Per-class Accuracy (Unrelated): 91.18 Weighted Accuracy: 76.18 |
| fake-news-detection-on-fnc-1 | Neural baseline based on bi-directional LSTMs (Bhatt et al., 2017) | Per-class Accuracy (Agree): 38.04 Per-class Accuracy (Disagree): 4.59 Per-class Accuracy (Discuss): 58.132 Per-class Accuracy (Unrelated): 78.27 Weighted Accuracy: 63.11 |
| fake-news-detection-on-fnc-1 | Baseline based on word2vec + hand-crafted features (Bhatt et al., 2017) | Per-class Accuracy (Agree): 50.70 Per-class Accuracy (Disagree): 9.61 Per-class Accuracy (Discuss): 53.38 Per-class Accuracy (Unrelated): 96.05 Weighted Accuracy: 72.78 |
| fake-news-detection-on-fnc-1 | Bhatt et al. | Per-class Accuracy (Agree): 43.82 Per-class Accuracy (Disagree): 6.31 Per-class Accuracy (Discuss): 85.68 Per-class Accuracy (Unrelated): 98.04 Weighted Accuracy: 83.08 |