Yung-Sung ChuangRumen DangovskiHongyin LuoYang ZhangShiyu ChangMarin SoljačićShang-Wen LiWen-tau YihYoon KimJames Glass

摘要
我们提出 DiffCSE,一种用于学习句子嵌入的无监督对比学习框架。DiffCSE 学习的句子嵌入能够敏感地捕捉原始句子与其编辑后句子之间的差异,其中编辑后的句子通过随机遮蔽原始句子,再从掩码语言模型中采样得到。我们证明,DiffCSE 是等变对比学习(equivariant contrastive learning, Dangovski 等,2021)的一个实例,该方法扩展了传统对比学习,能够学习对某些类型数据增强不敏感、但对其他“有害”类型增强敏感的表示。实验结果表明,DiffCSE 在无监督句子表示学习方法中达到了当前最优性能,在语义文本相似度任务上,相较于无监督 SimCSE 提升了 2.3 个百分点的绝对准确率。
代码仓库
voidism/diffcse
官方
jax
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| semantic-textual-similarity-on-sts12 | DiffCSE-RoBERTa-base | Spearman Correlation: 0.7005 |
| semantic-textual-similarity-on-sts12 | DiffCSE-BERT-base | Spearman Correlation: 0.7228 |
| semantic-textual-similarity-on-sts13 | DiffCSE-BERT-base | Spearman Correlation: 0.8443 |
| semantic-textual-similarity-on-sts13 | DiffCSE-RoBERTa-base | Spearman Correlation: 0.8343 |
| semantic-textual-similarity-on-sts14 | DiffCSE-BERT-base | Spearman Correlation: 0.7647 |
| semantic-textual-similarity-on-sts14 | DiffCSE-RoBERTa-base | Spearman Correlation: 0.7549 |
| semantic-textual-similarity-on-sts15 | DiffCSE-BERT-base | Spearman Correlation: 0.8390 |
| semantic-textual-similarity-on-sts15 | DiffCSE-RoBERTa-base | Spearman Correlation: 0.8281 |
| semantic-textual-similarity-on-sts16 | DiffCSE-RoBERTa-base | Spearman Correlation: 0.8212 |
| semantic-textual-similarity-on-sts16 | DiffCSE-BERT-base | Spearman Correlation: 0.8054 |