DeBERTa 论文解读

11月 24 2021 Paper Reading 27 分钟读完 (约 4032 字)

《Deberta: decoding-Enhanced Bert with Disentangled Attention》，该论文来自 ICLR2021，作者团队为微软，主要介绍了其提出的一个新的预训练语言模型：DeBERTa，该模型从注意力解耦（disentangled attention）和预训练阶段的解码增强两方面对BERT类预训练模型进行了优化，并在SuperGLUE上首次超越人类基准。

Paper Reading

DeBERTa 论文解读

Your browser is out-of-date!