BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova · 2018
Google AI Language
BERT — bidirectional Transformer pre-training via masked language modeling. Defined the pretrain-then-finetune recipe that dominated NLP until decoder-only LLMs took over.
Metadata
Type
paper
Credibility
Primary source
Language
en
Publication date
October 11, 2018
Organization
Google AI Language
Authors
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova