Search for a command to run...
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin · 2017
Google Brain / Google Research
The Transformer paper. Introduces self-attention and multi-head attention, replacing recurrence and convolutions for sequence modeling. The architectural basis for all modern LLMs.
Not cited in any article yet.