Attention Is All You Need
Vaswani et al., 2017
Self-attention lets each word weigh every other word to build contextual meaning.
“The inline explanations alone save me hours every week. I wish I had this during my PhD.”
ML Engineer