Self-Attention

NLP & Text

Relating positions in sequence to each other

Each position attends to all positions to compute representation. Core of transformers.

Fundamental mechanism in modern NLP architectures

Learn more about concepts related to Self-Attention

Attention Mechanism

Focusing on relevant parts of input

Transformer

Attention-based architecture for sequence processing

BERT (Bidirectional Encoder Representations from Transformers)

Language model that understands context from both directions

GPT (Generative Pre-trained Transformer)

Large language model for text generation