Transformer

Neural Networks

Attention-based architecture for sequence processing

Modern architecture using self-attention mechanisms, processing entire sequences in parallel. Foundation for GPT, BERT, and most modern NLP.

State-of-the-art for NLP tasks and becoming popular for vision

Learn more about concepts related to Transformer

Attention Mechanism

Focusing on relevant parts of input

BERT (Bidirectional Encoder Representations from Transformers)

Language model that understands context from both directions

GPT (Generative Pre-trained Transformer)

Large language model for text generation

Natural Language Processing (NLP)

Teaching computers to understand and work with human language