massOfai

Knowledge Distillation

Neural Networks

Training small model (student) from large model (teacher)

What is Knowledge Distillation?

Student model learns teacher’s softened outputs to achieve similar performance with smaller size for edge or faster inference.

Real-World Examples

  • Compressing BERT to DistilBERT

Related Terms

Learn more about concepts related to Knowledge Distillation