massOfai

Throughput

Performance

Number of requests processed per unit time

What is Throughput?

Key metric for model serving and systems, often measured in requests/sec or samples/sec.

Real-World Examples

  • 200 requests/sec for inference API

Related Terms

Learn more about concepts related to Throughput