AI Glossary
Understand AI terminology
Artificial Intelligence (AI)
Machines performing tasks that typically require human intelligence
Machine Learning (ML)
Algorithms that learn from data without explicit programming
Deep Learning
ML using multi-layered neural networks
Supervised Learning
Training with labeled data (input + correct output)
Unsupervised Learning
Finding patterns in unlabeled data
Reinforcement Learning
Learning through trial and error with rewards/penalties
Classification
Categorizing data into predefined classes
Regression
Predicting continuous numerical values
Clustering
Grouping similar data points together
Neural Network
Brain-inspired computing model with interconnected nodes
Overfitting
Model learns training data too well, performs poorly on new data
Underfitting
Model is too simple to capture patterns in data
Training
Process of teaching an ML model using data
Validation
Evaluating model performance during training
Testing
Final evaluation on completely unseen data
Feature
Individual measurable property used for predictions
Label
The correct answer in supervised learning
Bayesian Inference
Probabilistic approach updating beliefs with evidence
Causal Inference
Methods to identify cause-effect relationships
AutoML
Automated model selection and tuning
Hyperparameter Optimization
Searching for the best hyperparameters
Gradient Boosting
Ensemble method building models sequentially to reduce errors
XGBoost
Efficient, scalable gradient boosting implementation
LightGBM
High-performance gradient boosting by Microsoft
CatBoost
Gradient boosting with categorical features handling
Probability
Measure of how likely events are to occur
Statistics
Collecting, analyzing, interpreting data
P-value
Probability of observing data at least as extreme under null hypothesis
Confidence Interval
Range that likely contains the true parameter value
Markov Chain
Sequence of events where next state depends only on current state
Hidden Markov Model (HMM)
Statistical model with hidden states and observations
Time Series
Data collected sequentially over time
ARIMA
Classical statistical model for forecasting
Prophet (forecasting)
Open-source forecasting tool by Meta
Feature Store
Centralized storage for ML features
Feature Engineering
Creating informative input variables from raw data
Data Ingestion
Collecting and importing data for processing
Data Quality
Measurement of data accuracy, completeness, and reliability
Schema Registry
Centralized schema management for data streams
Data Lineage
Tracking origin and transformations of data
Data Catalog
Inventory of data assets with metadata
Data Contracts
Agreements on data shape and quality between producers and consumers
Apache Kafka
Distributed streaming platform
Stream Processing
Processing data as it arrives in real-time
Batch Processing
Processing large chunks of data periodically
Model Registry
Centralized store for models and metadata
Model Serving
Making models available for inference
Canary Deployment
Rolling out changes to a small subset of users first
Blue-Green Deployment
Maintain two production environments and switch traffic
Shadow Mode
Run new model in parallel without affecting users
Model Drift
Model performance degradation over time as data changes
Data Drift
Distributional changes in input data over time
Concept Drift
Underlying relationship between features and labels changes
Model Monitoring
Observing model health and performance in production
Observability
Ability to infer system state from logs, metrics, and traces
Logging
Recording events for debugging and audit
Distributed Tracing
Tracking requests across service boundaries
SLI / SLO / SLA
Service-level indicators, objectives, and agreements
Reproducibility
Ability to reproduce results reliably
Model Versioning
Tracking model artifacts and metadata over time
Checkpointing
Saving model state periodically during training
Knowledge Distillation
Training small model (student) from large model (teacher)
Model Pruning
Removing redundant weights or neurons to shrink models
Quantization
Reducing numerical precision to shrink models and speed inference
Mixed Precision
Using lower precision where safe to speed training/inference
ONNX
Open Neural Network Exchange format
TensorRT
NVIDIA runtime for high-performance inference
Edge Deployment
Running models on-device or near users
TinyML
Machine learning on microcontrollers
Federated Learning
Training across decentralized devices without centralizing data
Differential Privacy
Mathematical guarantee that individual data contributions remain private
Adversarial Example
Inputs intentionally perturbed to fool models
Robustness
Model resilience to perturbations and distributional shifts
Explainability
Understanding why a model made a decision
SHAP
Shapley additive explanations for feature attribution
LIME
Local Interpretable Model-agnostic Explanations
Fairness
Ensuring models do not unfairly discriminate
Bias in Data
Systematic errors in data that lead to unfair models
Model Compliance
Meeting regulatory and legal requirements for models
Model Governance
Policies and processes for responsible ML use
Privacy
Protecting individuals’ personal data and rights
GDPR
EU data protection regulation
Encryption
Securing data by transforming it into unreadable form without keys
TLS (Transport Layer Security)
Protocol securing internet communications
Role-Based Access Control (RBAC)
Access control model assigning permissions by role
Secrets Management
Securely storing and accessing credentials and keys
Audit Logging
Recording actions for accountability and forensics
Capacity Planning
Estimating resources needed to meet demand
Benchmarking
Measuring system or model performance under controlled tests
Profiling
Analyzing resource use and hotspots in code or models
Throughput
Number of requests processed per unit time
Latency
Time taken to complete a single request
p95 / p99 Latency
Percentile-based latency metrics
GPU
Graphics Processing Unit used for parallel compute
TPU
Tensor Processing Unit by Google for ML workloads
CUDA
Parallel computing platform and API for NVIDIA GPUs
FPGA
Field-Programmable Gate Array for reconfigurable hardware acceleration
ASIC
Application-Specific Integrated Circuit
Cold Start
Delay when initializing services or models not recently used
Warm Start
Starting from previous state to accelerate startup
Serverless
Run code without managing servers
Sparse Representation
Using mostly-zero vectors to represent data efficiently
Embedding Index
Storage and retrieval structure for vector embeddings
Annoy
Approximate nearest neighbors library
FAISS
Facebook AI Similarity Search library
HNSW (Hierarchical Navigable Small World)
Graph-based ANN algorithm for vector search
Recommendation System
Systems that suggest relevant items to users
Collaborative Filtering
Recommends items based on other users’ preferences
Matrix Factorization
Decompose user-item matrix into latent factors
Graph Neural Network (GNN)
Neural networks operating on graph-structured data
Knowledge Graph
Graph storing entities and relationships for semantic queries
RDF
Resource Description Framework for representing graph data
SPARQL
Query language for RDF data
Synthetic Data
Artificially generated data for training or testing
Data Labeling
Annotating data with ground-truth labels
Annotation Tooling
Software for labeling datasets efficiently
Active Learning
Selectively labeling the most informative samples
Semi-Supervised Learning
Learning from small labeled and large unlabeled datasets
Self-Supervised Learning
Learning useful representations by creating proxy tasks from raw data
Contrastive Learning
Representation learning by contrasting positive and negative pairs
Representation Learning
Learning features or embeddings that capture data structure
Embedding Dimension
Size of vector representation for entities or tokens
Tokenizer
Converts text to tokens the model understands
Byte Pair Encoding (BPE)
Subword segmentation algorithm
WordPiece
Tokenizer algorithm used by BERT
Vocabulary (Vocab)
Set of tokens the model recognizes
Temperature (Sampling)
Controls randomness in language model sampling
Top-k Sampling
Sampling from the k most probable tokens
Top-p (Nucleus) Sampling
Sampling from smallest set of tokens whose cumulative probability ≥ p
Beam Search
Heuristic search keeping top sequences during decoding
Greedy Decoding
Selecting highest-probability token at each step
Reward Modeling
Learning a reward function from human feedback
RLHF
Reinforcement Learning from Human Feedback
Model Alignment
Ensuring model goals match human intent
AI Safety
Designing AI systems to avoid harmful behavior
Red Teaming
Adversarial testing to find model failures
Jitter (Time Series/Networking)
Variation in time between events or packet delays
Throughput per Dollar
Measuring cost-efficiency of compute resources
Cost Optimization
Reducing cloud and compute spend while maintaining performance
Spot Instances
Discounted cloud instances that can be reclaimed
Multi-Tenancy
Serving multiple customers on shared infrastructure
Isolation
Separating workloads to prevent interference and leakage
GPU Memory Optimization
Techniques to reduce GPU memory usage
Model Sharding
Splitting model across devices for parallelism
Distributed Training
Training across multiple machines or GPUs
Distributed Data Parallel (DDP)
PyTorch mechanism for synchronous multi-GPU training
Horovod
Distributed training library from Uber
Checkpointing Strategy
How often and what to save during training
Experiment Tracking
Recording experiments, parameters, and metrics systematically
Weights & Biases (W&B)
Platform for experiment tracking and model monitoring
Ablation Study
Systematically removing components to measure their impact
Sanity Checks
Quick validations to ensure models behave as expected
Unit Tests
Tests for small units of code
Integration Tests
Tests that validate interactions between components
End-to-End (E2E) Tests
Full-system tests simulating real user flows
Canary Testing
Deploy changes to small group for early validation
Rollback
Reverting to a previous stable version
SLO Breach
When a service misses its performance objective
Incident Response
Process for managing service disruptions
Postmortem
Analysis after an incident to prevent recurrence
Runbook
Step-by-step instructions for operational tasks
Observability Pipeline
Infrastructure for collecting and processing telemetry
OpenTelemetry
Standard for observability data (logs/metrics/traces)
Prometheus
Monitoring system and time-series database
Grafana
Visualization and dashboarding tool for metrics
Loki
Log aggregation system by Grafana Labs
Incident Management
Coordinated process to respond to and resolve incidents
Model
Trained algorithm that makes predictions
Accuracy
Percentage of correct predictions
Precision
Of predicted positives, how many are actually positive
Recall (Sensitivity)
Of actual positives, how many did we find
Convolutional Neural Network (CNN)
Neural network specialized for processing grid-like data (images)
Recurrent Neural Network (RNN)
Neural network for sequential data with memory
Long Short-Term Memory (LSTM)
Advanced RNN that remembers long-term dependencies
Transformer
Attention-based architecture for sequence processing
GPT (Generative Pre-trained Transformer)
Large language model for text generation
BERT (Bidirectional Encoder Representations from Transformers)
Language model that understands context from both directions
Generative Adversarial Network (GAN)
Two networks competing to generate realistic data
Autoencoder
Neural network that learns compressed representations
Attention Mechanism
Focusing on relevant parts of input
Activation Function
Non-linear function determining neuron output
Backpropagation
Algorithm for training neural networks
Dropout
Regularization technique randomly ignoring neurons
Batch Normalization
Normalizing layer inputs during training
Natural Language Processing (NLP)
Teaching computers to understand and work with human language
Tokenization
Breaking text into smaller units (tokens)
Word Embedding
Representing words as numerical vectors
Sentiment Analysis
Determining emotional tone of text
Named Entity Recognition (NER)
Identifying and classifying named entities in text
Large Language Model (LLM)
Massive neural network trained on vast text data
Retrieval-Augmented Generation (RAG)
LLM enhanced with external knowledge retrieval
Prompt Engineering
Crafting inputs to get desired LLM outputs
Computer Vision
Teaching computers to understand images and videos
Object Detection
Finding and locating objects in images
Image Segmentation
Classifying every pixel in an image
Face Recognition
Identifying or verifying people from faces
Optical Character Recognition (OCR)
Converting images of text to actual text
Image Classification
Categorizing entire images
Database
Organized collection of data stored electronically
SQL (Structured Query Language)
Language for managing relational databases
NoSQL
Non-relational databases for flexible data
API (Application Programming Interface)
Interface allowing software components to communicate
REST API
Web API following REST architectural principles
Cloud Computing
On-demand computing resources over the internet
Amazon Web Services (AWS)
Amazon's cloud computing platform
Docker
Platform for containerizing applications
Kubernetes (K8s)
Container orchestration platform
ETL (Extract, Transform, Load)
Process of moving data between systems
Vector Database
Database optimized for similarity search
Framework
Pre-built structure for building applications
Library
Collection of reusable code/functions
Git
Version control system for tracking code changes
CI/CD
Automating code integration and deployment
Jupyter Notebook
Interactive environment for code and documentation
Minimum Viable Product (MVP)
Simplest version with core features
Proof of Concept (POC)
Demonstration that an idea is feasible
Scalability
Ability to handle growth
Technical Debt
Cost of shortcuts taken during development
F1 Score
Harmonic mean of precision and recall
Confusion Matrix
Table showing prediction outcomes
Cross-Validation
Validating on multiple data splits
Hyperparameter
Configuration setting for learning algorithm
Learning Rate
Step size for parameter updates during training
Gradient Descent
Optimization algorithm minimizing loss
Loss Function
Measures how wrong predictions are
Batch Size
Number of samples processed before updating weights
Epoch
One complete pass through training data
Transfer Learning
Reusing pre-trained model for new task
Fine-Tuning
Adapting pre-trained model to new task
Data Augmentation
Creating variations of training data
Regularization
Techniques to prevent overfitting
Ensemble Learning
Combining multiple models for better predictions
Random Forest
Ensemble of decision trees
Decision Tree
Tree-like model of decisions
Support Vector Machine (SVM)
Finds optimal decision boundary
K-Means Clustering
Partitions data into K clusters
Principal Component Analysis (PCA)
Dimensionality reduction technique
Bias-Variance Tradeoff
Balance between underfitting and overfitting
Perceptron
Simplest neural network unit
ReLU (Rectified Linear Unit)
Activation function: max(0, x)
Sigmoid Function
S-shaped activation: 1/(1+e^-x)
Softmax Function
Converts logits to probabilities
Pooling Layer
Downsampling layer in CNNs
Convolution
Sliding window operation in CNNs
ResNet (Residual Network)
CNN with skip connections
U-Net
CNN for image segmentation
Variational Autoencoder (VAE)
Generative model learning latent representations
YOLO (You Only Look Once)
Real-time object detection architecture
Gated Recurrent Unit (GRU)
Simplified alternative to LSTM
Sequence-to-Sequence (Seq2Seq)
Encoder-decoder for sequence transformation
Word2Vec
Algorithm for learning word embeddings
Part-of-Speech Tagging
Labeling words with grammatical roles
Lemmatization
Reducing words to dictionary form
Stemming
Crude chopping to word root
TF-IDF
Term importance in documents
Self-Attention
Relating positions in sequence to each other
Zero-Shot Learning
Performing tasks without specific training examples
Few-Shot Learning
Learning from few examples
Text Generation
Creating new text using language models
Machine Translation
Automatically translating between languages
MongoDB
Popular document-oriented NoSQL database
Redis
In-memory key-value store
PostgreSQL
Advanced open-source relational database
Amazon S3
Cloud object storage service
Amazon EC2
Virtual servers in AWS cloud
AWS Lambda
Serverless compute service
GraphQL
Query language for APIs
WebSocket
Persistent two-way communication protocol
Microservices
Architecture of small independent services
Load Balancer
Distributes traffic across servers
Caching
Storing frequently accessed data for fast retrieval
Content Delivery Network (CDN)
Distributed network serving content
GitHub
Platform for hosting Git repositories
Visual Studio Code
Popular code editor
Pytest
Python testing framework
TensorFlow
Google's ML framework
PyTorch
Facebook's ML framework
Hugging Face
Platform for ML models and datasets
MLflow
ML lifecycle management platform
Apache Airflow
Workflow orchestration platform
Agile
Iterative software development methodology
DevOps
Combining development and operations
MLOps
DevOps for machine learning
A/B Testing
Comparing two versions to see which performs better
Key Performance Indicator (KPI)
Measurable value showing progress toward goals
Return on Investment (ROI)
Ratio of profit to cost
Service Level Agreement (SLA)
Commitment to service availability/performance
glossary.howToUse.title
- glossary.howToUse.clickTerm
- glossary.howToUse.useSearch
- glossary.howToUse.filterCategory
- glossary.howToUse.lookForLinks