Topics
Interactive deep-dives into ML algorithms. Each topic has a visual story, annotated code, quizzes, and an internals explorer.

01
Word2Vec
Beginner35m
Word Embeddings & Skip-gram
- Understand one-hot encoding and why dense vectors are better
- Build a skip-gram model from scratch in PyTorch
- Visualize how word embeddings self-organize into semantic clusters
- Perform vector arithmetic like king − man + woman ≈ queen

02
GloVe
Beginner50m
Global Vectors for Word Representation
NeedsWord2Vec
- Build and interpret a word-word co-occurrence matrix with distance weighting
- Understand why probability ratios P(k|i)/P(k|j) encode meaning better than raw probabilities
- Follow the 5-step derivation from ratios to the log-bilinear model
- Explain the weighted least squares objective and why each design choice matters
- Implement GloVe training with manual gradients and AdaGrad
- Understand WHY vector arithmetic (king − man + woman ≈ queen) works mechanically in GloVe's log-bilinear framework
- Connect GloVe to PMI, SVD, LSA, and the Levy-Goldberg result; know when GloVe outperforms alternatives and when it doesn't

03
RNN
Beginner50m
Recurrent Neural Networks
NeedsWord2Vec
- Understand why sequences need memory and how hidden state provides it
- Build a vanilla RNN from five raw parameter tensors (no nn.RNN)
- Implement Backpropagation Through Time (BPTT) manually, line by line
- Visualize vanishing gradients and understand why eigenvalues matter
- Watch category emergence from character-level prediction (Elman 1990)
- Generate text character by character with temperature-controlled sampling
- Understand the architectural ceiling of vanilla RNNs and why gating mechanisms were needed

04
Coming Soon
LSTM & GRU
Intermediate40m
Gated Memory Units

05
Coming Soon
Seq2Seq
Intermediate45m
Sequence-to-Sequence Models

06
Coming Soon
Attention
Intermediate55m
The Attention Mechanism

07
Coming Soon
Tokenization
Beginner35m
How Models Read Text

08
Coming Soon
Transformer
Intermediate65m
Attention Is All You Need

09
Coming Soon
BERT
Intermediate45m
Encoder Models & Masked LM

10
Coming Soon
GPT
Intermediate50m
Decoder Models & Language Modeling

11
Coming Soon
Enc-Dec
Intermediate40m
Encoder-Decoder Transformers

12
Coming Soon
KV Cache
Advanced40m
KV Caching & Inference

13
Coming Soon
Flash Attention
Advanced40m
IO-Aware Attention

14
Coming Soon
Scaling Laws
Advanced40m
Scaling Laws & Training

15
Coming Soon
RLHF
Advanced45m
Alignment & Human Feedback

16
Coming Soon
MoE
Advanced40m
Mixture of Experts

17
Coming Soon
SSM
Advanced50m
State Space Models
More topics ahead
The curriculum is actively growing