Skip to main content
QuantMLQuantML

Topics

Interactive deep-dives into ML algorithms. Each topic has a visual story, annotated code, quizzes, and an internals explorer.

Word2Vec
01

Word2Vec

Beginner35m

Word Embeddings & Skip-gram

  • Understand one-hot encoding and why dense vectors are better
  • Build a skip-gram model from scratch in PyTorch
  • Visualize how word embeddings self-organize into semantic clusters
  • Perform vector arithmetic like king − man + woman ≈ queen
GloVe
02

GloVe

Beginner50m

Global Vectors for Word Representation

NeedsWord2Vec
  • Build and interpret a word-word co-occurrence matrix with distance weighting
  • Understand why probability ratios P(k|i)/P(k|j) encode meaning better than raw probabilities
  • Follow the 5-step derivation from ratios to the log-bilinear model
  • Explain the weighted least squares objective and why each design choice matters
  • Implement GloVe training with manual gradients and AdaGrad
  • Understand WHY vector arithmetic (king − man + woman ≈ queen) works mechanically in GloVe's log-bilinear framework
  • Connect GloVe to PMI, SVD, LSA, and the Levy-Goldberg result; know when GloVe outperforms alternatives and when it doesn't
RNN
03

RNN

Beginner50m

Recurrent Neural Networks

NeedsWord2Vec
  • Understand why sequences need memory and how hidden state provides it
  • Build a vanilla RNN from five raw parameter tensors (no nn.RNN)
  • Implement Backpropagation Through Time (BPTT) manually, line by line
  • Visualize vanishing gradients and understand why eigenvalues matter
  • Watch category emergence from character-level prediction (Elman 1990)
  • Generate text character by character with temperature-controlled sampling
  • Understand the architectural ceiling of vanilla RNNs and why gating mechanisms were needed
LSTM & GRU
04
Coming Soon

LSTM & GRU

Intermediate40m

Gated Memory Units

Seq2Seq
05
Coming Soon

Seq2Seq

Intermediate45m

Sequence-to-Sequence Models

Attention
06
Coming Soon

Attention

Intermediate55m

The Attention Mechanism

Tokenization
07
Coming Soon

Tokenization

Beginner35m

How Models Read Text

Transformer
08
Coming Soon

Transformer

Intermediate65m

Attention Is All You Need

BERT
09
Coming Soon

BERT

Intermediate45m

Encoder Models & Masked LM

GPT
10
Coming Soon

GPT

Intermediate50m

Decoder Models & Language Modeling

Enc-Dec
11
Coming Soon

Enc-Dec

Intermediate40m

Encoder-Decoder Transformers

KV Cache
12
Coming Soon

KV Cache

Advanced40m

KV Caching & Inference

Flash Attention
13
Coming Soon

Flash Attention

Advanced40m

IO-Aware Attention

Scaling Laws
14
Coming Soon

Scaling Laws

Advanced40m

Scaling Laws & Training

RLHF
15
Coming Soon

RLHF

Advanced45m

Alignment & Human Feedback

MoE
16
Coming Soon

MoE

Advanced40m

Mixture of Experts

SSM
17
Coming Soon

SSM

Advanced50m

State Space Models

More topics ahead

The curriculum is actively growing