Lectures
Course Overview (Jan 8)
Content
- Course logistics
- What is natural language processing?
- What are the features of natural language?
- What do we want to do with NLP?
- What makes it hard?
Slides
Reading Material
Text Classification (Jan 10)
Content
- Defining features
- Building a rule-based classifier
- Training a logistic regression based classifier
- Evaluating classification
Slides
Reading Material
Neural Network Basics (Jan 15)
Content
- Cross Entropy Loss
- Gradient Descent
- Components of a feedforward neural network
Slides
Reading Material
Neural Nets: [Deep Averaging Networks]
Neural Nets: [Deep Learning with Pytorch: A 60 minute blitz]
Word Vectors (Jan 17)
Content
- Deep Averaging Network for Text CLassification
- Lexical Semantics
- Distributional Semantics
- Evaluating Word Vectors
Slides
Reading Material
Neural Nets: [Deep Averaging Networks]
Neural Nets: [Deep Learning with Pytorch: A 60 minute blitz]
Word Vectors: [Eisenstein 3.3.4, 14.5-14.6]
Word Vectors: [Goldberg 5]
Word Vectors: [Mikolov+13 word2vec]
Word Vectors: [Pennington+14 GloVe]
- Word Vectors: [Grave+17 fastText]
- Word Vectors: [Bolukbasi+16 Gender]
Language Modeling (Jan 22)
Content
- What is a language model
- How to evaluate a language model
- How to build a language model - N-gram language model, a simple feedforward neural LM
Slides
Reading Material
Language Modeling (Jan 24)
Content
- Feedforward Language Model
- Recurrent Neural LM, Attention
- Building blocks of a transformer
Slides
Reading Material
[Luong15]
Transformers (Jan 29)
Content
- Self attention
- Transformer Encoder
- Transformer Decoder (Cross Attention, Masked Self Attention)
- Impact of transformers
Slides
Reading Material
Tokenization (Jan 31)
Content
- Word and character tokenization
- Byte pair encoding / WordPiece
- Unigram tokenizer
Slides
Reading Material
[J&M 2.5)]
[“Let’s build the GPT Tokenizer” by Andrej Karpathy (practical tour of BPE with a focus on LLMs)]
Tokenization Contd. / Masked LMs (February 7)
Content
- Unigram tokenizer
- Pretraining / finetuning paradigm
- Masked LMs - BERT, RoBERTa, ELECTRA
Slides
Reading Material
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]
[RoBERTa]
[ELECTRA]
Pretraining II (February 12)
Pretraining III (February 14)
Instruction Following (February 19)
Content
- Instruction Tuning (T0, FLAN)
- Evaluating Instruction Tuned LMs
- Basics of RLHF
Slides
Reading Material
Preference Optimization (February 21)
Content
- Reward Modeling
- Basics of RLHF
- Direct Preference Optimization
Slides
Reading Material
[Illustrating Reinforcement Learning from Human Feedback (RLHF)]
[DPO]
Parameter Efficient Finetuning (February 26)
Evaluation (February 28)
Content
- What is Benchmarking
- Open and close ended evaluation
- LLM Evaluation Challenges
Slides
Reading Material
Sequence Tagging (March 5)
Parsing (March 7)
Content
- Constituency Parsing
- CKY Algorithm
- Dependency Parsing (Intro)
- Semantic Parsing (Into)
Slides
Reading Material
TBA
Spring Break (March 12 & 14)
No Class
Interpretability (March 19)
Content
- Global vs Local Explanation
- Post hoc explanations (LIME, Gradient-based)
- Probing
Slides
Reading Material
TBA
Efficiency (March 21)
Content
- Speculative Decoding, Flash Attention
- Quantization, Pruning, Distillation
Slides
Reading Material
TBA