Lectures
Course Overview / Introduction (Jan 14)
Content
- Course logistics
- What is natural language processing?
- What are the features of natural language?
- What do we want to do with NLP?
- What makes it hard?
Slides
Reading Material
Text Classification (Jan 16)
Content
- Defining features
- Building a rule-based classifier
- Training a logistic regression based classifier
- Evaluating classification
Slides
Reading Material
Neural Network Basics (Jan 21)
Content
- Cross Entropy Loss
- Gradient Descent
- Components of a feedforward neural network
Slides
Reading Material
Neural Nets: [Deep Averaging Networks]
Neural Nets: [Deep Learning with Pytorch: A 60 minute blitz]
Word Vectors (Jan 23)
Content
- Deep Averaging Network for Text CLassification
- Lexical Semantics
- Distributional Semantics
- Evaluating Word Vectors
Slides
Reading Material
Neural Nets: [Deep Averaging Networks]
Neural Nets: [Deep Learning with Pytorch: A 60 minute blitz]
Word Vectors: [Eisenstein 3.3.4, 14.5-14.6]
Word Vectors: [Goldberg 5]
Word Vectors: [Mikolov+13 word2vec]
Word Vectors: [Pennington+14 GloVe]
- Word Vectors: [Grave+17 fastText]
- Word Vectors: [Bolukbasi+16 Gender]
Word Vectors / Language Modeling I (Jan 28)
Content
- Distributional Semantics
- Evaluating Word Vectors
- What is a language model
- How to evaluate a language model
Slides
Reading Material
Language Modeling (Jan 30)
Content
- Feedforward Language Model
- Recurrent Neural LM, Attention
- Building blocks of a transformer
Slides
Reading Material
[Luong15]
Transformers (Feb 4)
Content
- Self attention
- Transformer Encoder
- Transformer Decoder (Cross Attention, Masked Self Attention)
- Impact of transformers
Slides
Reading Material
Transformer LMs continued
Content
- Transformer Decoder (Cross Attention, Masked Self Attention)
- Training and inference from a decoder-only autoregressive LM
- Impact of transformers
Slides
Reading Material
Tokenization (Feb 11)
Content
- Word and character tokenization
- Byte pair encoding / WordPiece
- Unigram tokenizer
- Masked LMs (if time permits)
Slides
Reading Material
[J&M 2.5)]
[“Let’s build the GPT Tokenizer” by Andrej Karpathy (practical tour of BPE with a focus on LLMs)]
Tokenization Contd. / Pretraining I (Feb 13)
Content
- Unigram tokenizer
- Pretraining / finetuning paradigm
- Masked LMs - BERT, RoBERTa, NeoBERT
Slides
Reading Material
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]
[RoBERTa]
Pretraining II (February 18)
Pretraining II (Feb 20)
Content
- Scaling
- Prompting
- In-context learning
- CoT
Slides
Reading Material
Section 7.3 in Jurafsky & Martin