Lectures
Course Overview (Jan 8)
Content
- Course logistics
- What is natural language processing?
- What are the features of natural language?
- What do we want to do with NLP?
- What makes it hard?
Slides
Reading Material
Text Classification (Aug 29)
Content
- Defining features
- Building a rule-based classifier
- Training a logistic regression based classifier
- Evaluating classification
Slides
Reading Material
Neural Network Basics (Sept 3)
Content
- Cross Entropy Loss
- Gradient Descent
- Components of a feedforward neural network
Slides
Reading Material
Neural Nets: [Deep Averaging Networks]
Neural Nets: [Deep Learning with Pytorch: A 60 minute blitz]
Word Vectors (Sept 5)
Content
- Deep Averaging Network for Text CLassification
- Lexical Semantics
- Distributional Semantics
- Evaluating Word Vectors
Slides
Reading Material
Neural Nets: [Deep Averaging Networks]
Neural Nets: [Deep Learning with Pytorch: A 60 minute blitz]
Word Vectors: [Eisenstein 3.3.4, 14.5-14.6]
Word Vectors: [Goldberg 5]
Word Vectors: [Mikolov+13 word2vec]
Word Vectors: [Pennington+14 GloVe]
- Word Vectors: [Grave+17 fastText]
- Word Vectors: [Bolukbasi+16 Gender]
Language Modeling (Sept 10)
Content
- What is a language model
- How to evaluate a language model
- How to build a language model - N-gram language model, a simple feedforward neural LM
Slides
Reading Material
Language Modeling (Sept 12)
Content
- Feedforward Language Model
- Recurrent Neural LM, Attention
- Building blocks of a transformer
Slides
Reading Material
[Luong15]
Transformers (Sept 17)
Content
- Self attention
- Transformer Encoder
- Transformer Decoder (Cross Attention, Masked Self Attention)
- Impact of transformers
Slides
Reading Material
Tokenization (Sept 19)
Content
- Word and character tokenization
- Byte pair encoding / WordPiece
- Unigram tokenizer
Slides
Reading Material
[J&M 2.5)]
[“Let’s build the GPT Tokenizer” by Andrej Karpathy (practical tour of BPE with a focus on LLMs)]
Tokenization Contd. / Pretraining I (September 24)
Content
- Unigram tokenizer
- Pretraining / finetuning paradigm
- Masked LMs - BERT, RoBERTa, ELECTRA
Slides
Reading Material
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]
[RoBERTa]
[ELECTRA]