Lectures

Course Overview (Jan 8)

Content

  • Course logistics
  • What is natural language processing?
  • What are the features of natural language?
  • What do we want to do with NLP?
  • What makes it hard?

Slides

Course Oveview

Reading Material

Text Classification (Aug 29)

Content

  • Defining features
  • Building a rule-based classifier
  • Training a logistic regression based classifier
  • Evaluating classification

Slides

Text Classification

Reading Material

Neural Network Basics (Sept 3)

Content

  • Cross Entropy Loss
  • Gradient Descent
  • Components of a feedforward neural network

Slides

Neural Network Basics

Reading Material

Word Vectors (Sept 5)

Content

  • Deep Averaging Network for Text CLassification
  • Lexical Semantics
  • Distributional Semantics
  • Evaluating Word Vectors

Slides

Word Vectors

Reading Material

Language Modeling (Sept 10)

Content

  • What is a language model
  • How to evaluate a language model
  • How to build a language model - N-gram language model, a simple feedforward neural LM

Slides

Language Modeling

Reading Material

[Eisenstein 6.1-6.2, 6.4]

Language Modeling (Sept 12)

Content

  • Feedforward Language Model
  • Recurrent Neural LM, Attention
  • Building blocks of a transformer

Slides

Neural LM

Reading Material

[J&M Chapter 8, 9]

[Eisenstein 6.3]

[Luong15]

[Illustrated Transformer]

Transformers (Sept 17)

Content

  • Self attention
  • Transformer Encoder
  • Transformer Decoder (Cross Attention, Masked Self Attention)
  • Impact of transformers

Slides

Transformers

Reading Material

[Illustrated Transformer]

[J&M Chapter 9]

[Attention is all you need]

Tokenization (Sept 19)

Tokenization Contd. / Pretraining I (September 24)

Content

  • Unigram tokenizer
  • Pretraining / finetuning paradigm
  • Masked LMs - BERT, RoBERTa, ELECTRA

Slides

Masked LMs

Reading Material

[Illustrated BERT)]

[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]

[RoBERTa]

[ELECTRA]

Pretraining II (September 26)

Content

  • T5 / BART / UL2 / GPT2
  • Decoding strategies

Slides

Pretraining II

Reading Material

[What happend to BERT/T5]

[Decoding strategies]

Pretraining II (October 1)

Content

  • Scaling
  • Prompting
  • In-context learning
  • CoT

Slides

Pretraining II

Reading Material

Instruction Following (October 3)

Content

  • Instruction Tuning (T0, FLAN)
  • Evaluating Instruction Tuned LMs
  • Basics of RLHF

Slides

Instruction Following

Reading Material