Lectures

Course Overview / Introduction (Jan 14)

Content

  • Course logistics
  • What is natural language processing?
  • What are the features of natural language?
  • What do we want to do with NLP?
  • What makes it hard?

Slides

Course Oveview

Reading Material

Text Classification (Jan 16)

Content

  • Defining features
  • Building a rule-based classifier
  • Training a logistic regression based classifier
  • Evaluating classification

Slides

Text Classification

Reading Material

Neural Network Basics (Jan 21)

Content

  • Cross Entropy Loss
  • Gradient Descent
  • Components of a feedforward neural network

Slides

Neural Network Basics

Reading Material

Word Vectors (Jan 23)

Content

  • Deep Averaging Network for Text CLassification
  • Lexical Semantics
  • Distributional Semantics
  • Evaluating Word Vectors

Slides

Word Vectors

Reading Material

Word Vectors / Language Modeling I (Jan 28)

Content

  • Distributional Semantics
  • Evaluating Word Vectors
  • What is a language model
  • How to evaluate a language model

Slides

Language Modeling

Reading Material

[Eisenstein 6.1-6.2, 6.4]

Language Modeling (Jan 30)

Content

  • Feedforward Language Model
  • Recurrent Neural LM, Attention
  • Building blocks of a transformer

Slides

Neural LM

Reading Material

[J&M Chapter 8, 9]

[Eisenstein 6.3]

[Luong15]

[Illustrated Transformer]

Transformers (Feb 4)

Content

  • Self attention
  • Transformer Encoder
  • Transformer Decoder (Cross Attention, Masked Self Attention)
  • Impact of transformers

Slides

Transformers

Reading Material

[Illustrated Transformer]

[J&M Chapter 9]

[Attention is all you need]

Transformer LMs continued

Content

  • Transformer Decoder (Cross Attention, Masked Self Attention)
  • Training and inference from a decoder-only autoregressive LM
  • Impact of transformers

Slides

Tokenization

Reading Material

[Illustrated Transformer]

[J&M Chapter 9]

[Attention is all you need]

Tokenization (Feb 11)

Tokenization Contd. / Pretraining I (Feb 13)

Content

  • Unigram tokenizer
  • Pretraining / finetuning paradigm
  • Masked LMs - BERT, RoBERTa, NeoBERT

Slides

Masked LMs

Reading Material

[Illustrated BERT)]

[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]

[RoBERTa]

Pretraining II (February 18)

Content

  • T5 / BART / UL2 / GPT2
  • Decoding strategies

Slides

Pretraining II

Reading Material

[What happend to BERT/T5]

[Decoding strategies]

Pretraining II (Feb 20)

Content

  • Scaling
  • Prompting
  • In-context learning
  • CoT

Slides

Pretraining II

Reading Material

Section 7.3 in Jurafsky & Martin