Course Description

CS 11-747
Language Technologies Institute, School of Computer Science
Carnegie Mellon University
Tuesday/Thursday 1:30-2:50PM, Scafie Hall 125


Graham Neubig (
  Office hours: Monday 2:00-3:00PM (GHC5409)
Antonios Anastasopoulos (
  Office hours: Monday 4:00-5:00PM (GHC5721)
TAs: (
  Chunting Zhou (Wednesday 4-5PM, GHC5705)
  Daniel Clothiaux (Friday 9-10AM, GHC5505)
  Danish (Tuesday 4-5PM, GHC6407)
  Jean-Baptiste Lamare (Wednesday 2-3PM, GHC5417)
  Junxian He (Wednesday 10-11AM, GHC6603)
  Vaibhav (Friday 4:30-5:30PM, Location GHC5417)
Questions and Discussion: Ideally in class or through piazza so we can share information with the class, but email and office hours are also OK.

Course Description

Neural networks provide powerful new tools for modeling language, and have been used both to improve the state-of-the-art in a number of tasks and to tackle new problems that were not easy in the past. This class will start with a brief overview of neural networks, then spend the majority of the class demonstrating how to apply neural networks to natural language problems. Each section will introduce a particular problem or phenomenon in natural language, describe why it is difficult to model, and demonstrate several models that were designed to tackle this problem. In the process of doing so, the class will cover different techniques that are useful in creating neural network models, including handling variably sized and structured sentences, efficient handling of large data, semi-supervised and unsupervised learning, structured prediction, and multilingual modeling.

Pre-requisites: 11-711 "Algorithms for NLP" or equivalent background is required. If you have not taken 11-711, I expect that you have enough NLP background to be able to complete its assignments (e.g. on n-gram language modeling, CKY parsing, and word alignment).

Class format: As the class aims to provide practical implementation skills necessary to create cutting-edge neural network models for NLP, the classes and assignments will be implementation-focused. In general classes will take the following format:

  • Reading: Before the class, you will be given a reading assignment that you should read before coming to class that day.
  • Quiz: At the beginning of class, there will be a short quiz that tests your knowledge of the reading assignment. (These quizzes should be easy if the reading assignment has been completed and understood.)
  • Summary/Elaboration/Questions: The instructor will summarize the important points of the reading material, elaborate on details that were not included in the reading while fielding any questions. Finally, new material on cutting-edge methods, or a deep look into one salient method will be covered
  • Code Walk: In some classes the TAs (or instructor) will walk through some demonstration code (usually in DyNet) that implements a simple version of the main concepts presented in the reading material.

Grading: The assignments will be given a grade of A+ (100), A (96), A- (92), B+ (88), B (85), B- (82), or below. The final grades will be determined based on the weighted average of the quizzes, assignments, and project. Cutoffs for final grades will be approximately 97+ A+, 94+ A, 90+ A-, 87+ B+, 84+ B, 80+ B-, etc., although I reserve some flexibility to change these thresholds slightly.

  • Quizzes: Worth 20% of the grade. Your lowest 2 quiz grades will be dropped. If you are sick or traveling on business (e.g. to a conference, for a job interview, or delayed in return due to visa issues), send a doctor's note or evidence of the reason for being away to the TA mailing list within a week of the absence, and you will be excused. We expect excused quizzes to be relatively rare, and if you'll be away for more than, e.g. 2 classes over the semester, please consult in advance.
  • Assignments: There will be 4 assignments, worth respectively 10%, 10%, 20%, 40% of the grade.
The details of the assignments are elaborated on the assignments page.