Language Technologies Institute, School of Computer Science
Carnegie Mellon University
Tuesday/Thursday 4:30-5:50PM, Doherty Hall 1212
Instructor: Graham Neubig (firstname.lastname@example.org)
Office hours: Monday 4:00-5:00PM (GHC5409)
Qizhe Xie (Monday 10:00-11:00AM, GHC5414)
Varun Gangal (Tuesday 1:00-2:00PM, GHC5705)
Zihang Dai (Wednesday 10:00-11:00AM, GHC5507)
Junjie Hu (Wednesday 1:00-2:00PM, GHC5503)
Hiroaki Hayashi (Thursday 10:00-11:00AM, GHC5705)
Paul Michel (Friday 11:00AM-12:00PM, GHC5417)
Questions and Discussion: Ideally in class or through piazza so we can share information with the class, but email and office hours are also OK.
Neural networks provide powerful new tools for modeling language, and have been used both to improve the state-of-the-art in a number of tasks and to tackle new problems that were not easy in the past. This class will start with a brief overview of neural networks, then spend the majority of the class demonstrating how to apply neural networks to natural language problems. Each section will introduce a particular problem or phenomenon in natural language, describe why it is difficult to model, and demonstrate several models that were designed to tackle this problem. In the process of doing so, the class will cover different techniques that are useful in creating neural network models, including handling variably sized and structured sentences, efficient handling of large data, semi-supervised and unsupervised learning, structured prediction, and multilingual modeling.
Pre-requisites: 11-711 "Algorithms for NLP" or equivalent background is required. If you have not taken 11-711, I expect that you have enough NLP background to be able to complete its assignments (e.g. on n-gram language modeling, CKY parsing, and word alignment).
Class format: As the class aims to provide practical implementation skills necessary to implement cutting-edge neural network models for NLP, but the classes and assignments will be implementation-focused. In general classes will take the following format:
- Reading: Before the class, you will be given a reading assignment that you should read before coming to class that day.
- Quiz: At the beginning of class, there will be a short quiz that tests your knowledge of the reading assignment. (These quizzes should be easy if the reading assignment has been completed and understood.)
- Summary/Elaboration/Questions: The instructor will summarize the important points of the reading material, elaborate on details that were not included in the reading, and field any questions.
- Code Walk: The TAs (or instructor) will walk through some demonstration code in DyNet that implements a simple version of the main concepts presented in the reading material.
Grading: The assignments will be given a grade of A+ (100), A (96), A- (92), B+ (88), B (85), B- (82), or below. The final grades will be determined based on the weighted average of the quizzes, assignments, and project. Cutoffs for final grades will be approximately 97+ A+, 94+ A, 90+ A-, 87+ B+, 84+ B, 80+ B-, etc., although I reserve some flexibility to change these thresholds slightly.
- Quizzes: Worth 20% of the grade. Your lowest 2 quiz grades will be dropped. If you are sick or traveling on business (e.g. to a conference, for a job interview, or delayed in return due to visa issues), send a doctor's note or evidence of the reason for being away to the TA list within a week of the absence, and you will be excused. We expect excused quizzes to be relatively rare, and if you'll be away for more than, e.g. 2 classes over the semester, please consult in advance.
- Questionnaire/Environment Check: Near the beginning of the class, there will be a questionnaire to fill out, and also a check to make sure that you are actually able to run a neural network toolkit. This will be worth 5% of the grade.
- Checkpoints: There will be 2 "checkpoint" assignments, each worth 20% of the grade.
- Project: The final course project will be worth 35%.