CS 11-747: Neural Networks for NLP

Assignments

The aim of the assignment and project is to build the skills needed to do build cutting-edge systems or do cutting-edge research, culminating with a project that demonstrates these abilities through a project.

Read all the instructions on this page carefully
You are responsible for reading these instructions and following them carefully. If you do not, you may be marked down as a result.

Assignment Policies

Working in Teams: There are 4 assignments in the class. Assignment 1 must be done individually, while Assignments 2-4 must be done in teams of 2-3 (individual submissions will not be accepted for Assignments 2-4). If you are having trouble finding a group, the instructor and TAs will help you find one after the first initial survey.

Submission Information: To submit your assignment, send the following to the TA mailing list:

your names and andrew IDs
a report: This should be up to 5 pages for the assignments 1-3, and 9 pages for assignment 4. References are not included in the page count, and it is OK to submit appendices that include supplementary information such as hyperparameter settings or additional output examples, although there is no guarantee that the TAs will read them. Submissions that exceed the page count will be penalized one half grade for each page over (e.g. A to A- or A- to B+). Submit the PDF of the assignment as an attachment to your email. The moment this PDF is delivered to the TA mailing list is the time your assignment is treated as being turned in.
a link to a github repository containing your code: Your github repository must be viewable to the TAs and instructor by the submission deadline. If your repository is private make it accessible to us (github IDs neubig, antonisa, jblamare, jxhe, dbc148, MysteryVaibhav, violet-zct, danishpruthi). If your repository is not visible to us, your assignment will not be considered complete, so if you are worried please submit well in advance of the deadline so we can confirm the submission is visible.

Late Day Policy: In case there are unforeseen circumstances that don't let you turn in your assignment on time, 5 late days total over the first three assignments will be allowed (late days may not be applied to the final project, assignment 4). Note that other than these late days, we will not be making exceptions and extending deadlines, so please try to be responsible frugal with your late days and use them only if necessary. Note that the second assignment is harder than the first one, so it'd be a good idea to try to save your late days for the second assignment if possible. Assignments that are late beyond the allowed late days will be graded down one half-grade per day late.

Plagiarism/Code Reuse Policy: All assignments are expected to be conducted under the CMU policy for academic integrity. All rules here apply and violations will be subject to penalty including zero credit on the assignment, failing the course, or other disciplinary measures. In particular, in your implementation:

Pseudo-code provided by the TAs or instructor may be used freely without restriction.
You may not just re-use an existing implementation written by someone else. The implementation should basically be your own.
Fragments of code found online can be used (assuming the license so permits), but if they are significant, please cite these in your report. Failure to do so will be treated as being in violation of the assignment rules.
Code written by other students in the class not in your assignment group cannot be used.

If you are doing a similar project for a graded class at CMU (including independent studies), you must declare so on your report, and note which parts of the project are for 11-747, and which parts are for the other class. Consult with the TA mailing list if you are unsure.

Consulting w/ Instructors/TAs: For assignments and projects, you are free to consult as much as you want, any time you want with the instructors and TAs. That is what we're here for, and in no way is this considered cheating. In fact, if you don't have much experience with neural networks previously, it may be necessary to liberally consult with the instructors and TAs to learn about how to do the implementation and finish the assignments. So please do so.

Because this is a project-based course I assume that many of the students taking the course will be interested in turning their assignments or project into research papers. In this case, if you have received useful advice from the instructor or TAs that made the project significantly better, consider inviting them to be co-authors on the paper. Of course, you do not need to do so just because the paper is a result of the class, only if you feel that their advice or help made a contribution.

Details of Each Assignment

Assignment 1: Text Classifier and Initial Interest Survey (Due Survey 2/11, Implementation 2/15)

The first assignment is to be done individually and will have two components (1) a preliminary survey, and (2) an implementation of a baseline text classifier.

Initial Interest Survey: There will be an initial questionnaire regarding what task you are interested in tackling for the final project. If you already have a group then you can specify who will be in your group, but otherwise just write that you are still looking and we'll help introduce others in the class.

Implementation Component: You will be asked to implement a text classifier from scratch. Specifically, we have provided training/development/test data in the format "LABEL ||| SENTENCE" directly below, where the labels for the test data are all "UNK". You will need to download the data, train, and calculate the validation accuracy. In addition, please predict the labels for both the development and test sets and turn them in along with your report, one label per line format.

CS 11-747 Assignment 1 Topic Classification data

Here implementing ``from scratch'' means that you are not allowed to simply run existing code, nor copy large chunks from an existing implementation of the particular model of interest. This also includes models for pre-training contextualized word representations such as ELMo or BERT. If you want to use pre-training of contextualized word representations, you will have to re-implement it yourself. You are, however, allowed to use pre-trained word embeddings (of any kind you like).

The grading rubric is as follows:

A+: Exceptional or surprising. Goes far beyond most other submissions.
A: A complete implementation that achieves competitive results on the provided datasets, and a clear report of what was done.
A-: A complete implementation but with slightly lower result numbers or a less clear report.
B+: An implementation and reasonable evaluation numbers exist, but clearly do not match the baseline. Or if the result is clearly lacking detail.
B or below: Results or report are lacking.

Assignment 2: Project Proposal and Literature Survey (Due 2/25)

Assignment 2 is to be done as a group and will involve a proposal of a project topic and a literature survey regarding this topic. In the survey, explain the task that you would like to tackle in concrete terms, and also cover all of the relevant recent research on the topic. Note that you are still allowed to change topics if you think of something different after assignment 2, but you will need to confirm with the instructor first.

The grading rubric for the survey component is as follows:

A+: Exceptional or surprising. Goes far beyond most other submissions.
A: A complete survey that covers all the major relevant papers in the field and has a critical analysis of their strengths, weaknesses, or applicability.
A-: The survey has a good analysis but is missing a few pieces of relevant related work, or is quite complete but is lacking on critical analysis.
B+: The survey is either quite lacking in coverage or analysis, or is decent but not complete in both aspects.
B or B-: The survey is lacking in both coverage and analysis, but does make an attempt to cover some related research.
C+ or below: Clear lack of effort or incompleteness.

Assignment 3: State-of-the-art Reimplementation (Due 3/25)

Checkpoint 2 will involve reproducing the evaluation numbers of a state-of-the-art baseline model for the task of interest with code that you have implemented from scratch. In other words, you must get the same numbers as the previous paper on the same dataset.

In your report, also perform an analysis of what remaining errors this model makes (ideally with concrete examples of failure cases), and describe how you plan to create a new model for the final project that will address these error cases. If you are interested in tackling a task that does not have a neural baseline in the final project, you may also describe how you adopted the existing model to this new task and perform your error analysis on the new task (although you must report results on the task that the state-of-the-art model was originally designed for).

The grading rubric for this checkpoint is as follows:

A+: Exceptional or surprising. Goes far beyond most other submissions.
A: A complete re-implementation that meets or exceeds the state of the art. An analysis of the results, and forward-looking plans for further development.
A-: Similarly, a complete re-implementation with competitive result numbers, but less analysis or forward-looking plans for development than assignments rewarded an A.
B+: An implementation and evaluation numbers exist, but they do not match previous work in the field. Or the analysis or forward-looking plans may be seriously lacking.
B or B-: Two or more of the above three elements are lacking.
C+ or below: Clear lack of effort or incompleteness.

Assignment 4: Final Project (Due 5/7)

The final project work will be expected to be a novel research contribution that either (1) introduces new techniques for one of the existing tasks in the assignment using a significant amount of technical sophistication utilizing one of the more advanced techniques introduced in the class, or (2) tackles a new NLP task with a neural network model that is motivated by the unique problems posed by the application domain. The grading rubric is as follows:

A+: Exceptional or surprising. Goes far beyond most other submissions.
A: A respectable research contribution that is novel and effective, and could be submitted largely as-is as a paper to an academic conference.
A-: A respectable research contribution that has some small incomplete parts, but is largely complete and promising.
B+: An idea that is novel, but the results may not be there yet, or the analysis is short.
B or B-: Results, analysis, or novelty are lacking.
C+ or below: Clear lack of effort or incompleteness.

Example Tasks

Below is a list of suggested NLP tasks that you may use for your assignments and projects. It is completely fine, and highly encouraged, to tackle other tasks, but you must confirm with the instructor/TAs in the initial questionnaire (or if you decide to change after the questionnaire, please send us an email).

Neural Networksfor NLP