Graham Neubig's Research

I am currently a doctoral candidate at the Kyoto University Graduate School of Informatics, affiliated with the Media Archiving Research Laboratory. Some of my research interests include:

Academic/Career History

Awards

Software I've Developed

Kylm ("KYoto Language Modeling toolkit"): A language modeling toolkit written in Java. It currently is able to train n-gram models with a variety of smoothing techniques. Eventually it will have the ability to perform detailed comparisons of a number of different types of language models, and simply model unknown words using sub-word structure (characters).

Kyfd ("KYoto Fst Decoder"): A beam-search decoder for FST models written in C++. It features the ability to keep track of separate component weights for log-linear tuning, use hierarchical failure transitions, and handle lattice input.

KyTea ("KYoto Text Analysis toolkit"): A toolkit for text analysis including word (morpheme) segmentation and pronunciation estimation. It can be learned from partially annotated corpora, allowing for rapid domain adaptation.

dirichlet-topic.pl: A simple script that allows you to find representative words for a specific topic (using a model based on Dirichlet processes).

Research Papers

Conference Papers

I have also written (or co-authored) 8 papers in Japanese. See the Japanese page for details.

Other Links