Graham Neubig's Research

Graham Neubig

neubig at ar.media.kyoto-u.ac.jp

I am currently a doctoral candidate at the Kyoto University Graduate School of Informatics, affiliated with the Media Archiving Research Laboratory. Some of my research interests include:

Academic/Career History

Awards

Internships

Activities

Teaching

Software/Resources I've Developed

pialign: A phrasal aligner for statistical machine translation that is able to produce compact, competitive phrase tables in a single step with no heuristics.

The Kyoto Free Translation Task: An evaluation task for machine translation with publicly available data. The target is Wikipedia articles about Kyoto.

latticelm: A tool for non-parametric bayesian unsupervised word-segmentation and language model learning using lattices. Lattices allow for learning over noisy input such as phoneme recognition results from continuous speech.

KyTea ("KYoto Text Analysis toolkit"): A toolkit for text analysis including word (morpheme) segmentation and pronunciation estimation. It can be learned from partially annotated corpora, allowing for rapid domain adaptation.

Kylm ("KYoto Language Modeling toolkit"): A language modeling toolkit written in Java. It currently is able to train n-gram models with a variety of smoothing techniques.

Kyfd ("KYoto Fst Decoder"): A beam-search decoder for FST models written in C++. It features the ability to keep track of separate component weights for log-linear tuning, use hierarchical failure transitions, and handle lattice input.

More can be found on my software page.

Research Papers

Journal Papers

Conference Papers

I have also written (or co-authored) 19 papers in Japanese. See the Japanese page for details.

Other Links