| Date | Topic | Instructor | Assignments |
| Jan 11 | Overview of machine translation. The statistical approach to MT. [PDF, 1.5M] | Chiang |
Required:
|
|
|
Part One: Word-based alignment and translation |
|
|
|
Jan 13 |
IBM Models 1–5. | Knight |
Required:
Background:
|
| Jan 18 | IBM Models 1–5. |
Knight |
Required:
|
| Jan 20 | IBM Models 1–5. | Knight |
|
| Jan 25 |
n-gram language models. Absolute discounting and Kneser-Ney smoothing. |
Chiang |
Required:
|
|
Jan 27 Add/drop period ends |
n-gram language models continued. Very large language models. |
Chiang | Assignment 1 due. |
| Feb 1 | MT evaluation. BLEU. | Chiang | Koehn, ch. 8 |
|
|
Part Two: Phrase-based translation and discriminative training |
|
|
| Feb 3 | Phrase-based MT. Why do we need phrases. Relationship to EBMT. Phrase extraction. Estimating phrase translation probabilities and the problem of overfitting. |
Chiang |
Koehn, ch. 5 Marcu and Wong, "A phrase-based, joint probability model for statistical machine translation." In Proc. EMNLP, 2002. [PDF] |
| Feb 8 | From the noisy channel to linear models. Phrase features. |
Chiang |
|
| Feb 10 | Phrase reordering models. |
Chiang |
|
| Feb 15 | Phrase-based decoding. | Huang | Koehn, ch. 6 |
| Feb 17 | Phrase-based decoding cont. k-best lists. | Huang |
Assignment 2 due. Huang and Chiang, "Better k-best parsing." In Proc. IWPT, 2005. [PDF] Koehn, "Pharaoh: a beam search decoder for phrase-based statistical machine translation models." In Proc. AMTA, 2004. [PDF] |
| Feb 22 | Maximum entropy. Minimum error-rate training. | Chiang |
Koehn, ch. 9 |
| Feb 24 | Perceptron, max-margin methods. | Chiang |
|
| Mar 1 | System combination. | Chiang |
|
|
|
Interlude: Subword translation |
|
|
| Mar 3 | Transliteration. Integrating traditional translation rules. | Knight | Koehn, ch. 10 |
| Mar 8 | Integrating morphology into translation. |
Knight |
|
| Mar 10 | Decoding with lattices for morphology and word segmentation. | Knight |
Assignment 3 due. |
| Mar 15 | Spring break |
|
|
| Mar 17 | Spring break |
|
|
|
|
Part Three: Syntax-based translation |
|
|
| Mar 22 |
Hierarchical and syntax-based MT. Why do we need syntax. Synchronous context-free grammars and TSGs. |
Chiang |
Koehn, ch. 11 Chiang, "An introduction to synchronous grammars." |
| Mar 24 |
Extracting synchronous CFGs and TSGs from parallel data. Estimating rule probabilities and the problem of overfitting. |
Chiang |
|
| Mar 29 | Extracting synchronous TSGs from tree-tree data and the problem of nonisomorphism. | Chiang |
|
| Mar 31 | CKY decoding. | Huang | Chiang, "Hierarchical phrase-based translation." |
| Apr 5 | CKY with an n-gram language model. | Huang |
Assignment 4 due. |
| Apr 7 | More CKY decoding: Binarization. k-best lists. Decoding with lattices. | Huang |
Huang et al., "Binarization for Synchronous Context-Free Grammars" Huang and Chiang, "Better k-best Parsing" |
| Apr 12 | Source-side tree decoding. Target-side left-to-right decoding. | Huang |
Huang et al., "Statistical Syntax-Directed Translation" Huang and Mi, "Efficient Incremental Decoding for Tree-to-String Translation" |
| Apr 14 | Syntax-based language models. | Knight |
|
| Apr 19 | Beyond synchronous CFGs and TSGs. |
Knight |
Knight, "Capturing Practical Natural Language Transformations" |
| Apr 21 | Towards semantics-based translation. |
Knight |
|
| Apr 26 | Final project presentations |
|
|
| Apr 28 | Final project presentations |
|
|
Students are expected to submit only their own work for homework assignments. They may discuss the assignments with one another but may not collaborate with or copy from one another. University policies on academic integrity will be closely observed.
All assignments and the project will be due at the beginning of class on the due date. Late assignments will be accepted with a 7% penalty for each day after the due date, up to a week after the due date. No exceptions can be made except for a grave reason.
Statement for Students with Disabilities
Any student requesting academic accommodations based on a disability is required to register with Disability Services and Programs (DSP) each semester. A letter of verification for approved accommodations can be obtained from DSP. Please be sure the letter is delivered to me (or to TA) as early in the semester as possible. DSP is located in STU 301 and is open 8:30 a.m.–5:00 p.m., Monday through Friday. The phone number for DSP is (213) 740-0776.
Statement on Academic Integrity
USC seeks to maintain an optimal learning environment. General principles of academic honesty include the concept of respect for the intellectual property of others, the expectation that individual work will be submitted unless otherwise allowed by an instructor, and the obligations both to protect one’s own academic work from misuse by others as well as to avoid using another’s work as one’s own. All students are expected to understand and abide by these principles. Scampus, the Student Guidebook, contains the Student Conduct Code in Section 11.00, while the recommended sanctions are located in Appendix A: http://www.usc.edu/dept/publications/SCAMPUS/gov/. Students will be referred to the Office of Student Judicial Affairs and Community Standards for further review, should there be any suspicion of academic dishonesty. The Review process can be found at: http://www.usc.edu/student-affairs/SJACS/.