This course is a part of "Deep Learning for NLP" Series. In this course, I will introduce concepts like Cross lingual benchmarks and models. These concepts form the base for multi-lingual and cross-lingual processing using advanced deep learning models for natural language understanding and generation across languages.
Often times, I hear from various product teams: "My product is in en-US only. I want to quickly scale to global markets with cost-effective solutions.", or "I have a new feature. How can I sim-ship to multiple markets?" This course is motivated by such needs. In this course the goal is to try to answer such questions.
The course consists of two main sections as follows. In both the sections, I will talk about some cross-lingual models as well as benchmarks.
In the first section, I will talk about cross-lingual benchmark datasets like XNLI and XGLUE. I will also talk about initial cross-lingual models like mBERT, XLM, Unicoder, XLM-R, and BERT with adaptors. Most of these models are encoder-based models. We will also talk about basic ways of cross-lingual modeling like translate-train, translate-test, multi-lingual translate-train-all, and zero shot cross-lingual transfer.
In the second section, I will talk about cross-lingual benchmark datasets like XTREME and XTREME-R. I will also talk about cross-lingual models like XNLG, mBART, InfoXLM, FILTER and mT5. Some of these models are encoder-only models like InfoXLM or FILTER while others can be used for encoder-decoder cross-lingual modeling like XNLG, mBART and mT5.
For each model, we will discuss specific pretraining losses, pretraining strategy, architecture and results obtained for pretraining as well as downstream tasks.