Building and Using Comparable Corpora for Multilingual Natural Language Processing [electronic resource] / by Serge Sharoff, Reinhard Rapp, Pierre Zweigenbaum.
By: Sharoff, Serge [author.].
Contributor(s): Rapp, Reinhard [author.] | Zweigenbaum, Pierre [author.] | SpringerLink (Online service).
Material type: BookSeries: Synthesis Lectures on Human Language Technologies: Publisher: Cham : Springer International Publishing : Imprint: Springer, 2023Edition: 1st ed. 2023.Description: VIII, 133 p. 31 illus., 14 illus. in color. online resource.Content type: text Media type: computer Carrier type: online resourceISBN: 9783031313844.Subject(s): Natural language processing (Computer science) | Artificial intelligence | Application software | Computer science | Computational linguistics | Machine learning | Natural Language Processing (NLP) | Artificial Intelligence | Computer and Information Systems Applications | Computer Science | Computational Linguistics | Machine LearningAdditional physical formats: Printed edition:: No title; Printed edition:: No title; Printed edition:: No titleDDC classification: 006.35 Online resources: Click here to access onlineChapter 1 Introduction -- Chapter 2 Basic principles of cross-lingual models -- Chapter 3 Building comparable corpora -- Chapter 4 Extraction of parallel sentences -- Chapter 5 Induction of bilingual Dictionaries -- Chapter 6 Comparable and Parallel Corpora for Machine Translation -- Chapter 7 Other applications of comparable corpora -- Chapter 8 Conclusions and future research -- Index.
This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications.
There are no comments for this item.