Author: Inguna Skadiņa
Publisher: Springer
ISBN: 3319990047
Category : Computers
Languages : en
Pages : 326
Book Description
This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.
Using Comparable Corpora for Under-Resourced Areas of Machine Translation
Author: Inguna Skadiņa
Publisher: Springer
ISBN: 3319990047
Category : Computers
Languages : en
Pages : 326
Book Description
This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.
Publisher: Springer
ISBN: 3319990047
Category : Computers
Languages : en
Pages : 326
Book Description
This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.
Building and Using Comparable Corpora
Author: Serge Sharoff
Publisher: Springer Science & Business Media
ISBN: 3642201288
Category : Computers
Languages : en
Pages : 333
Book Description
The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.
Publisher: Springer Science & Business Media
ISBN: 3642201288
Category : Computers
Languages : en
Pages : 333
Book Description
The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.
Neural Machine Translation
Author: Philipp Koehn
Publisher: Cambridge University Press
ISBN: 1108497322
Category : Computers
Languages : en
Pages : 409
Book Description
Learn how to build machine translation systems with deep learning from the ground up, from basic concepts to cutting-edge research.
Publisher: Cambridge University Press
ISBN: 1108497322
Category : Computers
Languages : en
Pages : 409
Book Description
Learn how to build machine translation systems with deep learning from the ground up, from basic concepts to cutting-edge research.
Parallel Corpora for Contrastive and Translation Studies
Author: Irene Doval
Publisher: John Benjamins Publishing Company
ISBN: 9027262845
Category : Language Arts & Disciplines
Languages : en
Pages : 313
Book Description
This volume assesses the state of the art of parallel corpus research as a whole, reporting on advances in both recent developments of parallel corpora – with some particular references to comparable corpora as well– and in ways of exploiting them for a variety of purposes. The first part of the book is devoted to new roles that parallel corpora can and should assume in translation studies and in contrastive linguistics, to the usefulness and usability of parallel corpora, and to advances in parallel corpus alignment, annotation and retrieval. There follows an up-to-date presentation of a number of parallel corpus projects currently being carried out in Europe, some of them multimodal, with certain chapters illustrating case studies developed on the basis of the corpora at hand. In most of these chapters, attention is paid to specific technical issues of corpus building. The third part of the book reflects on specific applications and on the creation of bilingual resources from parallel corpora. This volume will be welcomed by scholars, postgraduate and PhD students in the fields of contrastive linguistics, translation studies, lexicography, language teaching and learning, machine translation, and natural language processing.
Publisher: John Benjamins Publishing Company
ISBN: 9027262845
Category : Language Arts & Disciplines
Languages : en
Pages : 313
Book Description
This volume assesses the state of the art of parallel corpus research as a whole, reporting on advances in both recent developments of parallel corpora – with some particular references to comparable corpora as well– and in ways of exploiting them for a variety of purposes. The first part of the book is devoted to new roles that parallel corpora can and should assume in translation studies and in contrastive linguistics, to the usefulness and usability of parallel corpora, and to advances in parallel corpus alignment, annotation and retrieval. There follows an up-to-date presentation of a number of parallel corpus projects currently being carried out in Europe, some of them multimodal, with certain chapters illustrating case studies developed on the basis of the corpora at hand. In most of these chapters, attention is paid to specific technical issues of corpus building. The third part of the book reflects on specific applications and on the creation of bilingual resources from parallel corpora. This volume will be welcomed by scholars, postgraduate and PhD students in the fields of contrastive linguistics, translation studies, lexicography, language teaching and learning, machine translation, and natural language processing.
Corpus Use in Cross-linguistic Research
Author: Marlén Izquierdo
Publisher: John Benjamins Publishing Company
ISBN: 9027249318
Category : Language Arts & Disciplines
Languages : en
Pages : 245
Book Description
Cross-linguistic research is a fruitful field of language inquiry that has benefited enormously from the use of corpora. As sources of linguistic data of various kinds and as tools for language processing, corpora have shaped the development of cross-linguistic research, enabling both language description and practical applications. This volume contains twelve studies that emphasize the usefulness and usability of parallel corpora in accurately exploring the structure and use of seven under-researched languages and language varieties. The first part emphasizes the role of corpus-based descriptive analyses at the lexicogrammatical and discursive levels, as a first step on the way towards concrete applications like translation or language teaching. The second part focuses on the role of parallel-corpus-based language processing techniques and applications that facilitate professional communication. This book will be of interest to scholars in contrastive linguistics, translation studies, discourse analysis, language teaching, and natural language processing.
Publisher: John Benjamins Publishing Company
ISBN: 9027249318
Category : Language Arts & Disciplines
Languages : en
Pages : 245
Book Description
Cross-linguistic research is a fruitful field of language inquiry that has benefited enormously from the use of corpora. As sources of linguistic data of various kinds and as tools for language processing, corpora have shaped the development of cross-linguistic research, enabling both language description and practical applications. This volume contains twelve studies that emphasize the usefulness and usability of parallel corpora in accurately exploring the structure and use of seven under-researched languages and language varieties. The first part emphasizes the role of corpus-based descriptive analyses at the lexicogrammatical and discursive levels, as a first step on the way towards concrete applications like translation or language teaching. The second part focuses on the role of parallel-corpus-based language processing techniques and applications that facilitate professional communication. This book will be of interest to scholars in contrastive linguistics, translation studies, discourse analysis, language teaching, and natural language processing.
Human Language Technologies
Author: Inguna Skadina
Publisher: IOS Press
ISBN: 1607506408
Category : Computers
Languages : en
Pages : 264
Book Description
This book contains papers from the Fourth International Conference on Human Language Technologies - the Baltic Perspective (Baltic HLT 2010), held in Riga in October 2010. This conference is the latest in a series which provides a forum for sharing recent advances in human language processing, and promotes cooperation between the computer science and linguistics communities of the Baltic countries and the rest of the world. Bringing together scientists, developers, providers and users, the conference is an opportunity to exchange information, discuss problems, find new synergies, and promote i.
Publisher: IOS Press
ISBN: 1607506408
Category : Computers
Languages : en
Pages : 264
Book Description
This book contains papers from the Fourth International Conference on Human Language Technologies - the Baltic Perspective (Baltic HLT 2010), held in Riga in October 2010. This conference is the latest in a series which provides a forum for sharing recent advances in human language processing, and promotes cooperation between the computer science and linguistics communities of the Baltic countries and the rest of the world. Bringing together scientists, developers, providers and users, the conference is an opportunity to exchange information, discuss problems, find new synergies, and promote i.
Empirical Translation Studies
Author: Gert De Sutter
Publisher: Walter de Gruyter GmbH & Co KG
ISBN: 3110459582
Category : Language Arts & Disciplines
Languages : en
Pages : 324
Book Description
The present volume is devoted to the study of language use in translated texts as a function of various linguistic, contextual and cognitive factors. It contributes to the recent trend in empirical translation studies towards more methodological sophistication, including mixed methodology designs and multivariate statistical analyses, ultimately leading to a more accurate understanding of language use in translations.
Publisher: Walter de Gruyter GmbH & Co KG
ISBN: 3110459582
Category : Language Arts & Disciplines
Languages : en
Pages : 324
Book Description
The present volume is devoted to the study of language use in translated texts as a function of various linguistic, contextual and cognitive factors. It contributes to the recent trend in empirical translation studies towards more methodological sophistication, including mixed methodology designs and multivariate statistical analyses, ultimately leading to a more accurate understanding of language use in translations.
Multilingual and Multimodal Information Access Evaluation
Author: Maristella Agosti
Publisher: Springer Science & Business Media
ISBN: 3642159974
Category : Computers
Languages : en
Pages : 158
Book Description
In its ?rst ten years of activities (2000-2009), the Cross-Language Evaluation Forum (CLEF) played a leading role in stimulating investigation and research in a wide range of key areas in the information retrieval domain, such as cro- language question answering, image and geographic information retrieval, int- activeretrieval,and many more.It also promotedthe study andimplementation of appropriateevaluation methodologies for these diverse types of tasks and - dia. As a result, CLEF has been extremely successful in building a wide, strong, and multidisciplinary research community, which covers and spans the di?erent areasofexpertiseneededto dealwith thespreadofCLEFtracksandtasks.This constantly growing and almost completely voluntary community has dedicated an incredible amount of e?ort to making CLEF happen and is at the core of the CLEF achievements. CLEF 2010 represented a radical innovation of the “classic CLEF” format and an experiment aimed at understanding how “next generation” evaluation campaigns might be structured. We had to face the problem of how to innovate CLEFwhile still preservingits traditionalcorebusiness,namely the benchma- ing activities carried out in the various tracks and tasks. The consensus, after lively and community-wide discussions, was to make CLEF an independent four-day event, no longer organized in conjunction with the European Conference on Research and Advanced Technology for Digital Libraries (ECDL) where CLEF has been running as a two-and-a-half-day wo- shop. CLEF 2010 thus consisted of two main parts: a peer-reviewed conference – the ?rst two days – and a series of laboratories and workshops – the second two days.
Publisher: Springer Science & Business Media
ISBN: 3642159974
Category : Computers
Languages : en
Pages : 158
Book Description
In its ?rst ten years of activities (2000-2009), the Cross-Language Evaluation Forum (CLEF) played a leading role in stimulating investigation and research in a wide range of key areas in the information retrieval domain, such as cro- language question answering, image and geographic information retrieval, int- activeretrieval,and many more.It also promotedthe study andimplementation of appropriateevaluation methodologies for these diverse types of tasks and - dia. As a result, CLEF has been extremely successful in building a wide, strong, and multidisciplinary research community, which covers and spans the di?erent areasofexpertiseneededto dealwith thespreadofCLEFtracksandtasks.This constantly growing and almost completely voluntary community has dedicated an incredible amount of e?ort to making CLEF happen and is at the core of the CLEF achievements. CLEF 2010 represented a radical innovation of the “classic CLEF” format and an experiment aimed at understanding how “next generation” evaluation campaigns might be structured. We had to face the problem of how to innovate CLEFwhile still preservingits traditionalcorebusiness,namely the benchma- ing activities carried out in the various tracks and tasks. The consensus, after lively and community-wide discussions, was to make CLEF an independent four-day event, no longer organized in conjunction with the European Conference on Research and Advanced Technology for Digital Libraries (ECDL) where CLEF has been running as a two-and-a-half-day wo- shop. CLEF 2010 thus consisted of two main parts: a peer-reviewed conference – the ?rst two days – and a series of laboratories and workshops – the second two days.
Corpus-Based Translation Studies
Author: Alet Kruger
Publisher: Bloomsbury Publishing
ISBN: 144118919X
Category : Language Arts & Disciplines
Languages : en
Pages : 321
Book Description
This is a collection of leading research within corpus-based translation studies (CTS). CTS is now recognized as a major paradigm that has transformed analysis within the discipline of translation studies. It can be defined as the use of corpus linguistic technologies to inform and elucidate the translation process, something that is increasingly accessible through advances in computer technology. The book pulls together a wide range of perspectives from respected authors in the field. All the chapters deal with the implementation of the basic concepts and methodologies, providing the reader with practical tools for their own research. The book addresses key issues in corpus analysis, including online corpora and corpus construction, and covers both translation and interpreting. The authors look at various languages and utilize a variety of approaches, qualitative and quantitative, reflecting the breadth of the field and providing many valuable examples of the methodology at work.
Publisher: Bloomsbury Publishing
ISBN: 144118919X
Category : Language Arts & Disciplines
Languages : en
Pages : 321
Book Description
This is a collection of leading research within corpus-based translation studies (CTS). CTS is now recognized as a major paradigm that has transformed analysis within the discipline of translation studies. It can be defined as the use of corpus linguistic technologies to inform and elucidate the translation process, something that is increasingly accessible through advances in computer technology. The book pulls together a wide range of perspectives from respected authors in the field. All the chapters deal with the implementation of the basic concepts and methodologies, providing the reader with practical tools for their own research. The book addresses key issues in corpus analysis, including online corpora and corpus construction, and covers both translation and interpreting. The authors look at various languages and utilize a variety of approaches, qualitative and quantitative, reflecting the breadth of the field and providing many valuable examples of the methodology at work.
Computational Linguistics and Intelligent Text Processing
Author: Alexander Gelbukh
Publisher: Springer
ISBN: 3319181114
Category : Computers
Languages : en
Pages : 678
Book Description
The two volumes LNCS 9041 and 9042 constitute the proceedings of the 16th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2015, held in Cairo, Egypt, in April 2015. The total of 95 full papers presented was carefully reviewed and selected from 329 submissions. They were organized in topical sections on grammar formalisms and lexical resources; morphology and chunking; syntax and parsing; anaphora resolution and word sense disambiguation; semantics and dialogue; machine translation and multilingualism; sentiment analysis and emotion detection; opinion mining and social network analysis; natural language generation and text summarization; information retrieval, question answering, and information extraction; text classification; speech processing; and applications.
Publisher: Springer
ISBN: 3319181114
Category : Computers
Languages : en
Pages : 678
Book Description
The two volumes LNCS 9041 and 9042 constitute the proceedings of the 16th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2015, held in Cairo, Egypt, in April 2015. The total of 95 full papers presented was carefully reviewed and selected from 329 submissions. They were organized in topical sections on grammar formalisms and lexical resources; morphology and chunking; syntax and parsing; anaphora resolution and word sense disambiguation; semantics and dialogue; machine translation and multilingualism; sentiment analysis and emotion detection; opinion mining and social network analysis; natural language generation and text summarization; information retrieval, question answering, and information extraction; text classification; speech processing; and applications.