Bitext Alignment

Bitext Alignment PDF Author: Jörg Tiedemann
Publisher: Springer Nature
ISBN: 3031021428
Category : Computers
Languages : en
Pages : 153

Get Book Here

Book Description
This book provides an overview of various techniques for the alignment of bitexts. It describes general concepts and strategies that can be applied to map corresponding parts in parallel documents on various levels of granularity. Bitexts are valuable linguistic resources for many different research fields and practical applications. The most predominant application is machine translation, in particular, statistical machine translation. However, there are various other threads that can be followed which may be supported by the rich linguistic knowledge implicitly stored in parallel resources. Bitexts have been explored in lexicography, word sense disambiguation, terminology extraction, computer-aided language learning and translation studies to name just a few. The book covers the essential tasks that have to be carried out when building parallel corpora starting from the collection of translated documents up to sub-sentential alignments. In particular, it describes various approaches to document alignment, sentence alignment, word alignment and tree structure alignment. It also includes a list of resources and a comprehensive review of the literature on alignment techniques. Table of Contents: Introduction / Basic Concepts and Terminology / Building Parallel Corpora / Sentence Alignment / Word Alignment / Phrase and Tree Alignment / Concluding Remarks

Bitext Alignment

Bitext Alignment PDF Author: Jörg Tiedemann
Publisher: Springer Nature
ISBN: 3031021428
Category : Computers
Languages : en
Pages : 153

Get Book Here

Book Description
This book provides an overview of various techniques for the alignment of bitexts. It describes general concepts and strategies that can be applied to map corresponding parts in parallel documents on various levels of granularity. Bitexts are valuable linguistic resources for many different research fields and practical applications. The most predominant application is machine translation, in particular, statistical machine translation. However, there are various other threads that can be followed which may be supported by the rich linguistic knowledge implicitly stored in parallel resources. Bitexts have been explored in lexicography, word sense disambiguation, terminology extraction, computer-aided language learning and translation studies to name just a few. The book covers the essential tasks that have to be carried out when building parallel corpora starting from the collection of translated documents up to sub-sentential alignments. In particular, it describes various approaches to document alignment, sentence alignment, word alignment and tree structure alignment. It also includes a list of resources and a comprehensive review of the literature on alignment techniques. Table of Contents: Introduction / Basic Concepts and Terminology / Building Parallel Corpora / Sentence Alignment / Word Alignment / Phrase and Tree Alignment / Concluding Remarks

Bitext Alignment

Bitext Alignment PDF Author: Jörg Tiedemann
Publisher: Morgan & Claypool Publishers
ISBN: 1608455106
Category : Computers
Languages : en
Pages : 168

Get Book Here

Book Description
This book provides an overview of various techniques for the alignment of bitexts. It describes general concepts and strategies that can be applied to map corresponding parts in parallel documents on various levels of granularity. Bitexts are valuable linguistic resources for many different research fields and practical applications. The most predominant application is machine translation, in particular, statistical machine translation. However, there are various other threads that can be followed which may be supported by the rich linguistic knowledge implicitly stored in parallel resources. Bitexts have been explored in lexicography, word sense disambiguation, terminology extraction, computer-aided language learning and translation studies to name just a few. The book covers the essential tasks that have to be carried out when building parallel corpora starting from the collection of translated documents up to sub-sentential alignments. In particular, it describes various approaches to document alignment, sentence alignment, word alignment and tree structure alignment. It also includes a list of resources and a comprehensive review of the literature on alignment techniques. Table of Contents: Introduction / Basic Concepts and Terminology / Building Parallel Corpora / Sentence Alignment / Word Alignment / Phrase and Tree Alignment / Concluding Remarks

Parallel Text Processing

Parallel Text Processing PDF Author: Jean Véronis
Publisher: Springer Science & Business Media
ISBN: 9401725357
Category : Language Arts & Disciplines
Languages : en
Pages : 417

Get Book Here

Book Description
l This book evolved from the ARCADE evaluation exercise that started in 1995. The project's goal is to evaluate alignment systems for parallel texts, i. e. , texts accompanied by their translation. Thirteen teams from various places around the world have participated so far and for the first time, some ten to fifteen years after the first alignment techniques were designed, the community has been able to get a clear picture of the behaviour of alignment systems. Several chapters in this book describe the details of competing systems, and the last chapter is devoted to the description of the evaluation protocol and results. The remaining chapters were especially commissioned from researchers who have been major figures in the field in recent years, in an attempt to address a wide range of topics that describe the state of the art in parallel text processing and use. As I recalled in the introduction, the Rosetta stone won eternal fame as the prototype of parallel texts, but such texts are probably almost as old as the invention of writing. Nowadays, parallel texts are electronic, and they are be coming an increasingly important resource for building the natural language processing tools needed in the "multilingual information society" that is cur rently emerging at an incredible speed. Applications are numerous, and they are expanding every day: multilingual lexicography and terminology, machine and human translation, cross-language information retrieval, language learning, etc.

Handbook of Natural Language Processing

Handbook of Natural Language Processing PDF Author: Robert Dale
Publisher: CRC Press
ISBN: 9780824790004
Category : Business & Economics
Languages : en
Pages : 974

Get Book Here

Book Description
This study explores the design and application of natural language text-based processing systems, based on generative linguistics, empirical copus analysis, and artificial neural networks. It emphasizes the practical tools to accommodate the selected system.

Empirical Methods for Exploiting Parallel Texts

Empirical Methods for Exploiting Parallel Texts PDF Author: I. Dan Melamed
Publisher: MIT Press
ISBN: 9780262133807
Category : Computers
Languages : en
Pages : 224

Get Book Here

Book Description
This book lays out the theory and the practical techniques for discovering and applying translational equivalence at the lexical level. Parallel texts (bitexts) are a goldmine of linguistic knowledge, because the translation of a text into another language can be viewed as a detailed annotation of what that text means. Knowledge about translational equivalence, which can be gleaned from bitexts, is of central importance for applications such as manual and machine translation, cross-language information retrieval, and corpus linguistics. The availability of bitexts has increased dramatically since the advent of the Web, making their study an exciting new area of research in natural language processing. This book lays out the theory and the practical techniques for discovering and applying translational equivalence at the lexical level. It is a start-to-finish guide to designing and evaluating many translingual applications.

Handbook of Natural Language Processing

Handbook of Natural Language Processing PDF Author: Nitin Indurkhya
Publisher: CRC Press
ISBN: 142008593X
Category : Business & Economics
Languages : en
Pages : 704

Get Book Here

Book Description
The Handbook of Natural Language Processing, Second Edition presents practical tools and techniques for implementing natural language processing in computer systems. Along with removing outdated material, this edition updates every chapter and expands the content to include emerging areas, such as sentiment analysis.New to the Second EditionGreater

Advances in Multimodal Interfaces - ICMI 2000

Advances in Multimodal Interfaces - ICMI 2000 PDF Author: Tieniu Tan
Publisher: Springer
ISBN: 354040063X
Category : Computers
Languages : en
Pages : 692

Get Book Here

Book Description
Multimodal Interfaces represents an emerging interdisciplinary research direction and has become one of the frontiers in Computer Science. Multimodal interfaces aim at efficient, convenient and natural interaction and communication between computers (in their broadest sense) and human users. They will ultimately enable users to interact with computers using their everyday skills. These proceedings include the papers accepted for presentation at the Third International Conference on Multimodal Interfaces (ICMI 2000) held in Beijing, China on 1416 O ctober 2000. The papers were selected from 172 contributions submitted worldwide. Each paper was allocated for review to three members of the Program Committee, which consisted of more than 40 leading researchers in the field. Final decisions of 38 oral papers and 48 poster papers were made based on the reviewers’ comments and the desire for a balance of topics. The decision to have a single track conference led to a competitive selection process and it is very likely that some good submissions are not included in this volume. The papers collected here cover a wide range of topics such as affective and perceptual computing, interfaces for wearable and mobile computing, gestures and sign languages, face and facial expression analysis, multilingual interfaces, virtual and augmented reality, speech and handwriting, multimodal integration and application systems. They represent some of the latest progress in multimodal interfaces research.

Discriminative Alignment Models For Statistical Machine Translation

Discriminative Alignment Models For Statistical Machine Translation PDF Author: Nadi Tomeh
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
Bitext alignment is the task of aligning a text in a source language and its translation in the target language. Aligning amounts to finding the translational correspondences between textual units at different levels of granularity. Many practical natural language processing applications rely on bitext alignments to access the rich linguistic knowledge present in a bitext. While the most predominant application for bitexts is statistical machine translation, they are also used in multilingual (and monolingual) lexicography, word sense disambiguation, terminology extraction, computer-aided language learning andtranslation studies, to name a few.Bitext alignment is an arduous task because meaning is not expressed seemingly across languages. It varies along linguistic properties and cultural backgrounds of different languages, and also depends on the translation strategy that have been used to produce the bitext.Current practices in bitext alignment model the alignment as a hidden variable in the translation process. In order to reduce the complexity of the task, such approaches suppose that a word in the source sentence is aligned to one word at most in the target sentence.However, this over-simplistic assumption results in asymmetric, one-to-many alignments, whereas alignments are typically symmetric and many-to-many.To achieve symmetry, two one-to-many alignments in opposite translation directions are built and combined using a heuristic.In order to use these word alignments in phrase-based translation systems which use phrases instead of words, a heuristic is used to extract phrase pairs that are consistent with the word alignment.In this dissertation we address both the problems of word alignment and phrase pairs extraction.We improve the state of the art in several ways using discriminative learning techniques.We present a maximum entropy (MaxEnt) framework for word alignment.In this framework, links are predicted independently from one another using a MaxEnt classifier.The interaction between alignment decisions is approximated using stackingtechniques, which allows us to account for a part of the structural dependencies without increasing the complexity. This formulation can be seen as an alignment combination method,in which the union of several input alignments is used to guide the output alignment. Additionally, input alignments are used to compute a rich set of feature functions.Our MaxEnt aligner obtains state of the art results in terms of alignment quality as measured by thealignment error rate, and translation quality as measured by BLEU on large-scale Arabic-English NIST'09 systems.We also present a translation quality informed procedure for both extraction and evaluation of phrase pairs. We reformulate the problem in the supervised framework in which we decide for each phrase pair whether we keep it or not in the translation model. This offers a principled way to combine several features to make the procedure more robust to alignment difficulties. We use a simple and effective method, based on oracle decoding,to annotate phrase pairs that are useful for translation. Using machine learning techniques based on positive examples only,these annotations can be used to learn phrase alignment decisions. Using this approach we obtain improvements in BLEU scores for recall-oriented translation models, which are suitable for small training corpora.

Intercultural Collaboration

Intercultural Collaboration PDF Author: Toru Ishida
Publisher: Springer
ISBN: 3540740007
Category : Computers
Languages : en
Pages : 406

Get Book Here

Book Description
This book presents 29 revised invited and selected lectures given by top-researchers at the First International Workshop on Intercultural Collaboration, IWIC 2007, held in Kyoto, Japan. This state-of-the-art survey increases mutual understanding in our multicultural world by featuring collaboration support, social psychological analyses of intercultural interaction, and case studies from field workers.

Information Retrieval Technology

Information Retrieval Technology PDF Author: Mohamed Vall Mohamed Salem
Publisher: Springer Science & Business Media
ISBN: 3642256309
Category : Computers
Languages : en
Pages : 639

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 7th Asia Information Retrieval Societies Conference AIRS 2011, held in Dubai, United Arab Emirates, in December 2011. The 31 revised full papers and 25 revised poster papers presented were carefully reviewed and selected from 132 submissions. All current aspects of information retrieval - in theory and practice - are addressed; the papers are organized in topical sections on information retrieval models and theories; information retrieval applications and multimedia information retrieval; user study, information retrieval evaluation and interactive information retrieval; Web information retrieval, scalability and adversarial information retrieval; machine learning for information retrieval; natural language processing for information retrieval; arabic script text processing and retrieval.