Author: Ruslan Mitkov
Publisher: John Benjamins Publishing Company
ISBN: 9027264201
Category : Language Arts & Disciplines
Languages : en
Pages : 271
Book Description
The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully. This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues.
Multiword Units in Machine Translation and Translation Technology
Author: Ruslan Mitkov
Publisher: John Benjamins Publishing Company
ISBN: 9027264201
Category : Language Arts & Disciplines
Languages : en
Pages : 271
Book Description
The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully. This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues.
Publisher: John Benjamins Publishing Company
ISBN: 9027264201
Category : Language Arts & Disciplines
Languages : en
Pages : 271
Book Description
The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully. This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues.
Recent Advances in Multiword Units in Machine Translation and Translation Technology
Author: Johanna Monti
Publisher: John Benjamins Publishing Company
ISBN: 9027246386
Category : Language Arts & Disciplines
Languages : en
Pages : 276
Book Description
The investigation of phraseology through corpus-based and computational approaches holds significant relevance for various professionals, including translators, interpreters, terminologists, lexicographers, language instructors, and learners. Computational Phraseology, and in particular the computational analysis of multiword expressions (also known as multiword units), has gained prominence in recent years and is essential for a number of Natural Language Processing and Translation Technology applications. The failure to detect these units automatically could result in incorrect and problematic automatic translations and could hinder the performance of applications such as text summarisation and web search. Against this background, the volume offers 13 articles carefully selected and organised into two parts: ‘Computational treatment of multiword units’ and ‘Corpus-based and linguistic studies in phraseology‘. The contributions not only highlight the latest advancements in computational and corpus-based phraseology but also reiterate its vital role in all areas of language technologies, including basic and applied research.
Publisher: John Benjamins Publishing Company
ISBN: 9027246386
Category : Language Arts & Disciplines
Languages : en
Pages : 276
Book Description
The investigation of phraseology through corpus-based and computational approaches holds significant relevance for various professionals, including translators, interpreters, terminologists, lexicographers, language instructors, and learners. Computational Phraseology, and in particular the computational analysis of multiword expressions (also known as multiword units), has gained prominence in recent years and is essential for a number of Natural Language Processing and Translation Technology applications. The failure to detect these units automatically could result in incorrect and problematic automatic translations and could hinder the performance of applications such as text summarisation and web search. Against this background, the volume offers 13 articles carefully selected and organised into two parts: ‘Computational treatment of multiword units’ and ‘Corpus-based and linguistic studies in phraseology‘. The contributions not only highlight the latest advancements in computational and corpus-based phraseology but also reiterate its vital role in all areas of language technologies, including basic and applied research.
Multiword Expressions Acquisition
Author: Carlos Ramisch
Publisher: Springer
ISBN: 3319092073
Category : Computers
Languages : en
Pages : 233
Book Description
This book is an excellent introduction to multiword expressions. It provides a unique, comprehensive and up-to-date overview of this exciting topic in computational linguistics. The first part describes the diversity and richness of multiword expressions, including many examples in several languages. These constructions are not only complex and arbitrary, but also much more frequent than one would guess, making them a real nightmare for natural language processing applications. The second part introduces a new generic framework for automatic acquisition of multiword expressions from texts. Furthermore, it describes the accompanying free software tool, the mwetoolkit, which comes in handy when looking for expressions in texts (regardless of the language). Evaluation is greatly emphasized, underlining the fact that results depend on parameters like corpus size, language, MWE type, etc. The last part contains solid experimental results and evaluates the mwetoolkit, demonstrating its usefulness for computer-assisted lexicography and machine translation. This is the first book to cover the whole pipeline of multiword expression acquisition in a single volume. It is addresses the needs of students and researchers in computational and theoretical linguistics, cognitive sciences, artificial intelligence and computer science. Its good balance between computational and linguistic views make it the perfect starting point for anyone interested in multiword expressions, language and text processing in general.
Publisher: Springer
ISBN: 3319092073
Category : Computers
Languages : en
Pages : 233
Book Description
This book is an excellent introduction to multiword expressions. It provides a unique, comprehensive and up-to-date overview of this exciting topic in computational linguistics. The first part describes the diversity and richness of multiword expressions, including many examples in several languages. These constructions are not only complex and arbitrary, but also much more frequent than one would guess, making them a real nightmare for natural language processing applications. The second part introduces a new generic framework for automatic acquisition of multiword expressions from texts. Furthermore, it describes the accompanying free software tool, the mwetoolkit, which comes in handy when looking for expressions in texts (regardless of the language). Evaluation is greatly emphasized, underlining the fact that results depend on parameters like corpus size, language, MWE type, etc. The last part contains solid experimental results and evaluates the mwetoolkit, demonstrating its usefulness for computer-assisted lexicography and machine translation. This is the first book to cover the whole pipeline of multiword expression acquisition in a single volume. It is addresses the needs of students and researchers in computational and theoretical linguistics, cognitive sciences, artificial intelligence and computer science. Its good balance between computational and linguistic views make it the perfect starting point for anyone interested in multiword expressions, language and text processing in general.
Lexical Collocation Analysis
Author: Pascual Cantos-Gómez
Publisher: Springer
ISBN: 3319925822
Category : Social Science
Languages : en
Pages : 145
Book Description
This book re-examines the notion of word associations, more precisely collocations. It attempts to come to a potentially more generally applicable definition of collocation and how to best extract, identify and measure collocations. The book highlights the role played by (i) automatic linguistic annotation (part-of-speech tagging, syntactic parsing, etc.), (ii) using semantic criteria to facilitate the identification of collocations, (iii) multi-word structured, instead of the widespread assumption of bipartite collocational structures, for capturing the intricacies of the phenomenon of syntagmatic attraction, (iv) considering collocation and valency as near neighbours in the lexis-grammar continuum and (v) the mathematical properties of statistical association measures in the automatic extraction of collocations from corpora. This book is an ideal guide to the use of statistics in collocation analysis and lexicography, as well as a practical text to the development of skills in the application of computational lexicography. Lexical Collocation Analysis: Advances and Applications begins with a proposal for integrating both collocational and valency phenomena within the overarching theoretical framework of construction grammar. Next the book makes the case for integrating advances in syntactic parsing and in collocational analysis. Chapter 3 offers an innovative look at complementing corpus data and dictionaries in the identification of specific types of collocations consisting of restricted predicate-argument combinations. This strategy complements corpus collocational data with network analysis techniques applied to dictionary entries. Chapter 4 explains the potential of collocational graphs and networks both as a visualization tool and as an analytical technique. Chapter 5 introduces MERGE (Multi-word Expressions from the Recursive Grouping of Elements), a data-driven approach to the identification and extraction of multi-word expressions from corpora. Finally the book concludes with an analysis and evaluation of factors influencing the performance of collocation extraction methods in parsed corpora.
Publisher: Springer
ISBN: 3319925822
Category : Social Science
Languages : en
Pages : 145
Book Description
This book re-examines the notion of word associations, more precisely collocations. It attempts to come to a potentially more generally applicable definition of collocation and how to best extract, identify and measure collocations. The book highlights the role played by (i) automatic linguistic annotation (part-of-speech tagging, syntactic parsing, etc.), (ii) using semantic criteria to facilitate the identification of collocations, (iii) multi-word structured, instead of the widespread assumption of bipartite collocational structures, for capturing the intricacies of the phenomenon of syntagmatic attraction, (iv) considering collocation and valency as near neighbours in the lexis-grammar continuum and (v) the mathematical properties of statistical association measures in the automatic extraction of collocations from corpora. This book is an ideal guide to the use of statistics in collocation analysis and lexicography, as well as a practical text to the development of skills in the application of computational lexicography. Lexical Collocation Analysis: Advances and Applications begins with a proposal for integrating both collocational and valency phenomena within the overarching theoretical framework of construction grammar. Next the book makes the case for integrating advances in syntactic parsing and in collocational analysis. Chapter 3 offers an innovative look at complementing corpus data and dictionaries in the identification of specific types of collocations consisting of restricted predicate-argument combinations. This strategy complements corpus collocational data with network analysis techniques applied to dictionary entries. Chapter 4 explains the potential of collocational graphs and networks both as a visualization tool and as an analytical technique. Chapter 5 introduces MERGE (Multi-word Expressions from the Recursive Grouping of Elements), a data-driven approach to the identification and extraction of multi-word expressions from corpora. Finally the book concludes with an analysis and evaluation of factors influencing the performance of collocation extraction methods in parsed corpora.
Computational and Corpus-Based Phraseology
Author: Gloria Corpas Pastor
Publisher: Springer Nature
ISBN: 303115925X
Category : Computers
Languages : en
Pages : 252
Book Description
This book constitutes the refereed proceedings of the 4th International Conference on Computational and Corpus-Based Phraseology, Europhras 2022, held in Malaga, Spain, in September 2022. The 16 full papers presented in this book were carefully reviewed and selected from 59 submissions. The papers in this volume cover a number of topics including general corpus-based approaches to phraseology, phraseology in translation and cross-linguistic studies, phraseology in language teaching and learning, phraseology in specialized languages, phraseology in lexicography, cognitive approaches to phraseology, the computational treatment of multiword expressions, and the development, annotation, and exploitation of corpora for phraseological studies.
Publisher: Springer Nature
ISBN: 303115925X
Category : Computers
Languages : en
Pages : 252
Book Description
This book constitutes the refereed proceedings of the 4th International Conference on Computational and Corpus-Based Phraseology, Europhras 2022, held in Malaga, Spain, in September 2022. The 16 full papers presented in this book were carefully reviewed and selected from 59 submissions. The papers in this volume cover a number of topics including general corpus-based approaches to phraseology, phraseology in translation and cross-linguistic studies, phraseology in language teaching and learning, phraseology in specialized languages, phraseology in lexicography, cognitive approaches to phraseology, the computational treatment of multiword expressions, and the development, annotation, and exploitation of corpora for phraseological studies.
Computational and Corpus-Based Phraseology
Author: Ruslan Mitkov
Publisher: Springer
ISBN: 3319698052
Category : Computers
Languages : en
Pages : 464
Book Description
This book constitutes the refereed proceedings of the International Conference on Computational and Corpus-Based Phraseology, Europhras 2017, held in London, UK, in November 2017. The 31 full papers presented were carefully reviewed and selected from numerous submissions and are organized into the following thematic sessions: Phraseology in translation and contrastive studies, Lexicography and terminography, Exploitation of corpora in phraseological studies, Development of corpora for phraseological studies, Phraseology and language learning, Cognitive and cultural aspects of phraseology, Theoretical and descriptive approaches to phraseology, and Computational approaches to phraseology. The chapter 'Frequency Consolidation Among Word N-Grams' is available open access under a CC BY 4.0 license.
Publisher: Springer
ISBN: 3319698052
Category : Computers
Languages : en
Pages : 464
Book Description
This book constitutes the refereed proceedings of the International Conference on Computational and Corpus-Based Phraseology, Europhras 2017, held in London, UK, in November 2017. The 31 full papers presented were carefully reviewed and selected from numerous submissions and are organized into the following thematic sessions: Phraseology in translation and contrastive studies, Lexicography and terminography, Exploitation of corpora in phraseological studies, Development of corpora for phraseological studies, Phraseology and language learning, Cognitive and cultural aspects of phraseology, Theoretical and descriptive approaches to phraseology, and Computational approaches to phraseology. The chapter 'Frequency Consolidation Among Word N-Grams' is available open access under a CC BY 4.0 license.
Formalising Natural Languages with Nooj 2014
Author: Mario Monteleone
Publisher: Cambridge Scholars Publishing
ISBN: 1443884642
Category : Computers
Languages : en
Pages : 260
Book Description
This volume is composed of 22 peer-reviewed contributions selected from among the 52 presentations submitted for the 2014 International NooJ Conference held at the University of Sassari, Italy. NooJ is a linguistic development environment that allows linguists to formalize a wide range of linguistic phenomena, and then test, adapt, share and accumulate each elementary description so as to build linguistic “modules”, that is, structured libraries of linguistic resources. NooJ is also used as a corpus processor that can launch sophisticated queries over large corpora of texts, in order to produce various results, including concordances, statistical analyses, information extraction, and automatic translation. NooJ is used in many research centers all over the world, and linguistic modules are available for more than 20 languages. NooJ is also used by a growing number of software companies to develop various Natural Language Processing applications. Johanna Monti is Associate Professor at the University of Sassari, Italy, where she teaches Translation Studies, Computational Linguistics, and Machine-Translation and Computer-Aided Translation. She has acted as a member of the scientific committees of various renowned international conferences on Natural Language Processing, and as external evaluator for the Italian Ministry for Education, Universities and Research (MIUR) and the Horizon 2020 programme.
Publisher: Cambridge Scholars Publishing
ISBN: 1443884642
Category : Computers
Languages : en
Pages : 260
Book Description
This volume is composed of 22 peer-reviewed contributions selected from among the 52 presentations submitted for the 2014 International NooJ Conference held at the University of Sassari, Italy. NooJ is a linguistic development environment that allows linguists to formalize a wide range of linguistic phenomena, and then test, adapt, share and accumulate each elementary description so as to build linguistic “modules”, that is, structured libraries of linguistic resources. NooJ is also used as a corpus processor that can launch sophisticated queries over large corpora of texts, in order to produce various results, including concordances, statistical analyses, information extraction, and automatic translation. NooJ is used in many research centers all over the world, and linguistic modules are available for more than 20 languages. NooJ is also used by a growing number of software companies to develop various Natural Language Processing applications. Johanna Monti is Associate Professor at the University of Sassari, Italy, where she teaches Translation Studies, Computational Linguistics, and Machine-Translation and Computer-Aided Translation. She has acted as a member of the scientific committees of various renowned international conferences on Natural Language Processing, and as external evaluator for the Italian Ministry for Education, Universities and Research (MIUR) and the Horizon 2020 programme.
Computational Phraseology
Author: Gloria Corpas Pastor
Publisher: John Benjamins Publishing Company
ISBN: 9027261393
Category : Language Arts & Disciplines
Languages : en
Pages : 341
Book Description
Whether you wish to deliver on a promise, take a walk down memory lane or even on the wild side, phraseological units (also often referred to as phrasemes or multiword expressions) are present in most communicative situations and in all world’s languages. Phraseology, the study of phraseological units, has therefore become a rare unifying theme across linguistic theories. In recent years, an increasing number of studies have been concerned with the computational treatment of multiword expressions: these pertain among others to their automatic identification, extraction or translation, and to the role they play in various Natural Language Processing applications. Computational Phraseology is a comparatively new field where better understanding and more advances are urgently needed. This book aims to address this pressing need, by bringing together contributions focusing on different perspectives of this promising interdisciplinary field.
Publisher: John Benjamins Publishing Company
ISBN: 9027261393
Category : Language Arts & Disciplines
Languages : en
Pages : 341
Book Description
Whether you wish to deliver on a promise, take a walk down memory lane or even on the wild side, phraseological units (also often referred to as phrasemes or multiword expressions) are present in most communicative situations and in all world’s languages. Phraseology, the study of phraseological units, has therefore become a rare unifying theme across linguistic theories. In recent years, an increasing number of studies have been concerned with the computational treatment of multiword expressions: these pertain among others to their automatic identification, extraction or translation, and to the role they play in various Natural Language Processing applications. Computational Phraseology is a comparatively new field where better understanding and more advances are urgently needed. This book aims to address this pressing need, by bringing together contributions focusing on different perspectives of this promising interdisciplinary field.
Multiword expressions at length and in depth
Author: Stella Markantonatou
Publisher: Language Science Press
ISBN: 396110123X
Category : Computational linguistics
Languages : en
Pages : 408
Book Description
The annual workshop on multiword expressions takes place since 2001 in conjunction with major computational linguistics conferences and attracts the attention of an ever-growing community working on a variety of languages, linguistic phenomena and related computational processing issues. MWE 2017 took place in Valencia, Spain, and represented a vibrant panorama of the current research landscape on the computational treatment of multiword expressions, featuring many high-quality submissions. Furthermore, MWE 2017 included the first shared task on multilingual identification of verbal multiword expressions. The shared task, with extended communal work, has developed important multilingual resources and mobilised several research groups in computational linguistics worldwide. This book contains extended versions of selected papers from the workshop. Authors worked hard to include detailed explanations, broader and deeper analyses, and new exciting results, which were thoroughly reviewed by an internationally renowned committee. We hope that this distinctly joint effort will provide a meaningful and useful snapshot of the multilingual state of the art in multiword expressions modelling and processing, and will be a point point of reference for future work.
Publisher: Language Science Press
ISBN: 396110123X
Category : Computational linguistics
Languages : en
Pages : 408
Book Description
The annual workshop on multiword expressions takes place since 2001 in conjunction with major computational linguistics conferences and attracts the attention of an ever-growing community working on a variety of languages, linguistic phenomena and related computational processing issues. MWE 2017 took place in Valencia, Spain, and represented a vibrant panorama of the current research landscape on the computational treatment of multiword expressions, featuring many high-quality submissions. Furthermore, MWE 2017 included the first shared task on multilingual identification of verbal multiword expressions. The shared task, with extended communal work, has developed important multilingual resources and mobilised several research groups in computational linguistics worldwide. This book contains extended versions of selected papers from the workshop. Authors worked hard to include detailed explanations, broader and deeper analyses, and new exciting results, which were thoroughly reviewed by an internationally renowned committee. We hope that this distinctly joint effort will provide a meaningful and useful snapshot of the multilingual state of the art in multiword expressions modelling and processing, and will be a point point of reference for future work.
Computational Linguistics and Intelligent Text Processing
Author: Alexander Gelbukh
Publisher: Springer
ISBN: 3319754874
Category : Computers
Languages : en
Pages : 652
Book Description
The two-volume set LNCS 9623 + 9624 constitutes revised selected papers from the CICLing 2016 conference which took place in Konya, Turkey, in April 2016. The total of 89 papers presented in the two volumes was carefully reviewed and selected from 298 submissions. The book also contains 4 invited papers and a memorial paper on Adam Kilgarriff’s Legacy to Computational Linguistics. The papers are organized in the following topical sections: Part I: In memoriam of Adam Kilgarriff; general formalisms; embeddings, language modeling, and sequence labeling; lexical resources and terminology extraction; morphology and part-of-speech tagging; syntax and chunking; named entity recognition; word sense disambiguation and anaphora resolution; semantics, discourse, and dialog. Part II: machine translation and multilingualism; sentiment analysis, opinion mining, subjectivity, and social media; text classification and categorization; information extraction; and applications.
Publisher: Springer
ISBN: 3319754874
Category : Computers
Languages : en
Pages : 652
Book Description
The two-volume set LNCS 9623 + 9624 constitutes revised selected papers from the CICLing 2016 conference which took place in Konya, Turkey, in April 2016. The total of 89 papers presented in the two volumes was carefully reviewed and selected from 298 submissions. The book also contains 4 invited papers and a memorial paper on Adam Kilgarriff’s Legacy to Computational Linguistics. The papers are organized in the following topical sections: Part I: In memoriam of Adam Kilgarriff; general formalisms; embeddings, language modeling, and sequence labeling; lexical resources and terminology extraction; morphology and part-of-speech tagging; syntax and chunking; named entity recognition; word sense disambiguation and anaphora resolution; semantics, discourse, and dialog. Part II: machine translation and multilingualism; sentiment analysis, opinion mining, subjectivity, and social media; text classification and categorization; information extraction; and applications.