Author: Yuji Kawaguchi
Publisher: John Benjamins Publishing
ISBN: 9789027233189
Category : Language Arts & Disciplines
Languages : en
Pages : 464
Book Description
UBLI has conducted field surveys since 2002 and built spoken language corpora for French, Spanish, Italian (Salentino dialect), Russian, Malaysian, Turkish, Japanese, and Canadian multilinguals. This volume features new research presented at the UBLI second workshop on Corpus Linguistics Research Domain, which was held on September 14, 2006. The first part consisting of eleven presentations to this workshop shows a wide range of subjects within the area of corpus-based research, such as dictionary, linguistic atlas, dialect, translation, ancient texts, non-standard texts, sociolinguistics, second language acquisition, and natural language processing. The second part of this volume comprises ten additional contributions to both written and spoken corpora by the members and research assistants of UBLI.
Corpus-based Perspectives in Linguistics
Author: Yuji Kawaguchi
Publisher: John Benjamins Publishing
ISBN: 9789027233189
Category : Language Arts & Disciplines
Languages : en
Pages : 464
Book Description
UBLI has conducted field surveys since 2002 and built spoken language corpora for French, Spanish, Italian (Salentino dialect), Russian, Malaysian, Turkish, Japanese, and Canadian multilinguals. This volume features new research presented at the UBLI second workshop on Corpus Linguistics Research Domain, which was held on September 14, 2006. The first part consisting of eleven presentations to this workshop shows a wide range of subjects within the area of corpus-based research, such as dictionary, linguistic atlas, dialect, translation, ancient texts, non-standard texts, sociolinguistics, second language acquisition, and natural language processing. The second part of this volume comprises ten additional contributions to both written and spoken corpora by the members and research assistants of UBLI.
Publisher: John Benjamins Publishing
ISBN: 9789027233189
Category : Language Arts & Disciplines
Languages : en
Pages : 464
Book Description
UBLI has conducted field surveys since 2002 and built spoken language corpora for French, Spanish, Italian (Salentino dialect), Russian, Malaysian, Turkish, Japanese, and Canadian multilinguals. This volume features new research presented at the UBLI second workshop on Corpus Linguistics Research Domain, which was held on September 14, 2006. The first part consisting of eleven presentations to this workshop shows a wide range of subjects within the area of corpus-based research, such as dictionary, linguistic atlas, dialect, translation, ancient texts, non-standard texts, sociolinguistics, second language acquisition, and natural language processing. The second part of this volume comprises ten additional contributions to both written and spoken corpora by the members and research assistants of UBLI.
Corpus-based Computational Linguistics
Author: Clive Souter
Publisher: Rodopi
ISBN: 9789051834857
Category : Computers
Languages : en
Pages : 292
Book Description
Publisher: Rodopi
ISBN: 9789051834857
Category : Computers
Languages : en
Pages : 292
Book Description
Corpus-based and Computational Approaches to Discourse Anaphora
Author: Simon Botley
Publisher: John Benjamins Publishing
ISBN: 902722272X
Category : Language Arts & Disciplines
Languages : en
Pages : 264
Book Description
Discourse anaphora is a challenging linguistic phenomenon that has given rise to research in fields as diverse as linguistics, computational linguistics and cognitive science. Because of the diversity of approaches these fields bring to the anaphora problem, the editors of this volume argue that there needs to be a synthesis, or at least a principled attempt to draw the differing strands of anaphora research together. The selected papers in this volume all contribute to the aim of synthesis and were selected to represent the growing importance of corpus-based and computational approaches to anaphora description, and to developing natural language systems for resolving anaphora in natural language.
Publisher: John Benjamins Publishing
ISBN: 902722272X
Category : Language Arts & Disciplines
Languages : en
Pages : 264
Book Description
Discourse anaphora is a challenging linguistic phenomenon that has given rise to research in fields as diverse as linguistics, computational linguistics and cognitive science. Because of the diversity of approaches these fields bring to the anaphora problem, the editors of this volume argue that there needs to be a synthesis, or at least a principled attempt to draw the differing strands of anaphora research together. The selected papers in this volume all contribute to the aim of synthesis and were selected to represent the growing importance of corpus-based and computational approaches to anaphora description, and to developing natural language systems for resolving anaphora in natural language.
Corpus Linguistics and Statistics with R
Author: Guillaume Desagulier
Publisher: Springer
ISBN: 3319645722
Category : Computers
Languages : en
Pages : 359
Book Description
This textbook examines empirical linguistics from a theoretical linguist’s perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and text-based sensorimetrics. Chapters cover topics including corpus processing, frequencing data, and clustering methods. Case studies illustrate each chapter with accompanying data sets, R code, and exercises for use by readers. This book may be used in advanced undergraduate courses, graduate courses, and self-study.
Publisher: Springer
ISBN: 3319645722
Category : Computers
Languages : en
Pages : 359
Book Description
This textbook examines empirical linguistics from a theoretical linguist’s perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and text-based sensorimetrics. Chapters cover topics including corpus processing, frequencing data, and clustering methods. Case studies illustrate each chapter with accompanying data sets, R code, and exercises for use by readers. This book may be used in advanced undergraduate courses, graduate courses, and self-study.
The Computational Analysis of English
Author: Roger Garside
Publisher: Longman Publishing Group
ISBN:
Category : Computers
Languages : en
Pages : 216
Book Description
Publisher: Longman Publishing Group
ISBN:
Category : Computers
Languages : en
Pages : 216
Book Description
Natural Language Processing Using Very Large Corpora
Author: S. Armstrong
Publisher: Springer Science & Business Media
ISBN: 9401723907
Category : Language Arts & Disciplines
Languages : en
Pages : 314
Book Description
ABOUT THIS BOOK This book is intended for researchers who want to keep abreast of cur rent developments in corpus-based natural language processing. It is not meant as an introduction to this field; for readers who need one, several entry-level texts are available, including those of (Church and Mercer, 1993; Charniak, 1993; Jelinek, 1997). This book captures the essence of a series of highly successful work shops held in the last few years. The response in 1993 to the initial Workshop on Very Large Corpora (Columbus, Ohio) was so enthusias tic that we were encouraged to make it an annual event. The following year, we staged the Second Workshop on Very Large Corpora in Ky oto. As a way of managing these annual workshops, we then decided to register a special interest group called SIGDAT with the Association for Computational Linguistics. The demand for international forums on corpus-based NLP has been expanding so rapidly that in 1995 SIGDAT was led to organize not only the Third Workshop on Very Large Corpora (Cambridge, Mass. ) but also a complementary workshop entitled From Texts to Tags (Dublin). Obviously, the success of these workshops was in some measure a re flection of the growing popularity of corpus-based methods in the NLP community. But first and foremost, it was due to the fact that the work shops attracted so many high-quality papers.
Publisher: Springer Science & Business Media
ISBN: 9401723907
Category : Language Arts & Disciplines
Languages : en
Pages : 314
Book Description
ABOUT THIS BOOK This book is intended for researchers who want to keep abreast of cur rent developments in corpus-based natural language processing. It is not meant as an introduction to this field; for readers who need one, several entry-level texts are available, including those of (Church and Mercer, 1993; Charniak, 1993; Jelinek, 1997). This book captures the essence of a series of highly successful work shops held in the last few years. The response in 1993 to the initial Workshop on Very Large Corpora (Columbus, Ohio) was so enthusias tic that we were encouraged to make it an annual event. The following year, we staged the Second Workshop on Very Large Corpora in Ky oto. As a way of managing these annual workshops, we then decided to register a special interest group called SIGDAT with the Association for Computational Linguistics. The demand for international forums on corpus-based NLP has been expanding so rapidly that in 1995 SIGDAT was led to organize not only the Third Workshop on Very Large Corpora (Cambridge, Mass. ) but also a complementary workshop entitled From Texts to Tags (Dublin). Obviously, the success of these workshops was in some measure a re flection of the growing popularity of corpus-based methods in the NLP community. But first and foremost, it was due to the fact that the work shops attracted so many high-quality papers.
Advances in Corpus-based Contrastive Linguistics
Author: Karin Aijmer
Publisher: John Benjamins Publishing
ISBN: 9027272328
Category : Computers
Languages : en
Pages : 307
Book Description
Contrastive studies have experienced a dramatic revival in the last decades. By combining the methodological advantages of computer corpus linguistics and the possibility of contrasting texts in two or more languages, the structure and use of languages can be explored with greater accuracy, detail and empirical strength than before. The approach has also proved to have fruitful practical applications in a number of areas such as language teaching, lexicography, translation studies and computer-aided translation. This volume contains twelve studies comparing linguistic phenomena in English and seven other languages. The topics range from comparisons of specific lexical categories and word combinations to syntactic constructions and discourse phenomena such as cohesion and thematic structure. The studies highlight similarities and differences in the use, semantics and functions of the compared items, as well as the emergence of new meanings and language change. The emphasis varies from purely linguistic studies to those focusing on practical applications.
Publisher: John Benjamins Publishing
ISBN: 9027272328
Category : Computers
Languages : en
Pages : 307
Book Description
Contrastive studies have experienced a dramatic revival in the last decades. By combining the methodological advantages of computer corpus linguistics and the possibility of contrasting texts in two or more languages, the structure and use of languages can be explored with greater accuracy, detail and empirical strength than before. The approach has also proved to have fruitful practical applications in a number of areas such as language teaching, lexicography, translation studies and computer-aided translation. This volume contains twelve studies comparing linguistic phenomena in English and seven other languages. The topics range from comparisons of specific lexical categories and word combinations to syntactic constructions and discourse phenomena such as cohesion and thematic structure. The studies highlight similarities and differences in the use, semantics and functions of the compared items, as well as the emergence of new meanings and language change. The emphasis varies from purely linguistic studies to those focusing on practical applications.
Corpus-Based Computational Linguistics
Author: Souter
Publisher: BRILL
ISBN: 9004653546
Category : Computers
Languages : en
Pages : 288
Book Description
Publisher: BRILL
ISBN: 9004653546
Category : Computers
Languages : en
Pages : 288
Book Description
Practical Corpus Linguistics
Author: Martin Weisser
Publisher: John Wiley & Sons
ISBN: 1118831888
Category : Language Arts & Disciplines
Languages : en
Pages : 306
Book Description
This is the first book of its kind to provide a practical and student-friendly guide to corpus linguistics that explains the nature of electronic data and how it can be collected and analyzed. Designed to equip readers with the technical skills necessary to analyze and interpret language data, both written and (orthographically) transcribed Introduces a number of easy-to-use, yet powerful, free analysis resources consisting of standalone programs and web interfaces for use with Windows, Mac OS X, and Linux Each section includes practical exercises, a list of sources and further reading, and illustrated step-by-step introductions to analysis tools Requires only a basic knowledge of computer concepts in order to develop the specific linguistic analysis skills required for understanding/analyzing corpus data
Publisher: John Wiley & Sons
ISBN: 1118831888
Category : Language Arts & Disciplines
Languages : en
Pages : 306
Book Description
This is the first book of its kind to provide a practical and student-friendly guide to corpus linguistics that explains the nature of electronic data and how it can be collected and analyzed. Designed to equip readers with the technical skills necessary to analyze and interpret language data, both written and (orthographically) transcribed Introduces a number of easy-to-use, yet powerful, free analysis resources consisting of standalone programs and web interfaces for use with Windows, Mac OS X, and Linux Each section includes practical exercises, a list of sources and further reading, and illustrated step-by-step introductions to analysis tools Requires only a basic knowledge of computer concepts in order to develop the specific linguistic analysis skills required for understanding/analyzing corpus data
Natural Language Processing for Corpus Linguistics
Author: Jonathan Dunn
Publisher: Cambridge University Press
ISBN: 1009083740
Category : Language Arts & Disciplines
Languages : en
Pages : 149
Book Description
Corpus analysis can be expanded and scaled up by incorporating computational methods from natural language processing. This Element shows how text classification and text similarity models can extend our ability to undertake corpus linguistics across very large corpora. These computational methods are becoming increasingly important as corpora grow too large for more traditional types of linguistic analysis. We draw on five case studies to show how and why to use computational methods, ranging from usage-based grammar to authorship analysis to using social media for corpus-based sociolinguistics. Each section is accompanied by an interactive code notebook that shows how to implement the analysis in Python. A stand-alone Python package is also available to help readers use these methods with their own data. Because large-scale analysis introduces new ethical problems, this Element pairs each new methodology with a discussion of potential ethical implications.
Publisher: Cambridge University Press
ISBN: 1009083740
Category : Language Arts & Disciplines
Languages : en
Pages : 149
Book Description
Corpus analysis can be expanded and scaled up by incorporating computational methods from natural language processing. This Element shows how text classification and text similarity models can extend our ability to undertake corpus linguistics across very large corpora. These computational methods are becoming increasingly important as corpora grow too large for more traditional types of linguistic analysis. We draw on five case studies to show how and why to use computational methods, ranging from usage-based grammar to authorship analysis to using social media for corpus-based sociolinguistics. Each section is accompanied by an interactive code notebook that shows how to implement the analysis in Python. A stand-alone Python package is also available to help readers use these methods with their own data. Because large-scale analysis introduces new ethical problems, this Element pairs each new methodology with a discussion of potential ethical implications.