Building a National Corpus

Building a National Corpus PDF Author: Dawn Knight
Publisher: Springer Nature
ISBN: 3030818586
Category : Language Arts & Disciplines
Languages : en
Pages : 192

Get Book Here

Book Description
This book aims to provide a micro-level, working model of a methodological approach and practical guidelines for building a corpus, informed by the work on the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes - the National Corpus of Contemporary Welsh). It focuses specifically on the development of detailed design frames for corpora across communicative modes (spoken, written and e-language), and the practical processes involved in the planning, collection, transcription, collation and (re)presentation of language data. The book is designed to be of significant value and relevance to those interested in critically engaging with corpus methodology. Although Welsh is the language under discussion, the processes and approaches discussed in the building of CorCenCC can be applied to a lesser or greater extent to other language contexts. This book provides a working model, and an account of how to build a corpus dataset from which step by step guidelines for creating other linguistic corpora in any language can be easily extrapolated. It will be of value to students and scholars of minority languages and corpus linguistics.

Building a National Corpus

Building a National Corpus PDF Author: Dawn Knight
Publisher: Springer Nature
ISBN: 3030818586
Category : Language Arts & Disciplines
Languages : en
Pages : 192

Get Book Here

Book Description
This book aims to provide a micro-level, working model of a methodological approach and practical guidelines for building a corpus, informed by the work on the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes - the National Corpus of Contemporary Welsh). It focuses specifically on the development of detailed design frames for corpora across communicative modes (spoken, written and e-language), and the practical processes involved in the planning, collection, transcription, collation and (re)presentation of language data. The book is designed to be of significant value and relevance to those interested in critically engaging with corpus methodology. Although Welsh is the language under discussion, the processes and approaches discussed in the building of CorCenCC can be applied to a lesser or greater extent to other language contexts. This book provides a working model, and an account of how to build a corpus dataset from which step by step guidelines for creating other linguistic corpora in any language can be easily extrapolated. It will be of value to students and scholars of minority languages and corpus linguistics.

Developing Linguistic Corpora

Developing Linguistic Corpora PDF Author: Martin Wynne
Publisher: Oxbow Books Limited
ISBN:
Category : Language Arts & Disciplines
Languages : en
Pages : 100

Get Book Here

Book Description
A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

Overcoming Challenges in Corpus Construction

Overcoming Challenges in Corpus Construction PDF Author: Robbie Love
Publisher: Routledge
ISBN: 0429771096
Category : Language Arts & Disciplines
Languages : en
Pages : 183

Get Book Here

Book Description
This volume offers a critical examination of the construction of the Spoken British National Corpus 2014 (Spoken BNC2014) and points the way forward toward a more informed understanding of corpus linguistic methodology more broadly. The book begins by situating the creation of this second corpus, a compilation of new, publicly-accessible Spoken British English from the 2010s, within the context of the first, created in 1994, talking through the need to balance backward capability and optimal practice for today’s users. Chapters subsequently use the Spoken BNC2014 as a focal point around which to discuss the various considerations taken into account in corpus construction, including design, data collection, transcription, and annotation. The volume concludes by reflecting on the successes and limitations of the project, as well as the broader utility of the corpus in linguistic research, both in current examples and future possibilities. This exciting new contribution to the literature on linguistic methodology is a valuable resource for students and researchers in corpus linguistics, applied linguistics, and English language teaching.

English Corpus Linguistics

English Corpus Linguistics PDF Author: Charles F. Meyer
Publisher:
ISBN: 9780511044755
Category : Computational linguistics
Languages : en
Pages : 168

Get Book Here

Book Description
English Corpus Linguistics is a step-by-step guide to creating and analyzing linguistic corpora. The author shows how to collect and computerize data for inclusion in a corpus; how to annotate the data; and how to conduct a linguistic analysis of it once it has been created.

Using Corpora in Discourse Analysis

Using Corpora in Discourse Analysis PDF Author: Paul Baker
Publisher: Bloomsbury Publishing
ISBN: 1350083771
Category : Language Arts & Disciplines
Languages : en
Pages : 281

Get Book Here

Book Description
How can you carry out discourse analysis using corpus linguistics? What research questions should I ask? Which methods should you use and when? What is a collocational network or a key cluster? Introducing the major techniques, methods and tools for corpus-assisted analysis of discourse, this book answers these questions and more, showing readers how to best use corpora in their analyses of discourse. Using carefully tailored case studies, each chapter is devoted to a central technique, including frequency, concordancing and keywords, going step by step through the process of applying different analytical procedures. Introducing a wide range of different corpora, from holiday brochures to political debates, the book considers the key debates and latest advances in the field. Fully revised and updated, this new edition includes: - A new chapter on how to conduct research projects in corpus-based discourse analysis - Completely rewritten chapters on collocation and advanced techniques, using a corpus of jihadist propaganda texts and covering topics such as social media and visual analysis - Coverage of major tools, including CQPweb, AntConc, Sketch Engine and #LancsBox - Discussion of newer techniques including the derivation of lockwords and the comparison of multiple data sets for diachronic analysis With exercises, discussion questions and suggested further readings in each chapter, this book is an excellent guide to using corpus linguistics techniques to carry out discourse analysis.

Statistics in Corpus Linguistics

Statistics in Corpus Linguistics PDF Author: Vaclav Brezina
Publisher: Cambridge University Press
ISBN: 1107125707
Category : Foreign Language Study
Languages : en
Pages : 317

Get Book Here

Book Description
A comprehensive and accessible introduction to statistics in corpus linguistics, covering multiple techniques of quantitative language analysis and data visualisation.

Corpus Linguistics and Linguistically Annotated Corpora

Corpus Linguistics and Linguistically Annotated Corpora PDF Author: Sandra Kuebler
Publisher: Bloomsbury Publishing
ISBN: 1441119809
Category : Language Arts & Disciplines
Languages : en
Pages : 321

Get Book Here

Book Description
Linguistically annotated corpora are becoming a central part of the corpus linguistics field. One of their main strengths is the level of searchability they offer, but with the annotation come problems of the initial complexity of queries and query tools. This book gives a full, pedagogic account of this burgeoning field. Beginning with an overview of corpus linguistics, its prerequisites and goals, the book then introduces linguistically annotated corpora. It explores the different levels of linguistic annotation, including morphological, parts of speech, syntactic, semantic and discourse-level, as well as advantages and challenges for such annotations. It covers the main annotated corpora for English, the Penn Treebank, the International Corpus of English, and OntoNotes, as well as a wide range of corpora for other languages. In its third part, search strategies required for different types of data are explored. All chapters are accompanied by exercises and by sections on further reading.

The Cambridge Handbook of Learner Corpus Research

The Cambridge Handbook of Learner Corpus Research PDF Author: Sylviane Granger
Publisher: Cambridge University Press
ISBN: 1316432149
Category : Language Arts & Disciplines
Languages : en
Pages : 1199

Get Book Here

Book Description
The origins of learner corpus research go back to the late 1980s when large electronic collections of written or spoken data started to be collected from foreign/second language learners, with a view to advancing our understanding of the mechanisms of second language acquisition and developing tailor-made pedagogical tools. Engaging with the interdisciplinary nature of this fast-growing field, The Cambridge Handbook of Learner Corpus Research explores the diverse and extensive applications of learner corpora, with 27 chapters written by internationally renowned experts. This comprehensive work is a vital resource for students, teachers and researchers, offering fresh perspectives and a unique overview of the field. With representative studies in each chapter which provide an essential guide on how to conduct learner corpus research in a wide range of areas, this work is a cutting-edge account of learner corpus collection, annotation, methodology, theory, analysis and applications.

The BNC Handbook

The BNC Handbook PDF Author: Guy Aston
Publisher:
ISBN:
Category : Language Arts & Disciplines
Languages : en
Pages : 284

Get Book Here

Book Description
The authors explain how to use large language corpora in explanatory learning and English languages teaching and research. They focus on the largest corpus of spoken and written data compiled (the BNC) and on the search tool SARA.

Understanding Corpus Linguistics

Understanding Corpus Linguistics PDF Author: Danielle Barth
Publisher: Routledge
ISBN: 1000466752
Category : Language Arts & Disciplines
Languages : en
Pages : 276

Get Book Here

Book Description
This textbook introduces the fundamental concepts and methods of corpus linguistics for students approaching this topic for the first time, putting specific emphasis on the enormous linguistic diversity represented by approximately 7,000 human languages and broadening the scope of current concerns in general corpus linguistics. Including a basic toolkit to help the reader investigate language in different usage contexts, this book: Shows the relevance of corpora to a range of linguistic areas from phonology to sociolinguistics and discourse Covers recent developments in the application of corpus linguistics to the study of understudied languages and linguistic typology Features exercises, short problems, and questions Includes examples from real studies in over 15 languages plus multilingual corpora Providing the necessary corpus linguistics skills to critically evaluate and replicate studies, this book is essential reading for anyone studying corpus linguistics.