Corpus Linguistics and the Web

Corpus Linguistics and the Web PDF Author: Marianne Hundt
Publisher: Rodopi
ISBN: 9042021284
Category : Computers
Languages : en
Pages : 313

Get Book Here

Book Description
Using the Web as Corpus is one of the recent challenges for corpus linguistics. This volume presents a current state-of-the-arts discussion of the topic. The articles address practical problems such as suitable linguistic search tools for accessing the www, the question of register variation, or they probe into methods for culling data from the web. The book also offers a wide range of case studies, covering morphology, syntax, lexis, as well as synchronic and diachronic variation in English. These case studies make use of the two approaches to the www in corpus linguistics - web-as-corpus and web-for-corpus-building. The case studies demonstrate that web data can provide useful additional evidence for a broad range of research questions.

Corpus Linguistics and the Web

Corpus Linguistics and the Web PDF Author: Marianne Hundt
Publisher: Rodopi
ISBN: 9042021284
Category : Computers
Languages : en
Pages : 313

Get Book Here

Book Description
Using the Web as Corpus is one of the recent challenges for corpus linguistics. This volume presents a current state-of-the-arts discussion of the topic. The articles address practical problems such as suitable linguistic search tools for accessing the www, the question of register variation, or they probe into methods for culling data from the web. The book also offers a wide range of case studies, covering morphology, syntax, lexis, as well as synchronic and diachronic variation in English. These case studies make use of the two approaches to the www in corpus linguistics - web-as-corpus and web-for-corpus-building. The case studies demonstrate that web data can provide useful additional evidence for a broad range of research questions.

Web Corpus Construction

Web Corpus Construction PDF Author: Roland Schäfer
Publisher: Morgan & Claypool Publishers
ISBN: 1627053123
Category : Computers
Languages : en
Pages : 197

Get Book Here

Book Description
The World Wide Web constitutes the largest existing source of texts written in a great variety of languages. A feasible and sound way of exploiting this data for linguistic research is to compile a static corpus for a given language. There are several adavantages of this approach: (i) Working with such corpora obviates the problems encountered when using Internet search engines in quantitative linguistic research (such as non-transparent ranking algorithms). (ii) Creating a corpus from web data is virtually free. (iii) The size of corpora compiled from the WWW may exceed by several orders of magnitudes the size of language resources offered elsewhere. (iv) The data is locally available to the user, and it can be linguistically post-processed and queried with the tools preferred by her/him. This book addresses the main practical tasks in the creation of web corpora up to giga-token size. Among these tasks are the sampling process (i.e., web crawling) and the usual cleanups including boilerplate removal and removal of duplicated content. Linguistic processing and problems with linguistic processing coming from the different kinds of noise in web corpora are also covered. Finally, the authors show how web corpora can be evaluated and compared to other corpora (such as traditionally compiled corpora).

Web As Corpus

Web As Corpus PDF Author: Maristella Gatto
Publisher: A&C Black
ISBN: 1441134131
Category : Language Arts & Disciplines
Languages : en
Pages : 255

Get Book Here

Book Description
Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions. The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking – and bewildering. This book explores the theory and practice of the “web as corpus”. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.

Developing Linguistic Corpora

Developing Linguistic Corpora PDF Author: Martin Wynne
Publisher: Oxbow Books Limited
ISBN:
Category : Language Arts & Disciplines
Languages : en
Pages : 100

Get Book Here

Book Description
A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

The Oxford Handbook of the History of English

The Oxford Handbook of the History of English PDF Author: Terttu Nevalainen (linguiste)
Publisher: Oxford University Press
ISBN: 0190627883
Category : History
Languages : en
Pages : 983

Get Book Here

Book Description
This ambitious handbook takes advantage of recent advances in the study of the history of English to rethink the understanding of the field.

Corpus Analysis

Corpus Analysis PDF Author:
Publisher: BRILL
ISBN: 9004334416
Category : Language Arts & Disciplines
Languages : en
Pages : 294

Get Book Here

Book Description
The papers published in this volume were originally presented at the Third North American Symposium on Corpus Linguistics and Language Teaching held on 23-25 March 2001 at the Park Plaza Hotel in Boston, Massachusetts. Each paper analyses some aspect of language use or structure in one or more of the many linguistic corpora now available. The number of different corpora investigated in the book is a real testament to the progress that has been made in recent years in developing new corpora, particularly spoken corpora, as over half of the papers deal either wholly or partially with the analysis of spoken data. This book will be of particular interest to undergraduate and graduate students and scholars interested in corpus, socio and applied linguistics, discourse analysis, pragmatics, and language teaching.

Text, Speech and Dialogue

Text, Speech and Dialogue PDF Author: Petr Sojka
Publisher: Springer
ISBN: 3319108166
Category : Computers
Languages : en
Pages : 623

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 17th International Conference on Text, Speech and Dialogue, TSD 2013, held in Brno, Czech Republic, in September 2014. The 70 papers presented together with 3 invited papers were carefully reviewed and selected from 143 submissions. They focus on topics such as corpora and language resources; speech recognition; tagging, classification and parsing of text and speech; speech and spoken language generation; semantic processing of text and speech; integrating applications of text and speech processing; automatic dialogue systems; as well as multimodal techniques and modelling.

Corpus Linguistics

Corpus Linguistics PDF Author: Tony McEnery
Publisher: Cambridge University Press
ISBN: 1139502441
Category : Language Arts & Disciplines
Languages : en
Pages : 311

Get Book Here

Book Description
Corpus linguistics is the study of language data on a large scale - the computer-aided analysis of very extensive collections of transcribed utterances or written texts. This textbook outlines the basic methods of corpus linguistics, explains how the discipline of corpus linguistics developed and surveys the major approaches to the use of corpus data. It uses a broad range of examples to show how corpus data has led to methodological and theoretical innovation in linguistics in general. Clear and detailed explanations lay out the key issues of method and theory in contemporary corpus linguistics. A structured and coherent narrative links the historical development of the field to current topics in 'mainstream' linguistics. Practical tasks and questions for discussion at the end of each chapter encourage students to test their understanding of what they have read and an extensive glossary provides easy access to definitions of technical terms used in the text.

A Practical Handbook of Corpus Linguistics

A Practical Handbook of Corpus Linguistics PDF Author: Magali Paquot
Publisher: Springer Nature
ISBN: 3030462161
Category : Philosophy
Languages : en
Pages : 686

Get Book Here

Book Description
This handbook is a comprehensive practical resource on corpus linguistics. It features a range of basic and advanced approaches, methods and techniques in corpus linguistics, from corpus compilation principles to quantitative data analyses. The Handbook is organized in six Parts. Parts I to III feature chapters that discuss key issues and the know-how related to various topics around corpus design, methods and corpus types. Parts IV-V aim to offer a user-friendly introduction to the quantitative analysis of corpus data: for each statistical technique discussed, chapters provide a practical guide with R and come with supplementary online material. Part VI focuses on how to write a corpus linguistic paper and how to meta-analyze corpus linguistic research. The volume can serve as a course book as well as for individual study. It will be an essential reading for students of corpus linguistics as well as experienced researchers who want to expand their knowledge of the field.

Corpus-based Language Studies

Corpus-based Language Studies PDF Author: Tony McEnery
Publisher: Taylor & Francis
ISBN: 9780415286237
Category : Foreign Language Study
Languages : en
Pages : 412

Get Book Here

Book Description
Covering the major approaches to the use of corpus data, this work gathers together influential readings from leading names in the discipline, including Biber, Widdowson, Sinclair, Carter and McCarthy.