Statistically-Driven Computer Grammars of English

Statistically-Driven Computer Grammars of English PDF Author: Black
Publisher: BRILL
ISBN: 9004653538
Category : Language Arts & Disciplines
Languages : en
Pages : 262

Get Book Here

Book Description
This book is about building computer programs that parse (analyze, or diagram) sentences of a real-world English. The English we are concerned with might be a corpus of everyday, naturally-occurring prose, such as the entire text of this morning's newspaper. Most programs that now exist for this purpose are not very successful at finding the correct analysis for everyday sentences. In contrast, the programs described here make use of a more successful statistically-driven approach. Our book is, first, a record of a five-year research collaboration between IBM and Lancaster University. Large numbers of real-world sentences were fed into the memory of a program for grammatical analysis (including a detailed grammar of English) and processed by statistical methods. The idea is to single out the correct parse, among all those offered by the grammar, on the basis of probabilities. Second, this is a how-to book, showing how to build and implement a statistically-driven broad-coverage grammar of English. We even supply our own grammar, with the necessary statistical algorithms, and with the knowledge needed to prepare a very large set (or corpus) of sentences so that it can be used to guide the statistical processing of the grammar's rules.

Statistically-Driven Computer Grammars of English

Statistically-Driven Computer Grammars of English PDF Author: Black
Publisher: BRILL
ISBN: 9004653538
Category : Language Arts & Disciplines
Languages : en
Pages : 262

Get Book Here

Book Description
This book is about building computer programs that parse (analyze, or diagram) sentences of a real-world English. The English we are concerned with might be a corpus of everyday, naturally-occurring prose, such as the entire text of this morning's newspaper. Most programs that now exist for this purpose are not very successful at finding the correct analysis for everyday sentences. In contrast, the programs described here make use of a more successful statistically-driven approach. Our book is, first, a record of a five-year research collaboration between IBM and Lancaster University. Large numbers of real-world sentences were fed into the memory of a program for grammatical analysis (including a detailed grammar of English) and processed by statistical methods. The idea is to single out the correct parse, among all those offered by the grammar, on the basis of probabilities. Second, this is a how-to book, showing how to build and implement a statistically-driven broad-coverage grammar of English. We even supply our own grammar, with the necessary statistical algorithms, and with the knowledge needed to prepare a very large set (or corpus) of sentences so that it can be used to guide the statistical processing of the grammar's rules.

English Language Corpora

English Language Corpora PDF Author:
Publisher: BRILL
ISBN: 9004653554
Category : Computers
Languages : en
Pages : 336

Get Book Here

Book Description


Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing

Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing PDF Author: Stefan Wermter
Publisher: Springer Science & Business Media
ISBN: 9783540609254
Category : Computers
Languages : en
Pages : 490

Get Book Here

Book Description
This book is based on the workshop on New Approaches to Learning for Natural Language Processing, held in conjunction with the International Joint Conference on Artificial Intelligence, IJCAI'95, in Montreal, Canada in August 1995. Most of the 32 papers included in the book are revised selected workshop presentations; some papers were individually solicited from members of the workshop program committee to give the book an overall completeness. Also included, and written with the novice reader in mind, is a comprehensive introductory survey by the volume editors. The volume presents the state of the art in the most promising current approaches to learning for NLP and is thus compulsory reading for researchers in the field or for anyone applying the new techniques to challenging real-world NLP problems.

Grammar of Spoken and Written English

Grammar of Spoken and Written English PDF Author: Douglas Biber
Publisher: John Benjamins Publishing Company
ISBN: 9027260478
Category : Language Arts & Disciplines
Languages : en
Pages : 1258

Get Book Here

Book Description
The completely redesigned Grammar of Spoken and Written English is a comprehensive corpus-based reference grammar. GSWE describes the structural characteristics of grammatical constructions in English, as do other reference grammars. But GSWE is unique in that it gives equal attention to describing the patterns of language use for each grammatical feature, based on empirical analyses of grammatical patterns in a 40-million-word corpus of spoken and written registers. Grammar-in-use is characterized by three inter-related kinds of information: frequency of grammatical features in spoken and written registers, frequencies of the most common lexico-grammatical patterns, and analysis of the discourse factors influencing choices among related grammatical features. GSWE includes over 350 tables and figures highlighting the results of corpus-based investigations. Throughout the book, authentic examples illustrate all research findings. The empirical descriptions document the lexico-grammatical features that are especially common in face-to-face-conversation compared to those that are especially common in academic writing. Analyses of fiction and newspaper articles are included as further benchmarks of language use. GSWE contains over 6,000 authentic examples from these four registers, illustrating the range of lexico-grammatical features in real-world speech and writing. In addition, comparisons between British and American English reveal specific regional differences. Now completely redesigned and available in an electronic edition, the Grammar of Spoken and Written English remains a unique and indispensable reference work for researchers, language teachers, and students alike.

New Methods In Language Processing

New Methods In Language Processing PDF Author: D. B. Jones
Publisher: Routledge
ISBN: 1134227388
Category : Language Arts & Disciplines
Languages : en
Pages : 385

Get Book Here

Book Description
Studies in Computational Linguistics presents authoritative texts from an international team of leading computational linguists. The books range from the senior undergraduate textbook to the research level monograph and provide a showcase for a broad range of recent developments in the field. The series should be interesting reading for researchers and students alike involved at this interface of linguistics and computing.

Grammatical Inference: Learning Syntax from Sentences

Grammatical Inference: Learning Syntax from Sentences PDF Author: Laurent Miclet
Publisher: Springer Science & Business Media
ISBN: 9783540617785
Category : Computers
Languages : en
Pages : 340

Get Book Here

Book Description
This book constitutes the refereed proceedings of the Third International Colloquium on Grammatical Inference, ICGI-96, held in Montpellier, France, in September 1996. The 25 revised full papers contained in the book together with two invited key papers by Magerman and Knuutila were carefully selected for presentation at the conference. The papers are organized in sections on algebraic methods and algorithms, natural language and pattern recognition, inference and stochastic models, incremental methods and inductive logic programming, and operational issues.

Quantitative Linguistik / Quantitative Linguistics

Quantitative Linguistik / Quantitative Linguistics PDF Author: Reinhard Köhler
Publisher: Walter de Gruyter
ISBN: 3110194147
Category : Language Arts & Disciplines
Languages : en
Pages : 1056

Get Book Here

Book Description
Over the past two decades, statistical and other quantitative concepts, models and methods have been increasingly gaining importance and interest in all areas of linguistics and text analysis, as well as in a number of neighboring disciplines and areas of application. The term "quantitative linguistics" comprises all scientific and technical approaches which use such terms and methods in the analysis of or work with language(s), texts and other related subjects. The 71 articles in this handbook, written by internationally-recognized experts, offer a broad, up-to-date overview of the scientific-theoretical principles, the history, the diversity of the subject areas studied, the methods and models used, the results obtained thus far and their applications. The articles are divided up into thirteen chapters: the first chapter includes contributions on the basic principles and the history of the field, nine additional chapters are dedicated to individual descriptions of the levels of linguistic research (from phonology to pragmatics) as well as typological, diachronic and geolinguistic questions. The next two chapters include a description of important models, hypotheses and principles; selected areas of application; and references to neighboring disciplines. The last portion of the handbook is an informative contribution, with information about publication forums, bibliographies, major projects, Internet links, etc. This handbook is useful not only for researchers, teachers and students of all branches of linguistics and the philologies, but also for scientists in neighboring fields, whose theoretical and empirical research touches on linguistic questions (for instance, psychology and sociology), or for those who want to make use of the proven methods or results from quantitative linguistics in their own research.

Learner English on Computer

Learner English on Computer PDF Author: Sylviane Granger
Publisher: Routledge
ISBN: 1317885589
Category : Language Arts & Disciplines
Languages : en
Pages : 219

Get Book Here

Book Description
The first book of its kind, Learner English on Computer is intended to provide linguists, students of linguistics and modern languages, and ELT professionals with a highly accessible and comprehensive introduction to the new and rapidly-expanding field of corpus-based research into learner language. Edited by the founder and co-ordinator of the International Corpus of Learner English (ICLE), the book contains articles on all aspects of corpus compilation, design and analysis. The book is divided into three main sections; in Part I, the first chapter provides the reader with an overview of the field, explaining links with corpus and applied linguistics, second language acquisition and ELT. The second chapter reviews the software tools which are currently available for analysing learner language and contains useful examples of how they can be used. Part 2 contains eight case studies in which computer learner corpora are analysed for various lexical, discourse and grammatical features. The articles contain a wide range of methodologies with broad general application. The chapters in Part 3 look at how Computer Learner Corpus (CLC) based studies can help improve pedagogical tools: EFL grammars, dictionaries, writing textbooks and electronic tools. Implications for classroom methodology are also discussed. The comprehensive scope of this volume should be invaluable to applied linguists and corpus linguists as well as to would-be learner corpus builders and analysts who wish to discover more about a new, exciting and fast-growing field of research.

Advances in Probabilistic and Other Parsing Technologies

Advances in Probabilistic and Other Parsing Technologies PDF Author: H. Bunt
Publisher: Springer Science & Business Media
ISBN: 9401594708
Category : Language Arts & Disciplines
Languages : en
Pages : 277

Get Book Here

Book Description
Parsing technology is concerned with finding syntactic structure in language. In parsing we have to deal with incomplete and not necessarily accurate formal descriptions of natural languages. Robustness and efficiency are among the main issuesin parsing. Corpora can be used to obtain frequency information about language use. This allows probabilistic parsing, an approach that aims at both robustness and efficiency increase. Approximation techniques, to be applied at the level of language description, parsing strategy, and syntactic representation, have the same objective. Approximation at the level of syntactic representation is also known as underspecification, a traditional technique to deal with syntactic ambiguity. In this book new parsing technologies are collected that aim at attacking the problems of robustness and efficiency by exactly these techniques: the design of probabilistic grammars and efficient probabilistic parsing algorithms, approximation techniques applied to grammars and parsers to increase parsing efficiency, and techniques for underspecification and the integration of semantic information in the syntactic analysis to deal with massive ambiguity. The book gives a state-of-the-art overview of current research and development in parsing technologies. In its chapters we see how probabilistic methods have entered the toolbox of computational linguistics in order to be applied in both parsing theory and parsing practice. The book is both a unique reference for researchers and an introduction to the field for interested graduate students.

Industrial Parsing of Software Manuals

Industrial Parsing of Software Manuals PDF Author: Sutcliffe
Publisher: BRILL
ISBN: 9004653619
Category : Computers
Languages : en
Pages : 287

Get Book Here

Book Description
The task of language engineering is to develop the technology for building computer systems which can perform useful linguistic tasks such as machine assisted translation, text retrieval, message classification and document summarisation. Such systems often require the use of a parser which can extract specific types of grammatical data from pre-defined classes of input text. There are many parsers already available for use in language engineering systems. However, many different linguistic formalisms and parsing algorithms are employed. Grammatical coverage varies, as does the nature of the syntactic information extracted. Direct comparison between systems is difficult because each is likely to have been evaluated using different test criteria. In this volume, eight different parsers are applied to the same task, that of analysing a set of sentences derived from software instruction manuals. Each parser is presented in a separate chapter. Evaluation of performance is carried out using a standard set of criteria with the results being presented in a set of tables which have the same format for each system. Three additional chapters provide further analysis of the results as well as discussing possible approaches to the standardisation of parse tree data. Five parse trees are provided for each system in an appendix, allowing further direct comparison between systems by the reader. The book will be of interest to students, researchers and practitioners in the areas of computational linguistics, computer science, information retrieval, language engineering, linguistics and machine assisted translation.