Author: J. Hellrich
Publisher: IOS Press
ISBN: 1614999953
Category : Computers
Languages : en
Pages : 190
Book Description
Word embeddings are a form of distributional semantics increasingly popular for investigating lexical semantic change. However, typical training algorithms are probabilistic, limiting their reliability and the reproducibility of studies. Johannes Hellrich investigated this problem both empirically and theoretically and found some variants of SVD-based algorithms to be unaffected. Furthermore, he created the JeSemE website to make word embedding based diachronic research more accessible. It provides information on changes in word denotation and emotional connotation in five diachronic corpora. Finally, the author conducted two case studies on the applicability of these methods by investigating the historical understanding of electricity as well as words connected to Romanticism. They showed the high potential of distributional semantics for further applications in the digital humanities.
Word Embeddings: Reliability & Semantic Change
Author: J. Hellrich
Publisher: IOS Press
ISBN: 1614999953
Category : Computers
Languages : en
Pages : 190
Book Description
Word embeddings are a form of distributional semantics increasingly popular for investigating lexical semantic change. However, typical training algorithms are probabilistic, limiting their reliability and the reproducibility of studies. Johannes Hellrich investigated this problem both empirically and theoretically and found some variants of SVD-based algorithms to be unaffected. Furthermore, he created the JeSemE website to make word embedding based diachronic research more accessible. It provides information on changes in word denotation and emotional connotation in five diachronic corpora. Finally, the author conducted two case studies on the applicability of these methods by investigating the historical understanding of electricity as well as words connected to Romanticism. They showed the high potential of distributional semantics for further applications in the digital humanities.
Publisher: IOS Press
ISBN: 1614999953
Category : Computers
Languages : en
Pages : 190
Book Description
Word embeddings are a form of distributional semantics increasingly popular for investigating lexical semantic change. However, typical training algorithms are probabilistic, limiting their reliability and the reproducibility of studies. Johannes Hellrich investigated this problem both empirically and theoretically and found some variants of SVD-based algorithms to be unaffected. Furthermore, he created the JeSemE website to make word embedding based diachronic research more accessible. It provides information on changes in word denotation and emotional connotation in five diachronic corpora. Finally, the author conducted two case studies on the applicability of these methods by investigating the historical understanding of electricity as well as words connected to Romanticism. They showed the high potential of distributional semantics for further applications in the digital humanities.
Supervised Machine Learning for Text Analysis in R
Author: Emil Hvitfeldt
Publisher: CRC Press
ISBN: 1000461971
Category : Computers
Languages : en
Pages : 402
Book Description
Text data is important for many domains, from healthcare to marketing to the digital humanities, but specialized approaches are necessary to create features for machine learning from language. Supervised Machine Learning for Text Analysis in R explains how to preprocess text data for modeling, train models, and evaluate model performance using tools from the tidyverse and tidymodels ecosystem. Models like these can be used to make predictions for new observations, to understand what natural language features or characteristics contribute to differences in the output, and more. If you are already familiar with the basics of predictive modeling, use the comprehensive, detailed examples in this book to extend your skills to the domain of natural language processing. This book provides practical guidance and directly applicable knowledge for data scientists and analysts who want to integrate unstructured text data into their modeling pipelines. Learn how to use text data for both regression and classification tasks, and how to apply more straightforward algorithms like regularized regression or support vector machines as well as deep learning approaches. Natural language must be dramatically transformed to be ready for computation, so we explore typical text preprocessing and feature engineering steps like tokenization and word embeddings from the ground up. These steps influence model results in ways we can measure, both in terms of model metrics and other tangible consequences such as how fair or appropriate model results are.
Publisher: CRC Press
ISBN: 1000461971
Category : Computers
Languages : en
Pages : 402
Book Description
Text data is important for many domains, from healthcare to marketing to the digital humanities, but specialized approaches are necessary to create features for machine learning from language. Supervised Machine Learning for Text Analysis in R explains how to preprocess text data for modeling, train models, and evaluate model performance using tools from the tidyverse and tidymodels ecosystem. Models like these can be used to make predictions for new observations, to understand what natural language features or characteristics contribute to differences in the output, and more. If you are already familiar with the basics of predictive modeling, use the comprehensive, detailed examples in this book to extend your skills to the domain of natural language processing. This book provides practical guidance and directly applicable knowledge for data scientists and analysts who want to integrate unstructured text data into their modeling pipelines. Learn how to use text data for both regression and classification tasks, and how to apply more straightforward algorithms like regularized regression or support vector machines as well as deep learning approaches. Natural language must be dramatically transformed to be ready for computation, so we explore typical text preprocessing and feature engineering steps like tokenization and word embeddings from the ground up. These steps influence model results in ways we can measure, both in terms of model metrics and other tangible consequences such as how fair or appropriate model results are.
Flexible Workflows
Author: L. Grumbach
Publisher: IOS Press
ISBN: 1643683977
Category : Computers
Languages : en
Pages : 340
Book Description
Traditional workflow management systems support the fulfillment of business tasks by providing guidance along a predefined workflow model. Due to the shift from mass production to customization, flexibility has become important in recent decades, but the various approaches to workflow flexibility either require extensive knowledge acquisition and modeling, or active intervention during execution. Pursuing flexibility by deviation compensates for these disadvantages by allowing alternative paths of execution at run time without requiring adaptation to the workflow model. This work, Flexible Workflows: A Constraint- and Case-Based Approach, proposes a novel approach to flexibility by deviation, the aim being to provide support during the execution of a workflow by suggesting items based on predefined strategies or experiential knowledge, even in case of deviations. The concepts combine two familiar methods from the field of AI - constraint satisfaction problem solving, and process-oriented case-based reasoning. The combined model increases the capacity for flexibility. The experimental evaluation of the approach consisted of a simulation involving several types of participant in the domain of deficiency management in construction. The book contains 7 chapters covering foundations; domains and potentials; prerequisites; constraint based workflow engine; case based deviation management; prototype; and evaluation, together with an introduction, a conclusion and 3 appendices. Demonstrating high utility values and the promise of wide applicability in practice, as well as the potential for an investigation into the transfer of the approach to other domains, the book will be of interest to all those whose work involves workflow management systems.
Publisher: IOS Press
ISBN: 1643683977
Category : Computers
Languages : en
Pages : 340
Book Description
Traditional workflow management systems support the fulfillment of business tasks by providing guidance along a predefined workflow model. Due to the shift from mass production to customization, flexibility has become important in recent decades, but the various approaches to workflow flexibility either require extensive knowledge acquisition and modeling, or active intervention during execution. Pursuing flexibility by deviation compensates for these disadvantages by allowing alternative paths of execution at run time without requiring adaptation to the workflow model. This work, Flexible Workflows: A Constraint- and Case-Based Approach, proposes a novel approach to flexibility by deviation, the aim being to provide support during the execution of a workflow by suggesting items based on predefined strategies or experiential knowledge, even in case of deviations. The concepts combine two familiar methods from the field of AI - constraint satisfaction problem solving, and process-oriented case-based reasoning. The combined model increases the capacity for flexibility. The experimental evaluation of the approach consisted of a simulation involving several types of participant in the domain of deficiency management in construction. The book contains 7 chapters covering foundations; domains and potentials; prerequisites; constraint based workflow engine; case based deviation management; prototype; and evaluation, together with an introduction, a conclusion and 3 appendices. Demonstrating high utility values and the promise of wide applicability in practice, as well as the potential for an investigation into the transfer of the approach to other domains, the book will be of interest to all those whose work involves workflow management systems.
From Narratology to Computational Story Composition and Back
Author: L. Berov
Publisher: IOS Press
ISBN: 1643683837
Category : Computers
Languages : en
Pages : 362
Book Description
Although both deal with narratives, the two disciplines of Narrative Theory (NT) and Computational Story Composition (CSC) rarely exchange insights and ideas or engage in collaborative research. The former has its roots in the humanities, and attempts to analyze literary texts to derive an understanding of the concept of narrative. The latter is in the domain of Artificial Intelligence, and investigates the autonomous composition of fictional narratives in a way that could be deemed creative. The two disciplines employ different research methodologies at contradistinct levels of abstraction, making simultaneous research difficult, while a close exchange between the two disciplines would undoubtedly be desirable, not least because of the complementary approach to their object of study. This book, From Narratology to Computational Story Composition and Back, describes an exploratory study in generative modeling, a research methodology proposed to address the methodological differences between the two disciplines and allow for simultaneous NT and CSC research. It demonstrates how implementing narratological theories as computational, generative models can lead to insights for NT, and how grounding computational representations of narrative in NT can help CSC systems to take over creative responsibilities. It is the interplay of these two strands that underscores the feasibility and utility of generative modeling. The book is divided into 6 chapters: an introduction, followed by chapters on plot, fictional characters, plot quality estimation, and computational creativity, wrapped up by a conclusion. The book will be of interest to all those working in the fields of narrative theory and computational creativity.
Publisher: IOS Press
ISBN: 1643683837
Category : Computers
Languages : en
Pages : 362
Book Description
Although both deal with narratives, the two disciplines of Narrative Theory (NT) and Computational Story Composition (CSC) rarely exchange insights and ideas or engage in collaborative research. The former has its roots in the humanities, and attempts to analyze literary texts to derive an understanding of the concept of narrative. The latter is in the domain of Artificial Intelligence, and investigates the autonomous composition of fictional narratives in a way that could be deemed creative. The two disciplines employ different research methodologies at contradistinct levels of abstraction, making simultaneous research difficult, while a close exchange between the two disciplines would undoubtedly be desirable, not least because of the complementary approach to their object of study. This book, From Narratology to Computational Story Composition and Back, describes an exploratory study in generative modeling, a research methodology proposed to address the methodological differences between the two disciplines and allow for simultaneous NT and CSC research. It demonstrates how implementing narratological theories as computational, generative models can lead to insights for NT, and how grounding computational representations of narrative in NT can help CSC systems to take over creative responsibilities. It is the interplay of these two strands that underscores the feasibility and utility of generative modeling. The book is divided into 6 chapters: an introduction, followed by chapters on plot, fictional characters, plot quality estimation, and computational creativity, wrapped up by a conclusion. The book will be of interest to all those working in the fields of narrative theory and computational creativity.
Computational approaches to semantic change
Author: Nina Tahmasebi
Publisher: Language Science Press
ISBN: 3961103127
Category : Language Arts & Disciplines
Languages : en
Pages : 396
Book Description
Semantic change — how the meanings of words change over time — has preoccupied scholars since well before modern linguistics emerged in the late 19th and early 20th century, ushering in a new methodological turn in the study of language change. Compared to changes in sound and grammar, semantic change is the least understood. Ever since, the study of semantic change has progressed steadily, accumulating a vast store of knowledge for over a century, encompassing many languages and language families. Historical linguists also early on realized the potential of computers as research tools, with papers at the very first international conferences in computational linguistics in the 1960s. Such computational studies still tended to be small-scale, method-oriented, and qualitative. However, recent years have witnessed a sea-change in this regard. Big-data empirical quantitative investigations are now coming to the forefront, enabled by enormous advances in storage capability and processing power. Diachronic corpora have grown beyond imagination, defying exploration by traditional manual qualitative methods, and language technology has become increasingly data-driven and semantics-oriented. These developments present a golden opportunity for the empirical study of semantic change over both long and short time spans. A major challenge presently is to integrate the hard-earned knowledge and expertise of traditional historical linguistics with cutting-edge methodology explored primarily in computational linguistics. The idea for the present volume came out of a concrete response to this challenge. The 1st International Workshop on Computational Approaches to Historical Language Change (LChange'19), at ACL 2019, brought together scholars from both fields. This volume offers a survey of this exciting new direction in the study of semantic change, a discussion of the many remaining challenges that we face in pursuing it, and considerably updated and extended versions of a selection of the contributions to the LChange'19 workshop, addressing both more theoretical problems — e.g., discovery of "laws of semantic change" — and practical applications, such as information retrieval in longitudinal text archives.
Publisher: Language Science Press
ISBN: 3961103127
Category : Language Arts & Disciplines
Languages : en
Pages : 396
Book Description
Semantic change — how the meanings of words change over time — has preoccupied scholars since well before modern linguistics emerged in the late 19th and early 20th century, ushering in a new methodological turn in the study of language change. Compared to changes in sound and grammar, semantic change is the least understood. Ever since, the study of semantic change has progressed steadily, accumulating a vast store of knowledge for over a century, encompassing many languages and language families. Historical linguists also early on realized the potential of computers as research tools, with papers at the very first international conferences in computational linguistics in the 1960s. Such computational studies still tended to be small-scale, method-oriented, and qualitative. However, recent years have witnessed a sea-change in this regard. Big-data empirical quantitative investigations are now coming to the forefront, enabled by enormous advances in storage capability and processing power. Diachronic corpora have grown beyond imagination, defying exploration by traditional manual qualitative methods, and language technology has become increasingly data-driven and semantics-oriented. These developments present a golden opportunity for the empirical study of semantic change over both long and short time spans. A major challenge presently is to integrate the hard-earned knowledge and expertise of traditional historical linguistics with cutting-edge methodology explored primarily in computational linguistics. The idea for the present volume came out of a concrete response to this challenge. The 1st International Workshop on Computational Approaches to Historical Language Change (LChange'19), at ACL 2019, brought together scholars from both fields. This volume offers a survey of this exciting new direction in the study of semantic change, a discussion of the many remaining challenges that we face in pursuing it, and considerably updated and extended versions of a selection of the contributions to the LChange'19 workshop, addressing both more theoretical problems — e.g., discovery of "laws of semantic change" — and practical applications, such as information retrieval in longitudinal text archives.
Efficient Frequent Subtree Mining Beyond Forests
Author: P. Welke
Publisher: IOS Press
ISBN: 164368079X
Category : Computers
Languages : en
Pages : 190
Book Description
A common paradigm in distance-based learning is to embed the instance space into a feature space equipped with a metric and define the dissimilarity between instances by the distance of their images in the feature space. Frequent connected subgraphs are sometimes used to define such feature spaces if the instances are graphs, but identifying the set of frequent connected subgraphs and subsequently computing embeddings for graph instances is computationally intractable. As a result, existing frequent subgraph mining algorithms either restrict the structural complexity of the instance graphs or require exponential delay between the output of subsequent patterns, meaning that distance-based learners lack an efficient way to operate on arbitrary graph data. This book presents a mining system that gives up the demand on the completeness of the pattern set, and instead guarantees a polynomial delay between subsequent patterns. To complement this, efficient methods devised to compute the embedding of arbitrary graphs into the Hamming space spanned by the pattern set are described. As a result, a system is proposed that allows the efficient application of distance-based learning methods to arbitrary graph databases. In addition to an introduction and conclusion, the book is divided into chapters covering: preliminaries; related work; probabilistic frequent subtrees; boosted probabilistic frequent subtrees; and fast computation, with a further two chapters on Hamiltonian path for cactus graphs and Poisson binomial distribution.
Publisher: IOS Press
ISBN: 164368079X
Category : Computers
Languages : en
Pages : 190
Book Description
A common paradigm in distance-based learning is to embed the instance space into a feature space equipped with a metric and define the dissimilarity between instances by the distance of their images in the feature space. Frequent connected subgraphs are sometimes used to define such feature spaces if the instances are graphs, but identifying the set of frequent connected subgraphs and subsequently computing embeddings for graph instances is computationally intractable. As a result, existing frequent subgraph mining algorithms either restrict the structural complexity of the instance graphs or require exponential delay between the output of subsequent patterns, meaning that distance-based learners lack an efficient way to operate on arbitrary graph data. This book presents a mining system that gives up the demand on the completeness of the pattern set, and instead guarantees a polynomial delay between subsequent patterns. To complement this, efficient methods devised to compute the embedding of arbitrary graphs into the Hamming space spanned by the pattern set are described. As a result, a system is proposed that allows the efficient application of distance-based learning methods to arbitrary graph databases. In addition to an introduction and conclusion, the book is divided into chapters covering: preliminaries; related work; probabilistic frequent subtrees; boosted probabilistic frequent subtrees; and fast computation, with a further two chapters on Hamiltonian path for cactus graphs and Poisson binomial distribution.
Knowledge Representation and Inductive Reasoning Using Conditional Logic and Sets of Ranking Functions
Author: S. Kutsch
Publisher: IOS Press
ISBN: 164368163X
Category : Computers
Languages : en
Pages : 186
Book Description
A core problem in Artificial Intelligence is the modeling of human reasoning. Classic-logical approaches are too rigid for this task, as deductive inference yielding logically correct results is not appropriate in situations where conclusions must be drawn based on the incomplete or uncertain knowledge present in virtually all real world scenarios. Since there are no mathematically precise and generally accepted definitions for the notions of plausible or rational, the question of what a knowledge base consisting of uncertain rules entails has long been an issue in the area of knowledge representation and reasoning. Different nonmonotonic logics and various semantic frameworks and axiom systems have been developed to address this question. The main theme of this book, Knowledge Representation and Inductive Reasoning using Conditional Logic and Sets of Ranking Functions, is inductive reasoning from conditional knowledge bases. Using ordinal conditional functions as ranking models for conditional knowledge bases, the author studies inferences induced by individual ranking models as well as by sets of ranking models. He elaborates in detail the interrelationships among the resulting inference relations and shows their formal properties with respect to established inference axioms. Based on the introduction of a novel classification scheme for conditionals, he also addresses the question of how to realize and implement the entailment relations obtained. In this work, “Steven Kutsch convincingly presents his ideas, provides illustrating examples for them, rigorously defines the introduced concepts, formally proves all technical results, and fully implements every newly introduced inference method in an advanced Java library (...). He significantly advances the state of the art in this field.” – Prof. Dr. Christoph Beierle of the FernUniversität in Hagen
Publisher: IOS Press
ISBN: 164368163X
Category : Computers
Languages : en
Pages : 186
Book Description
A core problem in Artificial Intelligence is the modeling of human reasoning. Classic-logical approaches are too rigid for this task, as deductive inference yielding logically correct results is not appropriate in situations where conclusions must be drawn based on the incomplete or uncertain knowledge present in virtually all real world scenarios. Since there are no mathematically precise and generally accepted definitions for the notions of plausible or rational, the question of what a knowledge base consisting of uncertain rules entails has long been an issue in the area of knowledge representation and reasoning. Different nonmonotonic logics and various semantic frameworks and axiom systems have been developed to address this question. The main theme of this book, Knowledge Representation and Inductive Reasoning using Conditional Logic and Sets of Ranking Functions, is inductive reasoning from conditional knowledge bases. Using ordinal conditional functions as ranking models for conditional knowledge bases, the author studies inferences induced by individual ranking models as well as by sets of ranking models. He elaborates in detail the interrelationships among the resulting inference relations and shows their formal properties with respect to established inference axioms. Based on the introduction of a novel classification scheme for conditionals, he also addresses the question of how to realize and implement the entailment relations obtained. In this work, “Steven Kutsch convincingly presents his ideas, provides illustrating examples for them, rigorously defines the introduced concepts, formally proves all technical results, and fully implements every newly introduced inference method in an advanced Java library (...). He significantly advances the state of the art in this field.” – Prof. Dr. Christoph Beierle of the FernUniversität in Hagen
Shallow Discourse Parsing for German
Author: P. Bourgonje
Publisher: IOS Press
ISBN: 1643681931
Category : Computers
Languages : en
Pages : 188
Book Description
The last few decades have seen impressive improvements in several areas of Natural Language Processing. Nevertheless, getting a computer to make sense of the discourse of utterances in a text remains challenging. Several different theories which aim to describe and analyze the coherent structure of a well-written text exist, but with varying degrees of applicability and feasibility for practical use. This book is about shallow discourse parsing, following the paradigm of the Penn Discourse TreeBank, a corpus containing over 1 million words annotated for discourse relations. When it comes to discourse processing, any language other than English must be considered a low-resource language. This book relates to discourse parsing for German. The limited availability of annotated data for German means that the potential of modern, deep-learning-based methods relying on such data is also limited. This book explores to what extent machine-learning and more recent deep-learning-based methods can be combined with traditional, linguistic feature engineering to improve performance for the discourse parsing task. The end-to-end shallow discourse parser for German developed for the purpose of this book is open-source and available online. Work has also been carried out on several connective lexicons in different languages. Strategies are discussed for creating or further developing such lexicons for a given language, as are suggestions on how to further increase their usefulness for shallow discourse parsing. The book will be of interest to all whose work involves Natural Language Processing, particularly in languages other than English.
Publisher: IOS Press
ISBN: 1643681931
Category : Computers
Languages : en
Pages : 188
Book Description
The last few decades have seen impressive improvements in several areas of Natural Language Processing. Nevertheless, getting a computer to make sense of the discourse of utterances in a text remains challenging. Several different theories which aim to describe and analyze the coherent structure of a well-written text exist, but with varying degrees of applicability and feasibility for practical use. This book is about shallow discourse parsing, following the paradigm of the Penn Discourse TreeBank, a corpus containing over 1 million words annotated for discourse relations. When it comes to discourse processing, any language other than English must be considered a low-resource language. This book relates to discourse parsing for German. The limited availability of annotated data for German means that the potential of modern, deep-learning-based methods relying on such data is also limited. This book explores to what extent machine-learning and more recent deep-learning-based methods can be combined with traditional, linguistic feature engineering to improve performance for the discourse parsing task. The end-to-end shallow discourse parser for German developed for the purpose of this book is open-source and available online. Work has also been carried out on several connective lexicons in different languages. Strategies are discussed for creating or further developing such lexicons for a given language, as are suggestions on how to further increase their usefulness for shallow discourse parsing. The book will be of interest to all whose work involves Natural Language Processing, particularly in languages other than English.
Current Methods in Historical Semantics
Author: Kathryn Allan
Publisher: Walter de Gruyter
ISBN: 3110252902
Category : Language Arts & Disciplines
Languages : en
Pages : 357
Book Description
Innovative, data-driven methods provide more rigorous and systematic evidence for the description and explanation of diachronic semantic processes. The volume systematises, reviews, and promotes a range of empirical research techniques and theoretical perspectives that currently inform work across the discipline of historical semantics. In addition to emphasising the use of new technology, the potential of current theoretical models (e.g. within variationist, sociolinguistic or cognitive frameworks) is explored along the way.
Publisher: Walter de Gruyter
ISBN: 3110252902
Category : Language Arts & Disciplines
Languages : en
Pages : 357
Book Description
Innovative, data-driven methods provide more rigorous and systematic evidence for the description and explanation of diachronic semantic processes. The volume systematises, reviews, and promotes a range of empirical research techniques and theoretical perspectives that currently inform work across the discipline of historical semantics. In addition to emphasising the use of new technology, the potential of current theoretical models (e.g. within variationist, sociolinguistic or cognitive frameworks) is explored along the way.
Embeddings in Natural Language Processing
Author: Mohammad Taher Pilehvar
Publisher: Springer Nature
ISBN: 3031021770
Category : Computers
Languages : en
Pages : 157
Book Description
Embeddings have undoubtedly been one of the most influential research areas in Natural Language Processing (NLP). Encoding information into a low-dimensional vector representation, which is easily integrable in modern machine learning models, has played a central role in the development of NLP. Embedding techniques initially focused on words, but the attention soon started to shift to other forms: from graph structures, such as knowledge bases, to other types of textual content, such as sentences and documents. This book provides a high-level synthesis of the main embedding techniques in NLP, in the broad sense. The book starts by explaining conventional word vector space models and word embeddings (e.g., Word2Vec and GloVe) and then moves to other types of embeddings, such as word sense, sentence and document, and graph embeddings. The book also provides an overview of recent developments in contextualized representations (e.g., ELMo and BERT) and explains their potential in NLP. Throughout the book, the reader can find both essential information for understanding a certain topic from scratch and a broad overview of the most successful techniques developed in the literature.
Publisher: Springer Nature
ISBN: 3031021770
Category : Computers
Languages : en
Pages : 157
Book Description
Embeddings have undoubtedly been one of the most influential research areas in Natural Language Processing (NLP). Encoding information into a low-dimensional vector representation, which is easily integrable in modern machine learning models, has played a central role in the development of NLP. Embedding techniques initially focused on words, but the attention soon started to shift to other forms: from graph structures, such as knowledge bases, to other types of textual content, such as sentences and documents. This book provides a high-level synthesis of the main embedding techniques in NLP, in the broad sense. The book starts by explaining conventional word vector space models and word embeddings (e.g., Word2Vec and GloVe) and then moves to other types of embeddings, such as word sense, sentence and document, and graph embeddings. The book also provides an overview of recent developments in contextualized representations (e.g., ELMo and BERT) and explains their potential in NLP. Throughout the book, the reader can find both essential information for understanding a certain topic from scratch and a broad overview of the most successful techniques developed in the literature.