Multilingual Vocabularies in Automatic Speech Recognition

Multilingual Vocabularies in Automatic Speech Recognition PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 5

Get Book Here

Book Description
The paper describes a method for dealing with multilingual vocabularies in speech recognition tasks. We present an approach that combines acoustic descriptive precision and capability of generalization to multiple languages. The approach is based on the concept of classes of transitions between phones. The classes are defined by means of objective measures on acoustic similarities among sounds of different languages. This procedure stems from the definition of a general language-independent model. When a new language is to be added, the phonological structure of the language is mapped onto the set of classes belonging to the general model. Successively, if a limited amount of language-specific speech data becomes available for the new language, we identify those sounds which require the definition of additional classes. The experiments have been conducted in Italian, English and Spanish languages. The method can also be considered as a way of implementing cross-lingual porting of recognition models for a rapid prototyping of recognizers in a new target language, specifically in cases whereby the collection of large training databases would be economically infeasible.

Multilingual Vocabularies in Automatic Speech Recognition

Multilingual Vocabularies in Automatic Speech Recognition PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 5

Get Book Here

Book Description
The paper describes a method for dealing with multilingual vocabularies in speech recognition tasks. We present an approach that combines acoustic descriptive precision and capability of generalization to multiple languages. The approach is based on the concept of classes of transitions between phones. The classes are defined by means of objective measures on acoustic similarities among sounds of different languages. This procedure stems from the definition of a general language-independent model. When a new language is to be added, the phonological structure of the language is mapped onto the set of classes belonging to the general model. Successively, if a limited amount of language-specific speech data becomes available for the new language, we identify those sounds which require the definition of additional classes. The experiments have been conducted in Italian, English and Spanish languages. The method can also be considered as a way of implementing cross-lingual porting of recognition models for a rapid prototyping of recognizers in a new target language, specifically in cases whereby the collection of large training databases would be economically infeasible.

Multilingual Speech Processing

Multilingual Speech Processing PDF Author: Tanja Schultz
Publisher: Elsevier
ISBN: 0080457622
Category : Computers
Languages : en
Pages : 540

Get Book Here

Book Description
Tanja Schultz and Katrin Kirchhoff have compiled a comprehensive overview of speech processing from a multilingual perspective. By taking this all-inclusive approach to speech processing, the editors have included theories, algorithms, and techniques that are required to support spoken input and output in a large variety of languages. Multilingual Speech Processing presents a comprehensive introduction to research problems and solutions, both from a theoretical as well as a practical perspective, and highlights technology that incorporates the increasing necessity for multilingual applications in our global community. Current challenges of speech processing and the feasibility of sharing data and system components across different languages guide contributors in their discussions of trends, prognoses and open research issues. This includes automatic speech recognition and speech synthesis, but also speech-to-speech translation, dialog systems, automatic language identification, and handling non-native speech. The book is complemented by an overview of multilingual resources, important research trends, and actual speech processing systems that are being deployed in multilingual human-human and human-machine interfaces. Researchers and developers in industry and academia with different backgrounds but a common interest in multilingual speech processing will find an excellent overview of research problems and solutions detailed from theoretical and practical perspectives. State-of-the-art research with a global perspective by authors from the USA, Asia, Europe, and South Africa The only comprehensive introduction to multilingual speech processing currently available Detailed presentation of technological advances integral to security, financial, cellular and commercial applications

Exploiting Resources from Closely-related Languages for Automatic Speech Recognition in Low-resource Languages from Malaysia

Exploiting Resources from Closely-related Languages for Automatic Speech Recognition in Low-resource Languages from Malaysia PDF Author: Sarah Flora Samson Juan
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
Languages in Malaysia are dying in an alarming rate. As of today, 15 languages are in danger while two languages are extinct. One of the methods to save languages is by documenting languages, but it is a tedious task when performed manually.Automatic Speech Recognition (ASR) system could be a tool to help speed up the process of documenting speeches from the native speakers. However, building ASR systems for a target language requires a large amount of training data as current state-of-the-art techniques are based on empirical approach. Hence, there are many challenges in building ASR for languages that have limited data available.The main aim of this thesis is to investigate the effects of using data from closely-related languages to build ASR for low-resource languages in Malaysia. Past studies have shown that cross-lingual and multilingual methods could improve performance of low-resource ASR. In this thesis, we try to answer several questions concerning these approaches: How do we know which language is beneficial for our low-resource language? How does the relationship between source and target languages influence speech recognition performance? Is pooling language data an optimal approach for multilingual strategy?Our case study is Iban, an under-resourced language spoken in Borneo island. We study the effects of using data from Malay, a local dominant language which is close to Iban, for developing Iban ASR under different resource constraints. We have proposed several approaches to adapt Malay data to obtain pronunciation and acoustic models for Iban speech.Building a pronunciation dictionary from scratch is time consuming, as one needs to properly define the sound units of each word in a vocabulary. We developed a semi-supervised approach to quickly build a pronunciation dictionary for Iban. It was based on bootstrapping techniques for improving Malay data to match Iban pronunciations.To increase the performance of low-resource acoustic models we explored two acoustic modelling techniques, the Subspace Gaussian Mixture Models (SGMM) and Deep Neural Networks (DNN). We performed cross-lingual strategies using both frameworks for adapting out-of-language data to Iban speech. Results show that using Malay data is beneficial for increasing the performance of Iban ASR. We also tested SGMM and DNN to improve low-resource non-native ASR. We proposed a fine merging strategy for obtaining an optimal multi-accent SGMM. In addition, we developed an accent-specific DNN using native speech data. After applying both methods, we obtained significant improvements in ASR accuracy. From our study, we observe that using SGMM and DNN for cross-lingual strategy is effective when training data is very limited.

Language Modeling for Automatic Speech Recognition of Inflective Languages

Language Modeling for Automatic Speech Recognition of Inflective Languages PDF Author: Gregor Donaj
Publisher: Springer
ISBN: 3319416073
Category : Technology & Engineering
Languages : en
Pages : 77

Get Book Here

Book Description
This book covers language modeling and automatic speech recognition for inflective languages (e.g. Slavic languages), which represent roughly half of the languages spoken in Europe. These languages do not perform as well as English in speech recognition systems and it is therefore harder to develop an application with sufficient quality for the end user. The authors describe the most important language features for the development of a speech recognition system. This is then presented through the analysis of errors in the system and the development of language models and their inclusion in speech recognition systems, which specifically address the errors that are relevant for targeted applications. The error analysis is done with regard to morphological characteristics of the word in the recognized sentences. The book is oriented towards speech recognition with large vocabularies and continuous and even spontaneous speech. Today such applications work with a rather small number of languages compared to the number of spoken languages.

Lexicon Development for Speech and Language Processing

Lexicon Development for Speech and Language Processing PDF Author: Frank Van Eynde
Publisher: Springer
ISBN: 9401094586
Category : Language Arts & Disciplines
Languages : en
Pages : 302

Get Book Here

Book Description
This work offers a survey of methods and techniques for structuring, acquiring and maintaining lexical resources for speech and language processing. The first chapter provides a broad survey of the field of computational lexicography, introducing most of the issues, terms and topics which are addressed in more detail in the rest of the book. The next two chapters focus on the structure and the content of man-made lexicons, concentrating respectively on (morpho- )syntactic and (morpho- )phonological information. Both chapters adopt a declarative constraint-based methodology and pay ample attention to the various ways in which lexical generalizations can be formalized and exploited to enhance the consistency and to reduce the redundancy of lexicons. A complementary perspective is offered in the next two chapters, which present techniques for automatically deriving lexical resources from text corpora. These chapters adopt an inductive data-oriented methodology and focus also on methods for tokenization, lemmatization and shallow parsing. The next three chapters focus on speech synthesis and speech recognition.

Cross-language Acoustic Adaptation for Automatic Speech Recognition

Cross-language Acoustic Adaptation for Automatic Speech Recognition PDF Author: Christoph Nieuwoudt
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Speech recognition systems have been developed for the major languages of the world, yet for the majority of languages there are currently no large vocabulary continuous speech recognition (LVCSR) systems. The development of an LVCSR system for a new language is very costly, mainly because a large speech database has to be compiled to robustly capture the acoustic characteristics of the new language. This thesis investigates techniques that enable the re-use of acoustic information from a source language, in which a large amount of data is available, in implementing a system for a new target language. The assumption is that too little data is available in the target language to train a robust speech recognition system on that data alone, and that use of acoustic information from a source language can improve the performance of a target language recognition system. Strategies for cross-language use of acoustic information are proposed, including training on pooled source and target language data, adaptation of source language models using target language data, adapting multilingual models using target language data and transforming source language data to augment target language data for model training. These strategies are allied with Bayesian and transformation-based techniques, usually used for speaker adaptation, as well as with discriminative learning techniques, to present a framework for cross-language re-use of acoustic information. Extensions to current adaptation techniques are proposed to improve the performance of these techniques specifically for cross-language adaptation. A new technique for transformation-based adaptation of variance parameters and a cost-based extension of the minimum classification error (MCE) approach are proposed. Experiments are performed for a large number of approaches from the proposed framework for cross-language re-use of acoustic information. Relatively large amounts of English speech data are used in conjunction with smaller amounts of Afrikaans speech data to improve the performance of an Afrikaans speech recogniser. Results indicate that a significant reduction in word error rate (between 26% and 50%, depending on the amount of Afrikaans data available) is possible when English acoustic data is used in addition to Afrikaans speech data from the same database (i.e both sets of data were recorded under the same c1̀2onditions and the same labelling process was used). For same-database experiments, best results are achieved for approaches that train models on pooled source and target language data and then perform further adaptation of the models using Bayesian or discriminative techniques on target language data only. Experiments are also performed to evaluate the use of English data from a different database than the Afrikaans data. Peak reductions in word error rate of between 16% and 35% are delivered, depending on the amount of Afrikaans data available. Best results are achieved for an approach that performs a simple transformation of source model parameters using target language data, and then performs Bayesian adaptation of the transformed model on target language data.

Spoken, Multilingual and Multimodal Dialogue Systems

Spoken, Multilingual and Multimodal Dialogue Systems PDF Author: Ramon Lopez Cozar Delgado
Publisher: John Wiley & Sons
ISBN: 047002156X
Category : Technology & Engineering
Languages : en
Pages : 272

Get Book Here

Book Description
Dialogue systems are a very appealing technology with an extraordinary future. Spoken, Multilingual and Multimodal Dialogues Systems: Development and Assessment addresses the great demand for information about the development of advanced dialogue systems combining speech with other modalities under a multilingual framework. It aims to give a systematic overview of dialogue systems and recent advances in the practical application of spoken dialogue systems. Spoken Dialogue Systems are computer-based systems developed to provide information and carry out simple tasks using speech as the interaction mode. Examples include travel information and reservation, weather forecast information, directory information and product order. Multimodal Dialogue Systems aim to overcome the limitations of spoken dialogue systems which use speech as the only communication means, while Multilingual Systems allow interaction with users that speak different languages. Presents a clear snapshot of the structure of a standard dialogue system, by addressing its key components in the context of multilingual and multimodal interaction and the assessment of spoken, multilingual and multimodal systems In addition to the fundamentals of the technologies employed, the development and evaluation of these systems are described Highlights recent advances in the practical application of spoken dialogue systems This comprehensive overview is a must for graduate students and academics in the fields of speech recognition, speech synthesis, speech processing, language, and human–computer interaction technolgy. It will also prove to be a valuable resource to system developers working in these areas.

Deep Learning Approaches for Spoken and Natural Language Processing

Deep Learning Approaches for Spoken and Natural Language Processing PDF Author: Virender Kadyan
Publisher: Springer Nature
ISBN: 3030797783
Category : Technology & Engineering
Languages : en
Pages : 171

Get Book Here

Book Description
This book provides insights into how deep learning techniques impact language and speech processing applications. The authors discuss the promise, limits and the new challenges in deep learning. The book covers the major differences between the various applications of deep learning and the classical machine learning techniques. The main objective of the book is to present a comprehensive survey of the major applications and research oriented articles based on deep learning techniques that are focused on natural language and speech signal processing. The book is relevant to academicians, research scholars, industrial experts, scientists and post graduate students working in the field of speech signal and natural language processing and would like to add deep learning to enhance capabilities of their work. Discusses current research challenges and future perspective about how deep learning techniques can be applied to improve NLP and speech processing applications; Presents and escalates the research trends and future direction of language and speech processing; Includes theoretical research, experimental results, and applications of deep learning.

Speech-to-Speech Translation

Speech-to-Speech Translation PDF Author: Yutaka Kidawara
Publisher: Springer Nature
ISBN: 9811505950
Category : Computers
Languages : en
Pages : 103

Get Book Here

Book Description
This book provides the readers with retrospective and prospective views with detailed explanations of component technologies, speech recognition, language translation and speech synthesis. Speech-to-speech translation system (S2S) enables to break language barriers, i.e., communicate each other between any pair of person on the glove, which is one of extreme dreams of humankind. People, society, and economy connected by S2S will demonstrate explosive growth without exception. In 1986, Japan initiated basic research of S2S, then the idea spread world-wide and were explored deeply by researchers during three decades. Now, we see S2S application on smartphone/tablet around the world. Computational resources such as processors, memories, wireless communication accelerate this computation-intensive systems and accumulation of digital data of speech and language encourage recent approaches based on machine learning. Through field experiments after long research in laboratories, S2S systems are being well-developed and now ready to utilized in daily life. Unique chapter of this book is end-2-end evaluation by comparing system’s performance and human competence. The effectiveness of the system would be understood by the score of this evaluation. The book will end with one of the next focus of S2S will be technology of simultaneous interpretation for lecture, broadcast news and so on.

Automatic Speech Recognition on Mobile Devices and over Communication Networks

Automatic Speech Recognition on Mobile Devices and over Communication Networks PDF Author: Zheng-Hua Tan
Publisher: Springer Science & Business Media
ISBN: 1848001436
Category : Technology & Engineering
Languages : en
Pages : 408

Get Book Here

Book Description
The advances in computing and networking have sparked an enormous interest in deploying automatic speech recognition on mobile devices and over communication networks. This book brings together academic researchers and industrial practitioners to address the issues in this emerging realm and presents the reader with a comprehensive introduction to the subject of speech recognition in devices and networks. It covers network, distributed and embedded speech recognition systems.