Automatic Speech Recognition and Translation for Low Resource Languages

Automatic Speech Recognition and Translation for Low Resource Languages PDF Author: L. Ashok Kumar
Publisher: John Wiley & Sons
ISBN: 1394214170
Category : Computers
Languages : en
Pages : 428

Get Book Here

Book Description
AUTOMATIC SPEECH RECOGNITION and TRANSLATION for LOW-RESOURCE LANGUAGES This book is a comprehensive exploration into the cutting-edge research, methodologies, and advancements in addressing the unique challenges associated with ASR and translation for low-resource languages. Automatic Speech Recognition and Translation for Low Resource Languages contains groundbreaking research from experts and researchers sharing innovative solutions that address language challenges in low-resource environments. The book begins by delving into the fundamental concepts of ASR and translation, providing readers with a solid foundation for understanding the subsequent chapters. It then explores the intricacies of low-resource languages, analyzing the factors that contribute to their challenges and the significance of developing tailored solutions to overcome them. The chapters encompass a wide range of topics, ranging from both the theoretical and practical aspects of ASR and translation for low-resource languages. The book discusses data augmentation techniques, transfer learning, and multilingual training approaches that leverage the power of existing linguistic resources to improve accuracy and performance. Additionally, it investigates the possibilities offered by unsupervised and semi-supervised learning, as well as the benefits of active learning and crowdsourcing in enriching the training data. Throughout the book, emphasis is placed on the importance of considering the cultural and linguistic context of low-resource languages, recognizing the unique nuances and intricacies that influence accurate ASR and translation. Furthermore, the book explores the potential impact of these technologies in various domains, such as healthcare, education, and commerce, empowering individuals and communities by breaking down language barriers. Audience The book targets researchers and professionals in the fields of natural language processing, computational linguistics, and speech technology. It will also be of interest to engineers, linguists, and individuals in industries and organizations working on cross-lingual communication, accessibility, and global connectivity.

Automatic Speech Recognition and Translation for Low Resource Languages

Automatic Speech Recognition and Translation for Low Resource Languages PDF Author: L. Ashok Kumar
Publisher: John Wiley & Sons
ISBN: 1394214170
Category : Computers
Languages : en
Pages : 428

Get Book Here

Book Description
AUTOMATIC SPEECH RECOGNITION and TRANSLATION for LOW-RESOURCE LANGUAGES This book is a comprehensive exploration into the cutting-edge research, methodologies, and advancements in addressing the unique challenges associated with ASR and translation for low-resource languages. Automatic Speech Recognition and Translation for Low Resource Languages contains groundbreaking research from experts and researchers sharing innovative solutions that address language challenges in low-resource environments. The book begins by delving into the fundamental concepts of ASR and translation, providing readers with a solid foundation for understanding the subsequent chapters. It then explores the intricacies of low-resource languages, analyzing the factors that contribute to their challenges and the significance of developing tailored solutions to overcome them. The chapters encompass a wide range of topics, ranging from both the theoretical and practical aspects of ASR and translation for low-resource languages. The book discusses data augmentation techniques, transfer learning, and multilingual training approaches that leverage the power of existing linguistic resources to improve accuracy and performance. Additionally, it investigates the possibilities offered by unsupervised and semi-supervised learning, as well as the benefits of active learning and crowdsourcing in enriching the training data. Throughout the book, emphasis is placed on the importance of considering the cultural and linguistic context of low-resource languages, recognizing the unique nuances and intricacies that influence accurate ASR and translation. Furthermore, the book explores the potential impact of these technologies in various domains, such as healthcare, education, and commerce, empowering individuals and communities by breaking down language barriers. Audience The book targets researchers and professionals in the fields of natural language processing, computational linguistics, and speech technology. It will also be of interest to engineers, linguists, and individuals in industries and organizations working on cross-lingual communication, accessibility, and global connectivity.

Multilingual Techniques for Low Resource Automatic Speech Recognition

Multilingual Techniques for Low Resource Automatic Speech Recognition PDF Author: Ekapol Chuangsuwanich
Publisher:
ISBN:
Category :
Languages : en
Pages : 143

Get Book Here

Book Description
Out of the approximately 7000 languages spoken around the world, there are only about 100 languages with Automatic Speech Recognition (ASR) capability. This is due to the fact that a vast amount of resources is required to build a speech recognizer. This often includes thousands of hours of transcribed speech data, a phonetic pronunciation dictionary or lexicon which spans all words in the language, and a text collection on the order of several million words. Moreover, ASR technologies usually require years of research in order to deal with the specific idiosyncrasies of each language. This makes building a speech recognizer on a language with few resources a daunting task. In this thesis, we propose a universal ASR framework for transcription and keyword spotting (KWS) tasks that work on a variety of languages. We investigate methods to deal with the need of a pronunciation dictionary by using a Pronunciation Mixture Model that can learn from existing lexicons and acoustic data to generate pronunciation for new words. In the case when no dictionary is available, a graphemic lexicon provides comparable performance to the expert lexicon. To alleviate the need for text corpora, we investigate the use of subwords and web data which helps im- prove KWS spotting results. Finally, we reduce the need for speech recordings by using bottleneck (BN) features trained on multilingual corpora. We first propose the Low-rank Stacked Bottleneck architecture which improves ASR performance over previous state-of-the-art systems. We then investigate a method to select data from various languages that is most similar to the target language in a data-driven manner, which helps improve the eectiveness of the BN features. Using techniques described and proposed in this thesis, we are able to more than double the KWS performance for a low-resource language compared to using standard techniques geared towards rich resource domains.

Multilingual Speech Processing

Multilingual Speech Processing PDF Author: Tanja Schultz
Publisher: Elsevier
ISBN: 0080457622
Category : Computers
Languages : en
Pages : 540

Get Book Here

Book Description
Tanja Schultz and Katrin Kirchhoff have compiled a comprehensive overview of speech processing from a multilingual perspective. By taking this all-inclusive approach to speech processing, the editors have included theories, algorithms, and techniques that are required to support spoken input and output in a large variety of languages. Multilingual Speech Processing presents a comprehensive introduction to research problems and solutions, both from a theoretical as well as a practical perspective, and highlights technology that incorporates the increasing necessity for multilingual applications in our global community. Current challenges of speech processing and the feasibility of sharing data and system components across different languages guide contributors in their discussions of trends, prognoses and open research issues. This includes automatic speech recognition and speech synthesis, but also speech-to-speech translation, dialog systems, automatic language identification, and handling non-native speech. The book is complemented by an overview of multilingual resources, important research trends, and actual speech processing systems that are being deployed in multilingual human-human and human-machine interfaces. Researchers and developers in industry and academia with different backgrounds but a common interest in multilingual speech processing will find an excellent overview of research problems and solutions detailed from theoretical and practical perspectives. State-of-the-art research with a global perspective by authors from the USA, Asia, Europe, and South Africa The only comprehensive introduction to multilingual speech processing currently available Detailed presentation of technological advances integral to security, financial, cellular and commercial applications

Exploiting Resources from Closely-related Languages for Automatic Speech Recognition in Low-resource Languages from Malaysia

Exploiting Resources from Closely-related Languages for Automatic Speech Recognition in Low-resource Languages from Malaysia PDF Author: Sarah Flora Samson Juan
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
Languages in Malaysia are dying in an alarming rate. As of today, 15 languages are in danger while two languages are extinct. One of the methods to save languages is by documenting languages, but it is a tedious task when performed manually.Automatic Speech Recognition (ASR) system could be a tool to help speed up the process of documenting speeches from the native speakers. However, building ASR systems for a target language requires a large amount of training data as current state-of-the-art techniques are based on empirical approach. Hence, there are many challenges in building ASR for languages that have limited data available.The main aim of this thesis is to investigate the effects of using data from closely-related languages to build ASR for low-resource languages in Malaysia. Past studies have shown that cross-lingual and multilingual methods could improve performance of low-resource ASR. In this thesis, we try to answer several questions concerning these approaches: How do we know which language is beneficial for our low-resource language? How does the relationship between source and target languages influence speech recognition performance? Is pooling language data an optimal approach for multilingual strategy?Our case study is Iban, an under-resourced language spoken in Borneo island. We study the effects of using data from Malay, a local dominant language which is close to Iban, for developing Iban ASR under different resource constraints. We have proposed several approaches to adapt Malay data to obtain pronunciation and acoustic models for Iban speech.Building a pronunciation dictionary from scratch is time consuming, as one needs to properly define the sound units of each word in a vocabulary. We developed a semi-supervised approach to quickly build a pronunciation dictionary for Iban. It was based on bootstrapping techniques for improving Malay data to match Iban pronunciations.To increase the performance of low-resource acoustic models we explored two acoustic modelling techniques, the Subspace Gaussian Mixture Models (SGMM) and Deep Neural Networks (DNN). We performed cross-lingual strategies using both frameworks for adapting out-of-language data to Iban speech. Results show that using Malay data is beneficial for increasing the performance of Iban ASR. We also tested SGMM and DNN to improve low-resource non-native ASR. We proposed a fine merging strategy for obtaining an optimal multi-accent SGMM. In addition, we developed an accent-specific DNN using native speech data. After applying both methods, we obtained significant improvements in ASR accuracy. From our study, we observe that using SGMM and DNN for cross-lingual strategy is effective when training data is very limited.

Automatic Speech Recognition for Low-resource Languages and Accents Using Multilingual and Crosslingual Information

Automatic Speech Recognition for Low-resource Languages and Accents Using Multilingual and Crosslingual Information PDF Author: Ngoc Thang Vu
Publisher:
ISBN: 9783844028928
Category :
Languages : en
Pages : 203

Get Book Here

Book Description


Handbook of Standards and Resources for Spoken Language Systems

Handbook of Standards and Resources for Spoken Language Systems PDF Author: Dafydd Gibbon
Publisher: Walter de Gruyter
ISBN: 9783110153651
Category : Computers
Languages : en
Pages : 244

Get Book Here

Book Description
This handbook provides easy access to current practice and requirements in the main spoken language technologies.

Speech Input and Output Assessment

Speech Input and Output Assessment PDF Author: A. Fourcin
Publisher:
ISBN:
Category : Language Arts & Disciplines
Languages : en
Pages : 304

Get Book Here

Book Description


Spoken Language Understanding

Spoken Language Understanding PDF Author: Gokhan Tur
Publisher: John Wiley & Sons
ISBN: 1119993946
Category : Language Arts & Disciplines
Languages : en
Pages : 443

Get Book Here

Book Description
Spoken language understanding (SLU) is an emerging field in between speech and language processing, investigating human/ machine and human/ human communication by leveraging technologies from signal processing, pattern recognition, machine learning and artificial intelligence. SLU systems are designed to extract the meaning from speech utterances and its applications are vast, from voice search in mobile devices to meeting summarization, attracting interest from both commercial and academic sectors. Both human/machine and human/human communications can benefit from the application of SLU, using differing tasks and approaches to better understand and utilize such communications. This book covers the state-of-the-art approaches for the most popular SLU tasks with chapters written by well-known researchers in the respective fields. Key features include: Presents a fully integrated view of the two distinct disciplines of speech processing and language processing for SLU tasks. Defines what is possible today for SLU as an enabling technology for enterprise (e.g., customer care centers or company meetings), and consumer (e.g., entertainment, mobile, car, robot, or smart environments) applications and outlines the key research areas. Provides a unique source of distilled information on methods for computer modeling of semantic information in human/machine and human/human conversations. This book can be successfully used for graduate courses in electronics engineering, computer science or computational linguistics. Moreover, technologists interested in processing spoken communications will find it a useful source of collated information of the topic drawn from the two distinct disciplines of speech processing and language processing under the new area of SLU.

Multilingual Speech Recognition

Multilingual Speech Recognition PDF Author: Ulla Uebler
Publisher:
ISBN: 9783897225022
Category : Automatic speech recognition
Languages : en
Pages : 186

Get Book Here

Book Description


Language Modeling for Automatic Speech Recognition of Inflective Languages

Language Modeling for Automatic Speech Recognition of Inflective Languages PDF Author: Gregor Donaj
Publisher: Springer
ISBN: 9783319416052
Category : Technology & Engineering
Languages : en
Pages : 71

Get Book Here

Book Description
This book covers language modeling and automatic speech recognition for inflective languages (e.g. Slavic languages), which represent roughly half of the languages spoken in Europe. These languages do not perform as well as English in speech recognition systems and it is therefore harder to develop an application with sufficient quality for the end user. The authors describe the most important language features for the development of a speech recognition system. This is then presented through the analysis of errors in the system and the development of language models and their inclusion in speech recognition systems, which specifically address the errors that are relevant for targeted applications. The error analysis is done with regard to morphological characteristics of the word in the recognized sentences. The book is oriented towards speech recognition with large vocabularies and continuous and even spontaneous speech. Today such applications work with a rather small number of languages compared to the number of spoken languages.