Author: K. Sreenivasa Rao
Publisher: Springer
ISBN: 3319177257
Category : Technology & Engineering
Languages : en
Pages : 128
Book Description
This book discusses the contribution of excitation source information in discriminating language. The authors focus on the excitation source component of speech for enhancement of language identification (LID) performance. Language specific features are extracted using two different modes: (i) Implicit processing of linear prediction (LP) residual and (ii) Explicit parameterization of linear prediction residual. The book discusses how in implicit processing approach, excitation source features are derived from LP residual, Hilbert envelope (magnitude) of LP residual and Phase of LP residual; and in explicit parameterization approach, LP residual signal is processed in spectral domain to extract the relevant language specific features. The authors further extract source features from these modes, which are combined for enhancing the performance of LID systems. The proposed excitation source features are also investigated for LID in background noisy environments. Each chapter of this book provides the motivation for exploring the specific feature for LID task, and subsequently discuss the methods to extract those features and finally suggest appropriate models to capture the language specific knowledge from the proposed features. Finally, the book discuss about various combinations of spectral and source features, and the desired models to enhance the performance of LID systems.
Language Identification Using Excitation Source Features
Author: K. Sreenivasa Rao
Publisher: Springer
ISBN: 3319177257
Category : Technology & Engineering
Languages : en
Pages : 128
Book Description
This book discusses the contribution of excitation source information in discriminating language. The authors focus on the excitation source component of speech for enhancement of language identification (LID) performance. Language specific features are extracted using two different modes: (i) Implicit processing of linear prediction (LP) residual and (ii) Explicit parameterization of linear prediction residual. The book discusses how in implicit processing approach, excitation source features are derived from LP residual, Hilbert envelope (magnitude) of LP residual and Phase of LP residual; and in explicit parameterization approach, LP residual signal is processed in spectral domain to extract the relevant language specific features. The authors further extract source features from these modes, which are combined for enhancing the performance of LID systems. The proposed excitation source features are also investigated for LID in background noisy environments. Each chapter of this book provides the motivation for exploring the specific feature for LID task, and subsequently discuss the methods to extract those features and finally suggest appropriate models to capture the language specific knowledge from the proposed features. Finally, the book discuss about various combinations of spectral and source features, and the desired models to enhance the performance of LID systems.
Publisher: Springer
ISBN: 3319177257
Category : Technology & Engineering
Languages : en
Pages : 128
Book Description
This book discusses the contribution of excitation source information in discriminating language. The authors focus on the excitation source component of speech for enhancement of language identification (LID) performance. Language specific features are extracted using two different modes: (i) Implicit processing of linear prediction (LP) residual and (ii) Explicit parameterization of linear prediction residual. The book discusses how in implicit processing approach, excitation source features are derived from LP residual, Hilbert envelope (magnitude) of LP residual and Phase of LP residual; and in explicit parameterization approach, LP residual signal is processed in spectral domain to extract the relevant language specific features. The authors further extract source features from these modes, which are combined for enhancing the performance of LID systems. The proposed excitation source features are also investigated for LID in background noisy environments. Each chapter of this book provides the motivation for exploring the specific feature for LID task, and subsequently discuss the methods to extract those features and finally suggest appropriate models to capture the language specific knowledge from the proposed features. Finally, the book discuss about various combinations of spectral and source features, and the desired models to enhance the performance of LID systems.
Language Identification Using Spectral and Prosodic Features
Author: K. Sreenivasa Rao
Publisher: Springer
ISBN: 3319171631
Category : Technology & Engineering
Languages : en
Pages : 106
Book Description
This book discusses the impact of spectral features extracted from frame level, glottal closure regions, and pitch-synchronous analysis on the performance of language identification systems. In addition to spectral features, the authors explore prosodic features such as intonation, rhythm, and stress features for discriminating the languages. They present how the proposed spectral and prosodic features capture the language specific information from two complementary aspects, showing how the development of language identification (LID) system using the combination of spectral and prosodic features will enhance the accuracy of identification as well as improve the robustness of the system. This book provides the methods to extract the spectral and prosodic features at various levels, and also suggests the appropriate models for developing robust LID systems according to specific spectral and prosodic features. Finally, the book discuss about various combinations of spectral and prosodic features, and the desired models to enhance the performance of LID systems.
Publisher: Springer
ISBN: 3319171631
Category : Technology & Engineering
Languages : en
Pages : 106
Book Description
This book discusses the impact of spectral features extracted from frame level, glottal closure regions, and pitch-synchronous analysis on the performance of language identification systems. In addition to spectral features, the authors explore prosodic features such as intonation, rhythm, and stress features for discriminating the languages. They present how the proposed spectral and prosodic features capture the language specific information from two complementary aspects, showing how the development of language identification (LID) system using the combination of spectral and prosodic features will enhance the accuracy of identification as well as improve the robustness of the system. This book provides the methods to extract the spectral and prosodic features at various levels, and also suggests the appropriate models for developing robust LID systems according to specific spectral and prosodic features. Finally, the book discuss about various combinations of spectral and prosodic features, and the desired models to enhance the performance of LID systems.
Speech Recognition Using Articulatory and Excitation Source Features
Author: K. Sreenivasa Rao
Publisher: Springer
ISBN: 3319492209
Category : Technology & Engineering
Languages : en
Pages : 100
Book Description
This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Publisher: Springer
ISBN: 3319492209
Category : Technology & Engineering
Languages : en
Pages : 100
Book Description
This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Advances in Speech and Music Technology
Author: Anupam Biswas
Publisher: Springer Nature
ISBN: 9813368810
Category : Technology & Engineering
Languages : en
Pages : 463
Book Description
This book features original papers from 25th International Symposium on Frontiers of Research in Speech and Music (FRSM 2020), jointly organized by National Institute of Technology, Silchar, India, during 8–9 October 2020. The book is organized in five sections, considering both technological advancement and interdisciplinary nature of speech and music processing. The first section contains chapters covering the foundations of both vocal and instrumental music processing. The second section includes chapters related to computational techniques involved in the speech and music domain. A lot of research is being performed within the music information retrieval domain which is potentially interesting for most users of computers and the Internet. Therefore, the third section is dedicated to the chapters related to music information retrieval. The fourth section contains chapters on the brain signal analysis and human cognition or perception of speech and music. The final section consists of chapters on spoken language processing and applications of speech processing.
Publisher: Springer Nature
ISBN: 9813368810
Category : Technology & Engineering
Languages : en
Pages : 463
Book Description
This book features original papers from 25th International Symposium on Frontiers of Research in Speech and Music (FRSM 2020), jointly organized by National Institute of Technology, Silchar, India, during 8–9 October 2020. The book is organized in five sections, considering both technological advancement and interdisciplinary nature of speech and music processing. The first section contains chapters covering the foundations of both vocal and instrumental music processing. The second section includes chapters related to computational techniques involved in the speech and music domain. A lot of research is being performed within the music information retrieval domain which is potentially interesting for most users of computers and the Internet. Therefore, the third section is dedicated to the chapters related to music information retrieval. The fourth section contains chapters on the brain signal analysis and human cognition or perception of speech and music. The final section consists of chapters on spoken language processing and applications of speech processing.
Emotion Recognition using Speech Features
Author: K. Sreenivasa Rao
Publisher: Springer Science & Business Media
ISBN: 1461451434
Category : Technology & Engineering
Languages : en
Pages : 134
Book Description
“Emotion Recognition Using Speech Features” provides coverage of emotion-specific features present in speech. The author also discusses suitable models for capturing emotion-specific information for distinguishing different emotions. The content of this book is important for designing and developing natural and sophisticated speech systems. In this Brief, Drs. Rao and Koolagudi lead a discussion of how emotion-specific information is embedded in speech and how to acquire emotion-specific knowledge using appropriate statistical models. Additionally, the authors provide information about exploiting multiple evidences derived from various features and models. The acquired emotion-specific knowledge is useful for synthesizing emotions. Features includes discussion of: • Global and local prosodic features at syllable, word and phrase levels, helpful for capturing emotion-discriminative information; • Exploiting complementary evidences obtained from excitation sources, vocal tract systems and prosodic features in order to enhance the emotion recognition performance; • Proposed multi-stage and hybrid models for improving the emotion recognition performance. This brief is for researchers working in areas related to speech-based products such as mobile phone manufacturing companies, automobile companies, and entertainment products as well as researchers involved in basic and applied speech processing research.
Publisher: Springer Science & Business Media
ISBN: 1461451434
Category : Technology & Engineering
Languages : en
Pages : 134
Book Description
“Emotion Recognition Using Speech Features” provides coverage of emotion-specific features present in speech. The author also discusses suitable models for capturing emotion-specific information for distinguishing different emotions. The content of this book is important for designing and developing natural and sophisticated speech systems. In this Brief, Drs. Rao and Koolagudi lead a discussion of how emotion-specific information is embedded in speech and how to acquire emotion-specific knowledge using appropriate statistical models. Additionally, the authors provide information about exploiting multiple evidences derived from various features and models. The acquired emotion-specific knowledge is useful for synthesizing emotions. Features includes discussion of: • Global and local prosodic features at syllable, word and phrase levels, helpful for capturing emotion-discriminative information; • Exploiting complementary evidences obtained from excitation sources, vocal tract systems and prosodic features in order to enhance the emotion recognition performance; • Proposed multi-stage and hybrid models for improving the emotion recognition performance. This brief is for researchers working in areas related to speech-based products such as mobile phone manufacturing companies, automobile companies, and entertainment products as well as researchers involved in basic and applied speech processing research.
Digital Transformation of Collaboration
Author: Aleksandra Przegalinska
Publisher: Springer Nature
ISBN: 3030489930
Category : Computers
Languages : en
Pages : 307
Book Description
This proceedings is focused on the emerging concept of Collaborative Innovation Networks (COINs). COINs are at the core of collaborative knowledge networks, distributed communities taking advantage of the wide connectivity and the support of communication technologies, spanning beyond the organizational perimeter of companies on a global scale. The book presents the refereed conference papers from the 7th International Conference on COINs, October 8-9, 2019, in Warsaw, Poland. It includes papers for both application areas of COINs, (1) optimizing organizational creativity and performance, and (2) discovering and predicting new trends by identifying COINs on the Web through online social media analysis. Papers at COINs19 combine a wide range of interdisciplinary fields such as social network analysis, group dynamics, design and visualization, information systems and the psychology and sociality of collaboration, and intercultural analysis through the lens of online social media. They will cover most recent advances in areas from leadership and collaboration, trend prediction and data mining, to social competence and Internet communication.
Publisher: Springer Nature
ISBN: 3030489930
Category : Computers
Languages : en
Pages : 307
Book Description
This proceedings is focused on the emerging concept of Collaborative Innovation Networks (COINs). COINs are at the core of collaborative knowledge networks, distributed communities taking advantage of the wide connectivity and the support of communication technologies, spanning beyond the organizational perimeter of companies on a global scale. The book presents the refereed conference papers from the 7th International Conference on COINs, October 8-9, 2019, in Warsaw, Poland. It includes papers for both application areas of COINs, (1) optimizing organizational creativity and performance, and (2) discovering and predicting new trends by identifying COINs on the Web through online social media analysis. Papers at COINs19 combine a wide range of interdisciplinary fields such as social network analysis, group dynamics, design and visualization, information systems and the psychology and sociality of collaboration, and intercultural analysis through the lens of online social media. They will cover most recent advances in areas from leadership and collaboration, trend prediction and data mining, to social competence and Internet communication.
Extraction and Representation of Prosody for Speaker, Speech and Language Recognition
Author: Leena Mary
Publisher: Springer Science & Business Media
ISBN: 1461411599
Category : Technology & Engineering
Languages : en
Pages : 70
Book Description
Extraction and Representation of Prosodic Features for Speech Processing Applications deals with prosody from speech processing point of view with topics including: The significance of prosody for speech processing applications Why prosody need to be incorporated in speech processing applications Different methods for extraction and representation of prosody for applications such as speech synthesis, speaker recognition, language recognition and speech recognition This book is for researchers and students at the graduate level.
Publisher: Springer Science & Business Media
ISBN: 1461411599
Category : Technology & Engineering
Languages : en
Pages : 70
Book Description
Extraction and Representation of Prosodic Features for Speech Processing Applications deals with prosody from speech processing point of view with topics including: The significance of prosody for speech processing applications Why prosody need to be incorporated in speech processing applications Different methods for extraction and representation of prosody for applications such as speech synthesis, speaker recognition, language recognition and speech recognition This book is for researchers and students at the graduate level.
Prosodic Featured Based Automatic Language Identification
Author: Niraj Singh
Publisher: Educreation Publishing
ISBN:
Category : Computers
Languages : en
Pages : 136
Book Description
Living beings inherently have the ability to differentiate languages as a part of human intelligence. Language Identification (LID) had been a science fiction in 1970's but today; it has been deployed in practical usage. The prosodic features of a speech are relatively simpler in their structure and are accredited to be very affective in some Language Recognition (LR) or LID tasks; irrespective of these features to be biased on numerous factors, as speaker's way of speaking, the culture and background of speaker. The book includes a series of experiments on several speech corpus with different classification or/and identification technique. At the end of each chapter, few review questions have been included and at the verge of the book, a short list of projects for research scholars has been mentioned in addition to a set of MCQs and Important questions. This book motivates for developing a multilingual LID system which can be widely used for betterment of mankind, particularly in the fields of Intelligence Police/Military) services and medical care. In an overview, we may assert that the book explores various experimental datasets, for, performance analysis of LID system with News speech and Natural Conversation speech; Joint Factor Analysis for LR on prosodic featured models and for automatic LID using i-Vector based prosodic system.
Publisher: Educreation Publishing
ISBN:
Category : Computers
Languages : en
Pages : 136
Book Description
Living beings inherently have the ability to differentiate languages as a part of human intelligence. Language Identification (LID) had been a science fiction in 1970's but today; it has been deployed in practical usage. The prosodic features of a speech are relatively simpler in their structure and are accredited to be very affective in some Language Recognition (LR) or LID tasks; irrespective of these features to be biased on numerous factors, as speaker's way of speaking, the culture and background of speaker. The book includes a series of experiments on several speech corpus with different classification or/and identification technique. At the end of each chapter, few review questions have been included and at the verge of the book, a short list of projects for research scholars has been mentioned in addition to a set of MCQs and Important questions. This book motivates for developing a multilingual LID system which can be widely used for betterment of mankind, particularly in the fields of Intelligence Police/Military) services and medical care. In an overview, we may assert that the book explores various experimental datasets, for, performance analysis of LID system with News speech and Natural Conversation speech; Joint Factor Analysis for LR on prosodic featured models and for automatic LID using i-Vector based prosodic system.
Multilingual Phone Recognition in Indian Languages
Author: K.E Manjunath
Publisher: Springer Nature
ISBN: 303080741X
Category : Technology & Engineering
Languages : en
Pages : 113
Book Description
The book presents current research and developments in multilingual speech recognition. The author presents a Multilingual Phone Recognition System (Multi-PRS), developed using a common multilingual phone-set derived from the International Phonetic Alphabets (IPA) based transcription of six Indian languages - Kannada, Telugu, Bengali, Odia, Urdu, and Assamese. The author shows how the performance of Multi-PRS can be improved using tandem features. The book compares Monolingual Phone Recognition Systems (Mono-PRS) versus Multi-PRS and baseline versus tandem system. Methods are proposed to predict Articulatory Features (AFs) from spectral features using Deep Neural Networks (DNN). Multitask learning is explored to improve the prediction accuracy of AFs. Then, the AFs are explored to improve the performance of Multi-PRS using lattice rescoring method of combination and tandem method of combination. The author goes on to develop and evaluate the Language Identification followed by Monolingual phone recognition (LID-Mono) and common multilingual phone-set based multilingual phone recognition systems.
Publisher: Springer Nature
ISBN: 303080741X
Category : Technology & Engineering
Languages : en
Pages : 113
Book Description
The book presents current research and developments in multilingual speech recognition. The author presents a Multilingual Phone Recognition System (Multi-PRS), developed using a common multilingual phone-set derived from the International Phonetic Alphabets (IPA) based transcription of six Indian languages - Kannada, Telugu, Bengali, Odia, Urdu, and Assamese. The author shows how the performance of Multi-PRS can be improved using tandem features. The book compares Monolingual Phone Recognition Systems (Mono-PRS) versus Multi-PRS and baseline versus tandem system. Methods are proposed to predict Articulatory Features (AFs) from spectral features using Deep Neural Networks (DNN). Multitask learning is explored to improve the prediction accuracy of AFs. Then, the AFs are explored to improve the performance of Multi-PRS using lattice rescoring method of combination and tandem method of combination. The author goes on to develop and evaluate the Language Identification followed by Monolingual phone recognition (LID-Mono) and common multilingual phone-set based multilingual phone recognition systems.
Extraction of Prosody for Automatic Speaker, Language, Emotion and Speech Recognition
Author: Leena Mary
Publisher: Springer
ISBN: 3319911716
Category : Technology & Engineering
Languages : en
Pages : 70
Book Description
This updated book expands upon prosody for recognition applications of speech processing. It includes importance of prosody for speech processing applications; builds on why prosody needs to be incorporated in speech processing applications; and presents methods for extraction and representation of prosody for applications such as speaker recognition, language recognition and speech recognition. The updated book also includes information on the significance of prosody for emotion recognition and various prosody-based approaches for automatic emotion recognition from speech.
Publisher: Springer
ISBN: 3319911716
Category : Technology & Engineering
Languages : en
Pages : 70
Book Description
This updated book expands upon prosody for recognition applications of speech processing. It includes importance of prosody for speech processing applications; builds on why prosody needs to be incorporated in speech processing applications; and presents methods for extraction and representation of prosody for applications such as speaker recognition, language recognition and speech recognition. The updated book also includes information on the significance of prosody for emotion recognition and various prosody-based approaches for automatic emotion recognition from speech.