Genome-Wide Prediction of Intrinsic Disorder; Sequence Alignment of Intrinsically Disordered Proteins

Genome-Wide Prediction of Intrinsic Disorder; Sequence Alignment of Intrinsically Disordered Proteins PDF Author: Uros Midic
Publisher:
ISBN:
Category :
Languages : en
Pages : 154

Get Book Here

Book Description
Intrinsic disorder (ID) is defined as a lack of stable tertiary and/or secondary structure under physiological conditions in vitro. Intrinsically disordered proteins (IDPs) are highly abundant in nature. IDPs possess a number of crucial biological functions, being involved in regulation, recognition, signaling and control, e.g. their functional repertoire complements the functions of ordered proteins. Intrinsically disordered regions (IDRs) of IDPs have a different amino-acid composition than structured regions and proteins. This fact has been exploited for development of predictors of ID; the best predictors currently achieve around 80% per-residue accuracy. Earlier studies revealed that some IDPs are associated with various human diseases, including cancer, cardiovascular disease, amyloidoses, neurodegenerative diseases, diabetes and others. We developed a methodology for prediction and analysis of abundance of intrinsic disorder on the genome scale, which combines data from various gene and protein databases, and utilizes several ID prediction tools. We used this methodology to perform a large-scale computational analysis of the abundance of (predicted) ID in transcripts of various classes of disease-related genes. We further analyzed the relationships between ID and the occurrence of alternative splicing and Molecular Recognition Features (MoRFs) in human disease classes. An important, never before addressed issue with such genome-wide applications of ID predictors is that - for less-studied organisms - in addition to the experimentally confirmed protein sequences, there is a large number of putative sequences, which have been predicted with automated annotation procedures and lack experimental confirmation. In the human genome, these predicted sequences have significantly higher predicted disorder content. I investigated a hypothesis that this discrepancy is not correct, and that it is due to incorrectly annotated parts of the putative protein sequences that exhibit some similarities to confirmed IDRs, which lead to high predicted ID content. I developed a procedure to create synthetic nonsense peptide sequences by translation of non-coding regions of genomic sequences and translation of coding regions with incorrect codon alignment. I further trained several classifiers to discriminate between confirmed sequences and synthetic nonsense sequences, and used these predictors to estimate the abundance of incorrectly annotated regions in putative sequences, as well as to explore the link between such regions and intrinsic disorder. Sequence alignment is an essential tool in modern bioinformatics. Substitution matrices - such as the BLOSUM family - contain 20x20 parameters which are related to the evolutionary rates of amino acid substitutions. I explored various strategies for extension of sequence alignment to utilize the (predicted) disorder/structure information about the sequences being aligned. These strategies employ an extended 40 symbol alphabet which contains 20 symbols for amino acids in ordered regions and 20 symbols for amino acids in IDRs, as well as expanded 40x40 and 40x20 matrices. The new matrices exhibit significant and substantial differences in the substitution scores for IDRs and structured regions. Tests on a reference dataset show that 40x40 matrices perform worse than the standard 20x20 matrices, while 40x20 matrices - used in a scenario where ID is predicted for a query sequence but not for the target sequences - have at least comparable performance. However, I also demonstrate that the variations in performance between 20x20 and 20x40 matrices are insignificant compared to the variation in obtained matrices that occurs when the underlying algorithm for calculation of substitution matrices is changed.

Genome-Wide Prediction of Intrinsic Disorder; Sequence Alignment of Intrinsically Disordered Proteins

Genome-Wide Prediction of Intrinsic Disorder; Sequence Alignment of Intrinsically Disordered Proteins PDF Author: Uros Midic
Publisher:
ISBN:
Category :
Languages : en
Pages : 154

Get Book Here

Book Description
Intrinsic disorder (ID) is defined as a lack of stable tertiary and/or secondary structure under physiological conditions in vitro. Intrinsically disordered proteins (IDPs) are highly abundant in nature. IDPs possess a number of crucial biological functions, being involved in regulation, recognition, signaling and control, e.g. their functional repertoire complements the functions of ordered proteins. Intrinsically disordered regions (IDRs) of IDPs have a different amino-acid composition than structured regions and proteins. This fact has been exploited for development of predictors of ID; the best predictors currently achieve around 80% per-residue accuracy. Earlier studies revealed that some IDPs are associated with various human diseases, including cancer, cardiovascular disease, amyloidoses, neurodegenerative diseases, diabetes and others. We developed a methodology for prediction and analysis of abundance of intrinsic disorder on the genome scale, which combines data from various gene and protein databases, and utilizes several ID prediction tools. We used this methodology to perform a large-scale computational analysis of the abundance of (predicted) ID in transcripts of various classes of disease-related genes. We further analyzed the relationships between ID and the occurrence of alternative splicing and Molecular Recognition Features (MoRFs) in human disease classes. An important, never before addressed issue with such genome-wide applications of ID predictors is that - for less-studied organisms - in addition to the experimentally confirmed protein sequences, there is a large number of putative sequences, which have been predicted with automated annotation procedures and lack experimental confirmation. In the human genome, these predicted sequences have significantly higher predicted disorder content. I investigated a hypothesis that this discrepancy is not correct, and that it is due to incorrectly annotated parts of the putative protein sequences that exhibit some similarities to confirmed IDRs, which lead to high predicted ID content. I developed a procedure to create synthetic nonsense peptide sequences by translation of non-coding regions of genomic sequences and translation of coding regions with incorrect codon alignment. I further trained several classifiers to discriminate between confirmed sequences and synthetic nonsense sequences, and used these predictors to estimate the abundance of incorrectly annotated regions in putative sequences, as well as to explore the link between such regions and intrinsic disorder. Sequence alignment is an essential tool in modern bioinformatics. Substitution matrices - such as the BLOSUM family - contain 20x20 parameters which are related to the evolutionary rates of amino acid substitutions. I explored various strategies for extension of sequence alignment to utilize the (predicted) disorder/structure information about the sequences being aligned. These strategies employ an extended 40 symbol alphabet which contains 20 symbols for amino acids in ordered regions and 20 symbols for amino acids in IDRs, as well as expanded 40x40 and 40x20 matrices. The new matrices exhibit significant and substantial differences in the substitution scores for IDRs and structured regions. Tests on a reference dataset show that 40x40 matrices perform worse than the standard 20x20 matrices, while 40x20 matrices - used in a scenario where ID is predicted for a query sequence but not for the target sequences - have at least comparable performance. However, I also demonstrate that the variations in performance between 20x20 and 20x40 matrices are insignificant compared to the variation in obtained matrices that occurs when the underlying algorithm for calculation of substitution matrices is changed.

Flexible Viruses

Flexible Viruses PDF Author: Vladimir Uversky
Publisher: John Wiley & Sons
ISBN: 0470618310
Category : Science
Languages : en
Pages : 532

Get Book Here

Book Description
This book provides up-to-date information on experimental and computational characterization of the structural and functional properties of viral proteins, which are widely involved in regulatory and signaling processes. With chapters by leading research groups, it features current information on the structural and functional roles of intrinsic disorders in viral proteomes. It systematically addresses the measles, HIV, influenza, potato virus, forest virus, bovine virus, hepatitis, and rotavirus as well as viral genomics. After analyzing the unique features of each class of viral proteins, future directions for research and disease management are presented.

The Specificity of Serological Reactions

The Specificity of Serological Reactions PDF Author: Karl Landsteiner
Publisher: Courier Corporation
ISBN: 0486151441
Category : Science
Languages : en
Pages : 372

Get Book Here

Book Description
Nobel prizewinner's account of experiments he and colleagues carried out on antigens and serological reactions with simple compounds. Exceptionally broad coverage of basic immunology. Extensive bibliography.

Large-scale Characterization of Intrinsic Disorder and High-throughput Prediction Of RNA, DNA and Protein Binding Mediated by Intrinsic Disorder

Large-scale Characterization of Intrinsic Disorder and High-throughput Prediction Of RNA, DNA and Protein Binding Mediated by Intrinsic Disorder PDF Author: Zhenling Peng
Publisher:
ISBN:
Category : Bioinformatics
Languages : en
Pages : 141

Get Book Here

Book Description
Intrinsically disordered proteins lack stable 3D structures in vivo, are functionally important, and are very common in nature. In the past three decades, many studies focused on prediction of intrinsic disorder from protein sequence, estimation of its abundance, and analyses of its functional roles. However, these studies were limited in their scope; for example, they focused only on one of many functional and structural aspects. We performed first-of-its-kind comprehensive and detailed analysis of abundance, functional roles, and cellular localizations of intrinsic disorder in complete proteomes. We show that intrinsic disorder is abundant across all kingdoms of life including viruses, is involved in crucial cellular processes, such as translation, transcription, metabolism, regulation, signaling, and so on, and is preferentially located in the ribosome and nucleus. We also mapped intrinsic disorder into eukaryotic, bacterial and archaean cells. These observations motivated us to further analyze two protein families ? ribosomal proteins and proteins involved in the programmed cell death. We performed analysis across multiple species, which shows that intrinsic disorder is enriched and performs a variety of important cellular functions in ribosomal and cell death proteins. These two studies reveal that intrinsic disorder is involved in the interactions between proteins, RNAs, and DNAs. The prediction and characterization of these interactions for ordered proteins (i.e., proteins with stable 3D structures in vivo) recently attracted significant attention. However, there are no methods that target these functions/interactions mediated by the intrinsic disorder. Development of such methods is now possible by using the curated functional annotations of intrinsic disorder from the DisProt database. Utilizing these data we developed the first computational prediction method, DisoRDPbind, that predicts protein-protein, -RNA and -DNA interactions mediated by the intrinsic disorder. Our method utilizes logistic regression algorithm and a custom-designed and empirically selected set of descriptors of the input protein sequence. Empirical assessment using two benchmark datasets and large-scale predictions on four eukaryotic proteomes suggests that DisoRDPbind provides good predictive quality, differs from the methods focused on the predictions for the ordered proteins, and its computational efficiency allows for annotation of these interactions in whole proteomes.

Intrinsically Disordered Proteins

Intrinsically Disordered Proteins PDF Author: Vladimir N. Uversky
Publisher: Springer
ISBN: 3319089218
Category : Science
Languages : en
Pages : 73

Get Book Here

Book Description
In this brief, Vladimir Uversky discusses the paradigm-shifting phenomenon of intrinsically disordered proteins (IDPs) and hybrid proteins containing ordered domains and functional IDP regions (IDPRs). Beginning with an introduction to the concept of protein intrinsic disorder, Uversky then goes on to describe the peculiar amino acid sequences of IDPs, their structural heterogeneity, typical functions and disorder-based binding modes. In the final sections, Uversky discusses IDPs in human diseases and as potential drug targets. This volume provides a snapshot to researchers entering the field as well as providing a current overview for more experienced scientists in related areas.

Biological Sequence Analysis

Biological Sequence Analysis PDF Author: Richard Durbin
Publisher: Cambridge University Press
ISBN: 113945739X
Category : Science
Languages : en
Pages : 372

Get Book Here

Book Description
Probabilistic models are becoming increasingly important in analysing the huge amount of data being produced by large-scale DNA-sequencing efforts such as the Human Genome Project. For example, hidden Markov models are used for analysing biological sequences, linguistic-grammar-based probabilistic models for identifying RNA secondary structure, and probabilistic evolutionary models for inferring phylogenies of sequences from different organisms. This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of sequence analysis. Written by an interdisciplinary team of authors, it aims to be accessible to molecular biologists, computer scientists, and mathematicians with no formal knowledge of the other fields, and at the same time present the state-of-the-art in this new and highly important field.

Genome-Wide Prediction and Analysis of Protein-Protein Functional Linkages in Bacteria

Genome-Wide Prediction and Analysis of Protein-Protein Functional Linkages in Bacteria PDF Author: Vijaykumar Yogesh Muley
Publisher: Springer Science & Business Media
ISBN: 1461447054
Category : Science
Languages : en
Pages : 66

Get Book Here

Book Description
​​ ​Using genome sequencing, one can predict possible interactions among proteins. There are very few titles that focus on protein-protein interaction predictions in bacteria. The authors will describe these methods and further highlight its use to predict various biological pathways and complexity of the cellular response to various environmental conditions. Topics include analysis of complex genome-scale protein-protein interaction networks, effects of reference genome selection on prediction accuracy, and genome sequence templates to predict protein function.

Dancing Protein Clouds: Intrinsically Disordered Proteins in the Norm and Pathology

Dancing Protein Clouds: Intrinsically Disordered Proteins in the Norm and Pathology PDF Author: Vladimir Uversky
Publisher: Academic Press
ISBN: 012816851X
Category :
Languages : en
Pages : 426

Get Book Here

Book Description
"Dancing protein clouds: Intrinsically disordered proteins in the norm and pathology" represents a set of selected studies on a variety of research topics related to intrinsically disordered proteins. Topics in this update include structural and functional characterization of several important intrinsically disordered proteins, such as 14-3-3 proteins and their partners, as well as proteins from muscle sarcomere; representation of intrinsic disorder-related concept of protein structure-function continuum; discussion of the role of intrinsic disorder in phenotypic switching; consideration of the role of intrinsically disordered proteins in the pathogenesis of neurodegenerative diseases and cancer; discussion of the roles of intrinsic disorder in functional amyloids; demonstration of the usefulness of the analysis of translational diffusion of unfolded and intrinsically disordered proteins; consideration of various computational tools for evaluation of functions of intrinsically disordered regions; and discussion of the role of shear stress in the amyloid formation of intrinsically disordered regions in the brain. Provides some recent studies on the intrinsically disordered proteins and their functions, as well as on the involvement of intrinsically disordered proteins in pthogenesis of various diseases Contains numerous illustrative materials (color figures, diagrams, and tables) to help the readers to delve in the information provided Includes contributions from recognized experts in the field

Fuzziness

Fuzziness PDF Author: Monika Fuxreiter
Publisher: Springer Science & Business Media
ISBN: 1461406595
Category : Medical
Languages : en
Pages : 210

Get Book Here

Book Description
Detailed characterization of fuzzy interactions will be of central importance for understanding the diverse biological functions of intrinsically disordered proteins in complex eukaryotic signaling networks. In this volume, Peter Tompa and Monika Fuxreiter have assembled a series of papers that address the issue of fuzziness in molecular interactions. These papers provide a broad overview of the phenomenon of fuzziness and provide compelling examples of the central role played by fuzzy interactions in regulation of cellular signaling processes and in viral infectivity. These contributions summarize the current state of knowledge in this new field and will undoubtedly stimulate future research that will further advance our understanding of fuzziness and its role in biomolecular interactions.

Intrinsically Disordered Proteins

Intrinsically Disordered Proteins PDF Author: Birthe B. Kragelund
Publisher: Humana
ISBN: 9781071605264
Category : Science
Languages : en
Pages : 951

Get Book Here

Book Description
The edition details methods to study intrinsically disordered proteins (IDPs) including recent topics such as extremely high-affinity disordered complexes, kinetics that evade established concepts, liquid-liquid phase separation, and novel disorder-driven allosteric mechanisms. Written in the highly successful Methods in Molecular Biology series format, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible laboratory protocols, and tips on troubleshooting and avoiding known pitfalls. Authoritative and cutting-edge, Intrinsically Disordered Proteins: Methods and Protocols aims to help scientists with different backgrounds to further their investigations into these fascinating and dynamic molecules. Chapter 24 is available open access under a CC BY 4.0 license via link.springer.com. Chapters “40 and 42 ” are available open access under a Creative Commons Attribution 4.0 International License via link.springer.com.