Investigating Molecular Recognition Through Large-scale Analysis of Protein Sequences and Structures

Investigating Molecular Recognition Through Large-scale Analysis of Protein Sequences and Structures PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
The objective of this project is to study protein sequence-structure relationships through large-scale computational analysis of gene sequences and crystal structure in the databanks. The results of this analysis will be used to help better understand molecular recognition.

Investigating Molecular Recognition Through Large-scale Analysis of Protein Sequences and Structures

Investigating Molecular Recognition Through Large-scale Analysis of Protein Sequences and Structures PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
The objective of this project is to study protein sequence-structure relationships through large-scale computational analysis of gene sequences and crystal structure in the databanks. The results of this analysis will be used to help better understand molecular recognition.

Invitation to Protein Sequence Analysis Through Probability and Information

Invitation to Protein Sequence Analysis Through Probability and Information PDF Author: Daniel J. Graham
Publisher: CRC Press
ISBN: 0429647883
Category : Science
Languages : en
Pages : 245

Get Book Here

Book Description
This book explores the remarkable information correspondences and probability structures of proteins. Correspondences are pervasive in biochemistry and bioinformatics: proteins share homologies, folding patterns, and mechanisms. Probability structures are just as paramount: folded state graphics reflect Angstrom-scale maps of electron density. The author explores protein sequences (primary structures), both individually and in sets (systems) with the help of probability and information tools. This perspective will enhance the reader’s knowledge of how an important class of molecules is designed and put to task in natural systems, and how we can approach class members in hands-on ways.

Molecular Biology of the Cell

Molecular Biology of the Cell PDF Author:
Publisher:
ISBN: 9780815332183
Category : Cells
Languages : en
Pages : 0

Get Book Here

Book Description


Genome-Wide Prediction of Intrinsic Disorder; Sequence Alignment of Intrinsically Disordered Proteins

Genome-Wide Prediction of Intrinsic Disorder; Sequence Alignment of Intrinsically Disordered Proteins PDF Author: Uros Midic
Publisher:
ISBN:
Category :
Languages : en
Pages : 154

Get Book Here

Book Description
Intrinsic disorder (ID) is defined as a lack of stable tertiary and/or secondary structure under physiological conditions in vitro. Intrinsically disordered proteins (IDPs) are highly abundant in nature. IDPs possess a number of crucial biological functions, being involved in regulation, recognition, signaling and control, e.g. their functional repertoire complements the functions of ordered proteins. Intrinsically disordered regions (IDRs) of IDPs have a different amino-acid composition than structured regions and proteins. This fact has been exploited for development of predictors of ID; the best predictors currently achieve around 80% per-residue accuracy. Earlier studies revealed that some IDPs are associated with various human diseases, including cancer, cardiovascular disease, amyloidoses, neurodegenerative diseases, diabetes and others. We developed a methodology for prediction and analysis of abundance of intrinsic disorder on the genome scale, which combines data from various gene and protein databases, and utilizes several ID prediction tools. We used this methodology to perform a large-scale computational analysis of the abundance of (predicted) ID in transcripts of various classes of disease-related genes. We further analyzed the relationships between ID and the occurrence of alternative splicing and Molecular Recognition Features (MoRFs) in human disease classes. An important, never before addressed issue with such genome-wide applications of ID predictors is that - for less-studied organisms - in addition to the experimentally confirmed protein sequences, there is a large number of putative sequences, which have been predicted with automated annotation procedures and lack experimental confirmation. In the human genome, these predicted sequences have significantly higher predicted disorder content. I investigated a hypothesis that this discrepancy is not correct, and that it is due to incorrectly annotated parts of the putative protein sequences that exhibit some similarities to confirmed IDRs, which lead to high predicted ID content. I developed a procedure to create synthetic nonsense peptide sequences by translation of non-coding regions of genomic sequences and translation of coding regions with incorrect codon alignment. I further trained several classifiers to discriminate between confirmed sequences and synthetic nonsense sequences, and used these predictors to estimate the abundance of incorrectly annotated regions in putative sequences, as well as to explore the link between such regions and intrinsic disorder. Sequence alignment is an essential tool in modern bioinformatics. Substitution matrices - such as the BLOSUM family - contain 20x20 parameters which are related to the evolutionary rates of amino acid substitutions. I explored various strategies for extension of sequence alignment to utilize the (predicted) disorder/structure information about the sequences being aligned. These strategies employ an extended 40 symbol alphabet which contains 20 symbols for amino acids in ordered regions and 20 symbols for amino acids in IDRs, as well as expanded 40x40 and 40x20 matrices. The new matrices exhibit significant and substantial differences in the substitution scores for IDRs and structured regions. Tests on a reference dataset show that 40x40 matrices perform worse than the standard 20x20 matrices, while 40x20 matrices - used in a scenario where ID is predicted for a query sequence but not for the target sequences - have at least comparable performance. However, I also demonstrate that the variations in performance between 20x20 and 20x40 matrices are insignificant compared to the variation in obtained matrices that occurs when the underlying algorithm for calculation of substitution matrices is changed.

Bioinformatics of Genome Regulation and Structure II

Bioinformatics of Genome Regulation and Structure II PDF Author: Nikolay Kolchanov
Publisher: Springer Science & Business Media
ISBN: 0387294554
Category : Science
Languages : en
Pages : 545

Get Book Here

Book Description
The last 15 years in development of biology were marked with accumulation of unprecedentedly huge arrays of experimental data. The information was amassed with exclusively high rates due to the advent of highly efficient experimental technologies that provided for high throughput genomic sequencing; of functional genomics technologies allowing investigation of expression dynamics of large groups of genes using expression DNA chips; of proteomics methods giving the possibility to analyze protein compositions of cells, tissues, and organs, assess the dynamics of the cell proteome, and reconstruct the networks of protein-protein interactions; and of metabolomics, in particular, high resolution mass spectrometry study of cell metabolites, and distribution of metabolic fluxes in the cells with a concurrent investigation of the dynamics of thousands metabolites in an individual cell. Analysis, comprehension, and use of the tremendous volumes of experimental data reflecting the intricate processes underlying the functioning of molecular genetic systems are unfeasible in principle without the systems approach and involvement of the state-of-the-art information and computer technologies and efficient mathematical methods for data analysis and simulation of biological systems and processes. The need in solving these problems initiated the birth of a new science— postgenomic bioinformatics or systems biology in silico.

Biological Sequence Analysis

Biological Sequence Analysis PDF Author: Richard Durbin
Publisher: Cambridge University Press
ISBN: 113945739X
Category : Science
Languages : en
Pages : 372

Get Book Here

Book Description
Probabilistic models are becoming increasingly important in analysing the huge amount of data being produced by large-scale DNA-sequencing efforts such as the Human Genome Project. For example, hidden Markov models are used for analysing biological sequences, linguistic-grammar-based probabilistic models for identifying RNA secondary structure, and probabilistic evolutionary models for inferring phylogenies of sequences from different organisms. This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of sequence analysis. Written by an interdisciplinary team of authors, it aims to be accessible to molecular biologists, computer scientists, and mathematicians with no formal knowledge of the other fields, and at the same time present the state-of-the-art in this new and highly important field.

Sequence — Evolution — Function

Sequence — Evolution — Function PDF Author: Eugene V. Koonin
Publisher: Springer Science & Business Media
ISBN: 1475737831
Category : Science
Languages : en
Pages : 482

Get Book Here

Book Description
Sequence - Evolution - Function is an introduction to the computational approaches that play a critical role in the emerging new branch of biology known as functional genomics. The book provides the reader with an understanding of the principles and approaches of functional genomics and of the potential and limitations of computational and experimental approaches to genome analysis. Sequence - Evolution - Function should help bridge the "digital divide" between biologists and computer scientists, allowing biologists to better grasp the peculiarities of the emerging field of Genome Biology and to learn how to benefit from the enormous amount of sequence data available in the public databases. The book is non-technical with respect to the computer methods for genome analysis and discusses these methods from the user's viewpoint, without addressing mathematical and algorithmic details. Prior practical familiarity with the basic methods for sequence analysis is a major advantage, but a reader without such experience will be able to use the book as an introduction to these methods. This book is perfect for introductory level courses in computational methods for comparative and functional genomics.

Research at the Intersection of the Physical and Life Sciences

Research at the Intersection of the Physical and Life Sciences PDF Author: National Research Council
Publisher: National Academies Press
ISBN: 0309147514
Category : Science
Languages : en
Pages : 122

Get Book Here

Book Description
Traditionally, the natural sciences have been divided into two branches: the biological sciences and the physical sciences. Today, an increasing number of scientists are addressing problems lying at the intersection of the two. These problems are most often biological in nature, but examining them through the lens of the physical sciences can yield exciting results and opportunities. For example, one area producing effective cross-discipline research opportunities centers on the dynamics of systems. Equilibrium, multistability, and stochastic behavior-concepts familiar to physicists and chemists-are now being used to tackle issues associated with living systems such as adaptation, feedback, and emergent behavior. Research at the Intersection of the Physical and Life Sciences discusses how some of the most important scientific and societal challenges can be addressed, at least in part, by collaborative research that lies at the intersection of traditional disciplines, including biology, chemistry, and physics. This book describes how some of the mysteries of the biological world are being addressed using tools and techniques developed in the physical sciences, and identifies five areas of potentially transformative research. Work in these areas would have significant impact in both research and society at large by expanding our understanding of the physical world and by revealing new opportunities for advancing public health, technology, and stewardship of the environment. This book recommends several ways to accelerate such cross-discipline research. Many of these recommendations are directed toward those administering the faculties and resources of our great research institutions-and the stewards of our research funders, making this book an excellent resource for academic and research institutions, scientists, universities, and federal and private funding agencies.

Sequences of Proteins of Immunological Interest

Sequences of Proteins of Immunological Interest PDF Author:
Publisher:
ISBN:
Category : Amino acid sequence
Languages : en
Pages : 1246

Get Book Here

Book Description
Tabulation and analysis of amino acid and nucleic acid sequences of precursors, v-regions, c-regions, j-chain, T-cell receptors for antigen, T-cell surface antigens, l-microglobulins, major histocompatibility antigens, thy-1, complement, c-reactive protein, thymopoietin, integrins, post-gamma globulin, -macroglobulins, and other related proteins.

A Journey Through 50 Years of Structural Bioinformatics in Memoriam of Cyrus Chothia

A Journey Through 50 Years of Structural Bioinformatics in Memoriam of Cyrus Chothia PDF Author: Alfredo Iacoangeli
Publisher: Frontiers Media SA
ISBN: 2889747050
Category : Science
Languages : en
Pages : 131

Get Book Here

Book Description
The cover image for this Research Topic was designed by Claire Marks.