Machine Learning Algorithms for Characterization and Prediction of Protein Structural Properties

Machine Learning Algorithms for Characterization and Prediction of Protein Structural Properties PDF Author: Maxim V Shapovalov
Publisher:
ISBN:
Category :
Languages : en
Pages : 164

Get Book Here

Book Description
Proteins are large biomolecules which are functional building blocks of living organisms. There are about 22,000 protein-coding genes in the human genome. Each gene encodes a unique protein sequence of a typical 100-1000 length which is built using a 20-letter alphabet of amino acids. Each protein folds up into a unique 3D shape that enables it to perform its function. Each protein structure consists of some number of helical segments, extended segments called sheets, and loops that connect these elements. In the last two decades, machine learning methods coupled with exponentially expanding biological knowledge databases and computational power are enabling significant progress in the field of computational biology. In this dissertation, I carry out machine learning research for three major interconnected problems to advance protein structural biology as a field. A separate chapter in this dissertation is devoted to each problem. After the three chapters I conclude this doctoral research with a summary and direction of our future work. Chapter 1 describes design, training and application of a convolutional neural network (SecNet) to achieve 84% accuracy for the 60-year-old problem of predicting protein secondary structure given a protein sequence. Our accuracy is 2-3% better than any previous result, which had only risen 5% in last 20 years. We identified the key factors for successful prediction in a detailed ablation study. A paper submitted for publication includes our secondary-structure prediction software, data set generation, and training and testing protocols [1]. Chapter 2 characterizes the design and development of a protocol for clustering of beta turns, i.e. short structural motifs responsible for U-turns in protein loops. We identified 18 turn types, 11 of which are newly described [2]. We also developed a turn library and cross-platform software for turn assignment in new structures. In Chapter 3 I build upon the results from these two problems and predict geometries in loops of unknown structure with custom Residual Neural Networks (ResNet). I demonstrate solid results on (a) locating turns and predicting 18 types and (b) prediction of backbone torsion angles in loops. Given the recent progress in machine learning, these two results provide a strong foundation for successful loop modeling and encourage us to develop a new loop structure prediction program, a critical step in protein structure prediction and modeling.

Machine Learning Algorithms for Characterization and Prediction of Protein Structural Properties

Machine Learning Algorithms for Characterization and Prediction of Protein Structural Properties PDF Author: Maxim V Shapovalov
Publisher:
ISBN:
Category :
Languages : en
Pages : 164

Get Book Here

Book Description
Proteins are large biomolecules which are functional building blocks of living organisms. There are about 22,000 protein-coding genes in the human genome. Each gene encodes a unique protein sequence of a typical 100-1000 length which is built using a 20-letter alphabet of amino acids. Each protein folds up into a unique 3D shape that enables it to perform its function. Each protein structure consists of some number of helical segments, extended segments called sheets, and loops that connect these elements. In the last two decades, machine learning methods coupled with exponentially expanding biological knowledge databases and computational power are enabling significant progress in the field of computational biology. In this dissertation, I carry out machine learning research for three major interconnected problems to advance protein structural biology as a field. A separate chapter in this dissertation is devoted to each problem. After the three chapters I conclude this doctoral research with a summary and direction of our future work. Chapter 1 describes design, training and application of a convolutional neural network (SecNet) to achieve 84% accuracy for the 60-year-old problem of predicting protein secondary structure given a protein sequence. Our accuracy is 2-3% better than any previous result, which had only risen 5% in last 20 years. We identified the key factors for successful prediction in a detailed ablation study. A paper submitted for publication includes our secondary-structure prediction software, data set generation, and training and testing protocols [1]. Chapter 2 characterizes the design and development of a protocol for clustering of beta turns, i.e. short structural motifs responsible for U-turns in protein loops. We identified 18 turn types, 11 of which are newly described [2]. We also developed a turn library and cross-platform software for turn assignment in new structures. In Chapter 3 I build upon the results from these two problems and predict geometries in loops of unknown structure with custom Residual Neural Networks (ResNet). I demonstrate solid results on (a) locating turns and predicting 18 types and (b) prediction of backbone torsion angles in loops. Given the recent progress in machine learning, these two results provide a strong foundation for successful loop modeling and encourage us to develop a new loop structure prediction program, a critical step in protein structure prediction and modeling.

Machine Learning In Bioinformatics Of Protein Sequences: Algorithms, Databases And Resources For Modern Protein Bioinformatics

Machine Learning In Bioinformatics Of Protein Sequences: Algorithms, Databases And Resources For Modern Protein Bioinformatics PDF Author: Lukasz Kurgan
Publisher: World Scientific
ISBN: 9811258597
Category : Science
Languages : en
Pages : 378

Get Book Here

Book Description
Machine Learning in Bioinformatics of Protein Sequences guides readers around the rapidly advancing world of cutting-edge machine learning applications in the protein bioinformatics field. Edited by bioinformatics expert, Dr Lukasz Kurgan, and with contributions by a dozen of accomplished researchers, this book provides a holistic view of the structural bioinformatics by covering a broad spectrum of algorithms, databases and software resources for the efficient and accurate prediction and characterization of functional and structural aspects of proteins. It spotlights key advances which include deep neural networks, natural language processing-based sequence embedding and covers a wide range of predictions which comprise of tertiary structure, secondary structure, residue contacts, intrinsic disorder, protein, peptide and nucleic acids-binding sites, hotspots, post-translational modification sites, and protein function. This volume is loaded with practical information that identifies and describes leading predictive tools, useful databases, webservers, and modern software platforms for the development of novel predictive tools.

Introduction to Protein Structure Prediction

Introduction to Protein Structure Prediction PDF Author: Huzefa Rangwala
Publisher: John Wiley & Sons
ISBN: 111809946X
Category : Science
Languages : en
Pages : 611

Get Book Here

Book Description
A look at the methods and algorithms used to predict protein structure A thorough knowledge of the function and structure of proteins is critical for the advancement of biology and the life sciences as well as the development of better drugs, higher-yield crops, and even synthetic bio-fuels. To that end, this reference sheds light on the methods used for protein structure prediction and reveals the key applications of modeled structures. This indispensable book covers the applications of modeled protein structures and unravels the relationship between pure sequence information and three-dimensional structure, which continues to be one of the greatest challenges in molecular biology. With this resource, readers will find an all-encompassing examination of the problems, methods, tools, servers, databases, and applications of protein structure prediction and they will acquire unique insight into the future applications of the modeled protein structures. The book begins with a thorough introduction to the protein structure prediction problem and is divided into four themes: a background on structure prediction, the prediction of structural elements, tertiary structure prediction, and functional insights. Within those four sections, the following topics are covered: Databases and resources that are commonly used for protein structure prediction The structure prediction flagship assessment (CASP) and the protein structure initiative (PSI) Definitions of recurring substructures and the computational approaches used for solving sequence problems Difficulties with contact map prediction and how sophisticated machine learning methods can solve those problems Structure prediction methods that rely on homology modeling, threading, and fragment assembly Hybrid methods that achieve high-resolution protein structures Parts of the protein structure that may be conserved and used to interact with other biomolecules How the loop prediction problem can be used for refinement of the modeled structures The computational model that detects the differences between protein structure and its modeled mutant Whether working in the field of bioinformatics or molecular biology research or taking courses in protein modeling, readers will find the content in this book invaluable.

A Metaheuristic Approach to Protein Structure Prediction

A Metaheuristic Approach to Protein Structure Prediction PDF Author: Nanda Dulal Jana
Publisher: Springer
ISBN: 3319747754
Category : Technology & Engineering
Languages : en
Pages : 243

Get Book Here

Book Description
This book introduces characteristic features of the protein structure prediction (PSP) problem. It focuses on systematic selection and improvement of the most appropriate metaheuristic algorithm to solve the problem based on a fitness landscape analysis, rather than on the nature of the problem, which was the focus of methodologies in the past. Protein structure prediction is concerned with the question of how to determine the three-dimensional structure of a protein from its primary sequence. Recently a number of successful metaheuristic algorithms have been developed to determine the native structure, which plays an important role in medicine, drug design, and disease prediction. This interdisciplinary book consolidates the concepts most relevant to protein structure prediction (PSP) through global non-convex optimization. It is intended for graduate students from fields such as computer science, engineering, bioinformatics and as a reference for researchers and practitioners.

Machine Learning Meets Quantum Physics

Machine Learning Meets Quantum Physics PDF Author: Kristof T. Schütt
Publisher: Springer Nature
ISBN: 3030402452
Category : Science
Languages : en
Pages : 473

Get Book Here

Book Description
Designing molecules and materials with desired properties is an important prerequisite for advancing technology in our modern societies. This requires both the ability to calculate accurate microscopic properties, such as energies, forces and electrostatic multipoles of specific configurations, as well as efficient sampling of potential energy surfaces to obtain corresponding macroscopic properties. Tools that can provide this are accurate first-principles calculations rooted in quantum mechanics, and statistical mechanics, respectively. Unfortunately, they come at a high computational cost that prohibits calculations for large systems and long time-scales, thus presenting a severe bottleneck both for searching the vast chemical compound space and the stupendously many dynamical configurations that a molecule can assume. To overcome this challenge, recently there have been increased efforts to accelerate quantum simulations with machine learning (ML). This emerging interdisciplinary community encompasses chemists, material scientists, physicists, mathematicians and computer scientists, joining forces to contribute to the exciting hot topic of progressing machine learning and AI for molecules and materials. The book that has emerged from a series of workshops provides a snapshot of this rapidly developing field. It contains tutorial material explaining the relevant foundations needed in chemistry, physics as well as machine learning to give an easy starting point for interested readers. In addition, a number of research papers defining the current state-of-the-art are included. The book has five parts (Fundamentals, Incorporating Prior Knowledge, Deep Learning of Atomistic Representations, Atomistic Simulations and Discovery and Design), each prefaced by editorial commentary that puts the respective parts into a broader scientific context.

Computational Methods for Protein Structure Prediction and Modeling

Computational Methods for Protein Structure Prediction and Modeling PDF Author: Ying Xu
Publisher: Springer Science & Business Media
ISBN: 0387683720
Category : Science
Languages : en
Pages : 408

Get Book Here

Book Description
Volume One of this two-volume sequence focuses on the basic characterization of known protein structures, and structure prediction from protein sequence information. Eleven chapters survey of the field, covering key topics in modeling, force fields, classification, computational methods, and structure prediction. Each chapter is a self contained review covering definition of the problem and historical perspective; mathematical formulation; computational methods and algorithms; performance results; existing software; strengths, pitfalls, challenges, and future research.

Feature Representation and Learning Methods With Applications in Protein Secondary Structure

Feature Representation and Learning Methods With Applications in Protein Secondary Structure PDF Author: Zhibin Lv
Publisher: Frontiers Media SA
ISBN: 2889715558
Category : Science
Languages : en
Pages : 112

Get Book Here

Book Description


Prediction of Protein Secondary Structure

Prediction of Protein Secondary Structure PDF Author: Yaoqi Zhou
Publisher: Humana
ISBN: 9781493964048
Category : Science
Languages : en
Pages : 0

Get Book Here

Book Description
This thorough volume explores predicting one-dimensional functional properties, functional sites in particular, from protein sequences, an area which is getting more and more attention. Beginning with secondary structure prediction based on sequence only, the book continues by exploring secondary structure prediction based on evolution information, prediction of solvent accessible surface areas and backbone torsion angles, model building, global structural properties, functional properties, as well as visualizing interior and protruding regions in proteins. Written for the highly successful Methods in Molecular Biology series, the chapters include the kind of detail and implementation advice to ensure success in the laboratory. Practical and authoritative, Prediction of Protein Secondary Structure serves as a vital guide to numerous state-of-the-art techniques that are useful for computational and experimental biologists.

Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics

Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics PDF Author: Yi Pan
Publisher: John Wiley & Sons
ISBN: 1118345789
Category : Medical
Languages : en
Pages : 534

Get Book Here

Book Description
Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics An in-depth look at the latest research, methods, and applications in the field of protein bioinformatics This book presents the latest developments in protein bioinformatics, introducing for the first time cutting-edge research results alongside novel algorithmic and AI methods for the analysis of protein data. In one complete, self-contained volume, Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics addresses key challenges facing both computer scientists and biologists, arming readers with tools and techniques for analyzing and interpreting protein data and solving a variety of biological problems. Featuring a collection of authoritative articles by leaders in the field, this work focuses on the analysis of protein sequences, structures, and interaction networks using both traditional algorithms and AI methods. It also examines, in great detail, data preparation, simulation, experiments, evaluation methods, and applications. Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics: Highlights protein analysis applications such as protein-related drug activity comparison Incorporates salient case studies illustrating how to apply the methods outlined in the book Tackles the complex relationship between proteins from a systems biology point of view Relates the topic to other emerging technologies such as data mining and visualization Includes many tables and illustrations demonstrating concepts and performance figures Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics is an essential reference for bioinformatics specialists in research and industry, and for anyone wishing to better understand the rich field of protein bioinformatics.

Nucleic Acid and Protein Sequence Analysis

Nucleic Acid and Protein Sequence Analysis PDF Author: Martin J. Bishop
Publisher: Oxford University Press, USA
ISBN:
Category : Language Arts & Disciplines
Languages : en
Pages : 446

Get Book Here

Book Description