Computational Analysis of RNA-binding Protein Target-site Selection and Function

Computational Analysis of RNA-binding Protein Target-site Selection and Function PDF Author: Xiao Li
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description

Computational Analysis of RNA-binding Protein Target-site Selection and Function

Computational Analysis of RNA-binding Protein Target-site Selection and Function PDF Author: Xiao Li
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description


Computational Analysis and Prediction of RNA-protein Interactions

Computational Analysis and Prediction of RNA-protein Interactions PDF Author: Michael Uhl
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
Abstract: This dissertation is about the computational analysis and prediction of RNA-protein interactions. Ribonucleic acids (RNAs) and proteins both are essential for the control of gene expression in our cells. Gene expression is the process by which a functional gene product, namely a protein or an RNA, is produced from a gene, starting from the gene region on the DNA with the transcription of an RNA. Once regarded primarily as a messenger to transmit the protein information, recent years have seen RNA moving further into the biomedical spotlight, thanks to its increasingly uncovered roles in regulating gene expression. In addition, RNA has showcased its therapeutic potential, as famously demonstrated by the groundbreaking success of RNA vaccines in the COVID-19 pandemic. However, RNAs rarely function on their own: In humans, more than 1,500 different RNA-binding proteins (RBPs) are involved in controlling the various stages of an RNA's life cycle, creating a highly complex regulatory interplay between RNAs and proteins. It is therefore of fundamental importance to study these RNA-protein interactions, in order to deepen our understanding of gene expression. Over the last decade, CLIP-seq has become the dominant experimental method to identify the set of cellular RNA binding sites for an RBP of interest. However, analysing the resulting CLIP-seq data can be challenging, as there are many analysis steps and CLIP-seq protocol variants available, each requiring specific adaptations to the analysis workflow. Consequently, there is a need for analysis guidelines, providing easy access to tools, as well as the constant improvement of tools and workflows to increase the accuracy of the analysis results. The first set of works included in this thesis (publications P1, P4, and P5) deals with these topics, by providing a review article on CLIP-seq data analysis, as well as two articles on how to further improve CLIP-seq data analysis. Publication P1 supplies readers with an overview of tools and protocols, as well as guidelines to conduct a successful analysis, drawing largely from our own experience with analysing CLIP-seq data. Publication P4 demonstrates the issues current binding site identification tools have with CLIP-seq data from RBPs that bind to processed RNAs, and that the integration of RNA processing information improves the resulting binding site quality. On top of this, publication P5 presents Peakhood, the first tool that utilizes RNA processing information in order to increase the quality of RBP binding sites identified from CLIP-seq data. A natural drawback of experimental methods is that a target RNA needs to be sufficiently expressed in the observed cells for an RNA-protein interaction to be detected. Hence, since gene expression is a dynamic process that differs between cell types, time points, and conditions, a CLIP-seq experiment cannot recover the complete set of cellular RBP binding sites. This creates a demand for computational methods which can learn the binding properties of an RBP from existing CLIP-seq data, in order to predict RBP binding sites on any given target RNA. Besides interacting with proteins, RNAs can also interact with other RNAs, further increasing the amount of possible regulatory interactions between RNAs and proteins. In this regard, long non-coding RNAs (lncRNAs), a large class of non-protein-coding RNAs whose functions are still vastly unexplored, have become especially important, as it has been shown that they can engage in RNA-RNA interactions, whose regulatory mechanisms also include RNA-protein interactions. As such mechanistic studies are typically slow and expensive, computational tools that combine RNA-protein and RNA-RNA interaction predictions to infer potential mechanisms could be of great help, e.g., by screening a set of target RNAs and proteins and suggesting plausible mechanisms for experimental validation. The second set of works included in this thesis (publications P2 and P3) thus deals with the computational prediction of RNA-protein interactions, RNA-RNA interactions and the functional mechanisms that can be inferred from these interactions. Publication P2 introduces MechRNA, the first tool to infer functional mechanisms of lncRNAs based on their predicted interactions with RBPs and other RNAs, as well as gene expression data. We demonstrated MechRNA's capability to identify formerly described lncRNA mechanisms and experimentally validated one prediction, underlining its value for functional lncRNA studies. Finally, publication P3 presents RNAProt, a flexible and performant RBP binding site prediction tool based on recurrent neural networks. Compared to other popular deep learning methods, RNAProt achieves state-of-the-art predictive performance, as well as superior runtime efficiency. In addition, it is more feature-rich than any other available method, including the support of user-defined predictive features. We further showed that its visualizations agree with known RBP binding preferences, and demonstrated that its additional predictive features can increase the specificity of predictions

Computational Analysis of the Interplay Between RNA Structure and Function

Computational Analysis of the Interplay Between RNA Structure and Function PDF Author: Elan A. Shatoff
Publisher:
ISBN:
Category : Molecular structure
Languages : en
Pages : 0

Get Book Here

Book Description
RNA is ubiquitous in the cellular environment, and it can function in innumerable ways with a variety of interaction partners. A RNA molecule's structure, in particular the set of base pairing interactions between the nucleotides of the molecule known as secondary structure, can help determine its function. Since most proteins can only bind to either single stranded or double stranded RNA, RNA secondary structure can also help determine where and how RNA-protein binding interactions occur. In this work I investigate computational models for RNA-protein interactions in a variety of different contexts. In Chapter 2 I probe the effect of single nucleotide variations on RNA-protein binding as mediated by RNA secondary structure. Single nucleotide variations are single nucleotide changes in an organism's genome that can often cause disease, and may do so through a number of different mechanisms. In this work we propose that sequence changes can affect accessibility to protein binding sites through changes in secondary structure, even when these sequence changes occur tens of nucleotides outside of protein binding sites. We find that single nucleotide variations can have a many fold effect on the binding affinity of proteins for RNA, and characterize the genome-wide effect of single nucleotide variations on HuR binding. HuR is a single-stranded RNA binding protein that binds to AU-rich sequences, and has links to diseases such as cancer. We also find an asymmetry in this effect for HuR, indicating that this effect may be under selection. Following the previous work, which utilizes a model incorporating single stranded RNA binding proteins into RNA secondary structure folding, I introduce a model for incorporating double stranded RNA binding proteins (dsRBPs) into RNA secondary structure partition function calculations in Chapter 3. The dsRBPs are an important but understudied class of proteins that have uses in a wide range of processes. We implement our model in the ViennaRNA package, and validate it by calculating a number of experimental observables for transactivation response element RNA-binding protein. We find that RNA secondary structure can have a many fold effect on the effective binding affinity of dsRBPs, and show that calculated affinities for pre-miRNA-like constructs correlate with experimentally measured processing rates. Our model provides a novel method for interrogating the interplay between dsRBPs and RNA secondary structure. In Chapter 4 I study RNA-protein interactions in a different context, and investigate the role of Shine-Dalgarno (SD) sequences in translation in the Bacteroidetes. The Bacteroidetes are a phylum of bacteria known to rarely use SD sequences, but after performing a survey of SD usage in the phylum we find that certain ribosomal protein genes utilize them, particularly rpsU. A cryo-electron microscopy structure of the ribosome from Flavobacterium johnsoniae, a member of the Bacteroidetes, also shows that S21, which is encoded by the ribosomal open reading frame rpsU, sequesters the anti-Shine-Dalgarno (ASD) sequence. In our survey of SD sequences we also find covariation between the SD sequence of rpsU and the ASD sequence. These observations suggest an autoregulatory model for S21 in the Bacteroidetes.

The Biology of mRNA: Structure and Function

The Biology of mRNA: Structure and Function PDF Author: Marlene Oeffinger
Publisher: Springer Nature
ISBN: 3030314340
Category : Science
Languages : en
Pages : 318

Get Book Here

Book Description
The book provides an overview on the different aspects of gene regulation from an mRNA centric viewpoint, including how mRNA is assembled and self-assembles in a complex consisting of RNA and proteins, and how its ability to be translated at the right time and space depends on many processes acting on the mRNAs, leading to a properly folded complex. This book shows how new technologies have led to a better understanding of these processes and their connected diseases.The book is written for scientists in fundamental and applied biomedical research working on different aspects of gene regulation. It is also targeted to an audience that is not implicated in these fields directly, but wants to gain a better understanding of mRNA biology.

High-Resolution Profiling of Protein-RNA Interactions

High-Resolution Profiling of Protein-RNA Interactions PDF Author: Mathias Munschauer
Publisher: Springer
ISBN: 3319162535
Category : Technology & Engineering
Languages : en
Pages : 140

Get Book Here

Book Description
The work reported in this book represents an excellent example of how creative experimentation and technology development, complemented by computational data analysis, can yield important insights that further our understanding of biological entities from a systems perspective. The book describes how the study of a single RNA-binding protein and its interaction sites led to the development of the novel ‘protein occupancy profiling’ technology that for the first time captured the mRNA sequence space contacted by the ensemble of expressed RNA binders. Application of protein occupancy profiling to eukaryotic cells revealed that extensive sequence stretches in 3’ UTRs can be contacted by RBPs and that evolutionary conservation as well as negative selection act on protein-RNA contact sites, suggesting functional importance. Comparative analysis of the RBP-bound sequence space has the potential to unravel putative cis-acting RNA elements without a priori knowledge of the bound regulators. Here, Dr. Munschauer provides a comprehensive introduction to the field of post-transcriptional gene regulation, examines state-of-the-art technologies, and combines the conclusions from several journal articles into a coherent and logical story from the frontiers of systems-biology inspired life science. This thesis, submitted to the Department of Biology, Chemistry and Pharmacy at Freie Universität Berlin, was selected as outstanding work by the Berlin Institute for Medical Systems Biology at the Max-Delbrueck Center for Molecular Medicine, Germany.

Dissecting Regulatory Interactions of RNA and Protein

Dissecting Regulatory Interactions of RNA and Protein PDF Author: Marvin Jens
Publisher: Springer
ISBN: 3319070827
Category : Technology & Engineering
Languages : en
Pages : 115

Get Book Here

Book Description
The work described in this book is an excellent example of interdisciplinary research in systems biology. It shows how concepts and approaches from the field of physics can be efficiently used to answer biological questions and reports on a novel methodology involving creative computer-based analyses of high-throughput biological data. Many of the findings described in the book, which are the result of collaborations between the author (a theoretical scientist) and experimental biologists and between different laboratories, have been published in high-quality peer-reviewed journals such as Molecular Cell and Nature. However, while those publications address different aspects of post-transcriptional gene regulation, this book provides readers with a complete, coherent and logical view of the research project as a whole. The introduction presents post-transcriptional gene regulation from a distinct angle, highlighting aspects of information theory and evolution and laying the groundwork for the questions addressed in the subsequent chapters, which concern the regulation of the transcriptome as the primary functional carrier of active genetic information.

Functional and Computational Analysis of RNA-binding Proteins and Their Roles in Cancer

Functional and Computational Analysis of RNA-binding Proteins and Their Roles in Cancer PDF Author: Yarden Katz
Publisher:
ISBN:
Category :
Languages : en
Pages : 241

Get Book Here

Book Description
This work is concerned with mRNA processing in mammalian cells and proceeds in two parts. In the first part, I introduce a computational framework for inferring the abundances of mRNA isoforms using high-throughput RNA sequencing data. This framework was applied to study the targets of the ubiquitous splicing factor hnRNP H in human cells. In the second part, I describe an experimental study of the Musashi (hnRNP-like) family of RNA-binding proteins in stem cells and cancer cells, which incorporates computational analyses that rely heavily on the framework developed in part one. In sum, this work provides a computational framework of general use in global analyses of RNA processing and its protein regulators, as well as functional insights into a family of poorly understood RNA-binding proteins. Several related analyses and techniques developed as part of the thesis are described in Appendix A-C. Appendix A describes a study of activity-dependent gene expression and mRNA processing in the mouse olfactory bulb. It uses computational techniques developed in part one of the thesis. Appendix B describes a technique for quantitative visualization of alternative splicing from RNA sequencing data and its integration into a genome browser. Appendix C describes a method for clonal analysis of neural stem cell growth and differentiation in culture using live imaging and `microdot' plates, developed as part of the work presented in part one of the thesis.

RNA-protein Interactions

RNA-protein Interactions PDF Author: Kiyoshi Nagai
Publisher: Oxford University Press, USA
ISBN:
Category : Medical
Languages : en
Pages : 302

Get Book Here

Book Description
The study of RNA-protein interactions is crucial to understanding the mechanisms and control of gene expression and protein synthesis. The realization that RNAs are often far more biologically active than was previously appreciated has stimulated a great deal of new research in this field. Uniquely, in this book, the world's leading researchers have collaborated to produce a comprehensive and current review of RNA-protein interactions for all scientists working in this area. Timely, comprehensive, and authoritative, this new Frontiers title will be invaluable for all researchers in molecular biology, biochemistry and structural biology.

Analyzing Microarray Gene Expression Data

Analyzing Microarray Gene Expression Data PDF Author: Geoffrey J. McLachlan
Publisher: John Wiley & Sons
ISBN: 0471726125
Category : Mathematics
Languages : en
Pages : 366

Get Book Here

Book Description
A multi-discipline, hands-on guide to microarray analysis of biological processes Analyzing Microarray Gene Expression Data provides a comprehensive review of available methodologies for the analysis of data derived from the latest DNA microarray technologies. Designed for biostatisticians entering the field of microarray analysis as well as biologists seeking to more effectively analyze their own experimental data, the text features a unique interdisciplinary approach and a combined academic and practical perspective that offers readers the most complete and applied coverage of the subject matter to date. Following a basic overview of the biological and technical principles behind microarray experimentation, the text provides a look at some of the most effective tools and procedures for achieving optimum reliability and reproducibility of research results, including: An in-depth account of the detection of genes that are differentially expressed across a number of classes of tissues Extensive coverage of both cluster analysis and discriminant analysis of microarray data and the growing applications of both methodologies A model-based approach to cluster analysis, with emphasis on the use of the EMMIX-GENE procedure for the clustering of tissue samples The latest data cleaning and normalization procedures The uses of microarray expression data for providing important prognostic information on the outcome of disease

Biological Sequence Analysis

Biological Sequence Analysis PDF Author: Richard Durbin
Publisher: Cambridge University Press
ISBN: 113945739X
Category : Science
Languages : en
Pages : 372

Get Book Here

Book Description
Probabilistic models are becoming increasingly important in analysing the huge amount of data being produced by large-scale DNA-sequencing efforts such as the Human Genome Project. For example, hidden Markov models are used for analysing biological sequences, linguistic-grammar-based probabilistic models for identifying RNA secondary structure, and probabilistic evolutionary models for inferring phylogenies of sequences from different organisms. This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of sequence analysis. Written by an interdisciplinary team of authors, it aims to be accessible to molecular biologists, computer scientists, and mathematicians with no formal knowledge of the other fields, and at the same time present the state-of-the-art in this new and highly important field.