Approaches to Analysis of Chromosome Conformation Capture Data

Approaches to Analysis of Chromosome Conformation Capture Data PDF Author: Joachim Wolff
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description

Approaches to Analysis of Chromosome Conformation Capture Data

Approaches to Analysis of Chromosome Conformation Capture Data PDF Author: Joachim Wolff
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description


Computational Approaches for the Analysis of Chromosome Conformation Capture Data and Their Application to Study Long-Range Gene Regulation

Computational Approaches for the Analysis of Chromosome Conformation Capture Data and Their Application to Study Long-Range Gene Regulation PDF Author: Bryan R. Lajoie
Publisher:
ISBN:
Category :
Languages : en
Pages : 680

Get Book Here

Book Description


Hi-C Data Analysis

Hi-C Data Analysis PDF Author: Silvio Bicciato
Publisher: Humana
ISBN: 9781071613924
Category : Science
Languages : en
Pages : 0

Get Book Here

Book Description
This volume details a comprehensive set of methods and tools for Hi-C data processing, analysis, and interpretation. Chapters cover applications of Hi-C to address a variety of biological problems, with a specific focus on state-of-the-art computational procedures adopted for the data analysis. Written in the highly successful Methods in Molecular Biology series format, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible laboratory protocols, and tips on troubleshooting and avoiding known pitfalls. Authoritative and cutting-edge, Hi-C Data Analysis: Methods and Protocols aims to help computational and molecular biologists working in the field of chromatin 3D architecture and transcription regulation.

STATISTICAL ANALYSIS OF CHROMOSOME CONFORMATION DATA AND OTHER OMIC DATA.

STATISTICAL ANALYSIS OF CHROMOSOME CONFORMATION DATA AND OTHER OMIC DATA. PDF Author: Frank Shen
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
High-throughput genomic data have driven many key advances in biology in the last quarter century. But the ability to generate increasing amounts of high-throughput data has advanced faster than the methods to analyze this data. One such example is the data generated by various chromosome conformation capture methods. These techniques give biologists insights on the physical organization of DNA in the cell, which may reveal previously unrecognized methods that cells use to regulate and change their activities. But existing analysis methods have a hard time handling the high variability, high dimensionality, and low number of replicates present in the data. This thesis will focus on two different approaches to handle the issues with the most recent type of chromosome conformation capture data, Hi-C data. In addition, there is another project that attempts to apply methodology developed for high-throughput genomic data to other types of high-dimensional problems. Finally, this thesis will conclude with an applied analysis of another type of high-throughput biological data, proteomics data. Hi-C chromosome conformation data may reveal novel mechanisms for gene regulation, but analyzing Hi-C data is difficult due to high variability and numerous biases. We introduce a new Hi-C probabilistic model to expand upon existing methods for detecting chromatin interactions. To improve detection of small scale functional interactions, we incorporate larger chromatin structures into our hierarchical model, HierCM, to prevent these structures from distorting estimates of baseline contact rates. In addition, when there are replicate samples this method accounts for technical and biological variation in the Hi-C procedure producing improved peak estimation.Nonnegative matrix factorization (NMF) also has potential applications for understanding the structures in HiC data. NMF is a machine learning technique which can be used for dimension reduction and unsupervised soft clustering. It became popular, because it is supposed to provide sparse decomposition without an explicit sparsity-based penalty. Most NMF formulations are only loosely associated with likelihood based models. We seek to provide an explicit probability model. This change will makes it easier to incorporate multiple layers of variation into the model, which allows one to separate the contributions of different sources of variation. This adjustment can also be used to test for reproducibility in Hi-C data. However, NMF suffers from a lack of identifiability, making the method computationally unstable. Thus, this probabilistic NMF formulation cannot be relied upon for likelihood ratio testing on reproducibility or consistent differentiation between multiple sources of variation.Another common issue with high-throughput biological data is the extremely high dimensionality, which makes it difficult to detect true associations amidst random noise. Several data mining tools, such as SVM-RFE (Support Vector Machine-Recursive Feature Elimination) and Random Forest (RF), have sprung up to handle such analyses. SVM (Support Vector Machines) and RF were created for clustering and prediction, but they have also been used for variable selection. Irreproducible Discovery Rate (IDR) has been proposed as method to better identify important variables in high dimensional biological data. We explore its use on large, sparse, high-dimensional datasets to increase the accuracy and consistency of variable importance measures used in data mining. We find that IDR does not generally improve prediction accuracy or variable selection accuracy, because the data needs to be sub-divided to provide multiple variable selection scores for IDR analysis. The reduction in data size reduces the accuracy of each set of machine learning variable selection scores, to the point where IDR cannot always compensate for the reduction in accuracy. Finally, we performed a practical analysis on high-dimensional proteomics data generated from the pig gut. We analyze the different proteomics signatures between pigs that have been fed a standard control diet, a high fat diet, or a high fat diet combined with various types of potato supplements. Inflammatory markers showed that a high fat diet increased inflammation in the pig gut, but the potato supplements counteracted the increased inflammation. We used several classification methods to confirm there were significant differences in the gut proteome among the different diets, and the proteins with the largest differences were shown to have a high correlation with the inflammatory markers. The correlation between the differentially present proteins and the inflammatory markers combined with a pathway analysis of the differentially present proteins provide insights of the mechanism that allows some potato supplements to counteract the effects of the high fat diet.

Statistical Topology of Genome Analysis

Statistical Topology of Genome Analysis PDF Author: Maxime Guiffo Pouokam
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
Genomes from bacteria to eukaryotes are intricately organized by the mutual interplay between the three-dimensional (3D) folding of their genome and their functional cell activities. In this dissertation, we propose exhaustive computational and statistical approaches to analyze chromosome conformation capture (CCC) data to investigate the 3D structure of the genome at both the level of CCC interaction counts between genomic loci and that of the 3D physical reconstruction structure. In this work, we use the yeast Saccharomyces cerevisiae (S. cerevisiae) as a model system. Our first result identifies the Rabl configuration, an evolutionary conserved feature of the 3D nuclear organization, characterized by the clustering of centromeres on one side of the nuclear envelope and the telomeres at the antipodal side, as an essential player in the simplification of the entanglement of chromatin fibers. In our approach, we introduced a new geometrical invariant termed the linking proportion that can capture the entanglement between pairs of chromosomes. Next, we showcase a novel approach of statistical topology whereby agreement between chromatin configuration reconstructions, which includes reproducibility of chromatin con- figurations and evaluation of chromatin reconstruction algorithms, can be evaluated. Our proposed approach makes use of the linking proportion together with statistical methods in inference to reach the important conclusion that the multidimensional scaling methods fails to preserve chromosomal topology. Finally, we present Smooth3D, a novel approach of inferring the 3D genome configuration structure from the CCC interaction counts based on cubic spline approximation. Smooth3D produces the 3D chromosomal trajectory from the CCC interactions counts via B-spline curve fitting using a least-squares algorithm. Our method estimates both the parameter of the transfer counts to distance function and the 3D chromosomal trajectory.

Capturing Chromosome Conformation

Capturing Chromosome Conformation PDF Author: Beatrice Bodega
Publisher: Humana
ISBN: 9781071606636
Category : Science
Languages : en
Pages : 322

Get Book Here

Book Description
This detailed book collects methods based on the evolution of the chromosome conformation capture (3C) technique and other complementary approaches to dissect chromatin conformation with an emphasis on dissection of nuclear compartmentalization and visualization in imaging. Written for the highly successful Methods in Molecular Biology series, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible laboratory protocols, and tips on troubleshooting and avoiding known pitfalls. Authoritative and practical, Capturing Chromosome Conformation: Methods and Protocols serves as an ideal guide for researchers working to further understand 3D genome organization.

The Barley Genome

The Barley Genome PDF Author: Nils Stein
Publisher: Springer
ISBN: 3319925288
Category : Science
Languages : en
Pages : 400

Get Book Here

Book Description
This book presents an overview of the state-of-the-art in barley genome analysis, covering all aspects of sequencing the genome and translating this important information into new knowledge in basic and applied crop plant biology and new tools for research and crop improvement. Unlimited access to a high-quality reference sequence is removing one of the major constraints in basic and applied research. This book summarizes the advanced knowledge of the composition of the barley genome, its genes and the much larger non-coding part of the genome, and how this information facilitates studying the specific characteristics of barley. One of the oldest domesticated crops, barley is the small grain cereal species that is best adapted to the highest altitudes and latitudes, and it exhibits the greatest tolerance to most abiotic stresses. With comprehensive access to the genome sequence, barley’s importance as a genetic model in comparative studies on crop species like wheat, rye, oats and even rice is likely to increase.

Analysis of Chromosome Conformation Data and Application to Cancer

Analysis of Chromosome Conformation Data and Application to Cancer PDF Author: Nicolas Servant
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
The chromatin is not randomly arranged into the nucleus. Instead, the nuclear organization is tightly controlled following different organization levels. Recent studies have explored how the genome is organized to ensure proper gene regulation within a constrained nuclear space. However, the impact of the epigenome, and in particular the three-dimensional topology of chromatin and its implication in cancer progression remain largely unexplored. As an example, recent studies have started to demonstrate that defects in the folding of the genome can be associated with oncogenes activation. Although the exact mechanisms are not yet fully understood, it demonstrates that the chromatin organization is an important factor of tumorigenesis, and that a systematic exploration of the three-dimensional cancer genomes could improve our knowledge of cancer biology in a near future. High-throughput chromosome conformation capture methods are now widely used to map chromatin interaction within regions of interest or across the genome. The Hi-C technique empowered by next generation sequencing was designed to explore intra and inter-chromosomal contacts at the whole genome scale and therefore offers detailed insights into the spatial arrangement of complete genomes. The aim of this project was to develop computational methods and tools, that can extract relevant information from Hi-C data, and in particular, in a cancer specific context. The presented work is divided in three parts. First, as many sequencing applications, the Hi-C technique generates a huge amount of data. Managing these data requires optimized bioinformatics workflows able to process them in reasonable time and space. To answer this need, we developped HiC-Pro, an optimized and flexible pipeline to process Hi-C data from raw sequencing reads to normalized contact maps. HiC-Pro maps reads, detects valid ligation products, generates and normalizes intra- and inter-chromosomal contact maps. In addition, HiC-Pro is compatible with all current Hi-C-based protocols.

Modeling the 3D Conformation of Genomes

Modeling the 3D Conformation of Genomes PDF Author: Guido Tiana
Publisher: CRC Press
ISBN: 1351387006
Category : Science
Languages : en
Pages : 370

Get Book Here

Book Description
This book provides a timely summary of physical modeling approaches applied to biological datasets that describe conformational properties of chromosomes in the cell nucleus. Chapters explain how to convert raw experimental data into 3D conformations, and how to use models to better understand biophysical mechanisms that control chromosome conformation. The coverage ranges from introductory chapters to modeling aspects related to polymer physics, and data-driven models for genomic domains, the entire human genome, epigenome folding, chromosome structure and dynamics, and predicting 3D genome structure.

Plant Chromatin Dynamics

Plant Chromatin Dynamics PDF Author: Marian Bemer
Publisher: Humana
ISBN: 9781493984510
Category : Science
Languages : en
Pages : 0

Get Book Here

Book Description
This volume provides a comprehensive collection of protocols that can be used to study plant chromatin structure and composition. Chapters divided into three sections detail the profiling of chromatin features in relation to epigenetic regulation, investigate the interaction between chromatin modifications and gene regulation, and explore the 3D spatial organization of the chromatin inside the nucleus. Written in the highly successful Methods in Molecular Biology series format, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible laboratory protocols, and tips on troubleshooting and avoiding known pitfalls. Authoritative and cutting-edge, Plant Chromatin Dynamics: Methods and Protocols aims to ensure successful results in the further study of this vital field.