Computational Genetic Approaches for the Dissection of Complex Traits

Computational Genetic Approaches for the Dissection of Complex Traits PDF Author: Nicholas A. Furlotte
Publisher:
ISBN:
Category :
Languages : en
Pages : 105

Get Book Here

Book Description
Over the past two decades, major technological innovations have transformed the field of genetics allowing researchers to examine the relationship between genetic and phenotypic variation at an unprecedented level of granularity. As a result, genetics has increasingly become a data-driven science, demanding effective statistical procedures and efficient computational methods and necessitating a new interface that some refer to as computational genetics. In this dissertation, I focus on a few problems existing within this interface. First, I introduce a method for calculating gene coexpression in a way that is robust to statistical confounding introduced through expression hetero- geneity. Heterogeneity in experimental conditions causes separate microarrays to be more correlated than expected by chance. This additional correlation between arrays induces correlation between gene expression measurements, in effect causing spuri- ous gene coexpression. By formulating the problem of calculating coexpression in a linear mixed-model framework, I show how it is possible to account for the cor- relation between microarrays and produce coexpression values that are robust to ex- pression heterogeneity. Second, I introduce a meta-analysis technique that allows for genome-wide association studies to be combined across populations that are known to contain population structure. This development was motivated by a specific problem in mouse genetics, the aim of which is to utilize multiple mouse association studies jointly. I show that by combining the studies using meta-analysis, while accounting for population structure, the proposed method achieves increased statistical power and increased association resolution. Next, I will introduce a computational and statistical procedure for performing genome-wide association using longitudinal measurements. I show that by accounting for the genetic and environmental correlation between mea- surements originating from the same individual, it is possible to increase association power. Finally, I will introduce a statistical and computational construct called the matrix-variate linear mixed-model (mvLMM), which is used for multiple phenotype genome-wide association. I show how the application of this method results in increased association power over single trait mapping and leads to a dramatic reduction in computational time over classical multiple phenotype optimization procedures. For example, where a classically-based approach takes hours to perform parameter optimization for moderate sample sizes mvLMM takes minutes. This technique is both a generalization and improvement on the previously proposed longitudinal analysis technique and its innovation has the potential to impact many current problems in the field of computational genetics.

Computational Genetic Approaches for the Dissection of Complex Traits

Computational Genetic Approaches for the Dissection of Complex Traits PDF Author: Nicholas A. Furlotte
Publisher:
ISBN:
Category :
Languages : en
Pages : 105

Get Book Here

Book Description
Over the past two decades, major technological innovations have transformed the field of genetics allowing researchers to examine the relationship between genetic and phenotypic variation at an unprecedented level of granularity. As a result, genetics has increasingly become a data-driven science, demanding effective statistical procedures and efficient computational methods and necessitating a new interface that some refer to as computational genetics. In this dissertation, I focus on a few problems existing within this interface. First, I introduce a method for calculating gene coexpression in a way that is robust to statistical confounding introduced through expression hetero- geneity. Heterogeneity in experimental conditions causes separate microarrays to be more correlated than expected by chance. This additional correlation between arrays induces correlation between gene expression measurements, in effect causing spuri- ous gene coexpression. By formulating the problem of calculating coexpression in a linear mixed-model framework, I show how it is possible to account for the cor- relation between microarrays and produce coexpression values that are robust to ex- pression heterogeneity. Second, I introduce a meta-analysis technique that allows for genome-wide association studies to be combined across populations that are known to contain population structure. This development was motivated by a specific problem in mouse genetics, the aim of which is to utilize multiple mouse association studies jointly. I show that by combining the studies using meta-analysis, while accounting for population structure, the proposed method achieves increased statistical power and increased association resolution. Next, I will introduce a computational and statistical procedure for performing genome-wide association using longitudinal measurements. I show that by accounting for the genetic and environmental correlation between mea- surements originating from the same individual, it is possible to increase association power. Finally, I will introduce a statistical and computational construct called the matrix-variate linear mixed-model (mvLMM), which is used for multiple phenotype genome-wide association. I show how the application of this method results in increased association power over single trait mapping and leads to a dramatic reduction in computational time over classical multiple phenotype optimization procedures. For example, where a classically-based approach takes hours to perform parameter optimization for moderate sample sizes mvLMM takes minutes. This technique is both a generalization and improvement on the previously proposed longitudinal analysis technique and its innovation has the potential to impact many current problems in the field of computational genetics.

Computational Methods for Genetics of Complex Traits

Computational Methods for Genetics of Complex Traits PDF Author:
Publisher: Academic Press
ISBN: 0123808634
Category : Science
Languages : en
Pages : 211

Get Book Here

Book Description
The field of genetics is rapidly evolving, and new medical breakthroughs are occurring as a result of advances in knowledge gained from genetics reasearch. This thematic volume of Advances in Genetics looks at Computational Methods for Genetics of Complex traits. Explores the latest topics in neural circuits and behavior research in zebrafish, drosophila, C.elegans, and mouse models Includes methods for testing with ethical, legal, and social implications Critically analyzes future prospects

Computational Genetic Approaches for Understanding the Genetic Basis of Complex Traits

Computational Genetic Approaches for Understanding the Genetic Basis of Complex Traits PDF Author: Eun Yong Kang
Publisher:
ISBN:
Category :
Languages : en
Pages : 273

Get Book Here

Book Description
Recent advances in genotyping and sequencing technology have enabled researchers to collect an enormous amount of high-dimensional genotype data. These large scale genomic data provide unprecedented opportunity for researchers to study and analyze the genetic factors of human complex traits. One of the major challenges in analyzing these high-throughput genomic data is requirements for effective and efficient computational methodologies. In this thesis, I introduce several methodologies for analyzing these genomic data which facilitates our understanding of the genetic basis of complex human traits. First, I introduce a method for inferring biological networks from high-throughput data containing both genetic variation information and gene expression profiles from genetically distinct strains of an organism. For this problem, I use causal inference techniques to infer the presence or absence of causal relationships between yeast gene expressions in the framework of graphical causal models. In particular, I utilize prior biological knowledge that genetic variations affect gene expressions, but not vice versa, which allow us to direct the subsequent edges between two gene expression levels. The prediction of a presence of causal relationship as well as the absence of causal relationship between gene expressions can facilitate distinguishing between direct and indirect effects of variation on gene expression levels. I demonstrate the utility of our approach by applying it to data set containing 112 yeast strains and the proposed method identifies the known "regulatory hotspot" in yeast. Second, I introduce efficient pairwise identity by descent (IBD) association mapping method, which utilizes importance sampling to improve efficiency and enables approximation of extremely small p-values. Two individuals are IBD at a locus if they have identical alleles inherited from a common ancestor. One popular approach to find the association between IBD status and disease phenotype is the pairwise method where one compares the IBD rate of case/case pairs to the background IBD rate to detect excessive IBD sharing between cases. One challenge of the pairwise method is computational efficiency. In the pairwise method, one uses permutation to approximate p-values because it is difficult to analytically obtain the asymptotic distribution of the statistic. Since the p-value threshold for genome-wide association studies (GWAS) is necessarily low due to multiple testing, one must perform a large number of permutations which can be computationally demanding. I present Fast-Pairwise to overcome the computational challenges of the traditional pairwise method by utilizing importance sampling to improve efficiency and enable approximation of extremely small p-values. Using the WTCCC type 1 diabetes data, I show that Fast-Pairwise can successfully pinpoint a gene known to be associated to the disease within the MHC region. Finally, I introduce a novel meta analytic approach to identify gene-by-environment interactions by aggregating the multiple studies with varying environmental conditions. Identifying environmentally specific genetic effects is a key challenge in understanding the structure of complex traits. Model organisms play a crucial role in the identification of such gene-by-environment interactions, as a result of the unique ability to observe genetically similar individuals across multiple distinct environments. Many model organism studies examine the same traits but, under varying environmental conditions. These studies when examined in aggregate provide an opportunity to identify genomic loci exhibiting environmentally-dependent effects. In this project, I jointly analyze multiple studies with varying environmental conditions using a meta-analytic approach based on a random effects model to identify loci involved in gene-by-environment interactions. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional uni- or multi-variate approaches for discovery of gene-by-environment interactions. I apply our new method to combine 17 mouse studies containing in aggregate 4,965 distinct animals. We identify 26 significant loci involved in High-density lipoprotein (HDL) cholesterol, many of which show significant evidence of involvement in gene-by-environment interactions.

Molecular Dissection of Complex Traits

Molecular Dissection of Complex Traits PDF Author: Andrew H. Paterson
Publisher: CRC Press
ISBN: 1420049380
Category : Science
Languages : en
Pages : 320

Get Book Here

Book Description
In the past 10 years, contemporary geneticists using new molecular tools have been able to resolve complex traits into individual genetic components and describe each such component in detail. Molecular Dissection of Complex Traits summarizes the state of the art in molecular analysis of complex traits (QTL mapping), placing new developments in thi

Experimental and Computational Approaches for Genetic Dissection of Complex Phenotypes

Experimental and Computational Approaches for Genetic Dissection of Complex Phenotypes PDF Author: Hani Goodarzi
Publisher:
ISBN:
Category :
Languages : en
Pages : 240

Get Book Here

Book Description


Computational Approaches to Understanding the Genetic Architecture of Complex Traits

Computational Approaches to Understanding the Genetic Architecture of Complex Traits PDF Author: Brielin C. Brown
Publisher:
ISBN:
Category :
Languages : en
Pages : 90

Get Book Here

Book Description
Advances in DNA sequencing technology have resulted in the ability to generate genetic data at costs unimaginable even ten years ago. This has resulted in a tremendous amount of data, with large studies providing genotypes of hundreds of thousands of individuals at millions of genetic locations. This rapid increase in the scale of genetic data necessitates the development of computational methods that can analyze this data rapidly without sacrificing statistical rigor. The low cost of DNA sequencing also provides an opportunity to tailor medical care to an individuals unique genetic signature. However, this type of precision medicine is limited by our understanding of how genetic variation shapes disease. Our understanding of so- called complex diseases is particularly poor, and most identified variants explain only a tiny fraction of the variance in the disease that is expected to be due to genetics. This is further complicated by the fact that most studies of complex disease go directly from genotype to phenotype, ignoring the complex biological processes that take place in between. Herein, we discuss several advances in the field of complex trait genetics. We begin with a review of computational and statistical methods for working with genotype and phenotype data, as well as a discussion of methods for analyzing RNA-seq data in effort to bridge the gap between genotype and phenotype. We then describe our methods for 1) improving power to detect common variants associated with disease, 2) determining the extent to which different world populations share similar disease genetics and 3) identifying genes which show differential expression between the two haplotypes of a single individual. Finally, we discuss opportunities for future investigation in this field.

Molecular and Computational Approaches to Identification of Genes Underlying Complex Traits

Molecular and Computational Approaches to Identification of Genes Underlying Complex Traits PDF Author: Martin L. Jirout
Publisher:
ISBN:
Category :
Languages : en
Pages : 236

Get Book Here

Book Description
Understanding the genetic architecture of complex traits is of great interest to the biomedical community. HXB/BXH recombinant inbred (RI) strains, derived from the spontaneously hypertensive rat (SHR) and normotensive Brown Norway (BN. Lx), are an important genomic resource for complex trait analysis by means of genetic linkage mapping. The power and accuracy of quantitative trait locus (QTL) analysis critically depends on the quality of the genetic map. To maximize the potential of the HXB/BXH RI strains for complex trait mapping, the latest available genotype information was used to construct a new genetic linkage map. Further, gene expression profiling and biochemical phenotyping in the adrenal glands of the HXB/BXH rats was performed to address the possible link between the dysregulated catecholamine biosynthesis in the SHR and the development of hypertension. Expression levels and enzyme activities of the two main catecholamine biosynthetic enzymes, Dbh and Pnmt, were found to be regulated from their genic regions (i.e., in cis). Pnmt re-sequencing revealed promoter polymorphisms, which resulted in a decreased response of the transfected SHR promoter to glucocorticoid stimulation. Dbh activity was negatively correlated with systolic blood pressure in RI strains, and Pnmt activity was negatively correlated with heart rate. These heritable changes in enzyme expression suggest primary genetic mechanisms for regulation of catecholamine action and blood pressure control in the SHR. In a separate analysis, genetic determinants of gene expression in the adrenal gland were explored. The adrenal transcriptome assayed via microarrays was subjected to expression quantitative trait locus (eQTL) mapping. Significant clustering of trans-eQTLs was observed, implying that groups of genes are jointly regulated from a single locus. A novel multivariate distance-matrix regression analysis (MDMR) method was applied to identify cis-eQTL genes whose expression profiles strongly correlate with those of the trans-eQTL cluster genes. The resulting genes, Rbm16 and Prp4b, are involved in pre-mRNA processing and as such present leading candidates for further studies aimed at better understanding of the quantitative genetics of gene expression. In conclusion, an important genomic resource was enhanced and then utilized to identify genetic loci controlling key aspects of catecholamine physiology, and differences in global gene expression.

Computational Methods for Disease Diagnosis and Understanding the Genetics of Complex Traits

Computational Methods for Disease Diagnosis and Understanding the Genetics of Complex Traits PDF Author: Lisa Gai
Publisher:
ISBN:
Category :
Languages : en
Pages : 99

Get Book Here

Book Description
An ever increasing wealth of biological data has become available in recent years, and with it, the potential to understand complex traits and extract disease relevant information from these many forms of data through computational methods. Understanding the genetic architecture behind complex traits can help us understand disease risk and adverse drug reactions, and to guide the development of treatment strategies. Many variants identified by genome-wide association studies (GWAS) have been found to affect multiple traits, either directly or through shared pathways. Analyzing multiple traits at once can increase power to detect shared variant effects from publicly available GWAS summary statistics. Use of multiple traits may also improve accuracy when estimating variant effects, which can be used in polygenic scores to stratify individuals by disease risk. This dissertation presents a method, CONFIT, for combining GWAS in multiple traits for variant discovery, and explores a few potential multi-trait methods for estimating polygenic scores. Computational methods can also be used to identify patients already suffering from disease who would benefit from treatment. Towards this end, this dissertation also presents work on deep learning to detect patients with orbital disease from image data with high accuracy and recall.

Computational Methods to Analyze Large-scale Genetic Studies of Complex Human Traits

Computational Methods to Analyze Large-scale Genetic Studies of Complex Human Traits PDF Author: Huwenbo Shi
Publisher:
ISBN:
Category :
Languages : en
Pages : 163

Get Book Here

Book Description
Large-scale genome-wide association studies (GWAS) have produced a rich resource of genetic data over the past decade, urging the need to develop computational and statistical methods that analyze these data. This dissertation presents four statistical methods that model the correlation structure between genetic variants and its effect on GWAS summary association statistics to help understand the genetic basis of complex human traits and diseases. The first method employs the multivariate Bernoulli distribution to model haplotype data, allowing for higher-order interactions among genetic variants, and shows better accuracy in predicting DNase I hypersensitivity status. The second method partitions heritability into small regions on the genome using GWAS summary statistics data, while accounting for complex correlation structures among genetic variants, and uncovers the genetic architectures of complex human traits and diseases. Extending the second method into pairs of traits, the third method partitions genetic correlation into small genomic regions using GWAS summary statistics data, and provides insights into the shared genetic basis between pairs of traits. Finally, the fourth method dissects population-specific and shared causal genetic variants of complex traits in two continental populations, using GWAS summary statistics data obtained from samples of different ethnicities, and reveals differences in genetic architectures of two continental populations.

Computational Approaches on Identifying Genetic Variants Underlying Complex Disease

Computational Approaches on Identifying Genetic Variants Underlying Complex Disease PDF Author: Jiayin Wang
Publisher:
ISBN:
Category :
Languages : en
Pages : 93

Get Book Here

Book Description