Controlling for Hidden Factors in High Dimensional EQTL Studies

Controlling for Hidden Factors in High Dimensional EQTL Studies PDF Author: Chuan Gao
Publisher:
ISBN:
Category :
Languages : en
Pages : 184

Get Book Here

Book Description
Finding genetic variants that regulate gene expression now plays a central role in the analysis of mechanism in biological systems. This will also increasingly be the case as large amounts of gene expression and genetic marker data are generated by next-generation sequencing technologies. While the unprecedented scale of these data is providing the opportunity for scientists to answer basic questions about biological systems, the properties of these data raise analysis challenges, particularly in terms of covariate modeling. For example, expression levels of thousands of genes are usually measured in batches and different batches may be measured under different conditions, which creates the well known batch effect. Besides this artificially created factor that can affect the quality of the measurement, expression data often reflect environmental regulators that change the gene expression levels, such as smoking, drug usage etc. These sources of confounding need to be addressed either before or during analysis of data. In this thesis, I address the analysis issues raised by a particular type of confounding in high-dimensional data: hidden factor effects. Hidden factors are defined as factors that contribute to variation in a large number of measured variables where there is no direct information concerning the factors in the data. It is critical to correct for the hidden factors because if ignored, they can lead to either high false positive rates or reduced power. To tackle this issue, I propose to use a statistical model that combines multivariate ridge regression and factor analysis to infer both the fixed effects and the hidden confounding. The method is unique in the sense that it employs the multivariate regression components to infer the associations between the response Y and the covariate X, while it maintains efficiency by sharing the same data reduction property with the factor analysis model. Compared to other models that address the same issue, this model can successfully partition the covariance structure of the hidden factors, which dramatically improves the power and the accuracy of detecting the real associations between X and Y.I also used the model to address the hidden factors issues in the analysis of data on gene expression levels measured in the airway of the lung in a sample of people, in the context of a genome association study, referred to as an expression Quantitative Trait Loci (eQTL) analysis. I show that the method successfully eliminates the false positives caused by spurious structures (hidden factors) and greatly improves the power to detect true genetic determinants (the eQTL) that regulate gene expression in the lung airway. I also apply the method to a challenging Genotype-Environment Interaction (GEI) analysis, where GEI effects are defined as the dependence of genotype-phenotype relationships on environmental factors. I show that despite the small sample size and the highly complicated data structure, with my method, I can identify a large number of interesting GEI associations, many have been verified indepently by other studies to be highly relevant genes to lung disease and lung functions. These GEI associations contain more information than a typical eQTL because they help to identify genetic regulators that show different behavior under different environmental pressures, which serve as an interesting set of gene candidates for clinical scientists.

Dissertation Abstracts International

Dissertation Abstracts International PDF Author:
Publisher:
ISBN:
Category : Dissertations, Academic
Languages : en
Pages : 906

Get Book Here

Book Description


Genome-Wide Association Studies

Genome-Wide Association Studies PDF Author: Krishnarao Appasani
Publisher: Cambridge University Press
ISBN: 1107042763
Category : Medical
Languages : en
Pages : 449

Get Book Here

Book Description
Experts from academia and industry highlight the potential of genome-wide association studies from basic science to clinical and biotechnological/pharmaceutical applications.

A Guide to QTL Mapping with R/qtl

A Guide to QTL Mapping with R/qtl PDF Author: Karl W. Broman
Publisher: Springer
ISBN: 9781461417088
Category : Science
Languages : en
Pages : 400

Get Book Here

Book Description
Comprehensive discussion of QTL mapping concepts and theory Detailed instructions on the use of the R/qtl software, the most featured and flexible software for QTL mapping Two case studies illustrate QTL analysis in its entirety

Batch Effects and Noise in Microarray Experiments

Batch Effects and Noise in Microarray Experiments PDF Author: Andreas Scherer
Publisher: John Wiley & Sons
ISBN: 9780470685990
Category : Science
Languages : en
Pages : 272

Get Book Here

Book Description
Batch Effects and Noise in Microarray Experiments: Sources and Solutions looks at the issue of technical noise and batch effects in microarray studies and illustrates how to alleviate such factors whilst interpreting the relevant biological information. Each chapter focuses on sources of noise and batch effects before starting an experiment, with examples of statistical methods for detecting, measuring, and managing batch effects within and across datasets provided online. Throughout the book the importance of standardization and the value of standard operating procedures in the development of genomics biomarkers is emphasized. Key Features: A thorough introduction to Batch Effects and Noise in Microrarray Experiments. A unique compilation of review and research articles on handling of batch effects and technical and biological noise in microarray data. An extensive overview of current standardization initiatives. All datasets and methods used in the chapters, as well as colour images, are available on www.the-batch-effect-book.org, so that the data can be reproduced. An exciting compilation of state-of-the-art review chapters and latest research results, which will benefit all those involved in the planning, execution, and analysis of gene expression studies.

Bioinformatics and Computational Biology Solutions Using R and Bioconductor

Bioinformatics and Computational Biology Solutions Using R and Bioconductor PDF Author: Robert Gentleman
Publisher: Springer Science & Business Media
ISBN: 0387293620
Category : Computers
Languages : en
Pages : 478

Get Book Here

Book Description
Full four-color book. Some of the editors created the Bioconductor project and Robert Gentleman is one of the two originators of R. All methods are illustrated with publicly available data, and a major section of the book is devoted to fully worked case studies. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.

Gene Network Inference

Gene Network Inference PDF Author: Alberto Fuente
Publisher: Springer Science & Business Media
ISBN: 3642451616
Category : Science
Languages : en
Pages : 135

Get Book Here

Book Description
This book presents recent methods for Systems Genetics (SG) data analysis, applying them to a suite of simulated SG benchmark datasets. Each of the chapter authors received the same datasets to evaluate the performance of their method to better understand which algorithms are most useful for obtaining reliable models from SG datasets. The knowledge gained from this benchmarking study will ultimately allow these algorithms to be used with confidence for SG studies e.g. of complex human diseases or food crop improvement. The book is primarily intended for researchers with a background in the life sciences, not for computer scientists or statisticians.

Long-Range Control of Gene Expression

Long-Range Control of Gene Expression PDF Author: Veronica van Heyningen
Publisher: Academic Press
ISBN: 0080877818
Category : Science
Languages : en
Pages : 415

Get Book Here

Book Description
Long-Range Control of Gene Expression covers the current progress in understanding the mechanisms for genomic control of gene expression, which has grown considerably in the last few years as insight into genome organization and chromatin regulation has advanced. Discusses the evolution of cis-regulatory sequences in drosophila Includes information on genomic imprinting and imprinting defects in humans Includes a chapter on epigenetic gene regulation in cancer

Bayesian Inference for Gene Expression and Proteomics

Bayesian Inference for Gene Expression and Proteomics PDF Author: Kim-Anh Do
Publisher: Cambridge University Press
ISBN: 052186092X
Category : Mathematics
Languages : en
Pages : 437

Get Book Here

Book Description
Expert overviews of Bayesian methodology, tools and software for multi-platform high-throughput experimentation.

eQTL Analysis

eQTL Analysis PDF Author: Xinghua Mindy Shi
Publisher: Humana
ISBN: 9781071600283
Category : Science
Languages : en
Pages : 252

Get Book Here

Book Description
This volume details state-of-art eQTL analysis, where interdisciplinary researchers are provided both theoretical and practical guidance to eQTL analysis and interpretation. Chapters guide readers through methods and tools for eQTL and QTL analysis and the usage of such analysis in various scenarios. Written in the highly successful Methods in Molecular Biology series format, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible laboratory protocols, and tips on troubleshooting and avoiding known pitfalls. Authoritative and cutting-edge, eQTL Analysis: Methods and Protocols to ensure successful results in the further study of this vital field.