Statistical Methods for High-Dimensional, Spatially-Distributed Microbiome Data from Next-Generation Sequencing

Statistical Methods for High-Dimensional, Spatially-Distributed Microbiome Data from Next-Generation Sequencing PDF Author: Neal Steven Grantham
Publisher:
ISBN:
Category :
Languages : en
Pages : 90

Get Book Here

Book Description

Statistical Methods for High-Dimensional, Spatially-Distributed Microbiome Data from Next-Generation Sequencing

Statistical Methods for High-Dimensional, Spatially-Distributed Microbiome Data from Next-Generation Sequencing PDF Author: Neal Steven Grantham
Publisher:
ISBN:
Category :
Languages : en
Pages : 90

Get Book Here

Book Description


Statistical Analysis of Next Generation Sequencing Data

Statistical Analysis of Next Generation Sequencing Data PDF Author: Somnath Datta
Publisher: Springer
ISBN: 3319072129
Category : Medical
Languages : en
Pages : 438

Get Book Here

Book Description
Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized medicine. About the editors: Somnath Datta is Professor and Vice Chair of Bioinformatics and Biostatistics at the University of Louisville. He is Fellow of the American Statistical Association, Fellow of the Institute of Mathematical Statistics and Elected Member of the International Statistical Institute. He has contributed to numerous research areas in Statistics, Biostatistics and Bioinformatics. Dan Nettleton is Professor and Laurence H. Baker Endowed Chair of Biological Statistics in the Department of Statistics at Iowa State University. He is Fellow of the American Statistical Association and has published research on a variety of topics in statistics, biology and bioinformatics.

Statistical Methods for High Dimensional Count and Compositional Data with Applications to Microbiome Studies

Statistical Methods for High Dimensional Count and Compositional Data with Applications to Microbiome Studies PDF Author: Yuanpei Cao
Publisher:
ISBN:
Category :
Languages : en
Pages : 202

Get Book Here

Book Description
Next generation sequencing (NGS) technologies make the studies of microbiomes in very large-scale possible without cultivation in vitro. One approach to sequencing-based microbiome studies is to sequence specific genes (often the 16S rRNA gene) to produce a profile of diversity of bacterial taxa. Alternatively, the NGS-based sequencing strategy, also called shotgun metagenomics, provides further insights at the molecular level, such as species/strain quantification, gene function analysis and association studies. Such studies generate large-scale high-dimensional count and compositional data, which are the focus of this dissertation.

Statistical Analysis of Microbiome Data

Statistical Analysis of Microbiome Data PDF Author: Somnath Datta
Publisher: Springer Nature
ISBN: 3030733513
Category : Medical
Languages : en
Pages : 349

Get Book Here

Book Description
Microbiome research has focused on microorganisms that live within the human body and their effects on health. During the last few years, the quantification of microbiome composition in different environments has been facilitated by the advent of high throughput sequencing technologies. The statistical challenges include computational difficulties due to the high volume of data; normalization and quantification of metabolic abundances, relative taxa and bacterial genes; high-dimensionality; multivariate analysis; the inherently compositional nature of the data; and the proper utilization of complementary phylogenetic information. This has resulted in an explosion of statistical approaches aimed at tackling the unique opportunities and challenges presented by microbiome data. This book provides a comprehensive overview of the state of the art in statistical and informatics technologies for microbiome research. In addition to reviewing demonstrably successful cutting-edge methods, particular emphasis is placed on examples in R that rely on available statistical packages for microbiome data. With its wide-ranging approach, the book benefits not only trained statisticians in academia and industry involved in microbiome research, but also other scientists working in microbiomics and in related fields.

Computational and Statistical Methods for Extracting Biological Signal from High-Dimensional Microbiome Data

Computational and Statistical Methods for Extracting Biological Signal from High-Dimensional Microbiome Data PDF Author: Gibraan Rahman
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
Next-generation sequencing (NGS) has effected an explosion of research into the relationship between genetic information and a variety of biological conditions. One of the most exciting areas of study is how the trillions of microbial species that we share this Earth with affect our health. However, the process of extracting useful biological insights from this breadth of data is far from trivial. There are numerous statistical and computational considerations in addition to the already complex and messy biological problems. In this thesis, I describe my work on developing and implementing software to tackle the complex world of statistical microbiome analysis. In the first part of this thesis, we review the applications and challenges of performing dimensionality reduction on microbiome data comprising thousands of microbial taxa. When dealing with this high dimensionality, it is imperative to be able to get an overview of the community structure in a lower dimensional space that can be both visualized and interpreted. We review the statistical considerations for dimensionality reduction and the existing tools and algorithms that can and cannot address them. This includes discussions about sparsity, compositionality, and phylogenetic signal. We also make recommendations about tools and algorithms to consider for different use-cases. In the second part of this thesis, we present a new software, Evident, designed to assist researchers with statistical analysis of microbiome effect sizes and power analysis. Effect sizes of statistical tests are not widely reported in microbiome datasets, limiting the interpretability of community differences such as alpha and beta diversity. As more large microbiome studies are produced, researchers have the opportunity to mine existing datasets to get a sense of the effect size for different biological conditions. These, in turn, can be used to perform power analysis prior to designing an experiment, allowing researchers to better allocate resources. We show how Evident is scalable to dozens of datasets and provides easy calculation and exploration of effect sizes and power analysis from existing data. In the third part of this thesis, we describe a novel investigation into the joint microbiome and metabolome axis in colorectal cancer. In most cases of sporadic colorectal cancers (CRC), tumorigenesis is a multistep process driven by genomic alterations in concert with dietary influences. In addition, mounting evidence has implicated the gut microbiome as an effector in the development and progression of CRC. While large meta-analyses have provided mechanistic insight into disease progression in CRC patients, study heterogeneity has limited causal associations. To address this limitation, multi-omics studies on genetically controlled cohorts of mice were performed to distinguish genetic and dietary influences. Diet was identified as the major driver of microbial and metabolomic differences, with reductions in alpha diversity and widespread changes in cecal metabolites seen in HFD-fed mice. Similarly, the levels of non-classic amino acid conjugated forms of the bile acid cholic acid (AA-CAs) increased with HFD. We show that these AA-CAs signal through the nuclear receptor FXR and membrane receptor TGR5 to functionally impact intestinal stem cell growth. In addition, the poor intestinal permeability of these AA-CAs supports their localization in the gut. Moreover, two cryptic microbial strains, Ileibacterium valens and Ruminococcus gnavus, were shown to have the capacity to synthesize these AA-CAs. This multi-omics dataset from CRC mouse models supports diet-induced shifts in the microbiome and metabolome in disease progression with potential utility in directing future diagnostic and therapeutic developments. In the fourth chapter, we demonstrate a new framework for performing differential abundance analysis using customized statistical modeling. As we learn more and more about the relationship between the microbiome and biological conditions, experimental protocols are becoming more and more complex. For example, meta-analyses, interventions, longitudinal studies, etc. are being used to better understand the dynamic nature of the microbiome. However, statistical methods to analyze these relationships are lacking--especially in the field of differential abundance. Finding biomarkers associated with conditions of interest must be performed with statistical care when dealing with these kinds of experimental designs. We present BIRDMAn, a software package integrating probabilistic programming with Stan to build custom models for analyzing microbiome data. We show that, on both simulated and real datasets, BIRDMAn is able to extract novel biological signals that are missed by existing methods. These chapters, taken together, advance our knowledge of statistical analysis of microbiome data and provide tools and references for researchers looking to perform analysis on their own data.

Statistical Methods for Human Microbiome Data Analysis

Statistical Methods for Human Microbiome Data Analysis PDF Author: Jun Chen
Publisher:
ISBN:
Category :
Languages : en
Pages : 107

Get Book Here

Book Description


Computational Methods for Next Generation Sequencing Data Analysis

Computational Methods for Next Generation Sequencing Data Analysis PDF Author: Ion Mandoiu
Publisher: John Wiley & Sons
ISBN: 1119272165
Category : Computers
Languages : en
Pages : 464

Get Book Here

Book Description
Introduces readers to core algorithmic techniques for next-generation sequencing (NGS) data analysis and discusses a wide range of computational techniques and applications This book provides an in-depth survey of some of the recent developments in NGS and discusses mathematical and computational challenges in various application areas of NGS technologies. The 18 chapters featured in this book have been authored by bioinformatics experts and represent the latest work in leading labs actively contributing to the fast-growing field of NGS. The book is divided into four parts: Part I focuses on computing and experimental infrastructure for NGS analysis, including chapters on cloud computing, modular pipelines for metabolic pathway reconstruction, pooling strategies for massive viral sequencing, and high-fidelity sequencing protocols. Part II concentrates on analysis of DNA sequencing data, covering the classic scaffolding problem, detection of genomic variants, including insertions and deletions, and analysis of DNA methylation sequencing data. Part III is devoted to analysis of RNA-seq data. This part discusses algorithms and compares software tools for transcriptome assembly along with methods for detection of alternative splicing and tools for transcriptome quantification and differential expression analysis. Part IV explores computational tools for NGS applications in microbiomics, including a discussion on error correction of NGS reads from viral populations, methods for viral quasispecies reconstruction, and a survey of state-of-the-art methods and future trends in microbiome analysis. Computational Methods for Next Generation Sequencing Data Analysis: Reviews computational techniques such as new combinatorial optimization methods, data structures, high performance computing, machine learning, and inference algorithms Discusses the mathematical and computational challenges in NGS technologies Covers NGS error correction, de novo genome transcriptome assembly, variant detection from NGS reads, and more This text is a reference for biomedical professionals interested in expanding their knowledge of computational techniques for NGS data analysis. The book is also useful for graduate and post-graduate students in bioinformatics.

Statistical Methods for High Dimensional Data in Microbiome Research

Statistical Methods for High Dimensional Data in Microbiome Research PDF Author: Sven Kleine Bardenhorst
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description


Statistical Methods for the Analysis of Genomic Data

Statistical Methods for the Analysis of Genomic Data PDF Author: Hui Jiang
Publisher: MDPI
ISBN: 3039361406
Category : Science
Languages : en
Pages : 136

Get Book Here

Book Description
In recent years, technological breakthroughs have greatly enhanced our ability to understand the complex world of molecular biology. Rapid developments in genomic profiling techniques, such as high-throughput sequencing, have brought new opportunities and challenges to the fields of computational biology and bioinformatics. Furthermore, by combining genomic profiling techniques with other experimental techniques, many powerful approaches (e.g., RNA-Seq, Chips-Seq, single-cell assays, and Hi-C) have been developed in order to help explore complex biological systems. As a result of the increasing availability of genomic datasets, in terms of both volume and variety, the analysis of such data has become a critical challenge as well as a topic of great interest. Therefore, statistical methods that address the problems associated with these newly developed techniques are in high demand. This book includes a number of studies that highlight the state-of-the-art statistical methods for the analysis of genomic data and explore future directions for improvement.

Big Data in Omics and Imaging

Big Data in Omics and Imaging PDF Author: Momiao Xiong
Publisher: CRC Press
ISBN: 1315353415
Category : Mathematics
Languages : en
Pages : 595

Get Book Here

Book Description
Big Data in Omics and Imaging: Association Analysis addresses the recent development of association analysis and machine learning for both population and family genomic data in sequencing era. It is unique in that it presents both hypothesis testing and a data mining approach to holistically dissecting the genetic structure of complex traits and to designing efficient strategies for precision medicine. The general frameworks for association analysis and machine learning, developed in the text, can be applied to genomic, epigenomic and imaging data. FEATURES Bridges the gap between the traditional statistical methods and computational tools for small genetic and epigenetic data analysis and the modern advanced statistical methods for big data Provides tools for high dimensional data reduction Discusses searching algorithms for model and variable selection including randomization algorithms, Proximal methods and matrix subset selection Provides real-world examples and case studies Will have an accompanying website with R code The book is designed for graduate students and researchers in genomics, bioinformatics, and data science. It represents the paradigm shift of genetic studies of complex diseases– from shallow to deep genomic analysis, from low-dimensional to high dimensional, multivariate to functional data analysis with next-generation sequencing (NGS) data, and from homogeneous populations to heterogeneous population and pedigree data analysis. Topics covered are: advanced matrix theory, convex optimization algorithms, generalized low rank models, functional data analysis techniques, deep learning principle and machine learning methods for modern association, interaction, pathway and network analysis of rare and common variants, biomarker identification, disease risk and drug response prediction.