Computational Methods for Comparative Analysis of Microbiome Related to Human Diseases

Computational Methods for Comparative Analysis of Microbiome Related to Human Diseases PDF Author: Wontack Han
Publisher:
ISBN:
Category : Bioinformatics
Languages : en
Pages : 0

Get Book Here

Book Description
Microbial organisms play key roles in the human hosts' health and diseases. Recent advancements in genome sequencing have resulted in a large collection of sequencing data of microbial species and have expanded the research of microbiome from the characterization of microbiomes' community associated with different environments/hosts to the applications related with human health and diseases. Computational methods have been developed to identify microbial markers from microbiome datasets derived from cohorts of patients with different diseases. Predictive models based on these markers (features) have been built for discriminating host phenotypes such as disease vs healthy and cancer immunotherapy responder vs non-responder. In this dissertation, I developed computational methods for comparative analysis of metagenomes from raw sequencing data and developed Machine Learning (ML) approaches to build predictive models for host phenotype prediction based on identified microbial markers. First, I implemented the subtractive assembly method(called CoSA) for comparative metagenomics that directly detects differential reads between two groups of metagenomes, from which microbial marker genes could be assembled and characterized. Secondly, I reported the curation of a repository of microbial marker genes and predictive models built from these markers for microbiome-based prediction of host phenotype, and a computational pipeline(named Mi2P) for using the repository. Lastly, I exploited locality sensitive hashing(LSH) as clustering algorithm to group billions of k-mers having similar abundance profiles across multiple samples into k-mers co-abundance groups (kCAGs) to improve the characterization of differential microbial markers. The overall goal of my research is to develop fast and efficient approaches for identifying microbial marker genes, and make them available for building predictive models for microbiome-based host phenotype predictions.