Phylogenetic Tree Reconstruction with Protein Linkage

Phylogenetic Tree Reconstruction with Protein Linkage PDF Author: Junjie Yu
Publisher: Open Dissertation Press
ISBN: 9781361307441
Category :
Languages : en
Pages :

Get Book Here

Book Description
This dissertation, "Phylogenetic Tree Reconstruction With Protein Linkage" by Junjie, Yu, 于俊杰, was obtained from The University of Hong Kong (Pokfulam, Hong Kong) and is being sold pursuant to Creative Commons: Attribution 3.0 Hong Kong License. The content of this dissertation has not been altered in any way. We have altered the formatting in order to facilitate the ease of printing and reading of the dissertation. All rights not granted by the above license are retained by the author. Abstract: Phylogenetic tree reconstruction for a set of species is an important problem for understanding the evolutionary history of the species. Existing algorithms usually represent each species as a binary string with each bit indicating whether a particular gene/protein exists in the species. Given the topology of a phylogenetic tree with each leaf representing a species (a binary string of equal length) and each internal node representing the hypothetical ancestor, the Fitch-Hartigan algorithm and the Sankoff algorithm are two polynomial-time algorithms which assign binary strings to internal nodes such that the total Hamming distance between adjacent nodes in the tree is minimized. However, these algorithms oversimplify the evolutionary process by considering only the number of protein insertions/deletions (Hamming distance) between two species and by assuming the evolutionary history of each protein is independent. Since the function of a protein may depend on the existence of other proteins, the evolutionary history of these functionally dependent proteins should be similar, i.e. functionally dependent proteins should usually be present (or absent) in a species at the same time. Thus, in addition to the Hamming distance, the protein linkage distance for some pairs/sets of proteins: whole block linkage distance, partial block linkage distance, pairwise linkage distance is introduced. It is proved that the phylogenetic tree reconstruction problem to find the binary strings for the internal nodes of a phylogenetic tree that minimizes the sum of the Hamming distance and the linkage distance is NP-hard. In this thesis, a general algorithm to solve the phylogenetic tree reconstruction with protein linkage problem which runs in O(4 DEGREESm-n) time for whole/partial block linkage distance and O(4 DEGREESm-- (m+n)) time for pairwise linkage distance (compared to the straight-forward O(4 DEGREESm- m- n) or O(4 DEGREESm- m DEGREES2-- n) time algorithm) is introduced where n is the number of species and m is the length of the binary string (number of proteins). It is further shown, by experiments, that our algorithm using linkage information can construct more accurate trees (better matches with the trees constructed by biologists) than the algorithms using only Hamming distance. DOI: 10.5353/th_b4961816 Subjects: Phylogeny Combinatorial analysis

Phylogenetic Tree Reconstruction with Protein Linkage

Phylogenetic Tree Reconstruction with Protein Linkage PDF Author: Junjie Yu
Publisher: Open Dissertation Press
ISBN: 9781361307441
Category :
Languages : en
Pages :

Get Book Here

Book Description
This dissertation, "Phylogenetic Tree Reconstruction With Protein Linkage" by Junjie, Yu, 于俊杰, was obtained from The University of Hong Kong (Pokfulam, Hong Kong) and is being sold pursuant to Creative Commons: Attribution 3.0 Hong Kong License. The content of this dissertation has not been altered in any way. We have altered the formatting in order to facilitate the ease of printing and reading of the dissertation. All rights not granted by the above license are retained by the author. Abstract: Phylogenetic tree reconstruction for a set of species is an important problem for understanding the evolutionary history of the species. Existing algorithms usually represent each species as a binary string with each bit indicating whether a particular gene/protein exists in the species. Given the topology of a phylogenetic tree with each leaf representing a species (a binary string of equal length) and each internal node representing the hypothetical ancestor, the Fitch-Hartigan algorithm and the Sankoff algorithm are two polynomial-time algorithms which assign binary strings to internal nodes such that the total Hamming distance between adjacent nodes in the tree is minimized. However, these algorithms oversimplify the evolutionary process by considering only the number of protein insertions/deletions (Hamming distance) between two species and by assuming the evolutionary history of each protein is independent. Since the function of a protein may depend on the existence of other proteins, the evolutionary history of these functionally dependent proteins should be similar, i.e. functionally dependent proteins should usually be present (or absent) in a species at the same time. Thus, in addition to the Hamming distance, the protein linkage distance for some pairs/sets of proteins: whole block linkage distance, partial block linkage distance, pairwise linkage distance is introduced. It is proved that the phylogenetic tree reconstruction problem to find the binary strings for the internal nodes of a phylogenetic tree that minimizes the sum of the Hamming distance and the linkage distance is NP-hard. In this thesis, a general algorithm to solve the phylogenetic tree reconstruction with protein linkage problem which runs in O(4 DEGREESm-n) time for whole/partial block linkage distance and O(4 DEGREESm-- (m+n)) time for pairwise linkage distance (compared to the straight-forward O(4 DEGREESm- m- n) or O(4 DEGREESm- m DEGREES2-- n) time algorithm) is introduced where n is the number of species and m is the length of the binary string (number of proteins). It is further shown, by experiments, that our algorithm using linkage information can construct more accurate trees (better matches with the trees constructed by biologists) than the algorithms using only Hamming distance. DOI: 10.5353/th_b4961816 Subjects: Phylogeny Combinatorial analysis

Phylogenetic Tree Reconstruction with Protein Linkage

Phylogenetic Tree Reconstruction with Protein Linkage PDF Author: Junjie Yu
Publisher:
ISBN:
Category : Combinatorial analysis
Languages : en
Pages : 120

Get Book Here

Book Description


Phylogenetic Tree Reconstruction with Protein Linkage

Phylogenetic Tree Reconstruction with Protein Linkage PDF Author: Junjie Yu
Publisher:
ISBN:
Category : Combinatorial analysis
Languages : en
Pages : 0

Get Book Here

Book Description


Protein Folding and Phylogenetic Tree Reconstruction Using Stochastic Approximation Monte Carlo

Protein Folding and Phylogenetic Tree Reconstruction Using Stochastic Approximation Monte Carlo PDF Author: Sooyoung Cheon
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Recently, the stochastic approximation Monte Carlo algorithm has been proposed by Liang et al. (2005) as a general-purpose stochastic optimization and simulation algorithm. An annealing version of this algorithm was developed for real small protein folding problems. The numerical results indicate that it outperforms simulated annealing and conventional Monte Carlo algorithms as a stochastic optimization algorithm. We also propose one method for the use of secondary structures in protein folding. The predicted protein structures are rather close to the true structures. Phylogenetic trees have been used in biology for a long time to graphically represent evolutionary relationships among species and genes. An understanding of evolutionary relationships is critical to appropriate interpretation of bioinformatics results. The use of the sequential structure of phylogenetic trees in conjunction with stochastic approximation Monte Carlo was developed for phylogenetic tree reconstruction. The numerical results indicate that it has a capability of escaping from local traps and achieving a much faster convergence to the global likelihood maxima than other phylogenetic tree reconstruction methods, such as BAMBE and MrBayes.

The Phylogenetic Handbook

The Phylogenetic Handbook PDF Author: Marco Salemi
Publisher: Cambridge University Press
ISBN: 9780521803908
Category : Computers
Languages : en
Pages : 456

Get Book Here

Book Description
Sample Text

Ancestral Sequence Reconstruction

Ancestral Sequence Reconstruction PDF Author: David A Liberles
Publisher: Oxford University Press
ISBN: 0199299188
Category : Science
Languages : en
Pages : 267

Get Book Here

Book Description
Ancestral sequence reconstruction is a technique of growing importance in molecular evolutionary biology and comparative genomics. As a powerful tool for testing evolutionary and ecological hypotheses, as well as uncovering the link between sequence and molecular phenotype, there are potential applications in a range of fields.Ancestral Sequence Reconstruction starts with a historical overview of the field, before discussing the potential applications in drug discovery and the pharmaceutical industry. This is followed by a section on computational methodology, which provides a detailed discussion of the available methods for reconstructing ancestral sequences (including their advantages, disadvantages, and potential pitfalls). Purely computational applications of the technique are then covered, including wholeproteome reconstruction. Further chapters provide a detailed discussion on taking computationally reconstructed sequences and synthesizing them in the laboratory. The book concludes with a description of the scientific questions where experimental ancestral sequence reconstruction has been utilized toprovide insights and inform future research.This research level text provides a first synthesis of the theories, methodologies and applications associated with ancestral sequence recognition, while simultaneously addressing many of the hot topics in the field. It will be of interest and use to both graduate students and researchers in the fields of molecular biology, molecular evolution, and evolutionary bioinformatics.

Phylogenetic Implications of the Effect of Nucleotide Bias on Amino Acid Composition

Phylogenetic Implications of the Effect of Nucleotide Bias on Amino Acid Composition PDF Author: Peter G. Foster
Publisher:
ISBN:
Category : Amino acids
Languages : en
Pages : 156

Get Book Here

Book Description


Phylogenetic Trees and Molecular Evolution

Phylogenetic Trees and Molecular Evolution PDF Author: David R. Bickel
Publisher: Springer Nature
ISBN: 3031119584
Category : Science
Languages : en
Pages : 112

Get Book Here

Book Description
This book serves as a brief introduction to phylogenetic trees and molecular evolution for biologists and biology students. It does so by presenting the main concepts in a variety of ways: first visually, then in a history, next in a dice game, and finally in simple equations. The content is primarily designed to introduce upper-level undergraduate and graduate students of biology to phylogenetic tree reconstruction and the underlying models of molecular evolution. A unique feature also of interest to experienced researchers is the emphasis on simple ways to quantify the uncertainty in the results more fully than is possible with standard methods.

Phylogenetic Trees Made Easy

Phylogenetic Trees Made Easy PDF Author: Barry G. Hall
Publisher: Sinauer Associates, Incorporated
ISBN:
Category : Science
Languages : en
Pages : 258

Get Book Here

Book Description
Barry G. Hall helps beginners get started in creating phylogenetic trees from protein or nucleic acid sequence data.

Structure-aided Detection of Functional Innovation in Protein Phylogenies

Structure-aided Detection of Functional Innovation in Protein Phylogenies PDF Author: Jeremy Bruce Adams
Publisher:
ISBN:
Category :
Languages : en
Pages : 123

Get Book Here

Book Description
Detection of positive selection in proteins is both a common and powerful approach for investigating the molecular basis of adaptation. In this thesis, I explore the use of protein three- dimensional (3D) structure to assist in prediction of historical adaptations in proteins. Building on a method first introduced by Wagner (Genetics, 2007, 176: 2451-2463), I present a novel framework called Adaptation3D for detecting positive selection by integrating sequence, structural, and phylogenetic information for protein families. Adaptation3D identifies possible instances of positive selection by reconstructing historical substitutions along a phylogenetic tree and detecting branch-specific cases of spatially clustered substitution. The Adaptation3D method was capable of identifying previously characterized cases of positive selection in proteins, as demonstrated through an analysis of the pathogenesis-related protein 5 (PR-5) phylogeny. It was then applied on a phylogenomic scale in an analysis of thousands of vertebrate protein phylogenetic trees from the Selectome database. Adaptation3D's reconstruction of historical mutations in vertebrate protein families revealed several evolutionary phenomena. First, clustered mutation is widespread and occurs significantly more often than that expected by chance. Second, numerous top-scoring cases of predicted positive selection are consistent with existing literature on vertebrate protein adaptation. Third, in the vertebrate lineage, clustered mutation has occurred disproportionately in proteins from certain families and functional categories such as zinc-finger transcription factors (TFs). Finally, by separating paralogous and orthologous lineages, it was found that TF paralogs display significantly elevated levels of clustered mutation in their DNA-binding sites compared to orthologs, consistent with historical DNA-binding specificity divergence in newly duplicated TFs. Ultimately, Adaptation3D is a powerful framework for reconstructing structural patterns of historical mutation, and provides important insights into the nature of protein adaptation.