Combinatorial Models of Synteny Conservation in Genomes

Award Month: 
April - May, 2010

Genomic rearrangements are large-scale evolutionary events that disrupt gene order along chromosomes. The computational analysis of gene orders, their structure and their evolution relies on combinatorial models and algorithms designed in terms of sequences of signed. This project is centered on detection of con served gene clusters, the assignment of evolutionary relations in the presence of multigenes families and the computation of evolution hypothesis. Detecting some clusters is a difficult problem with applications in very applied domains, like pathogenomics. This part of the project aims at designing methods to (1) detect highly rearranged clusters, (2) discriminate between conserved clusters due to evolutive pressure and conserved clusters due to phylogenetic proximity and (3) be computationally efficient in order to process large datasets. Our goal is to develop a gene matching strategy that is not based on an evolutionary model but on the conservation of local synteny and also consider sequence alignments results used to define gene families and a statistical model of synteny conservation significance. The third main part of the project deals with the analysis of gene order datasets produced using the methods developed in the two previous sub-projects for phylogenomic analysis, including computing gene order phylogenies, ancestral gene orders and statistics on genome rearrangements. Attention is also given to the problem of generating “gene order” datasets for eukaryotic genomes, where genes only do not cover enough genome to be reliable markers. We investigate two classical approaches, whole genome alignments and comparative mapping technique, and a new method, based on virtual hybridation. Our approach for most of the above problems relies on sound and well understand combinatorial models for the analysis of signed permutations and sequences, like, but not limited to, common intervals and max-gap clusters. An important focus is on designing and implementing efficient algorithms based on these models.

About Project Leader: Dr. Cedric Chauve

Dr. Cedric Chauve is an associate professor in the Department of Mathematics of Simon Fraser University. Before coming to SFU in 2007, he was a professor in the Computer Science Department of University of Quebec at Montreal (UQAM). Besides his membership at the IRMACS Centre, Dr. Chauve is a member of the SFU Discrete Mathematics Group, Bioinformatics Training Program for Health Research, where he is an associate faculty, Comparative Genomics Laboratory (CGL, UQAM), and Laboratoire de Combinatoire et Informatique Mathematiques (LaCIM, UQAM). His current research deals primarily with mathematical and computational questions arising from comparative genomics. This includes problems on genome rearrangements, gene families evolution, and computational analysis of RNA. Part of Dr. Chauve's scientific work is in enumerative combinatorics and algorithms design.