Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of the Positively Selected and Non-Positively Selected Non-Protein Coding Sequences of Chromosome 16 by Kyle Tretina, Dr. Pattle P. Pun INTRODUCTION.

Similar presentations


Presentation on theme: "Analysis of the Positively Selected and Non-Positively Selected Non-Protein Coding Sequences of Chromosome 16 by Kyle Tretina, Dr. Pattle P. Pun INTRODUCTION."— Presentation transcript:

1 Analysis of the Positively Selected and Non-Positively Selected Non-Protein Coding Sequences of Chromosome 16 by Kyle Tretina, Dr. Pattle P. Pun INTRODUCTION For some combination of egoistic or religious reasons, a commonly held assumption is a hierarchy of complexity among all organisms increasing in complexity, with homo sapiens at the apex. One paradoxical finding of genomics is the pattern that organismal complexity is generally inversely related to gene density. If some theories, which correlate an organism’s complexity with the number of genes in an organism, are good models then it can be expected that there must be an additional source of complexity in higher organisms. In this study, several databases are utilized to investigate the role of non-coding DNA in homo sapiens. Figure 1. the G-value paradox (Shabalina and Spiridonov, 2004) References Shabalina, Svetlana A., and Nikolay A. Spiridonov. "The Mammalian Transcriptome and the Function of Non-coding DNA Sequences." Genome Biol. 5.4 (2004): 105. GOAL: Determine the role of non-coding DNA in gene regulation by looking at the functions of non-coding SNPs that are positively selected or non-positively selected on chromosome 16. This will be accomplished by mining and combining data from several databases, then compiling the data into a single database where they will be analyzed. METHODS 1) HapMap Database  Selection Data  List of Chr16 SNPs 2) UCSC Genome Database Mirror  SNP flanking sequence 3) TRANSFAC  related transcription factor data for each SNP flanking sequence 4) PReMod  confirm results RESULTS Sequence identifiers for non-coding regions that have been predicted to be recently positively or non-positively selected on chromosome 16 by the International HapMap Project were gathered, and an onsite, partial mirror of a human genome database was created. The identifiers were then used to mine the corresponding sequences and then sent to our collaborators at the University of Hong Kong where a proprietary database called TRANSFAC will be used to gather regulatory information concerning the functions of these sequences. In the meantime, a database was prepared and several programs were written in order to look for patterns in the data, and further comparisons of this data with another database (PReMod) were investigated. So far 24% of the 25,622 of non-positively selected sequences were matched with TRANSFAC data and all the 4,750 positively selected sequences have as well. These were gathered manually, but the automation that was implemented more recently will standardize the data format, reduce the number of errors, and lend the data to easier and more reproducible analysis. Figure 2. an example of the sequences from the Genome Browser website IMPLICATIONS If it is found that DNA directly contributes to the complexity of an organism, this could affect current views of the Central Dogma of Biology. Also, according to the neutral selection theory, non-coding DNA is often considered to only evolve under chance. This project may evidence, using the HapMap data, that not only is non-coding DNA the subject of selection, but it plays vital roles in the information flow of homo sapiens. Finally, views on the complexity of humans must be reconciled with our views of their place in the hierarchy of known life. SNP SelectionTotalNo SitesUnique Matches in Other Dataset Entries to Be Looked Up Non-Positive25,5941,611 (6%)3,218 (13%)20,765 (81%)82 (<1%) Positive33,7702,437 (7%)361 (1.0%)30,972 (92%)10,641(32%) Table 2. A summary of the TRANSFAC data gathered manually for all of the sequences Graph 1. The distributions of the positively selected SNPs used in the study across human chromosome 16 (non-positively selected has similar pattern)


Download ppt "Analysis of the Positively Selected and Non-Positively Selected Non-Protein Coding Sequences of Chromosome 16 by Kyle Tretina, Dr. Pattle P. Pun INTRODUCTION."

Similar presentations


Ads by Google