Analyzing Phenotypic Variations and Polymorphism Phillip Tao Advisor: Professor Eleazar Eskin Grad Student: Emrah Kostem
Polymorphism and SNP Polymorphism is a variation between the genome of two individuals or chromosomes Many types: – Deletion * – Duplication * – Inversion * – Single Nucleotide Polymorphism (SNP) * These often cause major diseases, such as Cri du Chat and Charcot-Marie- Tooth disease, when large portions of the genome are affected [4]. However, their effect on small portions of the genome, (especially non- coding portions) have been mostly unstudied
Problem Polymorphisms can be used to identify disposition toward certain illnesses [1] Use mouse genome to identify problem causing alleles A lot of data to analyze, 8.27 million SNP in mice [2], estimated 10 million in humans [1] Very difficult to identify non-SNP polymorphisms – Can be very long, 1.5 million base pairs [3]
Past Methods and My Idea Polymorphism has been used to try to find causes for diseases such as cancer, etc. [5][6][7]. They decided on a disease first, found an area of a genome they believe might influence the disease, then looked for polymorphisms in that area My way is to map the entire set of polymorphisms for a given genome (mouse genome), then look for correlations
Proposal, step 1 Develop a way to identify non-SNP polymorphisms 1.Find start of problem area 2.Use marker sequence to find end of problem area* 3.Identify problem * Idea borrowed from Restriction Fragment Length Polymorphism (RFLP)
Proposal, step 2 Use Ruby to develop a tool to easily and flexibly model the data Develop algorithms to find correlation between polymorphisms and trait variations
Schedule A basic but functional version of the Ruby on Rails website should be finished, and a working but slow and inefficient version of the polymorphism finder should be done. By the end of Winter Quarter, the final version of the polymorphism finder should be finished, and a better version of the RoR website should also be finished By the end of the project, an algorithm should be found to analyze correlation between certain polymorphisms and trait variations
Resources tmlhttp:// tml http://circ.ahajournals.org/cgi/content/abstract/circulationaha;102/2/