Aliya Sadeque BIOC 599 Supervisory Committee Meeting Wednesday December 19, 2007.
Outline About me Thesis project blueprint Course selection
Curriculum Vitae Queen’s University. Bachelor of Science (Honours) in Biochemistry. Minor in Computing. Graduated May, 2007
Previous Coursework Undergraduate Level Biochemistry: Proteins and Enzymes Physical Biochemistry Metabolism Molecular Biology Introductory Biochemistry Laboratory Protein Structure and Function Current Topics in Biochemistry Biochemistry of the Cell Advanced Molecular Biology
Previous Coursework Undergraduate Level Computing: Database Management Systems Neural and Genetic Computing Introduction to Data Mining System Level Programming Operating Systems Undergraduate Level Mathematics: Introduction to Statistics Discrete Math for Computer Scientists Modeling Techniques in Biology
Thesis Project Blueprint Context Why is this work necessary What kind of tools have been used to address it Longest Common Subsequence Part I: Explore LCSs in poxvirus Visualization Threshold frequency equation Part II: Develop an interface for use by biologists
Background “Promoter sequences might be identified as conserved islands in a divergent sea” Observed: 42-bp sequence showing “unusually high degree of sequence conservation” (Brunetti et al.) Are these claims reasonable? How can they be tested?
Tools Alignment 0 mismatch suffix tree Longest Common Subsequence Algorithm
Visualization
Threshold Frequency Figure 1. Table showing number of hits resulting from LCS trials with varying values of n and k, or subsequence length and error number, respectively. k = 1k=2k=3 length# solutionslength# solutionslength# solutions
User Interface Design with usability in mind Selection of inputs – What kind of genomes can/will this tool be used for? Format of results – How should these be presented in order to allow interpretation? Visualization Further processing of output
Timeline Part I: Poxvirus LCS data collection and analysis 2 months Part II: Interface 4-6 months
Course Selection BIOC completed MICR Virology Courses to sit in for: Biochemistry courses? Computing courses? Data mining Bioinformatics Statistics