Download presentation
Presentation is loading. Please wait.
Published byLora Pearson Modified over 9 years ago
1
Modeling molecular evolution Jodi Schwarz and Marc Smith Vassar College Biol/CS353 Bioinformatics
2
Team taught Biol and CompSci course 7 students: – CS experience: 3 yes, 4 no – Bio experience: 5 yes, 2 no Project-based course; no exams Worked in Biol/CS pairs on projects I3U near end of course; last project before independent research projects
3
Common approach for all projects Biological question Algorithm design – Step-by-step approach to complete a task or solve the problem Implementation – The actual programming “script” that will carry out the steps of the algorithm Evaluation of implementation and algorithm Revision or augmentation
4
I3U: added an experimental component to our basic approach Previous projects focused on pattern finding, mining whole genome data Goal of I3U: Model a biological/evolutionary process Test the model with empirical data Perform computational experiments
5
Model molecular evolution Step 1: model the effect of random vs targeted nucleotide substitutions on a protein sequence – What do we mean by random? – determine the similarity of the original protein sequence to the “evolved” sequence Step 2: Assess the real nt diversity at positions 1, 2, 3 of codons in real homologs (HSP70) – Construct alignment of homologs and determine nt diversity at each position Evaluate the models using the empirical data
6
Learning goals CS students: To apply their knowledge of data structures and algorithms to a biological domain Biology students: To apply their knowledge of the biology to design algorithms For the collaboration: – To become familiar with modeling a biological process: a simple model must be constructed and tested first – To test the model using empirical data
7
Assessment Assignments – Alignment assignment – 2 Perl scripts Model random vs targeted substitution pattern Determine the codon nt diversity in HSP70 genes – Output from the 2 Perl scripts Raw output Graphs summarizing data Observation – Collaboration – Critical thinking
8
Random substitutions substitutions targeted to 3 rd psn Example student results Effect of random vs targeted substitutions on a protein sequence (compared the “ancestral” sequence to the “evolved” sequence ) 100 runs
9
Example student results of empirical data Average diversity by nucleotide position within codons: Codon position 1: 1.50 Codon position 2: 1.29 Codon position 3: 2.32 Most variation occurs in position 3
10
Collaboration across disciplines How we tried to teach collaboration: – We defined the meaning of collaboration CS students do not need to become biologists and vice versa Each person contributes a different set of expertise Learning how to speak each other’s language Communication – We modeled it Overt reliance on each other’s expertise Spontaneous discussions – Giving students lots of experience collaborating: several shifts in pairs over the semester
11
Assessment of collaboration Attitude : reluctant vs eager At beginning (self) vs. during project (experience) Gradational Assessment of Collaboration ScoreSelfExperience 0reluctantavoided 1eagerproblems 2reluctantpositive 3eagerpositive StudentScoreTeam ScoreTeams A02A+C B14B+F C26E+G D3 E33D worked alone F3 G3
12
1 how a genomics approach crosses levels of biological organization 2 how genomic-level science is conducted 3 how computational approaches are deployed to answer genomic questions? 4 how to find potential functional /evolutionary patterns in DNA/protein sequence 5 independently use bioinformatic tools to address biological/genomic questions. 6 examine the output of a bioinformatic analysis and relate it to a biological question. 7 provide one or more clear examples of how genomics uses an interdisciplinary approach Most improvement: questions that are explicitly bioinformatic Least: questions that are more broadly about genomics (CS) Likert Scale (1-5)
13
What worked well Overall approach was great: question, algorithm, implementation, analysis, iteration Use of starter code allowed students to – Undertake much more sophisticated projects – see examples of more advanced algorithm/code Encountering unanticipated results and problems – Gaps in alignments not in groups of 3 – Spontaneous discussions leading to AHA moments Students enjoyed the modeling process – One student’s final project focused on modeling molecular evolution
14
What didn’t work as well Some collaborations are not successful Ran out of time: insufficient analysis and reflection For the I3U: Assessment strategy not well developed – Can we retroactively extract more informative assessment?
16
Assessing biology knowledge Algorithm development – Ability to help partner understand different mutation vs selection – Ability to recognize assumptions of model – Ability to use the empirical data to evaluate model
17
Assessing the CS Variables – Abstraction: representing information as data – Types of data: predefined, atomic, aggregate – Scope: declaration, initialization, mutation Algorithms – Control flow: unconditional, conditional, repetition – Input/Output and regex (pattern matching) – Top-down design: subroutines – To reuse or not to reuse (code)? Incremental development / experimentation Elegance: readability and maintainability
18
Biological question – What pattern of nucleotide substitution occurs in protein-coding genes? Algorithm – What does we know about mutation, nt/AA sequences? – Assumptions Implementation – Instructors provided “starter code” – Students read and ran the code to see what it did – Pairs discussed how to add and refine it, and did so Evaluation – Analyze the CS: Did it run and did it do the job we asked? – Analyze the biology: Did it accurately represent the biological process? Testing the models against empirical evidence – Aligned HSP70 genes and evaluated the pattern of substitution Which model most closely matched the biology?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.