Download presentation
Presentation is loading. Please wait.
Published byOliver York Modified over 8 years ago
1
Bioinformatics Computing 1 CMP 807 – Day 1 Kevin Galens
2
Today’s Objectives Overview/Introduction What is Bioinformatics? Molecular Biology Overview Unix Introduction Sequence Alignment Introduction
3
Course Objectives Define Bioinformatics Review/Learn basic Molecular Biology Develop Unix skills Understand: Sequence alignment Gene finding Structure analysis Utilize bioinformatics software Web-based/Local Understand Data Storage Techniques
4
Textbook Developing Bioinformatics Computer Skills Gibas and Jambeck ISBN: 1-56592-664-1 Bioinformatics: Sequence and Genome Analysis David W. Mount ISBN: 0-8769-608-7 O’Reilly Books: Oreilly.com
5
Course Requirements 80% Attendance Ask Questions Have Fun
6
Introduction What is Bioinformatics?
8
Introduction Bioinformatics – “the science of using information to understand biology” Combination of: “Wet Lab” sciences: Biology Chemistry “Theoretical” sciences: Physics Mathematics Computer Science Information Technology
9
Introduction What you need to know to be a bioinformaticist Molecular Biology/Biochemistry DNA->RNA->Protein (Central Dogma) Unix Programming Perl Java/C/C++ Python Web Development Database Managment How to adapt
10
Introduction What Bioinformaticists do Create/manage databases DNA/RNA/Protein Sequence/Structure Microarray Phylogenetic Develop computational analysis methods Assist ‘wet-lab’ scientists Usable interfaces Everything
11
Bioinformatics Introduction Questions?
12
Molecular Biology An introduction/Review
13
What is the Central Dogma?
14
What is DNA? Deoxyribonucleic acid Polymer of nucleotides: Adenine (A) Thymine (T) Guanine (G) Cytosine (C) Double Helix (show PDB: 142D)
15
What is DNA Replication? Copy DNA molecule (5’->3’) Cell division Passage of genetic information Propagation/source of mutation dnai.org
16
What is RNA? Ribonucleic acid Polymer of nucleotides Adenine (A) Uracil (U) – substitute for T Guanine (G) Cytosine (C) Single chain Varied structures (pdb: 1evv –tRNA)
17
What is Transcription? DNA -> complementary RNA Genes – Transcribed DNA dnai.org
18
What is a protein? Polymer of amino acid Encoded by DNA via mRNA Synthesized at ribosome Enzymatic/structural PDB: 1gzx - hemoglobin
19
What is translation? Protein synthesis mRNA -> protein Ribosome dnai.org
20
Molecular Biology Review Questions?
21
UNIX
22
What is UNIX? Operating System Uniplexed Information and Computing System 1970s – Bell Labs Multiuser Environment
23
Why do we use UNIX? Multiuser abilities Network capabilities Process Chaining Easy Text file manipulation Software development capabilities Free!
24
UNIXploration Command Line Shell – Command line environment Bash (bourne again shell) csh (c-shell) tcsh (improved version of c-shell)
25
Important UNIX Commands man – view the manual page for a given command apropos – search man pages ls – list contents of a directory pwd – report current directory cd – change directory more/less – page through text clear – clear the terminal
26
Important UNIX Commands > - redirect output (standard output) < - redirect input (standard input) | - pipe cat – concatenate files/input to standard output grep – pattern matching from a file cut – remove sections from files find – search for files sort – sort lines of a file
27
Text Editors vi/vim u – undo i – insert A – append : - enter command line :help – view help page :q - quit ZZ – save/quit esc – exit mode x – delete character dw – delete word dd – delete line
28
UNIX Exercise Create ~/software and ~/bin Install BLAST ftp ftp.ncbi.nih.govftp.ncbi.nih.gov
29
Fundamentals of Sequence Alignment
30
Global Alignment: Needleman-Wunsch What is Global alignment? Uses whole length of both sequences Result: 1 optimal alignment Needleman-Wunsch: Utilize a 2-d matrix Scenario: Align: COELACANTH and PELICAN +1 – Match -1 – Mismatch -1 - Gap
31
Global Alignment: Needleman-Wunsch
33
Resulting alignment: COELACANTH P-ELICAN-- or COELACANTH -PELICAN--
34
Local Alignment: Smith-Waterman What is a local alignment? Find the highest scoring substring No assumption on sequence length Smith-Waterman Use a 2-d matrix Scenario: Align: COELACANTH and PELICAN +1 – Match -1 – Mismatch -1 - Gap
35
Local Alignment: Smith-Waterman
37
Resulting alignment: ELACAN ELICAN
38
Sequence Alignment More sophisticated scoring: Substitution Matrix PAMX (Point Accepted Mutation) Scaled according to evolutionary distance of closely related proteins PAM1 = 1% of amino acid positions have changed PAM250 – most common BLOSUMX (BLOck SUbstitution Matrix) Scaled according to more distantly related proteins BLOSUM62 – based on proteins with <=62% identity
39
Sequence Alignment Intro Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.