Download presentation
Presentation is loading. Please wait.
Published byRafe Shelton Modified over 8 years ago
1
1 Bioinformatics Tools for Genotyping Frances Tong Dr. Garry Larson, Ph.D City of Hope Department of Molecular Medicine Southern California Bioinformatics Institute Summer 2003 Funded by the National Science Foundation and the National Institutes of Health
2
2 Learn ASP and VBScript Learn the biology Programming Project I : writing code for mining of online genetic data Programming Project II : writing a program to graph linkage disequilibrium data Overview of Summer Program
3
3 Intro to ASP & VBScript ASP : Microsoft Active Server Pages * server generated web pages * similar to CGI but easier * works well with databases VBScript : Microsoft Visual Basic Scripting * scripting language to enhance HTML web pages * default language of ASP
4
4 Hello World! Sample ASP file (one line only!)
5
5 Genetic Mapping of ASPs ASPs : affected sibling pairs Identification of genes associated with cancer in patients and siblings who both have cancer (breast, prostate, lung or colon) Determine allele sharing statistics of susceptibility genes Look at gene-gene interactions => Provide information on a person’s genetic risk of developing cancer
6
6 DNA Marker Genotyping Genetic marker : polymorphic gene or section of DNA that has identifiable physical location on a chromosome used to trace inheritance Ex. Microsatellite and SNP markers
7
7 Programming Project I: Tag Selection For Markers Need unique way to identify markers (like social security numbers for people) Chromosome locations are relative and change frequently (UCSC) Use ASP to automate data mining to ease the generation of these unique 50 base-pair tags for each marker in database Tags will be used to locate markers in genome
8
8 UCSC Genome Browser
9
9 Marker Tag Selection Submit sequence surrounding simple repeat Submit accession number for microsatellite Submit accession number for snp
10
10 Output Link to UCSC browser chromosomeSequence start position Sequence end position Inputted sequence with repeats highlighted in blue
11
11 Choosing a 50bp tag Copy and paste here Send sequence to UCSC
12
12 UCSC Blat Results Blat is similar to BLAST : searches for alignment in genome
13
13 List of markers and their tags
14
14 Convert to FASTA format FASTA format: >name sequence program converts marker tag file into fasta format automatically
15
15 Check tag selection Program sends fasta file to UCSC Blat
16
16 Linkage Disequilibrium A condition where two polymorphisms are found together on the same chromosome at a greater frequency than that predicted from the product of their individual frequencies.
17
17 5’3’ 5’3’ 5’3’ 5’3’ G/A G : 0.88 A : 0.12 T/C T : 0.75 C : 0.25 5’3’ Two snps and their base frequencies G G T C AT A C (0.88)(0.75) = 0.66 (0.88)(0.25) = 0.22 (0.12)(0.75) = 0.09 (0.12)(0.25) = 0.03 Expected frequencies
18
18 Expected Frequencies Observed Frequencies G & T0.660.54 G & C0.220.20 A & T0.090.24 A & C0.030.02 IF observed frequencies of 2 variants together > expected frequencies => LINKAGE DISEQUILIBRIUM A and T together are in linkage disequilibrium
19
19 A Quantitative Measure of LD One of the most common measures of linkage disequilibrium is It is a squared correlation coefficient => the correlation of alleles at two sites. Special case: (“perfect LD”) ~ Exactly two out of the four possible haplotypes are observed. ~ Markers NOT separated by recombination
20
20 Programming Project II Program that helps visualize linkage disequilibrium by graphing scores such as Each pair of markers has such a score => pairwise comparisons 1 Marker 3Marker 1 1 Marker 2 Marker 1 Marker 2 Marker 3 0.7 0.20.7 0.2 Symmetric!
21
21 Sample data for graphing Read data by row: Pairwise comparison of marker 1 and marker 7 results in two different kinds of measurements
22
22 GOLD – Graphical Overview of Linkage Disequilibrium Existing program from the Univ. of Michigan to graph linkage disequilibrium http://www.sph.umich.edu/csg/abecasis/GOLD/ Graphs based on a chromosomal position scale Works very well for long range pattern analysis, but hard to distinguish each specific measurement.
23
23 Comparison of Program Output Output from GOLD Difficult to see individual points on graph Same input file Output from LD Color (my program) Easier to distinguish individual points
24
24 LD Color Program Program written in ASP to graphically depict linkage disequilibrium in human genetic data Color coded for specific numerical ranges of different measures of each pair-wise comparison of markers Complete program: 4 files ; >1,000 lines of code
25
25 Program Features Data input : file uploading or text pasting Allows for variable file formats for input User defined colors and ranges Switch between different measures of LD View actual data on graph or just the colors Change size of graph Option to select specific rows of data
26
26 Upload your file Paste data
27
27 Specify marker columns
28
28 Choose label for numerical data inputted
29
29 Choose measure of linkage disequilibrium Specify which column the data is located
30
30 Same as before => used to specify data for other side of diagonal
31
31 Choose to display data on graph
32
32 Choose different sizes for the graph
33
33 Select only the markers you want graphed by choosing rows Default : all are graphed
34
34 Specify the ranges for the colors you want graphed.
35
35 Manual
36
36 Color Legend
37
37 Sample: Symmetric
38
38 Sample: Big Size!
39
39 Sample: Data On, Asymmetric
40
40 Sample: Row Select
41
41 Future Directions LD Color Mouseover tag to each cell on graph to show marker id (Javascript) Ability to accept more kinds of file formats Better form validation and error checking More functionality and linking to outside sources
42
42 Acknowledgements Dr. Garry Larson, Ph.D Dave Ko City of Hope Senior Programmer Analyst Louis Geller City of Hope Senior Research Associate Dr. Ted Krontiris, M.D.,Ph.D Principal Investigator The rest of the Krontiris Lab Southern California Bioinformatics Institute: Dr. Jamil Momand, Dr. Nancy Warter-Perez, Dr. Sandra Sharp & Dr. Wendie Johnston, Jackie Leung & rest of SoCalBSI staff Fellow interns NSF & NIH
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.