Problem Set I review BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD.

Slides:



Advertisements
Similar presentations
applications of genome sequencing projects
Advertisements

Manish Anand Nihar Sheth Jim Costello Univ. of Indiana
CZ5225 Methods in Computational Biology Lecture 9: Pharmacogenetics and individual variation of drug response CZ5225 Methods in Computational Biology.
Integrating dbSNP with P. falciparum genome resources.
Polymorphisms: Clinical Implications By Amr S. Moustafa, M.D.; Ph.D. Assistant Prof. & Consultant, Medical Biochemistry Dept. College of Medicine, KSU.
Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id
Single Nucleotide Polymorphisms Jennifer Lyon Eskind Biomedical Library May 1, 2009 CRC Workshop Series.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
PolyPhen and SIFT: Tools for predicting functional effects of SNPs Epi 244 Spring 2009 Sam S. Oh.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Restriction Fragment Length Polymorphisms (RFLPs) By Amr S. Moustafa, M.D.; Ph.D. Assistant Prof. & Consultant, Medical Biochemistry Dept. College of.
Putting it all together: Finding the cystic fibrosis gene Cystic fibrosis (CF) is a genetic disorder that is relatively common in some ethnic groups A.
Genome Variations & GWAS
DbSNP: the NCBI database of genetic variation S. T. Sherry, M.H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski and K. Sirotkin, Nucleic Acids.
Chapter 3 -- Genetics Diversity Importance of Genetic Diversity Importance of Genetic Diversity -- Maintenance of genetic diversity is a major focus of.
Reading the Blueprint of Life
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Analyzing DNA Differences PHAR 308 March 2009 Dr. Tim Bloom.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
1 RFLP analysis RFLP= Restriction fragment length polymorphism  Refers to variation in restriction sites between individuals in a population  These are.
Conservation of genomic segments (haplotypes): The “HapMap” n In populations, it appears the the linear order of alleles (“haplotype”) is conserved in.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Genomes and Genomics.
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
MAPPING GENOMES – genetic, physical & cytological maps Genetic distance (in cM) 1 centimorgan = 1 map unit, corresponding to recombination frequency of.
Online Mendelian Inheritance in Man (OMIM): What it is & What it can do for you Knowledge Management & Eskind Biomedical Library January 27, 2012 helen.
 The process by which desired traits of certain plants and animals are selected and passed on to their future generations is called selective breeding.
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
Chap. 5 Problem 1 Recessive mutations must be present in two copies (homozygous) in diploid organisms to show a phenotype (Fig. 5.2). These mutations show.
Gene Regulations and Mutations
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
Genetic Testing Amniocentesis Until recently, most genetic testing occurred on fetuses to identify gender and genetic diseases. Amniocentesis is one technique.
Molecular Markers CRITFC Genetics Workshop December 8, 2015.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Chapter 12 Assessment How could manipulating DNA be beneficial?
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
Notes: Human Genome (Right side page)
Genetics 3.1 Genes. Essential Idea: Every living organism inherits a blueprint for life from its parents.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
生物資料庫搜尋 ( 第八組 ) 連威森 王鼎 黃智楹 張鈞淵
Chapter 13 Section 13.3 The Human Genome. Genomes contain all the information needed for an organism to grow and survive The Human Genome Project (HGP)
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Single Nucleotide Polymorphisms (SNPs
GENETIC MARKERS (RFLP, AFLP, RAPD, MICROSATELLITES, MINISATELLITES)
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
Gene sequencing Analysis
Genetic Testing for the Clinician
School of Pharmacy, University of Nizwa
Bellwork: What is the human genome project. What was its purpose
By Michael Fraczek and Caden Boyer
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
School of Pharmacy, University of Nizwa
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
SNPs and CNPs By: David Wendel.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Problem Set I review BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD

Tissue Specificity & Top Tissues Life is a complex orchestration of genes to be expressed at the right time, place, and level. Basic cellular functions require the expression of certain genes in all cells and tissues (that is, in a ubiquitous manner) while specialized functions require restricted expression of other genes in a single or small number of cells and tissues (that is, tissue specific). Life is a complex orchestration of genes to be expressed at the right time, place, and level. Basic cellular functions require the expression of certain genes in all cells and tissues (that is, in a ubiquitous manner) while specialized functions require restricted expression of other genes in a single or small number of cells and tissues (that is, tissue specific).

Tissue Specificity vs Tissues with Most Frequent Expression Not always the same Not always the same Tissue specificity: tissues expressing the gene above the median value. OMIM – just lists a few where gene found microarray-based expression data microarray-based expression data See e.g. expressed sequence tag (EST)-based expression data expressed sequence tag (EST)-based expression data See Stanford Source, Unigene RT-PCR data RT-PCR data Literature, commercial software, no good databases

MTHFR: Lymphoma;; Cardiac. Muscle (next probe)

MTHFR: Pancreas, liver

Stanford Source: MTHFR: lymph

MTHFR: heart, lung GeneCards:

Stanford Source Calculation EST-example Clones for a gene were isolated from skeletal muscle (8 unique clones) and cardiac muscle (2 unique clones). Clones for a gene were isolated from skeletal muscle (8 unique clones) and cardiac muscle (2 unique clones). Number of all clones isolated from skeletal muscle: 16000, so frequency is 8/16000= Number of all clones isolated from skeletal muscle: 16000, so frequency is 8/16000= Number of cardiac muscle clones is 10000, so frequency is Number of cardiac muscle clones is 10000, so frequency is = = Normalized gene expression is calculated by dividing by Skeletal muscle = / = 71% Skeletal muscle = / = 71% Cardiac muscle = / = 29% Cardiac muscle = / = 29%

Tissue-Specificity Calculation 2 unique clones for gene X were isolated from cardiac muscle. Out of clones isolated from cardiac muscle, there are 9999 genes represented by only one clone and one gene represented by 2 clones. This gene is tissue-specific 2 unique clones for gene X were isolated from cardiac muscle. Out of clones isolated from cardiac muscle, there are 9999 genes represented by only one clone and one gene represented by 2 clones. This gene is tissue-specific

dbSNP queries SLC19A1[gene] AND human[orgn] SLC19A1[gene] AND human[orgn] SLC19A1[gene] AND human[orgn] AND snp[snp_class] SLC19A1[gene] AND human[orgn] AND snp[snp_class] SLC19A1[gene] AND human[orgn] AND "coding nonsynonymous"[FUNC] SLC19A1[gene] AND human[orgn] AND "coding nonsynonymous"[FUNC] SLC19A1[gene] AND human[orgn] AND "coding synonymous"[FUNC] SLC19A1[gene] AND human[orgn] AND "coding synonymous"[FUNC]

dbSNP queries ADRB1[gene] AND human[orgn] = 48 ADRB1[gene] AND human[orgn] = 48 ADRB1[gene] AND human[orgn] AND "snp"[SNP_CLASS] =40 ADRB1[gene] AND human[orgn] AND "snp"[SNP_CLASS] =40 ADRB1[gene] AND human[orgn] AND "in- del"[snp_class] = 5 ADRB1[gene] AND human[orgn] AND "in- del"[snp_class] = 5 ADRB1[gene] AND human[orgn] AND heterozygous[snp_class] = 0 ADRB1[gene] AND human[orgn] AND heterozygous[snp_class] = 0

ADRB1[gene] AND human[orgn] AND mixed[snp_class] = 0 ADRB1[gene] AND human[orgn] AND mixed[snp_class] = 0 ADRB1[gene] AND human[orgn] AND microsatellite[snp_class] = 3 ADRB1[gene] AND human[orgn] AND microsatellite[snp_class] = 3 ADRB1[gene] AND human[orgn] AND "multinucleotide polymorphism"[snp_class] = 0 ADRB1[gene] AND human[orgn] AND "multinucleotide polymorphism"[snp_class] = 0 ADRB1[gene] AND human[orgn] AND "named locus"[snp_class] = 0 ADRB1[gene] AND human[orgn] AND "named locus"[snp_class] = 0 ADRB1[gene] AND human[orgn] AND "no variation"[snp_class] = 0 ADRB1[gene] AND human[orgn] AND "no variation"[snp_class] = 0

ADRB1 SNP summary 48 SNPs 48 SNPs 40 true SNPs 40 true SNPs 5 insertion-deletions (in-dels) 5 insertion-deletions (in-dels) 3 microsatellites 3 microsatellites no other types no other types

Type of variation SNP[snp_class], True single nucleotide polymorphism SNP[snp_class], True single nucleotide polymorphism in-del, Insertion deletion polymorphism; ('-‘/’+’) in-del, Insertion deletion polymorphism; ('-‘/’+’) Heterozygous, Variation has unknown sequence composition but is observed to be heterozygous Heterozygous, Variation has unknown sequence composition but is observed to be heterozygous Microsatellite/simple sequence repeat Microsatellite/simple sequence repeat Named: Allele sequences defined by name tag instead of raw sequence, e.g., (Alu)/ Named: Allele sequences defined by name tag instead of raw sequence, e.g., (Alu)/ no-variation, invariant region in surveyed sequence no-variation, invariant region in surveyed sequence Multiple nucleotide polymorphism (all alleles same length, where length >1) Multiple nucleotide polymorphism (all alleles same length, where length >1)

Definitions Homozygote - has two identical alleles at a particular locus (for a given gene) Homozygote - has two identical alleles at a particular locus (for a given gene) Heterozygote - has two different alleles at a particular locus Heterozygote - has two different alleles at a particular locus Hemizygote – only one of a pair of genes for a specific trait. Example: male is hemizygote for the X-chromosome Hemizygote – only one of a pair of genes for a specific trait. Example: male is hemizygote for the X-chromosome

Definitions Heterozygous genotype = Occurs when the two alleles at a particular gene locus are different. A heterozygous genotype may include one normal allele and one mutation, or two different mutations. The latter is called a compound heterozygote. Heterozygous genotype = Occurs when the two alleles at a particular gene locus are different. A heterozygous genotype may include one normal allele and one mutation, or two different mutations. The latter is called a compound heterozygote.

Heterozygous SNP vs AVG. Heterozygozyty

More on dbSNP An ss number is the unique ID number assigned to each submitted SNP. Once aligned and processed, submissions are clustered and a “reference SNP cluster”, or a “refSNP” is created and given a unique rs ID number, An ss number is the unique ID number assigned to each submitted SNP. Once aligned and processed, submissions are clustered and a “reference SNP cluster”, or a “refSNP” is created and given a unique rs ID number,

Drugs Some proteins are drug targets. Some proteins are drug targets. Example: glimepiride (antidiabetic: targets KCNJ11 (blocker) (also, antagonist, agonist) Example: glimepiride (antidiabetic: targets KCNJ11 (blocker) (also, antagonist, agonist) Some drugs regulate activity of drugs indirectly. Some drugs regulate activity of drugs indirectly. Diazoxide activates KCNJ11 Diazoxide activates KCNJ11 Glucocorticoid decreases expression of Kcnj11 mRNA Glucocorticoid decreases expression of Kcnj11 mRNA Regulates binding of KCNJ11 Regulates binding of KCNJ11 Some drugs are even more indirectly associated with SNPs in proteins causing sensitivities Some drugs are even more indirectly associated with SNPs in proteins causing sensitivities

Haplotypes Haplotypes are groups of linked SNPs which are somewhat inherited in a linked fashion Haplotypes are groups of linked SNPs which are somewhat inherited in a linked fashion Haplotype blocks refer to sites of closely located SNPs which are inherited in blocks Haplotype blocks refer to sites of closely located SNPs which are inherited in blocks A set of closely linked genes that tends to be inherited together as a unit. Haplotype may refer to only one locus or to an entire genome A set of closely linked genes that tends to be inherited together as a unit. Haplotype may refer to only one locus or to an entire genome - the HapMap project - the HapMap project

Haplotype block names Sometimes different for different populations/families. Sometimes different for different populations/families. Still “in progress” Still “in progress” Sometimes linked via dbSNP (haplotype- tagged), available in other variation sites Sometimes linked via dbSNP (haplotype- tagged), available in other variation sites Haplotype analysis of ABCB1 revealed 2 major haplotypes, ABCB1*1 and ABCB1*13. ABCB1*13 contains T1236, T2677T, T3435, and 3 intronic variants. Haplotype analysis of ABCB1 revealed 2 major haplotypes, ABCB1*1 and ABCB1*13. ABCB1*13 contains T1236, T2677T, T3435, and 3 intronic variants.