Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Bioinformatics

Similar presentations


Presentation on theme: "Introduction to Bioinformatics"— Presentation transcript:

1 Introduction to Bioinformatics
Topic 1 Introduction to Bioinformatics and Sequence Analysis

2 Session 1 Learning Outcomes: The scope of bioinformatics The origins & growth of DNA databases Evidence of evolution from bioinformatics Example sequence analysis and displays using human Factor IX

3 Bioinformatics: Concerns the generation, visualization, analysis, storage, and retrieval of large quantities of biological information.

4 GenBank growth: How much data we are talking about?
The amount of DNA sequence data in public databases NCBI: US national centre for Biotechnology Information DDJB: DNA Data Bank Japan EBI: European Bioinformatics Institute The contest of these data base are synchronized.

5 What DATA??? Human Genome Project Projecting now come from scientists in numerous field of biology, medicine, agriculture, ecology, history, energy, and forensic. Lets give some examples which you can explore in your own interest:

6 The genomes of 1000 people to identify genetic variants that affect 1% of the human population

7 The genomes of 1001 strains that differ in phenotype including adaptation to growth in a wide variety conditions.

8 https://genome10k.soe.ucsc.edu/
An effort to sequence the genomes of 10,000 species, one from each genus.

9

10 Metagenomics database

11 Cancer genome atlas

12

13 ANNOTATION: The information describing genetic and protein sequences structures, similarities, functions, and prediction associated with these sequences.

14 Advantageous Deleterious Neutral
WITNESSING EVOLUTION THROUGH BIOINFORMATICS Random mutation in sequences is a common phenomenon. Advantageous Organism kept it for future population Deleterious Quickly eliminated from the population Neutral May or may not be retained

15 Recent evolutionary changes to plants & animals
10,000 years ago hunter-gather life-style to practicing agriculture. Domestication of animals. Cows milk production Horses  speed or strength Sheep  wool quantity and quantity Poultry  more breast meat Fish  speed of maturation

16 LARGE SOURCES OF HUMAN SEQUENCE VARIATION
First time sequencing of human genome both cost and time was high. Resequencing cost decline sharply as using the first sequence as template. Resequencing show considerable differnces seen between individual people.

17 Single Nucleotide polymorphisms (SNPs):
Human genome 3.2 billions bp Approximately 3 million nucleotides differ between two individual genomes The common differences are found in about 1% of the population.

18 Copy Number Variations (CNVs):
Comparing your DNA sequence to that of the human “standard genome”, there are thousands of DNA segments which range from 1000 to several million nucleotides in length and they are either present, present in multiple copies or absent from your genome.

19 Africa (50,000 years ago)  Middle East  Europe  Neanderthals
RECENT EVOLUTIONARY CHANGES TO HUMAN POPULATIONS Africa (50,000 years ago)  Middle East  Europe  Neanderthals Eastern Europe  Lithuania

20 block damaging of uv light
Examples of genetic changes associated with adaptation (diet and lifestyle): Skin Color: African Indian Southern European Northern European Near pole Paler skin color make vitamin D Near equator Darker skin color block damaging of uv light Sequence variation in number of genes, one of it is SLC24A5

21 Other examples: (self study)
Lactose intolerance Digestion of starch Malaria resistance and sickle cell anemia Life at high altitude

22 DNA SEQUENCE IN DATABASES

23

24 Two types of DNA sequences are available in databases:
Genomic DNA cDNA

25 Genomic DNA assembly

26

27

28

29 cDNA:

30 SEQUENCE ANALYSIS AND DATABASE DISPLAY
The sequence of the mRNA for human Factor IX Accession number: NM_000133

31 Applying two rules for describing the human Factor IX mRNA sequence:
Coding regions begins with ATG Coding regions end with one of three terminator sequences: TAA TGA TAG

32 Coding regions are read at triplets. Others are 5’ and 3’ UTR

33

34 Coding region triplets are translated into amino acids.

35 The protein sequence of human factor IX (461 amino acids)

36 Pairwise alignment: Factor IX gene which is over nt. A single mutation, changing a G to T at coordinate 25531, results in hemophilia B, a severe bleeding disorder.

37 Alignment of human (Query) and chimpanzee (Subject/Subjct) Factor IX proteins

38 Factor IX has five major domains
Cleaved by signal peptidase, 12 Gla residues in the second domain. Activated by cleaving the protein into 2 peptide Cleave X protein, clotting cascade pathway To direct the protein to the ER of liver cells, from where it secreted into the blood. Epidermal growth factor- like domain bind Ca++

39 The entire 38000 nt gene is shown as the black arrow F9.

40 Location of Factor IX gene in chr X.

41 THE END


Download ppt "Introduction to Bioinformatics"

Similar presentations


Ads by Google