Presentation is loading. Please wait.

Presentation is loading. Please wait.

生物資料庫搜尋 ( 第八組 ) 連威森 497542342 王鼎 497542108 黃智楹 497542469 張鈞淵 497542720.

Similar presentations


Presentation on theme: "生物資料庫搜尋 ( 第八組 ) 連威森 497542342 王鼎 497542108 黃智楹 497542469 張鈞淵 497542720."— Presentation transcript:

1 生物資料庫搜尋 ( 第八組 ) 連威森 497542342 王鼎 497542108 黃智楹 497542469 張鈞淵 497542720

2 內容大綱 & 小組分工 Exercise 1: Finding public biological databases  黃智楹 Exercise 2: The GeneCards database  連威森 Exercise 3: Identification of disease genes  王鼎+張鈞淵

3 Exercise 1: Finding public biological databases

4 2011 NAR Database Summary Paper Category List Nucleotide Sequence Databases RNA sequence databases Protein sequence databases Structure Databases Genomics Databases (non-vertebrate) Metabolic and Signaling Pathways Human and other Vertebrate Genomes Human Genes and Diseases Microarray Data and other Gene Expression Databases Proteomics Resources Other Molecular Biology Databases Organelle databases Plant databases Immunological databases

5 MGI 資料庫性質 此次選擇的是 MGI database(Mouse Genome Informatics) 這個資料庫主要是提供實驗老鼠包含了 Genome 、 Pathway 、 Expression…… 等各種生 物資料,並進而來研究人類的癌症及各項 疾病並提出改善的可能

6 About MGI 包含了幾個大範圍的資料庫 Mouse Genome Database (MGD) Project 、 Gene Expression Database (GXD) Project 、 Mouse Tumor Biology (MTB) Database Project 、 Gene Ontology (GO) Project at MGI 、 MouseCyc Project at MGI ,這些索引可以在 about MGI 找到 ( 可見下圖標紅色處 )

7 About MGI 包含 Gene 庫、 Gene 的表達、老鼠的腫瘤研究、 Gene Ontology (GO) Project …… 等等

8 About MGI( 各資料庫內容 ) MGD : 含 gene characterization, nomenclature, mapping, gene homologies among mammals, sequence links, phenotypes, allelic variants and mutants, and strain data. GXD 含 different types of gene expression information from the mouse and provides a searchable index of published experiments on endogenous gene expression during development. MTB 含 data on the frequency, incidence, genetics, and pathology of neoplastic disorders, emphasizing data on tumors that develop characteristically in different genetically defined strains of mice.

9 特點 較特別的是本資料庫,所針對提供資料的 對象為實驗老鼠,除了一般資料之外,還 提供了腫瘤研究的相關資料,因此,資料 庫最終目的還是為了解決人類相關疾病如 癌症所設

10 建議的搜尋方式 建議從 Home 進入主頁面,直接從各個圖案 進入自己所要進入的資料庫,搜尋相關資 料較快 ( 見下兩張圖 ) ,點了 Home 之後,會 進入有很多圖示的頁面,每一個圖示對應 著一種資料的連結

11 點 Home 之後 …… Home Page

12 每一個圖示對應一種資料連結

13 Home Page 每一個圖示對應一種資料連結

14 Genes 點進 Access Data 後可在此頁面打入限定範圍並進行查詢

15 Exercise 2: The GeneCards database

16 GeneCards Genecards 是一個收集並展示人類基因及其產物和相關 疾病等綜合信息的知識平台。 它是由以色列的 Weizmann 研究所基因組研究中心和生 物信息學中心共同開發的,擁有完整的導覽輔助應用系統, 並且有專家建議的提示,再加上拼字檢查功能,可以說是 一套方便好用的生物醫學資源工具。 目前, GeneCards 的版本已演進到3.0 版,有 67398 個 基因資料,其中 30105 個已經被 HUGO 基因命名委員會審核 通過。

17 使用 GeneCards 搜尋 HBA2 gene Summaries The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5'- zeta -pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3'. The alpha-2 (HBA2) and alpha-1 (HBA1) codingsequences are identical. These genes differ slightly over the 5' untranslated regions and the introns, but they differsignificantly over the 3' untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normaladult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result fromdeletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemiashave also been reported. (provided by RefSeq)

18 Genomic Location Function Involved in oxygen transport from the lung to the various peripheral tissues. Start:140,887 bp from pter End:141,750 bp from pter Size:864 bases Orientation:plus strand

19 Protiens Recommended Name: Hemoglobin subunit alpha Size: 142 amino acids; 15258 Da Subunit: Heterotetramer of two alpha chains and two beta chains in adult hemoglobin A (HbA); two alpha chains and twodelta chains in adult hemoglobin A2 (HbA2); two alpha chains and two epsilon chains in early embryonic hemoglobinGower-2; two alpha chains and two gamma chains in fetal hemoglobin F (HbF) Miscellaneous: Gives blood its red color Sequence caution: Sequence=BAD97112.1; Type=Erroneous initiation

20 Exercise 3: Identification of disease genes

21 Problem A laboratory has generated an EST library from a hemochromatosis patient and wants to identify the gene(s) causing the phenotype. http://www.chelationtherapyonline.com/anatomy/images/cycle.gif

22 Hemochromatosis Hemochromatosis is the most common form of iron overload disease. Primary hemochromatosis, also called hereditary hemochromatosis, is an inherited disease. Secondary hemochromatosis is caused by anemia, alcoholism, and other disorders. Without treatment, the disease can cause the liver, heart, and pancreas to fail.

23 Outline - the steps solving the problem 1.Compare an EST from a hemochromatosis patient to the human genome (using BLAST). 2.Identify the gene(s) aligning the ESTs and download their sequences (using Map Viewer). 3.Identify whether the ESTs contain any known nucleotide variations (single nucleotide polymorphisms) (using dbSNP). 4.Determine whether a mutant form of the gene is known to cause a phenotype (using OMIM).

24 Step 1. Compare ESTs to the human genome (using BLAST)

25 Expressed sequence tag (EST) An expressed sequence tag or EST is a short sub-sequence of a transcribed cDNA sequence. Because these clones consist of DNA that is complementary to mRNA, the ESTs represent portions of expressed genes. http://www.nature.com/nrg/journal/v4/n6/images/nrg1085-i1.gif http://en.wikipedia.org/wiki/Expressed_sequence_tag

26 Contig (in DNA sequencing) In shotgun DNA sequencing projects, a contig is a set of overlapping DNA segments derived from a single genetic source. A contig in this sense can be used to deduce the original DNA sequence of the source. http://en.wikipedia.org/wiki/Contig http://bioweb.uwlax.edu/GenWeb/Molecular/Seq_Anal/Genomics/Contig.gif

27 Step 1. (1/4)

28 Step 1. (2/4)

29 Step 1. (3/4)

30 Step 1. (4/4)

31 Step 1. Questions On which chromosome is the EST located? On which contig?  EST located on Homo sapiens chromosome 6 genomic contig, GRCh37.p2 reference primary assembly. Is the EST sequence 100% identical to the genomic sequence?  Their is one nucleotide difference between the contig NT_007592 and the EST sequence (G/A).

32 Step 2. Identify the gene(s) aligning the ESTs and download their sequences (using Map Viewer).

33 Single-nucleotide polymorphism (SNP) That is a DNA sequence variation occurring when a single nucleotide — A, T, C, or G — in the genome differs between members of a biological species or paired chromosomes in an individual. http://en.wikipedia.org/wiki/Single-nucleotide_polymorphism

34 Step 2. (1/4)

35 Step 2. (2/4)

36 Step 2. (3/4)

37 Step 2. (4/4)

38 Step 2. Question Note the orientation of the gene. Is the gene on the forward strand or the reverse strand?

39 Step 3. Identify whether the ESTs contain any known nucleotide variations (using dbSNP).

40 Step 4. Determine whether a mutant form of the gene is known to cause a phenotype (using OMIM)

41 資料來源 http://www3.oup.co.uk/nar/database/c/ http://www.genecards.org/ http://www.ncbi.nlm.nih.gov/ http://www.ncbi.nlm.nih.gov/genome/seq/BlastGen/ BlastGen.cgi?taxid=9606


Download ppt "生物資料庫搜尋 ( 第八組 ) 連威森 497542342 王鼎 497542108 黃智楹 497542469 張鈞淵 497542720."

Similar presentations


Ads by Google