簡介基因體 註解工具.

1 簡介基因體 註解工具

2 何謂基因體 也稱基因組 泛指特定生物體細胞核內的所有基因組合

3 人類基因體解讀計畫 九十年代初期正式成立 由美國衛生研究院及能源部領軍,結合全美分子生物學界的資源

4 International Consortium Completes Human Genome Project
美國衛生研究院 能源部DOE

5 人類基因體計劃的目的與應用 完成人類基因體 3*109 鹼基之全部定序工作 發展新的生物科技 生物資訊 建立實驗動物模式

6 為何要做基因體的註解工作? (Genome annotation)
因為若無註解,基因體序列只是一群 GATC的組合,對一般的生物學家根本毫無幫助。 基因體序列被完全定序之後,生物學家非常急切想要知道的就是,這由四個字母編排出來的序列到底隱含了什麼樣的意義? 基因體註解廣義地說,就是把所有在DNA序列中有意義的資訊全都註解出來。

7 Example: SARS 基因體註解 SARS 之 RNA genome 定序後,首要之工作即是把序列上之基因位置及功能標示出來。這項工作稱為基因體註解 (genome annotation)。其中生物資訊中的序列比對 (sequence alignment) 技術即可運用於此。 此項基因體註解工作,需仰賴資料庫中其他的冠狀病毒 (coronavirus) 之基因功能註解,由序列的相似性及區域來推斷 SARS 病毒中重要的結構蛋白 (structure protein) ,如 spike protein (S), membrame protein(M), small membrane protein (E) nucleocapsid protein ,以及聚合脢等非結構蛋白 (NSPs) 之基因位置。

8 Central Dogma of Molecular Biology

9 Structure of an idealized gene
Transcription Start site ATG,TGA,or TAG Stop codon AATAAA Poly(A) signal CCAAT Box ATG Initiation codon Enhancer Poly(A) tail TATA Box GT AG GT AG EXON EXON EXON 5’ Untranslated region 3’ Untranslated region Introns Transcription

10 基因體註解 Promoter (啟動子): DNA region involved in and necessary for initiation of transcription, and including the RNA polymerase binding site, the startpoint of transcription and various other sites at which of transcription regulatory proteins may bind. Enhancer (增強子): a type of control site in DNA, present in the control region of many eukaryotic genes, and whose regulation by specific regulatory proteins dramatically increases the levele of transcription.

11 基因體註解 Exon (外顯子): a block of DNA encoding part of a polypeptide chain (or tRNA, rRNA.) Intron (內含子): a non-coding nucleotide sequence that is transcribed into RNA but subsequently removed by RNA splicing process. Splicing junction (剪接接頭): the junctions between exon and intron in a primary transcript from a eukaryotic split gene.

12 基因體註解工具 核酸序列分析 Nucleic Acid Conformation Translation ORF Finder
Gene Function Prediction

13 核酸序列分析 Nucleic Acid Conformation
Sequence: FGF9_5UTR, FGF9_3UTR

14 ExPASy Molecular Biology Server
核酸序列分析 Translation ExPASy Molecular Biology Server +1 +2 +3 -1 -2 -3 Six reading frames Sequence: FGF9_coding

15 Translate - Translates a nucleotide sequence to a protein sequence

16 Transeq - Nucleotide to protein translation at EBI (EMBOSS)

17 BCM - Nucleotide to protein translation at MBS

18 Backtranslation - Translates a protein sequence back to a nucleotide sequence

19 核酸序列分析 ORF finder Open reading frames Sequence: FGF9_minigene +1 +2 +3
-1 -2 -3 Sequence: FGF9_minigene

20 ORF Finder : open reading frames finder

21 GENSCAN: Identification of complete gene structures in genomic DNA

22 作業一 1. 請將序列 NM_ 由GenBank中找出來,寫出基因名稱,以及其5’,3’ UTR及coding 序列位置及長度。 2. 並利用網路上的工具預測此序列的5’,3’UTR 是否可能具有二及結構? 3.利用網路上的工具預測此序列的translation長度,並寫出其 Amino Acid 序列。

23 簡介蛋白質體 註解工具

24 Genomics vs. Proteomics
Genome “Genomics” DNA mRNA Proteome “Proteomics” Proteins Cell functions

25 Proteome Proteomics -is the expressed protein complement of a genome
-is functional genomics at the protein level

26 Generalized proteomics schema
Body Fluids Tissues Cells Proteon Resolution (2D Electrophoresis) Proteon Identification (Mass Spectrometry) Proteon Characterization (Interaction Assay)

27 Proteomics Structural Functional Structural assay Functional assay
Differential display Protein identification Protein characterization Functional Protein-protein interaction Pathway ID/elucidation Protein function Structural assay -X-ray crystallization NMR Molecular modeling Functional assay -2D PAGE -3D IMAGE

28 蛋白體註解 Biochemical Property Signal Prediction Pattern Finding …………

29 蛋白質序列分析 at ExPAsy Server

30 蛋白質序列分析 at CBS (The Center for Biological Sequence Analysis )

31 PDB - An Information Portal to Biological Macromolecular Structures

32 TBI – Taiwan Bioinformatic Institute

33 蛋白質序列分析 Pattern and profile searches
InterproScan, SMART, MOTIF Post-translational modification prediction SignalP , SecretomeP , NetPhosK Topology prediction PSORT , CELLO Structure prediction MMDB Others EMBL WWW Gateway to Isoelectric Point Service , Sequence: FGF9_peptide

34 作 業 二 利用作業一的蛋白質序列及網路工具回答下列問題 Isoelectric Point
Prediction of protein localization sites in cells Domains or Motifs Homolog proteins in other species

