Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Systems Biology 國立台灣大學資訊工程系 博士後研究員 詹鎮熊.

Similar presentations


Presentation on theme: "Introduction to Systems Biology 國立台灣大學資訊工程系 博士後研究員 詹鎮熊."— Presentation transcript:

1 Introduction to Systems Biology 國立台灣大學資訊工程系 博士後研究員 詹鎮熊

2 What is a system?

3 Features of a system  Components  Interrelated components  Boundary  Purpose  Environment  Interfaces  Input  Output  Constrain

4 Examples of Systems

5 Life ‘ s Complexity Pyramid Components Building blocks Functional modules System Z. N. Oltvai and A.-L. Barab á si, Science 298, 763 (2002)

6 生態體系 社區 族群 個體 器官系統 個體 組織 細胞 分子 原子 生物圈

7 個體 – 細胞 – 胞器 – 分子 Organism – Cell – Organelle – Molecules 人體由上兆個細胞組成 每個細胞具有: 46 條染色體 2 米長的 DNA 30 億個鹼基 (A, T, G, C) 2~3 萬個基因

8 The Central Dogma

9 Bottom-up  From genes to phenotypes  If the genome sequence can be fully sequenced, can we resolve all the secrets hidden in the DNA?

10 The -omics (-ome) era

11 Genomics (Genome)  Human Genome Project  Other Genome Projects Mouse Fly Dog Worm Bacteria … Most recently … Cat

12 Human genome project  Sequence the whole genome sequence of several individuals  Competition between Celera and NIH  Took over a decade  Draft in 2000, complete in 2003

13 The next stage: HapMap  HapMap is a catalog of common genetic variants that occur in human beings  It describes: what these variants are where they occur in our DNA and how they are distributed among people within populations and among populations in different parts of the world

14 Single Nucleotide Polymorphism (SNP)

15 Personalized genome  James Watson (454 Life Science)  Craig Venter (Venter Institute)  23andme (backed by Google, focus on social/family relationships)  Navigenics (focus on medical conditions)  Personal Genome Project (PGP, Harvard)

16 Proteomics (Proteome)  Categorize all proteins (and their relationships) in a temporal-spatial confined system Identities of these proteins Quantities Variants of these proteins  Alternative splicing forms  Post-translational modifications (Phosphorylation, Methylation, Ubiqutination, … )

17 Proteomics

18 Mass Spectrometry

19

20 Fluorescence Resonance Energy Transfer (FRET)  Co-localization (interaction) between protein- protein, protein- DNA pairs

21 Transcriptome  Identify all transcription factors (TF) functioning in a specific temporal- spatial confined system  Identify all genes regulated by specific TFs  ChIP-chip  TransFac database

22 Chromatin Immuno-Precipitation (ChIP)  a well-established procedure used to investigate interactions between DNA- binding proteins and DNA in vivo

23 ChIP-chip

24 Transcription Factor Binding Motifs

25 Interactome  Categorized all interactions (protein- protein or protein-DNA) within an organism Yeast Two-Hybrid Immuno-coprecipitation (co-IP) Mass Spectrometry FRET …

26 Yeast Two-hybrid

27 Metabolomics (Metabolome)  “ systematic study of the unique chemical fingerprints that specific cellular processes leave behind ”  Collection of all metabolites in a biological organism

28 Analytical methods for metabolomics  Separation Gas Chromatography (GC) High performance liquid chromatography (HPLC) Capillary electrophoresis (CE)  Detection Mass Spectrometry Nuclear magnetic resonance (NMR) spectroscopy

29 Glycomics  Oligosaccharide  Glycoprotein/Proteoglycan Proteins attached to oligosaccharides Important to cell recognition  Cancer targeting  Influenza

30 Model Organisms  Yeast (S. cerevisiae)  Worm (C. elegans)  Fruit Fly (D. melanogaster)  Mouse (M. musculus)

31 Monitoring the System  High throughput monitoring of gene expression Microarray Protein microarray GC/HPLC/MASS/Tandem MASS  Phenotype/Disease

32 Microarray

33 Protein Microarray

34 Phenotypes  Lethality Synthetic lethal  Developmental  Morphological  Behavioral  Diseases

35 Genotypes and Phenotypes genotype + environment → phenotype genotype + environment + random-variation → phenotype

36 Importance of Computer Models  Interactions in cell are too complex to handle by pen-and-paper  With high-throughput tools, biology shifts from descriptive to predictive  Computers are required to store, processing, assemble, and model all high-throughput data into networks

37 Types of Computer Models  Chemical Kinetic Model Defined by concentrations of different molecular species in the cell Represented with a number of equations Some processes may be stochastic  Simplified Discrete Circuit Network with nodes and arrows Nodes represent quantity or other attributes Directed edges represent effect of nodes on other nodes

38 Different Mathematical Formulations  Differential Equations Linear (ordinary) Partial Stochastic  S-Systems Power-law formulation Captures complicate dynamics Parameter estimation is computation intensive

39

40 Model details  Selection of genes, gene products, and other molecules to be included  Cellular compartments: nucleus, golgi, or other organelles  Too much details may lead to more noises  Minimal model able to predict system properties (mRNA level, growth rate, etc) is sufficient

41 Construct Model from Global Patterns  Microarray gene expression patterns: Up-regulated/down-regulated  Gene expression profiles under different conditions: Tumor/normal, cell cycle, drug treatment, …  Methods: Bayesian Inferences Machine learning (clustering, classification) …

42 Framework for Systems Biology

43 Tools for Simulation  E-cell  Cell Illustrator  Virtual Cell  Standardizing efforts: BioJake SBML (systems biology markup language) Facilitate the exchange of models

44 E-Cell System  A software to construct object models equivalent to a cell system or a part of the cell system  Employing Structured Variable- Process model (previously called the Substance-Reactor model, or SRM)  Objects: Variables, Processes, Systems

45 Cell Illustrator

46 Computational Databases  Protein-protein interaction DIP, BIND, MIPS, MINT, IntAct, POINT, BioGRID  Protein-DNA interaction TRANSFAC, SCPD  Metabolic pathways KEGG, EcoCyc, WIT, Reactome  Gene Expression GEO, ArrayExpress, GNF, NCI60, commercial  Gene Ontology

47 Network Biology  The entities within a system form intertwined complex networks Genes Proteins Metabolites External factors …

48 Gene (Transcription) Regulatory Network

49 Protein-Protein Interaction Network

50 Metabolic Pathways

51 KEGG metabolic pathway

52

53

54 Gene Ontology  The Gene Ontology project provides a controlled vocabulary to describe gene and gene product attributes in any organism  Annotations Molecular Function Cellular Components Biological Processes

55

56

57 Challenges of Databases  Provide information other than simple entries (e.g. PPI with functional annotation or binding strength)  Data maintenance – update  Integration with other databases

58 Applications Target identification and drug discovery

59 Disease Gene Identification  From networks  From literature  From microarray  Quantitative Trait Loci (QTL)  Genome-Wide Association Study (GWAS)  Endeavour  Systems biology (integrated) approaches?

60 Drug Targets

61 Gene identification from network  Nodes Hubs  Edges (interactions) Define critical genes from connected edges? Shortest path, alternative path? Weights  Metabolic pathways as well

62 Gene identification from literature  OMIM (Online Mendelian Inheritance in Men) Single gene disease Complex disease  Defects identified, target for drugs and cures

63 Gene identification from microarray  Up-regulated genes  Down-regulated genes  Too many?  Cluster of genes  Regulator (transcription factors) for the important clusters

64 Quantitative Trait Loci (QTL)  Region of DNA that is associated with a particular phenotypic trait  Phenotypic characteristic varies in degree and attributes to interaction between two or more genes  QTL may not be gene itself, but as a sequence of DNA, is closely linked with the target gene

65 Quantitative Trait Loci  LOD (log odd ratio): how likely to observe a locus for a group with specific trait (phenotype)  Expression QTL (e-QTL): combine microarray for gene expression (identify transcription regulatory elements as QTL)  cM: centimorgan, 1,000,000 bases in chromosome

66

67

68 Genome-Wide Association Studies (GWAS)  Genome-wide association studies (GWAS) rely on newly available research tools and technologies to rapidly and cost-effectively analyze genetic differences between people with specific illnesses, such as diabetes or heart disease, compared to healthy individuals.

69

70 Keys to success of GWAS  Population Resource Large sample size required for significant detection  SNP Map and Genotyping High-throughput genotyping  IT and Analysis Tool Storage and analysis (1000 microarrays for billions of data points)

71 What have GWAS found?  Genes associated with risks of: type 2 diabetes Parkinson's disease heart disorders Obesity prostate cancer …

72 An integrated approach: Endeavour  Genes can obtain various scores regarding their association with disease  These scores include those mentioned above  The various ranks of these genes according to different scores are determined  With a consensus scoring scheme (data fusion), the resulting prediction accuracy could be improved

73 Aerts, et al. (2006)

74

75 Toward personalized medicine

76 Targeted therapy  Using antibody against biomarkers (cancer or other infectious agents)  Require prior knowledge of patient response (through lab tests or biochips)

77 Gene therapy  Replace or inhibit genes in patients  Vectors Adenovirus (AAV)  Silencing the disease gene RNAi microRNA

78 RNA interference

79 Putting All Together

80 Network of Networks  Gene regulation (protein-DNA)  Protein-protein interaction  Metabolic pathway  How … ?

81 Questions?


Download ppt "Introduction to Systems Biology 國立台灣大學資訊工程系 博士後研究員 詹鎮熊."

Similar presentations


Ads by Google