Presentation is loading. Please wait.

Presentation is loading. Please wait.

Core 2: Bioinformatics CBio-Berkeley. Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction.

Similar presentations


Presentation on theme: "Core 2: Bioinformatics CBio-Berkeley. Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction."— Presentation transcript:

1 Core 2: Bioinformatics CBio-Berkeley

2 Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction w/ other cores Current progress Discussion

3 Berkeley group: genomics Formerly BDGP (Berkeley Drosophila Genome Project) Informatics –Genome sequencing, analysis and annotation –Genomic application development –Database development FlyBase Generic Model Organism Database

4 Apollo

5 GBrowse

6 In-situ expression database

7 Genomics applications GadFly –analysis and annotation database –pipeline software BOP –computational analysis integration CGL –Comparative Genomics Software Library

8 SO and SOFA Sequence Ontology for Feature Annotation Ontology for genomics –Sequence feature classes: mRNA, intron, UTR, sequence_variant, … –Sequence feature relations exon part_of transcript polypeptide derives_from mRNA

9 Chado Model organism relational database schema –FlyBase, GMOD Modules –sequence annotations –expression –map –genotype –phenotype –ontology/cv –… Generic schema –Uses ontologies for strong typing

10 Berkeley group: GO Gene Ontology - Informatics –Database, web portal –Ontology editing tools –Ontology QC and integration –OBO

11 OBO-Edit (formerly DAG-Edit)

12 AmiGO and GO Database

13 Obol Problem: large ontologies of composite terms are difficult to manage Solution: partial automation (reasoners) Requires logical definitions –how do we obtain them? Solution: Obol –Parses logical definitions from class names –Logical definitions can be reasoned over detect errors and automation –Integrates OBO ontologies

14 OBO Relations Ontology Common relations used across ontologies must mean the same thing –is_a –part_of –derives_from –has_participant –… OBO relations ontology provides precise definitions –defines class-level relations in terms of their instances http://obo.sourceforge.net/relationship –collaboration with core5, Manchester & others

15 Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction w/ other cores Current progress Open questions

16

17 Core 2 specific aims Aims 1. Capture and describe data 2. Reconcile annotation and ontology changes 3. Store, view and compare annotations 4. Link disease genes First round –phenotypes: Fly and Zebrafish –HIV clinical trial data

18 Aim 1: Capture and describe data Phenotype data capture –OBO-Edit plug-ins –Combine classes from multiple ontologies PATO, anatomical ontologies –NLP tools? Clinical trial data capture –what are the appropriate tools?

19 Aim 1: Capture and describe data Zebrafish, fly –PaTO: Phenotype and trait ontology phenotype ‘primitives’ –‘Entity-Attribute-Value’ model –Phenotype ontologies –Genetic data –Orthologs Clinical trial data –generic instance model –what are the appropriate ontologies here?

20 PATO An ontology of attributes and attribute values –e.g. morphology, structure, placement Current status of PATO? –needs work to conform to sound ontology principles definitions formalisation of attributes –working with core3-cambridge (Gkoutos) and core5 (Neuhaus)

21 Phenotype annotation Entity-attribute structured annotations –Entity term; PATO term brain FBbt:00005095 ; fused PATO:0000642 gut MA:0000917 ; dysplastic PATO:0000640 tail fin ZDB:020702-16 ; ventralized PATO:0000636 kidney ZDB:020702-16 ; hypertrophied PATO:0000636 midface ZDB:020702-16 ; hypoplastic PATO:0000636 Pre-composed phenotype terms –Mammalian Phenotype Ontology “increased activated B-cell number” MPO:0000319 “pink fur hue” MPO:0000374

22 Example (Fly) EntityAttributeValueBackground/ Environment embrypviabilitylethalScer\GAL4[hs.P B] dorsal cuticleshapeabnormal ………… wing vein L2shapebranchedtemperature sensitive Gene: Jra Allele: Jra[bZIP.Scer\UAS] Allele Description: defects in head and dorsal cuticle. Scer\GAL4[hs.PB] induces….. A481G bZIP

23 Genotype-Phenotype datamodel Need to model complex genotypes Environment Phenotype –E-A-V is not enough Relational attributes Complex phenotypes Measurements and assays –CSHL 2005 Phenotype meeting

24 Aim 2: Reconcile annotation and ontology changes Ontology evolution can trigger annotation changes Identifiers –all classes and annotations will have stable identifiers –Cores 1 and 2 to decide on identifier model LSID URNs OntoTrack

25 Aim 3: Store, view and compare annotations OBO: ontologies OBD: data annotated using ontologies –genotype-phenotype –clinical trials –others

26 OBD: A Database for OBO Data warehouse –collected from MODs and other sources Annotation versioning Generic data model –Any data typed by OBO classes can be stored Specific annotation data views –Clinical trial data view –Phenotype data view Chado-compliant Entity-attribute-(value) model

27

28 Key technologies ‘Semantic Web’ database technology –ontology-aware ontologies are part of meta-model higher level query languages –SPARQL, SeRQL, … tool interoperability –Protégé-OWL, Jena,.. –SQL compatibility optionally layered on relational model –Standards? Maturity? Many implementations –Sesame, Kowari,

29 Aim 3: Store, view and compare annotations Browsing –AmiGO-2 Advanced visualization –work with core 1 (University of Victoria)

30 Comparing annotations process vs state –regulatory processes: acidification of midgut has_quality reduced rate midgut has_quality low acidity development vs behavior –wing development has_quality abnormal –flight has_quality intermittent granularity (scale) –chemical vs molecular vs cell vs tissue vs anatomical part

31 Integrating anatomical ontologies Annotations should be comparable between species –phenotype annotations are composed of anatomical terms Multiple species-centric anatomical ontologies –Problem: how do we compare across species? –XSPAN (Bard et al): creating mappings –Core 1: ontology mappings

32 Aim 4: Linking disease genes Homology data –Orthologous genes Genomic data –SNPs, sequence variants Ontologies –Disease ontologies –Semantic similarity –Ontology integration Obol, XSPAN

33 Linking disease to phenotype Relationship of phenotype to diseases and disorders –essentialist –statistical Disease ontologies –OBO disease ontology (Northwestern) –EVOC disease ontology (EVOC) –Others Disease ontology workshop (core 5) –November 2006

34 Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction w/ other cores Current progress Open questions

35 Software lifecycle Software is developed in phases Different phases require interaction with different cores Iterative “Agile” methodology –fast cycles –involve ‘customer’ (core3) at all phases

36

37 Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction w/ other cores Current progress

38 Meetings –CSHL November 2005 Phenotype ontology meeting Phenotype tools workshop –Berkeley, UVic, Core 3 OBO-Edit complex class plug-in Phenotype browser prototype Genotype-Phenotype datamodel

39 OBO-Edit complex class plug- in Combinatorial composition of classes Current use-cases: –plant anatomical structures –integrating GO and OBO-Cell Ideal for phenotype classes –extend to make ‘phenotype’ plug-in

40 OBD Progress Genotype-Phenotype data model defined Prototype implemented evaulating technologies

41 Phenotype browser Experimental branch of AmiGO code Allows browsing and querying of combinatorial phenotype annotations Experimental dataset Demo –http://yuri.lbl.gov/amigo/obdhttp://yuri.lbl.gov/amigo/obd


Download ppt "Core 2: Bioinformatics CBio-Berkeley. Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction."

Similar presentations


Ads by Google