Download presentation
Presentation is loading. Please wait.
Published byMabel Wilcox Modified over 9 years ago
1
Core 2: Bioinformatics CBio-Berkeley
2
Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction w/ other cores Current progress Discussion
3
Berkeley group: genomics Formerly BDGP (Berkeley Drosophila Genome Project) Informatics –Genome sequencing, analysis and annotation –Genomic application development –Database development FlyBase Generic Model Organism Database
4
Apollo
5
GBrowse
6
In-situ expression database
7
Genomics applications GadFly –analysis and annotation database –pipeline software BOP –computational analysis integration CGL –Comparative Genomics Software Library
8
SO and SOFA Sequence Ontology for Feature Annotation Ontology for genomics –Sequence feature classes: mRNA, intron, UTR, sequence_variant, … –Sequence feature relations exon part_of transcript polypeptide derives_from mRNA
9
Chado Model organism relational database schema –FlyBase, GMOD Modules –sequence annotations –expression –map –genotype –phenotype –ontology/cv –… Generic schema –Uses ontologies for strong typing
10
Berkeley group: GO Gene Ontology - Informatics –Database, web portal –Ontology editing tools –Ontology QC and integration –OBO
11
OBO-Edit (formerly DAG-Edit)
12
AmiGO and GO Database
13
Obol Problem: large ontologies of composite terms are difficult to manage Solution: partial automation (reasoners) Requires logical definitions –how do we obtain them? Solution: Obol –Parses logical definitions from class names –Logical definitions can be reasoned over detect errors and automation –Integrates OBO ontologies
14
OBO Relations Ontology Common relations used across ontologies must mean the same thing –is_a –part_of –derives_from –has_participant –… OBO relations ontology provides precise definitions –defines class-level relations in terms of their instances http://obo.sourceforge.net/relationship –collaboration with core5, Manchester & others
15
Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction w/ other cores Current progress Open questions
17
Core 2 specific aims Aims 1. Capture and describe data 2. Reconcile annotation and ontology changes 3. Store, view and compare annotations 4. Link disease genes First round –phenotypes: Fly and Zebrafish –HIV clinical trial data
18
Aim 1: Capture and describe data Phenotype data capture –OBO-Edit plug-ins –Combine classes from multiple ontologies PATO, anatomical ontologies –NLP tools? Clinical trial data capture –what are the appropriate tools?
19
Aim 1: Capture and describe data Zebrafish, fly –PaTO: Phenotype and trait ontology phenotype ‘primitives’ –‘Entity-Attribute-Value’ model –Phenotype ontologies –Genetic data –Orthologs Clinical trial data –generic instance model –what are the appropriate ontologies here?
20
PATO An ontology of attributes and attribute values –e.g. morphology, structure, placement Current status of PATO? –needs work to conform to sound ontology principles definitions formalisation of attributes –working with core3-cambridge (Gkoutos) and core5 (Neuhaus)
21
Phenotype annotation Entity-attribute structured annotations –Entity term; PATO term brain FBbt:00005095 ; fused PATO:0000642 gut MA:0000917 ; dysplastic PATO:0000640 tail fin ZDB:020702-16 ; ventralized PATO:0000636 kidney ZDB:020702-16 ; hypertrophied PATO:0000636 midface ZDB:020702-16 ; hypoplastic PATO:0000636 Pre-composed phenotype terms –Mammalian Phenotype Ontology “increased activated B-cell number” MPO:0000319 “pink fur hue” MPO:0000374
22
Example (Fly) EntityAttributeValueBackground/ Environment embrypviabilitylethalScer\GAL4[hs.P B] dorsal cuticleshapeabnormal ………… wing vein L2shapebranchedtemperature sensitive Gene: Jra Allele: Jra[bZIP.Scer\UAS] Allele Description: defects in head and dorsal cuticle. Scer\GAL4[hs.PB] induces….. A481G bZIP
23
Genotype-Phenotype datamodel Need to model complex genotypes Environment Phenotype –E-A-V is not enough Relational attributes Complex phenotypes Measurements and assays –CSHL 2005 Phenotype meeting
24
Aim 2: Reconcile annotation and ontology changes Ontology evolution can trigger annotation changes Identifiers –all classes and annotations will have stable identifiers –Cores 1 and 2 to decide on identifier model LSID URNs OntoTrack
25
Aim 3: Store, view and compare annotations OBO: ontologies OBD: data annotated using ontologies –genotype-phenotype –clinical trials –others
26
OBD: A Database for OBO Data warehouse –collected from MODs and other sources Annotation versioning Generic data model –Any data typed by OBO classes can be stored Specific annotation data views –Clinical trial data view –Phenotype data view Chado-compliant Entity-attribute-(value) model
28
Key technologies ‘Semantic Web’ database technology –ontology-aware ontologies are part of meta-model higher level query languages –SPARQL, SeRQL, … tool interoperability –Protégé-OWL, Jena,.. –SQL compatibility optionally layered on relational model –Standards? Maturity? Many implementations –Sesame, Kowari,
29
Aim 3: Store, view and compare annotations Browsing –AmiGO-2 Advanced visualization –work with core 1 (University of Victoria)
30
Comparing annotations process vs state –regulatory processes: acidification of midgut has_quality reduced rate midgut has_quality low acidity development vs behavior –wing development has_quality abnormal –flight has_quality intermittent granularity (scale) –chemical vs molecular vs cell vs tissue vs anatomical part
31
Integrating anatomical ontologies Annotations should be comparable between species –phenotype annotations are composed of anatomical terms Multiple species-centric anatomical ontologies –Problem: how do we compare across species? –XSPAN (Bard et al): creating mappings –Core 1: ontology mappings
32
Aim 4: Linking disease genes Homology data –Orthologous genes Genomic data –SNPs, sequence variants Ontologies –Disease ontologies –Semantic similarity –Ontology integration Obol, XSPAN
33
Linking disease to phenotype Relationship of phenotype to diseases and disorders –essentialist –statistical Disease ontologies –OBO disease ontology (Northwestern) –EVOC disease ontology (EVOC) –Others Disease ontology workshop (core 5) –November 2006
34
Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction w/ other cores Current progress Open questions
35
Software lifecycle Software is developed in phases Different phases require interaction with different cores Iterative “Agile” methodology –fast cycles –involve ‘customer’ (core3) at all phases
37
Outline Berkeley group background Core 2 first round –what: aims, milestones –how: software lifecycle, interaction w/ other cores Current progress
38
Meetings –CSHL November 2005 Phenotype ontology meeting Phenotype tools workshop –Berkeley, UVic, Core 3 OBO-Edit complex class plug-in Phenotype browser prototype Genotype-Phenotype datamodel
39
OBO-Edit complex class plug- in Combinatorial composition of classes Current use-cases: –plant anatomical structures –integrating GO and OBO-Cell Ideal for phenotype classes –extend to make ‘phenotype’ plug-in
40
OBD Progress Genotype-Phenotype data model defined Prototype implemented evaulating technologies
41
Phenotype browser Experimental branch of AmiGO code Allows browsing and querying of combinatorial phenotype annotations Experimental dataset Demo –http://yuri.lbl.gov/amigo/obdhttp://yuri.lbl.gov/amigo/obd
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.