Download presentation
Presentation is loading. Please wait.
Published byAhmad Woodley Modified over 10 years ago
1
Andy Jenkinson, EBI An Introduction to DAS
2
Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS Brief History of DAS The DAS Game Explore some examples
3
What is Data Integration
4
All These are Data Integration Reading some papers so you can write a report Exploring some database websites so you can learn about a topic Downloading some data from different databases so you can analyse it Downloading some data from different databases so you can combine it with your own
5
All These are Data Integration Reading some papers so you can write a report Exploring some database websites so you can learn about a topic Downloading some data from different databases so you can analyse it Downloading some data from different databases so you can combine it with your own
6
Data Integration “Automatic” data integration pulling in data from different locations processing it creating a resource derived from the data done via computers, not humans e.g. creating/updating a data warehouse Warehouse PDBEnsemblUniProt
7
Warehouse model
13
Distributed Annotation System Distributed Federated Client-Server architecture RESTful web services
14
Federation Not federation: Web services SOAP REST GFF, etc etc Are federation: PSICQUIC BioMoby Semantic Web (sort of)
15
Warehouse model
16
DAS model
17
Architectural Overview
18
DAS Databases are all different DAS is a uniform facet of a database – always the same Databases evolve when the database changes, DAS stays the same Databases age DAS data comes directly from the provider so is always fresh Databases are big DAS uses real-time targeted queries
19
History Developed circa 1999 for sharing genome annotations Expanded 2004 onwards more data types better metadata addition of Registry Generally pre-computed data used for visual display
20
To Summarise… The Distributed Annotation System is… A network of biological data sources An example of federation A collection of REST web services The DAS Protocol is… An integration platform A client-server protocol An agreed standard
21
DAS Architecture A client asks for data from many servers HTTP requests identically structured URLs, the same parameters Each server behaves in the same way pre-defined set of behaviours e.g. provide a sequence, provide annotations of a sequence Each server provides different data in the same format DAS-XML
22
DAS Concepts Reference sequence e.g. “chromosome X” or “NT_025741” Annotation (a.k.a Feature) information attached to a location in a sequence e.g. “substitution at residue 326 of BRCA1” Non-positional annotation information attached to the reference as a whole e.g. “found in basolateral plasma membrane”
23
DAS Concepts Reference source server that provides “core” reference data e.g. GRCh37 sequence data Annotation source DAS Registry catalogue of DAS sources and their capabilities Client Software that combines the data together
24
Architectural Overview
25
The Game!
26
And a real example http://www.ebi.ac.uk/dasty/
27
The DAS Protocol Defines 3 constraints
28
The DAS Protocol Defines 3 constraints transport layer: HTTP Data transport Standard HTTP Includes compression Some additional headers, e.g. to indicate DAS version
29
The DAS Protocol Defines 3 constraints transport layer: HTTP query format: constrained REST URLs Well-defined query URLs A client can issue a command http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/sequence?segment=P15056 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^ ^^^^^^^ ^^^^^^^^ ^^^^^^^^^^^^^^ site prefix das source command arguments
30
Defines 3 constraints transport layer: HTTP query format: constrained REST URLs response format: constrained XML XML format server responds with a simple XML document MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADPAIPEEVWNIKQMIKLTQ... The DAS Protocol
31
Try these curl ‘http://www.ebi.ac.uk/das- srv/uniprot/das/uniprot/sequence?segment=P15056’ curl ‘http://www.dasregistry.org/das/sources?capability=features&authorit y=UniProt’ > /tmp/sources more /tmp/sources curl ‘http://das.cbs.dtu.dk:9000/das/netphos/features?segment=P15056’
32
Tools to help DAS client libraries: Bio::Das::Lite (Perl) JDAS, BioJava (Java) JsDAS (Javascript) DAS servers: ProServer (Perl) MyDas, Dazzle (Java) Example clients: Ensembl, Dalliance, MyKaryoView (genomic) Dasty, Pfam, SPICE, Jalview (protein)
33
Image Credits Flickr/muir.ceardach Flickr/Horia Varlan Flickr/Alessandro Pinna Fotopedia/Jean-Marie Hullot listicles.com/?p=3485 heartattackgrill.com Olivier H. Beauchesne
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.