Presentation is loading. Please wait.

Presentation is loading. Please wait.

Attie Bioinformatics Server Redesign

Similar presentations


Presentation on theme: "Attie Bioinformatics Server Redesign"— Presentation transcript:

1 Attie Bioinformatics Server Redesign
Andrew Broman & Brian Yandell October 2010 October 2010 Attie Bioinformatics Server Redesign

2 Attie Bioinformatics Server Redesign
overview scanone tool web page for biologists Attie islet mRNA data minimal functionality for now system architecture scanone and other tools as services authentication and authorization access to and use of databases interface to R and other analysis engines collaboration with off-site scientists October 2010 Attie Bioinformatics Server Redesign

3 Attie Bioinformatics Server Redesign
big picture user user services page security authenticate scanone service authorize October 2010 Attie Bioinformatics Server Redesign

4 Attie Bioinformatics Server Redesign
security modules authenticate: who is this? authorize: what can this person do/see? off-the-shelf tools well tested popular easy to implement authenticate & authorize are service units model-view-control architecture October 2010 Attie Bioinformatics Server Redesign

5 Attie Bioinformatics Server Redesign
scanone service unit Dataset: UCLA Tissue: liver Task: scanone plot summary MongoDB R analysis engine October 2010 Attie Bioinformatics Server Redesign

6 Attie Bioinformatics Server Redesign
service philosophy each service is self-contained, modular IT team designed or provided by other locations each service can contain other services use URLs to find data, code, etc. could be anywhere allows expansion to multiple centers REpresentational State Transfer (REST) key design idiom stateless client-server architecture web services are resources identified by URLs RESTful Web Services (2007) by Richardson and Ruby October 2010 Attie Bioinformatics Server Redesign

7 benefits of service architecture
decoupled/modular easier to create new tools easier to test & modify isolated parts of the system scalable any isolated service can be moved to a new server no need to alter to the rest of the system enables remote mirrors to be transparent to user understandable architecture easy to grasp isolated services easy to understand easily to maintain/extend individual services October 2010 Attie Bioinformatics Server Redesign

8 Attie Bioinformatics Server Redesign
MongoDB document-oriented database system not relational (MySQL, Oracle, …) DB is collection of documents each document can have user-specified parts accommodates huge data files quick access to desired components no schemas required: flexible data formats GenePattern has only two data formats October 2010 Attie Bioinformatics Server Redesign

9 Attie Bioinformatics Server Redesign
data and metadata metadata describes what data are provenance/history of data creation/acquisition type of data, size of data, other characteristics small “flat” file template to design new data data can be raw or processed large data object save time/space by passing metadata to R access data only as needed October 2010 Attie Bioinformatics Server Redesign

10 scanone service MVC components
view controller modify view Dataset: UCLA Tissue: liver pass details plot summary Task: scanone return objects pass details MongoDB model R analysis engine October 2010 Attie Bioinformatics Server Redesign

11 Attie Bioinformatics Server Redesign
project timeline task duration completion date scanone HTML mockup now summer islet scanone results database integration 1 wk 15 oct merge annotation, values (mRNA) expect speed, organization benefits multiple tissues 2 wks 1 nov tissues plus clinical MVC service architecture 2 wks 15 nov security integration 1 wk 1 dec authenticate, authorize services communication between services multiple services 4 wks 1 jan means, hotspots, qtlnet multiple projects 4 wks 1 feb UCLA, Florida, yeast October 2010 Attie Bioinformatics Server Redesign

12 MVC service architecture plans
view (what you see) extract from HTML mockup modular redesign controller (how information is passed) extract Ruby-on-Rails from HTML mockup add communication features (RESTful API) model (how tasks are performed) little modification needed October 2010 Attie Bioinformatics Server Redesign

13 analyst pipeline integration
R analysis engine raw data processed data get put MongoDB October 2010 Attie Bioinformatics Server Redesign

14 analyst pipeline details
R engine analysis libraries housed at github.org CHTC cluster offloads major workload get/put functions automate with periodic revision standardized metadata sheet owner, project, tissue, etc. dropdown menu of data service type scanone, peaks, causal negotiated by IT team each data type will have MVC service architecture(s) October 2010 Attie Bioinformatics Server Redesign

15 Attie Bioinformatics Server Redesign
future enhancements ideas not fully formed yet use sockets to connect objects save on I/O: don’t pass large objects, just open them avoid CSV, PDF, PNG unless user wants them plot, summary, result tables from R operations model passes socket information to tools connect R and MongoDB database directly controller passes socket info from model to view display results by opening RESTful resource October 2010 Attie Bioinformatics Server Redesign


Download ppt "Attie Bioinformatics Server Redesign"

Similar presentations


Ads by Google