Presentation is loading. Please wait.

Presentation is loading. Please wait.

BridgeDb Martijn van Iersel BiGCaT Maastricht. The 7 Virtues of Bioinformatics 1.Solve a problem 2.Start small 3.Modularity 4.Design for code re-use 5.Open.

Similar presentations


Presentation on theme: "BridgeDb Martijn van Iersel BiGCaT Maastricht. The 7 Virtues of Bioinformatics 1.Solve a problem 2.Start small 3.Modularity 4.Design for code re-use 5.Open."— Presentation transcript:

1 BridgeDb Martijn van Iersel BiGCaT Maastricht

2 The 7 Virtues of Bioinformatics 1.Solve a problem 2.Start small 3.Modularity 4.Design for code re-use 5.Open Source 6.Attention to detail 7.Eat your own dog-food

3 Solve a problem What problem are you solving?

4 Problem: Identifier Mapping ? Agilent reporter A46_P45789 Entrez Gene 3643

5 Solution: Conversion tools

6 Problem: Usability Check for double IDs Check for missing IDs Only 1000 at once Check alignment of Excel columns Manual Error-prone

7 Solution: Built-in Mapping Generic bioinformatics platforms should have identifier mapping built-in. BioConductor PathVisio Cytoscape... Batteries Included

8 Solution: Built-in Mapping Mapping service Entrez Gene 3643 Agilent reporter A46_P45789

9 Synergizer EnsMart DAVID CRONOS AliasServer MatchMiner OntoTranslate Problem: Which mapping service?

10 Solution: Abstraction Layer

11 interface IDMapper class IDMapperRdb relational database class IDMapperFile tab-delimited text class IDMapperBiomart web service

12 CyThe- saurus Wiki Pathways PathVisio Network Merge BridgeDb Internet webservices BioMart BridgeDb- REST Local Database Tab- delimited text files Tools Mapping Services PICR Cytoscape Plugins BMC Bioinformatics. 2010 Jan 4;11(1):5

13 BridgeDb interface 1: JAVA interface2: REST interface

14 API Overview BridgeDb.connect(...) IDMapper.mapID(...) Xref.getUrl() DataSource.getUrl()

15 Easy & Flexible Code

16

17

18 BridgeDb interface 1: JAVA interface2: REST interface

19 REST API ILMN_1713029Illumina 3255967Affy NP_001025186RefSeq IPI00005930IPI GO:0042752GeneOntology NM_033282RefSeq 3255968Affy 94233Entrez Gene ENSG00000122375Ensembl Human 234226_atAffy A6NEB4Uniprot/TrEMBL 0001780601Illumina GO:0008020GeneOntology 606665OMIM A_23_P24234Agilent 14449HUGO http://webservice.bridgedb.org/Human/xrefs/L/1234

20 REST API http://webservice.bridgedb.org/Human/xrefs/L/1234 http://webservice.bridgedb.org/Human/search/ENSG00000122375 http://webservice.bridgedb.org/Human/attributeSet http://webservice.bridgedb.org/Human/properties http://webservice.bridgedb.org/Human/targetDataSources http://webservice.bridgedb.org/Human/attributes/L/3643 http://localhost:8183/Human/xrefs/L/3643 http:// / / [ /... ]\

21 R Example

22 Types of Mapping Services TypeAdvantages Webservice+ always up-to-date + no disk-space required + no installation required Relational Database + highly efficient + versioned: updated only when you want to. Flat file+ easy to customize

23 Available Mapping Services NameTypeMaintainer Gene Databases (Ensembl based) DatabaseUs Metabolite databases (HMDB-based) DatabaseUs BridgeWebserviceWebserviceUs BioMartWebserviceEBI CRONOSWebserviceHemholtz Zentrum SynergizerWebserviceHarvard Medical School PICRWebserviceEBI

24 Problem: Custom Microarrays Custom probe #QXZCY!34 ?

25 EnsMart Custom table Solution: Stacking

26 Ensembl EntrezCustom microarray Relation defined by mapping source A Relation defined by mapping source B Inferred, transitive relationship

27 Comparison

28

29 CyThesaurus

30 MIRIAM Resources http://www.ebi.ac.uk/miriam/

31 Solution: MIRIAM Resources Regular expression for autodetection Pattern for generating URLs Link to documentation

32 The 7 Virtues of Bioinformatics 1.Solve a problem 2.Start small 3.Eat your own dog-food 4.Attention to detail 5.Modularity 6.Design for code re-use 7.Open Source

33 A Question to Linus Torvalds Q: “Do you have any tips for people who want to undertake a large open source project?” A: “Nobody should start to undertake a large project. You start with a small trivial project, and you should never expect it to get large.… … If it doesn't solve some fairly immediate need, it's almost certainly over-designed.… …You need to get something half-way useful first, and then others will say "hey, that almost works for me", and they'll get involved in the project”

34 Also from Linus Torvalds “I'm right and anyone who disagrees is stupid and ugly” “My name is Linus Torvalds and I am your god.”

35 Code Re-Use Reinventing the wheel is one of the 7 Deadly sins of Bioinformatics

36 Code Re-Use

37 Q: How to design re-usable code? A: Actually use it in more than one project from the start bridgedb Cytoscape PathVisio

38 Modularity

39

40

41 Open source Public money -> Public code Reproducibility Academic ideal Trust Insurance against vendor lock-in

42 Open source Now where are all those free programmers?

43 Open Source Web site Version controlMailing list Bug tracker

44 http://www.helixsoft.nl/blog

45 Eat your own dog food

46 Are you named “alkfdjlkdsf”? Why not “Hélène O’Brian?” …or “Bobby Tables”?

47 Eat your own dog food Real data has missing values Real data has commas instead of dots Real data has duplicate identifiers Real data starts with “ID” in the first cell* *Which Excel doesn’t like

48 User friendliness

49

50 Hallway usability testing Grab a passer-by from the hallway and put them in front of your program (We usually use students)

51 Thanks Alex Pico (UCSF) Kristina Hanspers (UCSF) Isaac Ho (UCSF) Bruce Conklin (UCSF) Jianjiong Gao (U. Missouri) Thomas Kelder (BiGCaT, Maastricht) Chris Evelo (BiGCaT, Maastricht) Brian Turner (U. Toronto) Igor Rodchenkov (U. Toronto) http://www.bridgedb.org

52 Ways to run BridgeDb (1/3)

53 Ways to run BridgeDb (2/3)

54 Ways to run BridgeDb (3/3)

55 Open source Is it difficult?

56 Open source [bridgedb:/] @gladstone = rw @bigcat = rw

57 Open source [bridgedb:/] @gladstone = rw @bigcat = rw * = r


Download ppt "BridgeDb Martijn van Iersel BiGCaT Maastricht. The 7 Virtues of Bioinformatics 1.Solve a problem 2.Start small 3.Modularity 4.Design for code re-use 5.Open."

Similar presentations


Ads by Google