Presentation is loading. Please wait.

Presentation is loading. Please wait.

I am not a PDBid I am a Biological Macromolecule Philip E. Bourne University of California San Diego

Similar presentations


Presentation on theme: "I am not a PDBid I am a Biological Macromolecule Philip E. Bourne University of California San Diego"— Presentation transcript:

1 I am not a PDBid I am a Biological Macromolecule Philip E. Bourne University of California San Diego pbourne@ucsd.edu

2 Striving to be Recognized The “identity” of a macromolecular structure – functional and structural features and its broad role in a living system – is not established very easily by the majority of biologists. Given the technology available to us today surely it is time that this situation changed?

3 This is Not to Say that the Identity has not Improved Improved chemical description of polymers and monomers Remove sequence and taxonomic inconsistencies Improved representation of viruses Primary citation assignments REMARKS, SF files, NMR restraints…. Henrick et al. NAR 2008 36: D426-D433

4 For Example… Chemical Components Dictionary: –Model and idealized coordinates –Chemical descriptors (e.g. SMILES) and systematic names –Stereochemical assignments and aromatic bond assignments –IUPAC nomenclature for standard amino acids and nucleotides with the exception of the well-established convention for C- terminal atoms OXT and HXT –More conventional atom labeling –Removal of redundant ligands –Additional description of protonation states

5 This now sets the stage for the next stage of identity development

6 The Problem Can be Defined as A Need to Change the Workflow

7 Workflow Entry Point Sequence Literature Structure Function Pathway…

8 The best way to change the workflow is to remove the barrier between the literature (knowledge) and the PDB (data) How Can This Happen?

9 Possibility 1 – Proteopedia A Completely New Beginning Advantages –Anyone can contribute simply –Community consensus seems to support quality (e.g. Wikipedia) Disadvantages –Where is the reward? –Wiki format limited for providing a structural identity http://www.proteopedia.org Eran Hodis, Eric Martz, Jaime Prilusky, Joel L. Sussman

10 Possibility 2 - iSee Advantages –High quality annotation Disadvantages –Time consuming –Does not scale http://www.sgc.ox.ac.uk/iSee

11 Possibility 3 – Database and Literature Integration Advantages –Reward through publication –Potentially comprehensive –Retains full power of the database and literature Disadvantages –Literature accessibility –Harder to do

12 The Disadvantage of Literature Accessibility is Disappearing Slowly The NIH Public Access Policy is a Term and Condition of Award for all grants and cooperative agreements active in Fiscal Year 2008 (October 1, 2007- September 30, 2008) or beyond, and for all contracts awarded after April 7, 2008.

13 So What is the Policy for NIH Sponsored Research? You can only agree to a journal copyright policy if that policy allows you to deposit the paper in PubMed Central (PMC) The paper must be deposited in PMC How this happens depends on the journal

14 BioLit http://biolit.ucsd.edu Our Effort at Database-Literature Integration J.L.Fink, S. Kushch, P. Williams & P.E.Bourne 2008 BioLit: Integrating Biological Literature with Databases NAR 36(S2) W385-389 P.E.Bourne, J.L.Fink, M.Gerstein 2008 Open Access: Taking Full Advantage of the Content PLoS Comp. Biol. (Editorial) 4(3) e1000037

15 1. A link brings up figures from the paper 0. Full text of PLoS papers stored in a database 2. Clicking the paper figure retrieves data from the PDB which is analyzed 3. A composite view of journal and database content results BioLit: Tools for New Modes of Scientific Dissemination Biolit integrates biological literature and biological databases and includes: –A database of journal text –Authoring tools to facilitate database storage of journal text –Tools to make static tables and figures interactive 4. The composite view has links to pertinent blocks of literature text and back to the PDB 1. 2. 3. 4. The Knowledge and Data Cycle http://biolit.ucsd.edu

16 How Much of the Structure Literature is Currently Found in the Accessible PMC? 74127 articles 17161 were not parasable 7% - 3814 PDBids out of 51633 referenced in ?? PMC articles 338 Figures have legends that include PDBids

17

18

19

20 ICTP Trieste, December 10, 2007

21 Where Can we Go From Here with BioLit? The Ideal Situation is to Capture Relationships as the Paper is Written

22 BioLit Plugin Project Rather than Post-processing the Document the Author Controls the Semantic Tagging

23 Author Paper Word File in Docx format Publisher BioLit Plugin Project

24 Plugin Architecture

25 Context-Sensitive Data Access Display of information of database entries when the user clicks on the ID in the document Display of ontology terms related to terms in the document text, using local database search

26 Ontologies are Stored in a Local Database

27 User Configurable Selection Fully user configuration ontology and database identifier selection All searches occur within the user’s desktop computer Desired ontologies are downloaded and installed automatically, and update periodically BioLit installer XML file provides the application with the information needed to download and install ontologies.

28 Possibility 4. SciVee - A Different Kind of Learning Experience Why not listen to the enthusiastic author talk about the structure while you see the structure respond to their dialog?

29 YouTube for Scientists www.scivee.tv

30 Motivation

31 Pubcast – Video Integrated with the Full Text of the Paper

32 Pubcast - Making PSP Washington DC Feb. 2008

33 Channels – Just Like TV ICTP Trieste, December 2007

34 Professional Profile ICTP Trieste, December 2007

35 Create & Join Communities and Discussion Groups ICTP Trieste, December 2007

36 Finding What you Want Tag clouds generated automatically from MESH headings Full text of the papers indexed Browsing by audience type, subject, language etc.

37 SciVee – Viral Projects Sweetwater School District “Postercasts” Science video competitions “Pubumentaries”

38 Summary New modes of learning about structure are possible Number 6 never did get identified Time will tell whether a PDBid will become more than a number

39 Acknowledgements SciVee Team –Apryl Bailey –Tim Beck –Leo Chalupa –Marc Friedman –Alex Ramos –Willy Suwanto BioLit Team J. Lynn Fink Sergey Kushch Parker Williams Greg Quinn CT Watch 2007, 3(3) 26-31

40 Questions? pbourne@ucsd.edu

41 Questions? pbourne@ucsd.edu


Download ppt "I am not a PDBid I am a Biological Macromolecule Philip E. Bourne University of California San Diego"

Similar presentations


Ads by Google