Directly Upload Data From An ELN Into PubChem

Slides:



Advertisements
Similar presentations
© 2008 EBSCO Information Services SUSHI, COUNTER and ERM Systems An Update on Usage Standards Ressources électroniques dans les bibliothèques électroniques.
Advertisements

Different Approaches to Single-Sign-On Jeff Kahn, Verbena Consulting.
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
BiodiversityCatalogue How-Tos Robert Haines. BiodiversityCatalogue Home Hover over the ‘s for more information!
PubMed Review Medical Library Association Annual Meeting May 20 – 22, 2007 Philadelphia.
MS-Viewer – A Web Based Spectral Viewer For Database Search Results Peter R. Baker 1, Alma L. Burlingame 1 and Robert J. Chalkley 1 1 Mass Spectrometry.
Make your choice from more than 70 templates to get a quick start online!70 templates.
Strategies towards improving the utility of scientific big data Evan Bolton, PhD National Center for Biotechnology Information (NCBI) National Library.
Figure 1. Hit analysis in 2002 of database-driven web applications Hits by Category in 2002 N = 73,873 Results Reporting 27% GME 26% Research 20% Bed Availability.
© InLoox ® InLoox PM Web App product presentation The Online Project Software.
Chapter 9 Collecting Data with Forms. A form on a web page consists of form objects such as text boxes or radio buttons into which users type information.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
A Scalable Application Architecture for composing News Portals on the Internet Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta Famagusta.
Central Michigan University Chemistry Department Online Electronic Ordering.
Getting started on informaworld™ How do I register my institution with informaworld™? How is my institution’s online access activated? What do I do if.
Prevent Cross-Site Scripting (XSS) attack
© 2011 Delmar, Cengage Learning Chapter 9 Collecting Data with Forms.
EBSCOadmin. Select Change Password Select EBSCOadmin Security.
XHTML Introductory1 Forms Chapter 7. XHTML Introductory2 Objectives In this chapter, you will: Study elements Learn about input fields Use the element.
Tutorial 10 Adding Spry Elements and Database Functionality Dreamweaver CS3 Tutorial 101.
Creating a Web Site to Gather Data and Conduct Research.
Evan Bolton, PhD Jian Zhang, PhD Gang Fu, PhD Jun. 15, 2015 U.S. National Center for Biotechnology Information (NCBI)
Board on Research Data and Information, National Research Council “Changing Roles of Libraries in Support of Scientific Data Activities” June 3, 2010 More.
MyRx seeks to solve the problem that pharmacists and doctors experience when there is a lack of easy, instant, paperless communication between the two.
COMP3121 E-Commerce Technologies Richard Henson University of Worcester November 2011.
Online Translation Service Capstone Design Eunyoung Ku Jason Roberts Jennifer Pitts Gregory Woodburn Kim Tran.
CakePHP is an open source web development framework. It follows Model-View- Controller and is developed using PHP. IT is the basic for user to create.
Week seven CIT 354 Internet II. 2 Objectives Database_Driven User Authentication Using Cookies Session Basics Summary Homework and Project 2.
Do's and don'ts to improve your site's ranking … Presentation by:
Chapter 8 Collecting Data with Forms. Chapter 8 Lessons Introduction 1.Plan and create a form 2.Edit and format a form 3.Work with form objects 4.Test.
ORBIS & PORTALS E-Journal Workshop Michael Markwith, TDNet Inc. Reed College Library May 9, 2002.
End of Day Tasks!. THE CHALLENGE…. Select either a list or a library to create Add x3 items or documents one item or document inside of.
1 SERD Project Director’s Conference CRIS OVERVIEW Education Component Current Research Information System March 30, 2005 Dr. Irma A. Lawrence National.
DalSpace A content repository for Dalhousie community members.
2004/051 >> Supply Chain Solutions That Deliver Users.
PubChem: An Open Repository for Chemical Structure and Biological Activity Information Steve Bryant The NIH Biowulf Cluster: 10 Years of Scientific Supercomputing.
PHP: Further Skills 02 By Trevor Adams. Topics covered Persistence What is it? Why do we need it? Basic Persistence Hidden form fields Query strings Cookies.
Project CLASP: Common Login and Access rights across Services Plan Goal  Propose a detailed plan to reduce the number of login/passwords entered by users.
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
PubChem Search Features Stephen Bryant Wolfram Data Summit Scientific and Technical Data Session September 9-10, 2010.
PubChem BioAssay: Link chemical research to GenBank and beyond
EASYLOBBY NEW VISITOR MANAGEMENT SYSTEM. EASY LOBBY BP is moving to a new Visitor Management system for the Chicago land areas, Naperville, Cantera 2.
CLINIC-LAB COMMUNICATION Configuring 3Shape Communicate™
Introduction to PubChem BioAssay
AdisInsight User Guide July 2015
Section 10.1 Define scripting
Architecture Review 10/11/2004
Veritas Ordering Systems
Elsevier Operative Techniques - Netter Process Flow
Core ELN Training: Office Web Apps (OWA)
The CompTox Chemistry Dashboard: an informational data hub at the
Social Network.
Core LIMS Training: Project Management
AP Online Customer Support Help Desk - Kayako EBSC Bratislava Account Payables Customer Support and Invoice Query Resolution Teams.
Business Directory REST API
Introduction to PubChem BioAssay
Printer Admin Print Job Manager
Order Management For Shippers.
ICOTS Helpdesk Training
Mobilizing EPA’s CompTox Chemistry Dashboard Data on Mobile Devices
InLoox PM Web App product presentation
Welcome to Cornerstone’s Updated FOUNDATION Software System
Title: Tech Training Certificate: Ace of Initiative Program
Understanding SharePoint
Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta
Welcome Traceability Software Integrators
Online Translation Service Capstone Design
TC 310 The Computer in Technical Communication
Smart Business for eGeneration Companies
ADMINISTRATION A guide to setup and manage your innovation platform…
Presentation transcript:

Directly Upload Data From An ELN Into PubChem Ben Shoemaker Ben.Shoemaker@nih.gov U.S. National Center for Biotechnology Information: NCBI / NLM / NIH

Maximize the impact of your research … with little effort PubChem is a global resource for open chemistry Data sources in PubChem found by internet search Open Access mandates satisfied with PubChem Data formats and web interfaces can impede upload Programmatic access to data uploads facilitates ELN integration

PubChem Mission PubChem is an open archive and a public resource with the primary aim to provide information on the biological activity of chemical substances

Unique chemical structure content of PubChem PubChem is an archive Submitted; SID accession Derived; CID accession Substance without structure is not in Compound Substance records keep provenance clear Unique chemical structure content of PubChem Compound helps to group Substance records Submitted; AID accession

Why does a user come to PubChem? Search result from Google/Yahoo/Baidu Purchase decision for molecule ‘X’ Publications about molecule/concept Patents/Biological activities What is known about the molecule? Physical properties Pharmacology Biological activity Safety information Spectroscopy Toxicity Pathways Etc. Launching pad for associations to related databases Image credit: http://blogs.egu.eu/network/palaeoblog/2012/10/31/why-bother-communicating/

Chemical information is everywhere now PubChem is helping to improve accessibility to chemical information

PubChem growth Sustained growth over 12 years of: Contributors Chemical substance descriptions Biological testing data Usage Top-10 chemistry website (#5?) ~1.5M monthly unique users at peak Heavy programmatic usage ~5% of unique IPs per month (~70K) Serve millions of web hits per day 2M-12M on average (0.5M interactive)

Benefit from PubChem by uploading data Minimal startup time Flexible interface Spreadsheet data accepted via file or web interface

Upload chemicals Draw or load structures Enter annotations & synonyms Link back to your site

Upload screening results Spreadsheet load File or web Include all test results E.g. an article table

Upload screening results Add annotations Specify targets Database links

Annotate with controlled vocabularies Include ontology terms such as from BAO, GO, MESH

How can data loading be improved? This works well, but… Issues: Web interfaces and file formats must be learned Open access data requirements add yet another step to lengthy and time-sensitive publishing process FTP uploads can be automated, but require custom scripts difficult for single-use

How can data loading be improved? Ideas: Ideally, a single “Make Public!” button would be added to existing end-user software This ‘publish’ button would require a standard implementation to make it simple to add Electronic Lab Notebooks (ELNs) would be good candidates for such functionality Great, so how do we that?

Build on public data: Programmatic access Outside websites create novel platforms for increased exposure REST: Easy, predictable access for research analysis http://pubchem.ncbi.nlm.nih.gov/rest/pug/assay/activity/EC50/aids/JSON

PubChem Upload REST Extend programmatic access to “pushing” data Open suite of operations for loading data Create standard syntax to simplify interface Use secure login and key to restrict access

PubChem Upload REST The URL path Domain Operation https://pubchem.ncbi.nlm.nih.gov/rest/uplo ad/<domain specification>/<operation specification>/ [?<operation_options>] <domain specification> = substance | assay | account login, upload, set_record, get_record, pending, list_records, commit, export_file, get_sidlist, list_archived, get_viewcode, set_viewcode, delete_viewcode

PubChem Upload REST Example Let’s say that you have structure and annotation information for three chemicals including: Unique identifiers CAS registry numbers Common names SMILES Tag list found on help page: https://pubchem.ncbi.nlm.nih.gov/upload SDF, CSV and Excel accepted PUBCHEM_EXT_DATASOURCE_REGID PUBCHEM_SUBSTANCE_SYNONYM PUBCHEM_EXT_DATASOURCE_SMILES my_sub1 50-99-7 D-Glucose, anhydrous C(C1C(C(C(C(O1)O)O)O)O)O my_sub2 CCOC1=CC=CC=C1NC(=O)C2=CC3=CC=CC=C3C=C2O my_sub3 C1=CC=CC=C1

PubChem Upload REST Example Authenticate: Provide user credentials Security key returned for subsequent operations unix> curl -c cookie1.txt "https://pubchem.ncbi.nlm.nih.gov/rest/upload/account/login ?login=MyLogin&password=test-password" { "Response": { "ResponseCode": "Pass", "UserId": "999" } } Base Domain Operation Arguments pubchem../rest/upload account login login,password

PubChem Upload REST Example Upload From File unix> curl -b cookie1.txt -F "data=@test1.sdf" "https://pubchem.ncbi.nlm.nih.gov/rest/upload/substance/upload/SDF?process=1" Base Domain Operation Arguments Input pubchem../rest/upload substance upload process SDF

PubChem Upload REST Example Upload from a URL-encoded string unix> curl --cookie "deposit_ses_key=8F565CD7-46E0-4939-9CB5-B3449C5B70A5" -d "data= PUBCHEM_EXT_DATASOURCE_REGID%2CPUBCHEM_SUBSTANCE_SYNONYM%2CPUBCHEM_SUB STANCE_SYNONYM%2CPUBCHEM_EXT_DATASOURCE_SMILES%0Amy_sub1%2C50-99- 7%2C%22D- Glucose%2C%20anhydrous%22%2CC%28C1C%28C%28C%28C%28O1%29O%29O%29O%29O%29 O%0Amy_sub2%2C%2C%2CCCOC1%3DCC%3DCC%3DC1NC%28%3DO%29C2%3DCC3%3DCC%3D CC%3DC3C%3DC2O%0Amy_sub3%2C%2Cbenzene%2CC1%3DCC%3DCC%3DC1%0A" "https://pubchem.ncbi.nlm.nih.gov/rest/upload/substance/upload/CSV?process=1" Base Domain Operation Arguments Input pubchem../rest/upload substance upload process CSV

PubChem Upload REST Example Check the status of your pending submissions unix> curl -b cookie1.txt "https://pubchem.ncbi.nlm.nih.gov/rest/upload/substance/pending" {"Response": {"ResponseCode": "Pass","PendingSubmissions": [{"UploadId": "40637","Date": "2016/02/08 16:25","Status": "V1","DataSet": "form-data.sdf","Records": "3"},{"UploadId": "40638","Date": "2016/02/08 17:06","Status": "V1","DataSet": "form-data.sdf","Records": "3"}]}} Base Domain Operation Arguments pubchem../rest/upload substance pending

PubChem Upload REST Example Commit your submission into the public PubChem database unix> curl -b cookie1.txt "https://pubchem.ncbi.nlm.nih.gov/rest/upload/substance/commit?upload_id=40637" {"Response": {"ResponseCode": "Pass","OperationStatus": [{"UploadId": "40637","CommitStatus": "Pass"}]}} Base Domain Operation Arguments pubchem../rest/upload substance commit upload_id

Maximize the impact of your research … with little effort PubChem is a global resource for open chemistry 12 years of growth Top 10 chemistry website Data sources in PubChem found by internet search Uploading is easy for small and large submissions Programmatic access to data uploads facilitates ELN integration Leverage PubChem’s impact on chemistry

Acknowledgments: The PubChem Team Evan Bolton Jie Chen Tiejun Cheng Gang Fu Renata Geer Asta Gindulyte Lianyi Han Jane He Steve Bryant (PI) Siqian He Sunghwan Kim Paul Thiessen Jiyao Wang Yanli Wang Bo Yu Leonid Zaslavsky Jian Zhang All research supported by the Intramural Research Program of the NIH, National Library of Medicine. Ben.Shoemaker@nih.gov Special thanks: NCBI Help Desk and past PubChem group members.