Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein Sequences in a Major Scientific Database

Similar presentations

Presentation on theme: "Protein Sequences in a Major Scientific Database"— Presentation transcript:

1 Protein Sequences in a Major Scientific Database
2009 ~ 10 Million

2 Protein Sequences in a Major Scientific Database

3 Illinois Research Data Service
Heidi Imker, PhD Director, Research Data Service Associate Professor, University Library University of Illinois at Urbana-Champaign

4 Illinois Research Data Service (RDS)
OVCR Provost Tech Services  Provide the Illinois research community with the expertise, tools, and infrastructure necessary to manage and steward research data. Research Data Service (RDS) iSchool NCSA University Library

5 Who is the RDS? Four Full-Time Staff 1 Director (Heidi Imker)
2 Data Curators (Elise Dunham and Elizabeth Wickes) 1 Repository Developer (Colleen Fallaw) Part Time/Voluntary Postdoc (with iSchool) User Experience Specialist (from Tech Services) Graduate Students Lots of interaction within the Library and elsewhere

6 Where is the interest in data coming from?

7 Funders Researchers Publishers

8 Scientific Data digital recorded factual material commonly accepted in the scientific community as necessary to validate research findings including data sets used to support scholarly publications but does not include ... lab notebooks, preliminary analyses, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects, such as lab specimens

9 Humanities Data materials generated or collected during the course of conducting research including citations, software code, algorithms, digital tools, documentation, databases, geospatial coordinates, reports, articles. but does not include… preliminary analyses, paper drafts, plans for future research, peer- review assessments, communications with colleagues, materials that must remain confidential until published, Information if release results in an invasion of personal privacy

10 Funders

11 OSTP Memo on Public Access
“requiring researchers to better account for and manage the digital data resulting from federally funded scientific research”   Data management plans will be come compulsory Providing public access to data will become more routine


13 Publishers


15 Example Publisher Data Polices
Science “After publication, all data and materials necessary to understand, assess, and extend the conclusions of the manuscript must be available to any reader of Science.” Proceedings of the National Academy of Science (PNAS) “As a publisher, PNAS must be able to archive the data essential to a published article. Where such archiving is not possible, deposition of data in public databases… only links to websites that are permanent public repositories, such as self-perpetuating online resources funded by government, academia, and industry, are permitted. Links to an author's personal webpage are not acceptable.” Genome Research “Genome Research will not publish manuscripts where data used and/or reported in the paper is not freely available in either a public database or on the Genome Research website. There are no exceptions.” 

16 Researchers

17 (Public) Post Publication Review
Dedicated Sites F1000Prime PubMed Commons ScienceOpen PubPeer Publons The Winnower Retraction Watch Personal Blogs Twitter

18 Post Publication Review
The Irish Potato Famine Pathogen Phytophthora infestans Translocates the CRN8 Kinase into Host Plant Cells van Damme et al PLoS Pathogens  2012 Posted on PubPeer

19 Data Management Practices
complete records find understand back-up migrate

20 Data Management Reality
I’ve lost two patent claims because we didn’t have complete records in lab notebooks. I panic every time I try to find my students’ data. I know I won’t be able to publish anything after my grad student leave because it’s too hard to understand what he/she did. We don’t have the assay results with substrate X, because my student didn’t back-up their computer. And we’re out of substrate X. The only reason I was able to do the analysis was because on a whim I decided to migrate some of my data off of floppy disks to CDs back in the 90s. I wish I had done all of it.

21 Research Data Service’s role?
So what is the Research Data Service’s role?

22 Data Management Consultation
Walk through the research project and data management planning process Help determine what data can or cannot be made accessible Help determine what data can reasonably be preserved Identify resources available on campus Identify resources available elsewhere

23 Data Management Plan Review
Use or We’ll request: link to the application instructions proposal abstract (confidential!) deadline that the application needs to be OSP We provide: review in 1-2 business days guidance including suggested resources, revisions, or language We don’t provide: a DMP written wholesale funding guarantees

24 Data Management Workshops
Three-part series at Main Library Introduction to Data Management Data and Process Documentation and Organization Data Publication and Sharing: Why, What, and How Single session on-the-road “Smart and Simple Data Management” Generally minutes Love to tailor for specific groups and disciplines

25 Data Sharing and Storage
Illinois Data Bank Quickly and easily upload dataset of any type and format Publically available with a persistent link and identifier (no link rot!) Managed and maintained by the University Library (3 copies!) Active Data Storage Private, scalable storage from a few TB to over a PB $8 / TB / Month managed and stored by NCSA

26 Opportunities in your area?
Every single grad student at Illinois to attend a “Smart and Simple Data Management” workshop. Every researcher writing a grant know that we can help improve their Data Management Plan. No research data being posted to websites that are not intended for to provide long-term, stable access and preservation.

27 What can we do for you? We have variety of blurbs and text about the RDS and the Illinois Data Bank on hand for newsletters. We have collateral and swag. We can engage with you on Twitter. We can give a workshop or seminar ranging from 5 minutes to 2 hours with essentially no notice needed. We are charming lunch and dinner guests.

28 Contacts walk: 310-312 Main Library
click: call: (217) tweet: @ILresearchdata

Download ppt "Protein Sequences in a Major Scientific Database"

Similar presentations

Ads by Google