Download presentation
Presentation is loading. Please wait.
Published byRebecca Barber Modified over 7 years ago
1
Data publishing wakes up the sleeping data -real practices in China Scientific Data
Zhang Lili Computer Network Information Center,CAS Sep 2016
2
Data Sharing Issues Privacy Concerns Data abuse
Teachprivacy.com Articulate.com 1 data are generated by scientists and funded by the public, so who possess the data? Here’s already good examples in funding agencies that take data management plan into account for research project and in China, we have directly fundings for data production and sharing for a long time in SDB. 2 Another questione is in the ivory tower of science, Privacy Concerns Data abuse
3
What is data publishing
Data publishing (also data publication) is the act of releasing data in published form for (re)use by others (Wikipedia, 2016). Model/classification(Bryan Lawrence et al,2011) Standalone Data Publication Data Publication by Proxy Appendix Data Journal Driven Data Archival Overlay Publication
4
Contents Why publish data What data to publish How to publish data
5
Why publish data For data authors/owners For data curators
Incentives Gain more reputations in scholar community by citation Enhanced exchanges and more focus as well as more suggestions Supports Save energy for long-term data preservation and ongoing services Get third party authorized the trustfulness of your data For data curators Academic recognition For data readers/users Trustful data Data quality control/clear data description Easy to use/Safe to use Less need to worry about intellectual property rights dispute by citation For journals/presses Cooperation between data journals and traditional journals(joint publishing/reduce falsification)/repositories(professional lifelong data services)
6
What data to publish Items Key points Notes SCOPE Popularity Scarcity
Sustainability Topics, areas, fields, methods, skills FORM Easy to understand Friendly access standards CONTENTS capacity richness Quality control VALUE/UTILITY* Reusable-reliability Innovation in data process To support the validation of scientific research Consistency/stability Partly unpredictable OTHERS Background information in life-long data cycle
7
How to publish data Li J, Wu C, Zhang L et al. Survey and analysis of scientific data publishing. China Scientific Data 1 (2016), DOI: /csdata
8
Right time to publish data
Published research papers with datasets available Need help from third party for ongoing maintenances of existing datasets Existing datasets that tend to sleep
9
About China Scientific Data
Hosted by Chinese Academy of Sciences Jointly published by CNIC,CAS & ICSU CODATA China Committee (CN /N,ISSN ) An academic journal publishing multidisciplinary data papers Online bilingual versions To promote standardized data openness and citation and to making data findable, accessible, intelligible and reusable (FAIR) SCOPE:Data papers describing (but not limited to) the following: Datasets or data products generated from major scientific activities Derived datasets or data products refined from raw data Datasets linked to existing publications
10
Key points in writing Clear intellectual property rights relationship maintance Entire and accurate author name list Authorized to publish datasets and data papers by cc-by 4.0 Completion for data files and storage record Confirm the data files storage are complete and it matches with the description involved in data papers. Make sure the data has been deposited in the most suitable, accessible and reliable data repository. Rigor experiment and high data quality Whether the data are generated in a rigorous and methodological way? Data validation and quality control Sufficient background information (such as spatial and temporal information) according to specific discipline as well as data application suggestions Integrity of the description Enough specific explanations for research method and data processing steps together with data source, process, software aided and data file types allowing for easily reproducing. All necessary information for datasets reuse or integration. Conformance for metadata, datasets, other data description as well as data papers. 第一段最后一句:Some background information description can be optional, such as datasets regarding to one kind of fruit that only has rootted within a quite small certain area 总结:Nice data is the premise and delicated paper as a necessary
11
How to publish data template www.sciencedb.cn www.csdata.org
Login CSDATA(registration/browsing) Download template and example Submit with data papers and datasets template
12
If I have a dataset/datasets
Kept by myself Archived by the funding agency Sharing it online by my website 2016 international training workshop for developing countries on big data for science held by codata china and CAS in july 2016 Stored in a long-term repository for re-analysis
13
Why not write a data paper!
OPEN ACCESS open access for data papers & datasets Open for reuse Easily access,lifelong data curation Efficient processing & rapid dissemination at least 15 days for open ,within 3 month for publication High quality guarenteed strict review process(peer review/crowd rating/voting)
14
Thank you !
15
Tough points under discussion
Originality Preferred creativity in data papers Derived data worthy of publication: DataA +DataB=My data? Data flow publication Similar data papers-ongoing dataflow and data papers Published data size Larger datasets with more citation as presumed; Less chance to publish similar small size datasets;
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.