Introduction an Open Source, Open Data international collaboration, based entirely in the internet started following a CECAM meeting in Zaragoza: http://neptuno.unizar.es/events/qcdatabases2.

Slides:



Advertisements
Similar presentations
What is intraLibrary Connect? Martin Morrey Product Director, Intrallect Ltd
Advertisements

SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
IRRA DSpace April 2006 Claire Knowles University of Edinburgh.
The CLARION Project for the Infrastructure for Integration in Structural Sciences (I2S2) mtg, Rutherford Labs, 11 th February 2010 CLARION – Chemical Laboratory.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Components and Architecture CS 543 – Data Warehousing.
RSS RSS is a method that uses XML to distribute web content on one web site, to many other web sites. RSS allows fast browsing for news and updates.
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
Jens Thomas Lensfield Quixote. Quixote Project An international, open-source, open-data collaboration to design, test and.
CERN – European Organization for Nuclear Research Administrative Support - Internet Development Services CET and the quest for optimal implementation and.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
A Practical Approach to Metadata Management Mark Jessop Prof. Jim Austin University of York.
PLANETS, OPF & SCAPE A summary of the tools from these preservation projects, and where their development is heading.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Building High Throughput Molecular Simulation Tools Mark Monroe 1, Sergio Maffioletti 2 1 Kim Baldridge’s group, Organic Chemistry Institute 2 Grid Computing.
Partnerships in Innovation: Serving a Networked Nation Grid Technologies: Foundations for Preservation Environments Portals for managing user interactions.
ECHO Technical Interchange Meeting 2013 Timothy Goff 1 Raytheon EED Program | ECHO Technical Interchange 2013.
MAKING BUSINESS INTELLIGENT Brought to you by your local PASS Community! Self Service ETL with Power Query Welcome.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Distributing Repository functions with DSpace Richard Jones.
Building Library Web Site Using Drupal
Introducing the Windows Mobile development
FHIR and Relational Databases
Web Mashups -Nirav Shah.
WHY? - Found initiative while case statement preparation
aspects of archive system design
SE goes software engineering; (practically) managing the Compose
Moving on : Repository Services after the RAE
Content Management.
INTAROS WP5 Data integration and management
The search engine of shipments
The importance of being Connected
Enable computational and experimental  scientists to do “more” computational chemistry by providing capability  computing resources and services at their.
improve the efficiency, collaborative potential, and
Center-wide strategy and plans Clark Judy Julia Collins
A Web-enabled Approach for generating data processors
Overview: Fedora Architecture and Software Features
National Accounts World Wide Exchange
Core WG Meeting November 16th, 2017.
Content Management Systems
SMART GROUND platform overview
Wsdl.
CONTENT: Introduction of the evolution of enterprise portals.
Grey Literature Repositories and CRIS in a SOA Environment
SDMX Introduction and practical exercises
Power Query Discovery and connectivity to a wide range of data sources
An ecosystem of contributions
2. An overview of SDMX (What is SDMX? Part I)
DataNovata Instantly Create Web-Enabled Applications
2. An overview of SDMX (What is SDMX? Part I)
Advance Metering Infrastructure (AMI) system awareness Training
Project Information Management Jiwei Ma
ESciDoc Introduction M. Dreyer.
Interoperability and standards for statistical data exchange
DITA component-based authoring and learning design
Unit# 5: Internet and Worldwide Web
Semantic Annotation service
TES Data Platform Providing business users with the tools to connect share and analyse data 2018.
Chapter 17: Client/Server Computing
SDMX in the S-DWH Layered Architecture
Metadata The metadata contains
Jisc Research Data Shared Service (RDSS)
Data Warehousing Concepts
SE goes software engineering; (practically) managing the Compose
DATA SCIENCE SOLUTION FOR RESEARCH Because today science is complex, multidisciplinary and multi-stakeholders Twitter - Facebook - Linkedin
NIEM Tool Strategy Next Steps for Movement
Meta-Data: the key to accessing Data and Information
GENEDI EUROPEAN COMMISSION - EUROSTAT GENERIC EDI TOOLBOX
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

Introduction an Open Source, Open Data international collaboration, based entirely in the internet started following a CECAM meeting in Zaragoza: http://neptuno.unizar.es/events/qcdatabases2 010/ motivated by 3 key ideas: scientific data (and ideally codes too) should be "open". a standard data model/format is a Very Good Thing. universally accessible and open databases of the results of calculations are scientifically highly valuable. create a useful infrastructure and consolidate the model around the tools; the "If you build it, they will come". approach.

Ideas Standard Data Model: different codes can interoperate to create complex workflows. tools (e.g. GUI's) can operate on the input and output of any code supporting the format. if a semantic model underlies the format, data can easily be validated. Open results databases: codes can be easily validated and benchmarked. are essential for the development of new methods. avoid costly duplication of results. provide a valuable resource for data mining. an easy, automated way of generating and archiving supporting information for publications.

Approach modular tools, so the same technology for creating community databases as for indexing local files on a desktop (personal use without forcing data openness). where possible use existing tools, protocols and technologies and collaborate with other open source projects. CML as the data format. JUMBO-Converters (Java) for legacy output formats (e.g. Gaussian, NWChem log files ) to CML and other transformations (looking at ANTLR as a complementary approach). Lensfield scours filestores, converts and organises files. RESTful system for uploading and aggregation. EMMA embargo system can control what is published from local to external respositories. repositories expose atom feeds for aggregation/indexing/status updates.

Architecture

Status and Future plans CML already supports a wide range of chemical data; currently working to extend to e.g. basis sets. converters to (a subset of) CML from Gaussian, NWChem, GAMESS-US and GAMESS-UK. can upload files, and have the tools to index and search data. aim to grow the community and continue developing the tools and data model to support a wider range of codes/data. develop interfaces with codes such as Avogadro and work closely with related tools such as Openbabel. ensure the tools as user-friendly as possible. work with journal publishers to integrate the tools into the publishing workflow. help specific communities develop databases of calculations of interest to them.