InChI Open Education Resource

Slides:



Advertisements
Similar presentations
 To publish information for global distribution, one needs a universally understood language, a kind of publishing mother tongue that all computers may.
Advertisements

INTRODUCTION T.Najah Al_Subaie Kingdom of Saudi Arabia Prince Norah bint Abdul Rahman University College of Computer Since and Information System CS240.
MIS 2211 The Internet from a Technology Perspective A network of networks Comprised of hundreds of thousands of networks (nodes) throughout the world Very.
The Internet & The World Wide Web Notes
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
2 pt 3 pt 4 pt 5pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2pt 3 pt 4pt 5 pt 1pt 2pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4pt 5 pt 1pt Networking Can you find it? Apps.
Introduction to Computational Thinking Vicky Chen.
DATA COMMUNICATION DONE BY: ALVIN SAMPATH CARLVIN SAMPATH.
Introduction to SPSS Edward A. Greenberg, PhD
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
Open Your Mind to Open Source MPDO’s & EOPR’s Centre for IT & eGovernance AMR-APARD Hyderabad Welcome!
Standards for Digital Data Representation 1) The IUPAC/NIST Chemical Identifier 2) IUPAC Terminology NSF Workshop Constructing a Kinetics Database NIST,
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
8 th Semester, Batch 2009 Department Of Computer Science SSUET.
Especially created for ASL faculty. By Gloria Barron.
Glencoe Introduction to Web Design Chapter 4 XHTML Basics 1 Review Do you remember the vocabulary terms from this chapter? Use the following slides to.
Department of Software & Media Technology
The purpose of a CPU is to process data Custom written software is created for a user to meet exact purpose Off the shelf software is developed by a software.
Introduction to Algorithm. What is Algorithm? an algorithm is any well-defined computational procedure that takes some value, or set of values, as input.
Ontologies and Linked Data (Introductory Lecture) Piotr Lapo, General Library Expert Nazarbayev University Library
Physics 114: Lecture 1 Overview of Class Intro to MATLAB
Cloud-Computing Cloud Web-Blog Software Application Download Software.
Understanding Web Server Programming
4.01 How Web Pages Work.
Wright State University
Development Environment
Organic Chemistry Lesson 5.
Chapter 25 Domain Name System.
Finite State Machines Dr K R Bond 2009
11 October Building a Web Site.
Software Specification Tools
An Overview of Data-PASS Shared Catalog
Lesson 1 An Introduction
Introduction To Web Design
SAM GIRLSCOLLEGE, BHOPAL
Basic XHTML Tables XHTML tables—a frequently used feature that organizes data into rows and columns. Tables are defined with the table element. Table.
Isomerism SCH4U Spring 2012.
Introduction With TimeCard users can tag SharePoint events with information that converts them into time sheets. This way they can report.
Educational Computing
Chapt 21 Hydrocarbons [Selected]
PLUG & PLAY GUIDELINES #eposters eposterboards.com What to bring
Introducing HTML & XHTML:
System And Application Software
Exploring Microsoft Excel
Daylight and Discovery
Mobilizing EPA’s CompTox Chemistry Dashboard Data on Mobile Devices
New Features Update Web of Knowledge : Discovery Starts Here
ELECTRONIC MAIL SECURITY
Managing a Web Server and Files
Chapter 3 Hardware and software 1.
ELECTRONIC MAIL SECURITY
Citation Map Visualizing citation data in the Web of Science
Information Technology Ms. Abeer Helwa
Computer Science A Level
Structural / Functional Site Diagramming
Organic Chemistry Lesson 3.
Chapter 3 Hardware and software 1.
PLUG & PLAY GUIDELINES #eposters eposterboards.com What to bring
SOFTWARE TECHNOLOGIES
Educational Computing
The ultimate in data organization
Multimedia Web Site Design
Identify internal hardware devices (e. g
CAII 4.01 Web Page Design Terms List 2.
Intro Project Introduction to HTML.
Distance Learning . Blended Learning
COMPUTER ORGANIZATION
ICS103 Programming in C 1: Overview of Computers And Programming
Describing a crystal to a computer: How to represent and predict material structure with machine learning Keith T Butler.
Presentation transcript:

InChI Open Education Resource Robert E. Belford1*, Ehren C. Bucholtz2, Andrew P. Cornell1, Jordi Cuadros3, Vincent Scalfani4 & Martin A. Walker5 1University of Arkansas at Little Rock, 2St.Louis College of Pharmacy, 3IQS Univ. Ramon Llull, 4University of Alabama, 5SUNY-Potsdam **rebelford@ualr.edu (author for correspondence) Chemical Identifiers and 21st Century Nomenclature InChIKey & OSR InChI OER Optical Structure Recognition (OSR) converts graphical representations of molecules to line notation Goal: Use OSR to grade student drawn structures on quizzes Using SMILES for development work as it is human readable SMILES presents database issues for alkenes Switch to InChIKey, a 27 character hashed version of full InChI Allows for grading on both connectivity and stereochemistry InChIKey is canonical by design with very low probability of two molecules having same InChIKey https://www.inchi-trust.org/oer/ Chemical structures can be described as machine readable alphanumeric strings Labels provide convenient means of comparing and distinguishing chemicals Enables communication across databases, software agents and human beings Molecules are assigned a unique text label identifier: Must be unambiguous and always refer to same substance Registry-lookup Identifiers (e.g. CASRN and PubChem CID) Act as pointers for databases which contain the structural information Structure Based Identifiers (e.g. nomenclature, SMILES and InChI) Software algorithms can convert the identifier to structure (InChI OER) Draw Canonicalize Usable output ID Molecule 167 CCCC(C)O OSR  CCCC(O)C Compare An Open Education Resource devoted to the use of InChI in the chemical sciences Repository of two types of content On-site open access materials to download and reuse as you see fit Off-site links to publications related to InChI (may or may not be open access) Structure Based Identifiers cis-2-butene IAQRGUVFOMOMEM-ARJAWSKDSA-N C/C=C\C (isomeric SMILES) CC=CC (canonical SMILES) trans-2-butene IAQRGUVFOMOMEM-ONEGZZNKSA-N C/C=C/C (isomeric SMILES) CC=CC(canonical SMILES) Structure identification is based on graph theory, where atoms are nodes, and bonds are edges in connection tables Software converts connection table file to line notations Closed Standard Identifiers (e.g. SMILES) Use proprietary algorithms to create a line notation May not be canonical and creates problems in communication between databases Open Standard Identifiers (e.g. InChI) Non-proprietary Functional Group (line notation) Validation set size Percent correct computer generated expert drawn alkenes (SMILES) 80 65 alkenes (InChIKey) 97 Filters by multiple tags, which reduce display to only content with chosen tags, and the other tags of that content Taxonomic Tag Structure starts with categories IUPAC Name-to-OPSIN-to-PubChem Step 1: Spreadsheet uses OPSIN to convert IUPAC name to InChIKey What is: 1-[4-(2-methoxyethyl)phenoxy]-3-(propan-2-ylamino)propan-2-ol Off Site Content provide link with brief overview of the content On Site Content Provides ability to download file and gain instructional information All content has information box Author/copyright/DOI and other information The InChI Standard Structure based approach- anyone can produce InChI from structural formula Strict uniqueness and canonical Non-proprietary, open source, and free access Open access to source code Hierarchical approach with levels of granularity InChI OER Material for IUPAC Name-to-OPSIN-to-PubChem Step 2: Spreadsheet concatenates InChIKey to PubChem URL (4 YouTubes on InChI) (IUPAC Name2PubChem) The compound is Metoprolol: PubChem has a wealth of information on it “… we cannot improve the language of any science without at the same time improving the science itself; neither can we, on the other hand, improve a science, without improving the language or nomenclature which belongs to it” Lavoisier's Preface to Traité Elémentaire de Chimie translation by Robert Kerr (Edinburgh, 1790) Please use QR Codes to link to web pages and watch YouTube videos on your cell phone This work has been supported by IUPAC (project 2018-012-3-024 and an InChI Trust Fellowship