Welcoming Remarks – and a Very Brief History of U.S. Govt. Chemical Databases and Open Chemistry Marc C. Nicklaus Computer-Aided Drug Design Group Chemical Biology Laboratory, NCI-Frederick, NIH, DHHS
Categories of “Open Chemistry” 1. Databases and websites: E.g. NCI Open Database, Binding DB, PubChem, Chemspider, ChEMBL etc. 2. Software tools: E.g. JME, RDKit, CDK, OpenBabel, KNIME etc. 3. Methods, standards, and definitions: E.g. InChI, CML, ToxML etc. 4. Publications: E.g. Internet Journal of Chemistry, Chemistry Central Journal, Journal of Cheminformatics etc.
Early Efforts (before ~2000) at NIH, NIST NLM, NIH –“ computerized databases in toxicology [were made] available to the public since 1967” (Wexler, Toxicology 157 (2001) 3–10) –Toxnet: Developed 1985; on the Web 1998 –PubMed: Jan.1996 ― present. “Precursors” (fee-based): –Chemline (Elhill text based): 1974 to 1997 –ChemID (Elhill text based): 1990 to 2000 –ChemIDlus: 1998 ― present NCI, NIH –DTP release of Open NCI Database (127k structures): Nov –NCI Database Browser (246k), public beta (CCL): May 28, 1998 (1 st version by W.-D. Ihlenfeldt at CCC Erlangen, Oct. 1997) NIST : NIST Chemistry Webbook: Aug. 1996
Early Efforts (before ~2000), non-database 1994: First chemistry web sites appear (Henry Rzepa at Imperial College, Steven Bachrach at Northern Illinois). First Electronic Computational Chemistry Conference ECCC, continue for another 9 years (Bachrach). 1995: Chemical MIME type proposed (Rzepa, Peter Murray-Rust). First Electronic Conference on Heterocyclic Chemistry (Rzepa). 1996: "The Internet: A Guide for Chemists" published by ACS Publications (Bachrach, editor). 1998: "Internet Journal of Chemistry” ( ), Bachrach, editor in chief. MDL releases Chime. JME released (Peter Ertl). 1 st ChemInt conference in Irvine, CA (Bachrach, Rzepa, Steve Heller). 1999: CML proposed (Rzepa, Murray-Rust). 2 nd ChemInt conference, Georgetown University. 2000: 3 rd ChemInt conference, Georgetown University. (time line provided courtesy of Steven Bachrach)
Previous Meetings on U.S. Govt. Chemical Databases DateNumber of participants August 11, 2000(*) 23 December 12, 2000~25 July 21-22, July 12-13, (+ 5 panel members) (*) “Informal Meeting on Public Govt. Chemical Databases”
Databases, Tools, Projects to date JChemPaint - ~1999 Chemistry Development Kit - 27–29 September 2000 Binding DB – ~2001 OpenBabel – 2001 NIAID ChemDB HIV/AIDS Database – 2002 Ligand Depot – November 2003 (first protein structure with a ligand in PDB: released 1976) ZINC database – 2004 PDBbind database PubChem – 2004 sc-PDB Binding MOAD – 2004 DrugBank Blue Obelisk – 2005 Chemical Structure Lookup Service (CSLS) – September 2006 ChemSpider – March 2007 ChEMBL – 2008/2009 Chemical Identifier Resolver (CIR) – July 2009
Players missing (at this meeting)...among others: Google U.S. PTO U.S. DoD groups, such as Walter Reed (Malaria database)
Acknowledgments These Slides Steven Bachrach George “Mike” Hazard Dan Zaharevitz Meeting Organization Julia Lam The CADD Group Team: Igor Filippov, Megan Peach, Markus Sitzmann