Download presentation
Presentation is loading. Please wait.
1
Rapid Sample Identification Through NMR
Reinhard Dunkel* and Xinzi Wu ScienceSoft LLC, Sandy, UT The acquisition of NMR data can be largely automated. But processing and interpretation of acquired data remains to be done manually. Our goal is to automate the processing of acquired spectra, to develop dereplication software to identify the best matching structures for an unknown sample, and to support structure elucidation. When a likely structure for a sample is known, a rating of the extent to which the acquired data agrees with or deviates from the structure assumption is desirable. FindIt: Dereplication of an Unknown Sample A traditional approach to NMR structure identification is to search for observed carbon and proton shifts in a database of compound spectra. Two freely accessible databases for this purpose are NMRShiftDB and SDBS with spectral data on about 10,000 molecules. The FindIt module of the NMRanalyst software matches both the proton and carbon shifts of the considered compound to half a million unique National Institutes of Health PubChem structures. We are currently evaluating the freely accessible ZINC structure collection of about two million commercially available compound structures as a potential FindIt structure database. The initial step is the analysis of the 1D spectra of the unknown by NMRanalyst. FindIt does not use the traditional spectrum database search approach. Instead, it matches the observed chemical shifts to the predicted carbon and proton shifts of each structure in its database. Shift predictions are based on HOSE codes introduced by Bremser and on additivity rules introduced by Pretsch. Used HOSE codes are generated from NMRShiftDB assigned structures. Closely related to FindIt is the VerifyIt module. For a specified structure, VerifyIt provides a detailed agreement rating with the experimental data. Acquisition of a 1D carbon spectrum is often time consuming. FindIt can use protonated carbon shifts instead. For the 200 µg quinine sample in EXAMPLE 1, HSQC derived carbon frequencies are used for the FindIt matching. Other options are a DEPT-135 or DEPT-45 spectrum. The following table shows the sample compounds tested by FindIt. A booklet of the identified top ten structures for each of the test compounds is attached to this poster. Identifying the best matches among the half a million PubChem structures takes typically one minute on a modern PC. In all but one case, the correct structure is identified as the best match. EXAMPLE 2: Top 10 FindIt Quinine Structures From 1D Proton & gHSQC Spectra Ratings: Ratings: NMRanalyst: Automated NMR Data Analysis Cryo, Cold, and Micro Probes have increased the signal-to-noise ratio of spectra. Pulse Field Gradients and new pulse sequences aim to reduce spectral imperfections. But signal-to-noise challenges, incompletely suppressed resonances, phasing, and other spectral imperfections remain to be addressed. The NMRanalyst software (previous versions marketed by Varian as FRED™) uses spin system modeling. The software generates a clean list of numerical resonances and spin system descriptions for the raw data. Further processing programs can be based on this solid foundation. NMRanalyst loads Bruker or Varian acquisition parameters as a description of acquired data. It Fourier Transforms the data when needed. It automatically corrects the baseline for 1D spectra (up to the first 10 FID point distortions) and phase corrects 1D through 3D spectra. Then it models 1D areas, 2D regions, or 3D volumes through the spin system model appropriate for the various spectrum types. For best numerical description, all acquired spectral phase components are analyzed simultaneously. A NMR resonance is modeled to relax exponentially (Lorentzian line) and is limited by the acquisition time (Sinc lineshape convolution). We routinely use zero-filling for under-digitized data. But we found line broadening and linear prediction to deteriorate resulting numerical descriptions and we discontinued their use. EXAMPLE 3: Top 10 FindIt C16H12N2 Structures From 1D Proton & Carbon Spectra C16H12N2 is the only FindIt test compound, where the correct structure is identified as the third best match (instead of the top match). However, when its molecular formula is specified, FindIt identifies it as the most likely structure. Top 10 FindIt Structures Without Molecular Formula Ratings: Ratings: 1H & 13C 1H & protonated 13C 1 3 Data Set 1-Indanone 2-Ethyl-1-indanone Benzothiazole Brucine C11H17F3O5S C12H15NO2 C12H7N3O2 C16H11ClO2 C16H12N2 C16H14N2O3 C17H25NO C18H23BrO4 C20H17NO4 C20H24O7 C28H45NO8 C35H40N2O4 C38H55NO10 C7H13NO5 C7H14FNO C8H10Cl2O2 Clobenzorex Cortisone Dihydrotestosterone Fexofenadine HCl Gibberellic Acid Isoindole Isoquinoline Lasalocid Na Salt Menthol Piperazine Prednisone Pyrrole Quinine Strychnine Strychnine (1 mg) Sucrose Taxol Verbenol EXAMPLE 1: Quinine 1D Proton, gHSQC, gHMBC, N15-gHMBC, & DQF- COSY Spectra Spectrometer: AV 400 MHz Probe: 1mm MicroProbe Solvent: CDCl3 Sample Concentration: ca. 200 µg Provided by: Dr. Till Kühn, Bruker Switzerland 1D 1H gHMBC gHSQC N15_gHMBC DQF-COSY Top 10 FindIt Structures With Molecular Formula Ratings: Ratings: Molecular Structures of FindIt Test Compounds AssembleIt: Structure Elucidation The FindIt dereplication identifies best matching structures from half a million known structures. Our suggestion is to use FindIt first to check if an unknown sample matches one of the FindIt database structures. AssembleIt is our generative structure elucidator to determine new structures. It requires only NMR derived correlations. In EXAMPLE 2, FindIt identifies the correct quinine structure. Based on the EXAMPLE 1 spectra, AssembleIt can elucidate the quinine structure. For the AssembleIt application, the NMRanalyst derived DQF-COSY correlation coupling constants are very valuable. By ignoring all couplings below 3 Hz, long-range couplings can be excluded. Based on HSQC correlations, the geminal couplings can be identified and excluded. So only vicinal couplings remain, corresponding to bonds between two protonated carbon atoms. For the HMBC correlations, two- and three-bond correlations cannot be distinguished. With all the ambiguities, many possible structures result. Our NMRgraph software can place NMR unobserved heteroatoms, such as the hydroxy group on carbon ppm and the methoxy group on carbon ppm, through the best explanation of observed carbon shifts. The shown correct quinine structure is identified as the one with the best agreement between the observed and predicted carbon shifts An atom label indicates the type of nucleus for a non-carbon atom or the observed shift in ppm for a carbon atom. The “?” correlation label identifies an unobserved, but derived bond. Gray bonds to NMR unobservable heteroatoms are added to minimize the disagreement between observed and predicted carbon shifts. Other bond labels show the proton frequency over which a bond was detected. NMRanalyst 3.2 with FindIt, VerifyIt, and AssembleIt described in this poster is available for MS Windows (98 SE, ME, 2000, and XP), RH Linux (8.0, 9, Enterprise 3), and Sun SPARC workstations running Solaris (8 or 9). For commercial availability, please ask or see Acknowledgements The University of Mainz, Varian Inc. and Bruker BioSpin contributed the NMR data sets. This work was made possible by NIH SBIR Phase II 5 R44 MH funding.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.