Graduate Research Symposium 2014William G. Lowrie Dept. of Chemical and Biomolecular Engineering Evaluating the potential toxicity of chemical compounds.

Slides:



Advertisements
Similar presentations
ADBIS 2007 Discretization Numbers for Multiple-Instances Problem in Relational Database Rayner Alfred Dimitar Kazakov Artificial Intelligence Group, Computer.
Advertisements

Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.
Constraint Systems used in Worst-Case Execution Time Analysis Andreas Ermedahl Dept. of Information Technology Uppsala University.
Progress In Computational Toxicology Sean Ekins 1,2 1 Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay Varina, NC27526, USA, NC. 2 Collaborative.
Learning Techniques for Video Shot Detection Under the guidance of Prof. Sharat Chandran by M. Nithya.
1 Test-Cost Sensitive Naïve Bayes Classification X. Chai, L. Deng, Q. Yang Dept. of Computer Science The Hong Kong University of Science and Technology.
FPGA Latency Optimization Using System-level Transformations and DFG Restructuring Daniel Gomez-Prado, Maciej Ciesielski, and Russell Tessier Department.
A Statistical Model for Domain- Independent Text Segmentation Masao Utiyama and Hitoshi Isahura Presentation by Matthew Waymost.
ECE643 DIGITAL IMAGE PROCESSING Steganalysis versus Splicing detection Paper by: Yun Q. Shi, Chunhua Chen, Guorong Xuan and Wei Su By: Nehal Patel Siddharth.
Clustered alignments of gene- expression time series data Adam A. Smith, Aaron Vollrath, Cristopher A. Bradfield and Mark Craven Department of Biosatatistics.
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
FLAIRS '991 Applying the SUBDUE Substructure Discovery System to the Chemical Toxicity Domain Ravindra N. Chittimoori, Diane J. Cook, Lawrence B. Holder.
A Study on Feature Selection for Toxicity Prediction*
Optimatization of a New Score Function for the Detection of Remote Homologs Kann et al.
Discovering Substructures in Chemical Toxicity Domain Masters Project Defense by Ravindra Nath Chittimoori Committee: DR. Lawrence B. Holder, DR. Diane.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Cloud Computing for Chemical Property Prediction Paul Watson School of Computing Science Newcastle University, UK Microsoft Cloud.
WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
Active Learning Strategies for Drug Screening 1. Introduction At the intersection of drug discovery and experimental design, active learning algorithms.
Application and Efficacy of Random Forest Method for QSAR Analysis
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Protein Tertiary Structure Prediction
1 A Combinatorial Toolbox for Protein Sequence Design and Landscape Analysis in the Grand Canonical Model Ming-Yang Kao Department of Computer Science.
Midwestern State University, Wichita Falls TX 1 Computerized Trip Classification of GPS Data: A Proposed Framework Terry Griffin - Yan Huang – Ranette.
Decay Data in View of Complex Applications Octavian Sima Physics Department, University of Bucharest Decay Data Evaluation Project Workshop May 12 – 14,
Prediction model building and feature selection with SVM in breast cancer diagnosis Cheng-Lung Huang, Hung-Chang Liao, Mu- Chen Chen Expert Systems with.
From Genomic Sequence Data to Genotype: A Proposed Machine Learning Approach for Genotyping Hepatitis C Virus Genaro Hernandez Jr CMSC 601 Spring 2011.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
Coverage Criteria for Testing of Object Interactions in Sequence Diagrams Atanas (Nasko) Rountev Scott Kagan Jason Sawin Ohio State University.
Laxman Yetukuri T : Modeling of Proteomics Data
AN APPROACH TO DETERMINE THE APPLICATION DOMAIN OF GROUP CONTRIBUTION MODELS Nina Jeliazkova 1 Joanna Jaworska 2, (2) Central Product Safety, Procter &
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Tijana Janjusevic Multimedia and Vision Group, Queen Mary, University of London Clustering of Visual Data using Ant-inspired Methods Supervisor: Prof.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
Anis Karimpour-Fard ‡, Ryan T. Gill †,
Evaluating Results of Learning Blaž Zupan
Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.
PharmaMiner: Geometric Mining of Pharmacophores 1.
School of Computer Science 1 Information Extraction with HMM Structures Learned by Stochastic Optimization Dayne Freitag and Andrew McCallum Presented.
Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.
Final Project Mei-Chen Yeh May 15, General In-class presentation – June 12 and June 19, 2012 – 15 minutes, in English 30% of the overall grade In-class.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
MUTAGENICITY OF AROMATIC AMINES: MODELLING, PREDICTION AND CLASSIFICATION BY MOLECULAR DESCRIPTORS M.Pavan and P.Gramatica QSAR Research Unit, Dept. of.
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
De Novo Peptide Sequencing via Probabilistic Network Modeling PepNovo.
Use of Machine Learning in Chemoinformatics
Final Report (30% final score) Bin Liu, PhD, Associate Professor.
HIV Mutation Classifier HIV Mutation Classifier Hannah Bier’s Project Proposal.
Summation of Toxicity Data in Vitic Andrew Thresher
Improving compound–protein interaction prediction by building up highly credible negative samples Toward more realistic drug-target interaction predictions.
Using the Fisher kernel method to detect remote protein homologies Tommi Jaakkola, Mark Diekhams, David Haussler ISMB’ 99 Talk by O, Jangmin (2001/01/16)
Logistic Regression: To classify gene pairs
Evaluating Classifiers
Boosted Augmented Naive Bayes. Efficient discriminative learning of
Recognition of arrhythmic Electrocardiogram using Wavelet based Feature Extraction Authors Atrija Singh Dept. Of Electronics and Communication Engineering.
Evaluating Results of Learning
Can Computer Algorithms Guess Your Age and Gender?
Week 6 Cecilia La Place.
Dieudo Mulamba November 2017
Predict Failures with Developer Networks and Social Network Analysis
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
Evolutionary Ensembles with Negative Correlation Learning
Manisha Panta, Avdesh Mishra, Md Tamjidul Hoque, Joel Atallah
Information Organization: Evaluation of Classification Performance
Presentation transcript:

Graduate Research Symposium 2014William G. Lowrie Dept. of Chemical and Biomolecular Engineering Evaluating the potential toxicity of chemical compounds is an important step in the development of all new products these days. Current methods for assessing toxicity largely rely on experimental techniques that are time consuming and resource intensive. Predictive computational models (referred to as “in silico” models) need to be developed to prioritize experimental tests. Goal: To develop a novel in silico tool for classifying compounds of unknown toxicity using annotated linear structural fragments. STRUCTURE-BASED IN SILICO MODELING OF CHEMICALLY-INDUCED TOXICITY Mehta, Darshan 1, Rathman, James F. 1,2, Yang, Chihae 2 1 Department of Chemical and Biomolecular Engineering, The Ohio State University 2 Altamira LLC and FDA CFSAN MD A. INTRODUCTION B. LINEAR FRAGMENTS AND CHEMICAL ANNOTATIONS C. AMES MUTAGENICITY DATASET A unique method of generating structural descriptors is proposed. These descriptors are linear subgraphs (fragments) that are extracted dynamically from a database of chemical structures. Generation of linear fragments using different annotation schemes D. CLASSIFICATION STRATEGY Go through all compounds in training set (POS and NEG separately) and count the connections between different possible states. Compute corresponding probabilities and calculate the likelihood of fragments in test compounds. The likelihood of fragment [‘C30’,‘C21’,‘C30’,‘C22’] is calculated as: Likelihood = p(C30-C21) * p(C21-C30) * p(C30-C22) = p(C30-C21) 2 * p(C30-C22) Log-likelihood = 2 * log(p(C30-C21)) + log(p(C30-C22)) Calculate difference in log-likelihood under POS and NEG models. Diff = (log-likelihood) POS – (log-likelihood) NEG If Diff > 0, classify compound as Ames POS If Diff < 0, classify compound as Ames NEG Performance parameters: Sensitivity = Pr(Y pred = 1 | Y = 1) (True positives) Specificity = Pr(Y pred = 0 | Y = 0) (True negatives) Ames test detects frame-shift mutations in a test compound by treating it with strains of Salmonella typhimurium. Ames positive  Mutagenic; Ames negative  Non-mutagenic Benchmark dataset with pre-defined cross validation splits compiled by Hansen et al. 1 is used to evaluate performance. Total compounds = 6512; Ames POS = 3503; Ames NEG = E. PRELIMINARY RESULTS 1. Katja Hansen, Sebastian Mika, Timon Schroeter, Andreas Sutter, Antonius ter Laak, Thomas Steger-Hartmann, Nikolaus Heinrich, and Klaus-Robert Muller. Benchmark data set for in silico prediction of Ames mutagenicity. Journal of chemical information and modeling, 49(9):2077–81, September MethodSensitivitySpecificity Linear fragments Pipeline Pilot DEREK MultiCASE Annotations are features/attributes assigned to each atom type. Annotation options: Atom identity (AI), Number of heavy atom connections (nC), Number of attached hydrogen atoms (nH) Annotation scheme: any possible combination of annotation options. Graph of nodes & edges Linear paths from node Comparison of performance with other non-parametric methods (averaged over 5-fold cross validation splits) Demonstration of extracting linear subgraphs from m-ethyl phenol Training Set Test Set One-step connection countsOne-step connection probability ratios Total linear fragments generatedDistribution of fragment lengths Performance of linear fragments method is obtained using {AI, nC, nH} annotation scheme Fragments of length 2. Characteristics of Ames Benchmark dataset Training Set 1 Test Set 1