Intelligent Database Systems Lab N.Y.U.S.T. I. M. An integrated scheme for feature selection and parameter setting in the support vector machine modeling.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A 24-h forecast of solar irradiance using artificial neural.
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel document similarity measure based on earth mover’s.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Discovering Leaders from Community Actions Presenter : Wu, Jia-Hao Authors : Amit Goyal, Francesco Bonchi,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Quality evaluation of product reviews using an information.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text classification based on multi-word with support vector.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Human eye sclera detection and tracking using a modified.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A new student performance analysing system using knowledge discovery in higher educational databases.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Data mining for credit card fraud: A comparative study.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comparison of neural network models with ARIMA and regression models for prediction of Houston's daily.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Web usage mining: extracting unexpected periods from web.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extraction Presenter : Jiang-Shan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2008.NN.10 Modeling propagation delays in the development.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Topology Preservation in Self-Organizing Feature Maps: Exact.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A quantitative stock prediction system based on financial news Presenter : Chun-Jung Shih Authors :Robert.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology SIGIR1 Improving Web Search Results Using Affinity Graph.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining LMS data to develop an early warning system for educators : A proof of concept Presenter : Wu,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Recommendations for E-Learning Personalization.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Hybrid Supervised ANN for Classification and Data Visualization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 AC-ViSOM: Hybridising the Modified Adaptive Coordinate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 GMDH-based feature ranking and selection for improved.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Plagiarism Detection Technique for Java Program Using.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Word sense disambiguation of WordNet glosses Presenter: Chun-Ping Wu Author: Dan Moldovan, Adrian Novischi.
國立雲林科技大學 National Yunlin University of Science and Technology Self-organizing map learning nonlinearly embedded manifoldsmanifolds Author :Timo Simila.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study on Automatic Recognition of Road Signs Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology SEP/COP: An efficient method to find the best partition.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Novel Density-Based Clustering Framework by Using Level.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Efficient Optimal Linear Boosting of a Pair of Classifiers.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. The application of SOM as a decision support tool to identify AACSB peer schools Presenter : Chun-Ping.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Model-based evaluation of clustering validation measures.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Regularization in Matrix Relevance Learning Petra Schneider,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Multiclass boosting with repartitioning Graduate : Chen,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An initialization method to simultaneously find initial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Enhanced neural gas network for prototype-based clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Region-based image retrieval using integrated color, shape,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A personal route prediction system base on trajectory.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive FIR Neural Model for Centroid Learning in Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Cost- sensitive boosting for classification of imbalanced.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining concept maps from news stories for measuring civic scientific literacy in media Presenter :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Mechanisms and Cluster Identification with TurSOM.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A survey of kernel and spectral methods for clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Providing Justifications in Recommender Systems Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Predicting corporate bankruptcy using a self-organizing map: An empirical study to improve the forecasting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology ACM SIGMOD1 Subsequence Matching on Structured Time Series.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Advisor-Advisee Relationships from Research Publication.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text Classification, Business Intelligence, and Interactivity:
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An Integrated Machine Learning Approach to Stroke Prediction Presenter: Tsai Tzung Ruei Authors: Aditya.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Prediction model building and feature selection with support.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A support system for predicting eBay end prices Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 f-information measures in medical image registration Presenter.
Presentation transcript:

Intelligent Database Systems Lab N.Y.U.S.T. I. M. An integrated scheme for feature selection and parameter setting in the support vector machine modeling and its application to the prediction of pharmacokinetic properties of drugs Presenter : Wu, Jia-Hao Authors : Sheng-Yong Yang, Qi Huang, Lin-Li Li, Chang- Ying Ma,Hui Zhang, Ru Bai, Qi-Zhi Teng, Ming-Li Xiang, Yu- Quan Wei AIM (2009) 國立雲林科技大學 National Yunlin University of Science and Technology

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Methodology Experiments Conclusion Personal Comments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation Many drug candidates fail in clinical trails are due to their unfavorable absorption, distribution, metabolism, excretion properties and toxicity (ADMET). Apply computational tools to predict ADMET properties of chemical compounds in the early design stages is important. Absorption 吸收 Distribution 分佈 Metabolism 代謝 Excretion 排泄 Toxicity 毒物

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective SVM has recently been evaluated in the prediction of ADMET of new drugs, but there are two problem still remain in SVM modeling.  Feature selection.  Parameter setting. The authors propose an integrated scheme to account for the two problems.  Feature selection – Genetic Algorithm.  Parameter setting – Conjugate Gradient method. The GA-CG SVM scheme to compared with the results of previous SVM studies.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective (Cont.) Use the GA-CG SVM scheme to build four classification models of ADMET-related properties.  Identification of P-glycoprotein substrates and nonsubstrates (P-gp)  Prediction of human intestinal absorption. (HIA)  Prediction of compounds inducing torsades de pointes (Tdp)  Prediction of blood-brain barrier penetration. (BBB)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – SVM – linearly separable cases Optimization problem

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – SVM - linearly non-separable cases Optimization problem

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology- nonlinear → linear The nonlinear separable cases could be transformed to linear cases by projecting the input variable into a new high-dimensional using a kernel function K(x i, x j ).  Polynomial.  Radial basis function.  Sigmoid kernel. Penalty parameter C and the kernel parameter γ

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Conjugate gradient Two parameters must be predetermined when using SVM, different pairs of (C, γ) give different levels of accuracy. Problem reduces to finding an optimal to minimize

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Conjugate gradient In minimizing problem can use the conjugate gradient method. (example)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Conjugate gradient (Cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – feature selection Removed the following descriptors  Descriptors with too many zero values (>90%)  Descriptors with very small standard deviation values ( < 0.5%)  Descriptors which are highly correlated with others (correlation coefficients > 90%) Initial C and γ were set 256 and 0.01

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion The parameter optimization (CG) in SVM modeling was able to improve further the prediction accuracy of the SVM model. All of these clearly demonstrate that considering feature selection and parameter optimization in SVM modeling can help to develop better prediction models of ADMET related properties.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comments Advantage  A good integrated scheme for SVM. Drawback  There are some proper nouns in this paper. Application  The prediction of pharmacokinetic properties of drugs.