Correlate Phosphorylation Sites to Kinases by Conditional Random Fields --- CS 104 Project Lu He, Tuobin Wang.

Slides:



Advertisements
Similar presentations
Formative assessment of the Engineering Design process
Advertisements

Using a Mixture of Probabilistic Decision Trees for Direct Prediction of Protein Functions Paper by Umar Syed and Golan Yona department of CS, Cornell.
Extended Project Research Skills 1 st Feb Aims of this session  Developing a clear focus of what you are trying to achieve in your Extended Project.
SUPPORT VECTOR MACHINES PRESENTED BY MUTHAPPA. Introduction Support Vector Machines(SVMs) are supervised learning models with associated learning algorithms.
Viveca concept 2014 Faculty of Sport and Health Sciences University of Jyväskylä.
Midterm Review Evaluation & Research Concepts Proposals & Research Design Measurement Sampling Survey methods.
Tips on Critiquing Articles The goal of the educational research is to observe phenomena in the field of education and attempt to explain why these phenomena.
1 Computational Analysis of Protein-DNA Interactions Changhui (Charles) Yan Department of Computer Science Utah State University.
An Introduction to Support Vector Machines CSE 573 Autumn 2005 Henry Kautz based on slides stolen from Pierre Dönnes’ web site.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.
Multiple testing correction
Truncation of Protein Sequences for Fast Profile Alignment with Application to Subcellular Localization Man-Wai MAK and Wei WANG The Hong Kong Polytechnic.
State Scoring Guide Professional Development SOCIAL SCIENCES.
From Genomic Sequence Data to Genotype: A Proposed Machine Learning Approach for Genotyping Hepatitis C Virus Genaro Hernandez Jr CMSC 601 Spring 2011.
Using Motion Planning to Study Protein Folding Pathways Susan Lin, Guang Song and Nancy M. Amato Department of Computer Science Texas A&M University
* When conducting qualitative research one is faced with the difficult task of interpreting the data. The following has been created to help make sense.
Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
Protein Synthesis How to code for the correct amino acids.
Reduction of Training Noises for Text Classifiers Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan.
DNA and RNA The Molecule of Life: DNA and RNA. DNA vs. RNA Summary DNARNA By comparison they both have: Sugar phosphate background Nitrogenous bases By.
Grid Business EGEE’07 business track Budapest, 2 October 2007 Csilla Zsigri
Caption N-terminal domain A/B domain: ligand-independent domain AF1: Activation Function 1 DBD: DNA Binding Domain LBD: Ligand Binding Domain: ligand-dependent.
 Watch these 2 animations and try to explain what is going on.  Animation 1 Animation 1  Animation 2 Animation 2.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
University of Macau Faculty of Science and Technology Computer and Information Science SFTW 241 Programming Languages Architecture 1 Group B5.
Holly Wang Workshop at CAU December 15, 2010 Conducting Empirical Research and Publishing in International Journals.
1 Yang Yang *, Yizhou Sun +, Jie Tang *, Bo Ma #, and Juanzi Li * Entity Matching across Heterogeneous Sources *Tsinghua University + Northeastern University.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
Final Report (30% final score) Bin Liu, PhD, Associate Professor.
Ecosystems Webquest 5th Grade Laura Henson Dr. Harrison Yang Winter 2010 Laura Henson Dr. Harrison Yang Winter 2010.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Redesign/ Change Management Team 3. Point of Actions  Joint Application Design (JAD) session (2 weeks)  Representatives from each department  Overview.
Focus: Introduction to Engineering Essential Question: What are some examples of engineering that you use? Agenda: Warm up: Write down two examples of.
MISSION IMPACT Creating an online interactive how-to guide to help small local TIP NGOs to raise visibility & funds for their cause.
COLLABORATION VS. COOPERATION COLLABORATIVE LEARNER.
Enhancing Tor’s Performance using Real- time Traffic Classification By Hugo Bateman.
Hannah Marshall January 2015 Preliminary Findings: A Comparative Study Of User- And Cataloger- Assigned Subject Terms.
Treatment Outcome Prediction Model of Visual Field Recovery Using SOM JOJO
BNFO 615 Fall 2016 Usman Roshan NJIT. Outline Machine learning for bioinformatics – Basic machine learning algorithms – Applications to bioinformatics.
Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1
Proposal for Term Project
Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data.
New Proposals about Evaluation of Complete e-Learning Course Digestion
TECHjOSH.COM TechJosh.com.
Urban Sound Classification
Final Defense – Spencer Perry
Extra Tree Classifier-WS3 Bagging Classifier-WS3
Using SVM for Expression Micro-array Data Mining
חיזוי ואפיון אתרי קישור של חלבון לדנ"א מתוך הרצף
“Coding” for the building blocks of our bodies Edited
Scientific Inquiry Standard B – 1.7.
Guidelines for Resident Quality Improvement Project
The Intern
Formative assessment of the Engineering Design process
Generalizations of Markov model to characterize biological sequences
Volume 19, Issue 5, Pages (May 2012)
Evaluating Models Part 1
Project collaborators’ names
Hsin-Nan Lin, Ching-Tai Chen, Ting-Yi Sung,
Ben Smith and Laurie Williams
01 DRAW YOUR TIMELINE HERE JAN. MAR. JAN. MAR. FEB. APR. FEB. APR.
False discovery rate estimation
Bioinformatics 김유환, 문현구, 정태진, 정승우.
Maria S. Robles, Sean J. Humphrey, Matthias Mann  Cell Metabolism 
Оюутны эрдэм шинжилгээний хурлын зорилго:
Pilot of revised survey
Stance Classification of Ideological Debates
Presentation transcript:

Correlate Phosphorylation Sites to Kinases by Conditional Random Fields --- CS 104 Project Lu He, Tuobin Wang

Background and Problem Problem: Given groups of kinases and groups of phosphorylated proteins, our problem focuses on correlating phosphorylated proteins with corresponding kinases.

Challenges and Methods Challenges: Dependency of amino acids around Phosphorylation site {Y, S, T} !! Method: Conditional Random Fields vs. SVM, HMM

Data Set and Evaluation Phospho.ELM website: In vivo protein-phosphorylation sites that are linked to at least one kinase Evaluation: 1.5-fold cross validation 2. TP, TN, FP, FN

Timeline Feb 8: Project proposal; Feb 9 - Feb 14: Collect data, learning relevant knowledge. Feb 15 - Feb 22: Coding and solving encountered problems. Feb 23 - Mar 2: Finish coding and experiments; Mar 3 - Mar 7: Analyzing and discussing the results of experiments. Finishing the final write-up and poster.

Thanks !