Using Prior Knowledge to Improve Scoring in High-Throughput Top-Down Proteomics Experiments Rich LeDuc Le-Shin Wu.

Slides:



Advertisements
Similar presentations
April 19, 2015 CASC Meeting 7 Sep 2011 Campus Bridging Presentation.
Advertisements

EngageNY.org ©2012 Core Knowledge Foundation. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Bill Barnett, Bob Flynn & Anurag Shankar Pervasive Technology Institute and University Information Technology Services, Indiana University CASC. September.
Data Gateways for Scientific Communities Birds of a Feather (BoF) Tuesday, June 10, 2008 Craig Stewart (Indiana University) Chris Jordan.
12. Summary, Trends, Research. © O. Nierstrasz PS — Summary, Trends, Research Roadmap  Summary: —Trends in programming paradigms  Research:...
ESE Einführung in Software Engineering N. XXX Prof. O. Nierstrasz Fall Semester 2009.
ESE Einführung in Software Engineering X. CHAPTER Prof. O. Nierstrasz Wintersemester 2005 / 2006.
ESE Einführung in Software Engineering X. CHAPTER Prof. O. Nierstrasz Wintersemester 2005 / 2006.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
CP — Concurrent Programming X. CHAPTER Prof. O. Nierstrasz Wintersemester 2005 / 2006.
12. eToys. © O. Nierstrasz PS — eToys 12.2 Denotational Semantics Overview:  … References:  …
Pti.iu.edu /jetstream Award # A national science & engineering cloud funded by the National Science Foundation Award #ACI Prepared for the.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Rockhopper: Penguin on Demand at Indiana.
Statistical Consideration for Identification and Quantification in Top-Down Proteomics Richard LeDuc National Center for Genome Analysis Support Discovery.
Campus Bridging: What is it and why is it important? Barbara Hallock – Senior Systems Analyst, Campus Bridging and Research Infrastructure.
Statewide IT Conference, Bloomington IN (October 7 th, 2014) The National Center for Genome Analysis Support, IU and You! Carrie Ganote (Bioinformatics.
Win8 on Intel Programming Course The challenge Paul Guermonprez Intel Software
Next Generation Cyberinfrastructures for Next Generation Sequencing and Genome Science AAMC 2013 Information Technology in Academic Medicine Conference.
Craig Stewart 23 July 2009 Cyberinfrastructure in research, education, and workforce development.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Using the Purdue DB Technology to build.
Goodbye from Indianapolis, IUPUI, and Craig A. Stewart Executive Director, Pervasive Technology Institute Associate Dean, Research Technologies Indiana.
Big Red II & Supporting Infrastructure Craig A. Stewart, Matthew R. Link, David Y Hancock Presented at IUPUI Faculty Council Information Technology Subcommittee.
I-Light: A Network for Collaboration between Indiana University and Purdue University Craig Stewart Associate Vice President Gary Bertoline Associate Vice.
Genomics, Transcriptomics, and Proteomics: Engaging Biologists Richard LeDuc Manager, NCGAS eScience, Chicago 10/8/2012.
The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists Internet2 Network Infrastructure for the Life Sciences Focused.
Leveraging the National Cyberinfrastructure for Top Down Mass Spectrometry Richard LeDuc.
XSEDE12 Closing Remarks Craig Stewart XSEDE12 General Chair Executive Director, Indiana University Pervasive Technology Institute.
September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. The IQ-Table & Collection Viewer A.
The Animated Sequence Chapter 5.1 in Sketching User Experiences: The Workbook.
RNA-Seq 2013, Boston MA, 6/20/2013 Optimizing the National Cyberinfrastructure for Lower Bioinformatic Costs: Making the Most of Resources for Publicly.
Pti.iu.edu /jetstream Award # funded by the National Science Foundation Award #ACI Jetstream - A self-provisioned, scalable science and.
July 18, 2012 Campus Bridging Security Challenges from “Panel: Security for Science Gateways and Campus Bridging”
Numerical Methods Multi Dimensional Direct Search Methods - Theory
Stages of the WSI life cycle Guidelines for Managing Integrity in Water Stewardship Initiatives: A Framework for Improving Effectiveness and Transparency.
Pti.iu.edu /jetstream Award # funded by the National Science Foundation Award #ACI Jetstream Overview – XSEDE ’15 Panel - New and emerging.
INDIANAUNIVERSITYINDIANAUNIVERSITY Spring 2000 Indiana University Information Technology University Information Technology Services Please cite as: Stewart,
11/17/ Introduction to Partial Differential Equations Transforming Numerical.
November 18, 2015 Quarterly Meeting 30Aug2011 – 1Sep2011 Campus Bridging Presentation.
February 27, 2007 University Information Technology Services Research Computing Craig A. Stewart Associate Vice President, Research Computing Chief Operating.
Win8 on Intel Programming Course Paul Guermonprez Intel Software
State of the Ward in 2007 Version 1.0 A Fifth Sunday Lesson Given in the Sterling Park Ward, Ashburn, VA Stake by D. Calvin Andrus, Bishop
A national science & engineering cloud funded by the National Science Foundation Award #ACI Craig Stewart ORCID ID Jetstream.
Recent key achievements in research computing at IU Craig Stewart Associate Vice President, Research & Academic Computing Chief Operating Officer, Pervasive.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Update on EAGER: Best Practices and.
Award # funded by the National Science Foundation Award #ACI Jetstream: A Distributed Cloud Infrastructure for.
Design of Everyday Things Part 2: Useful Designs? Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Images from:
Jetstream: A new national research and education cloud Jeremy Fischer ORCID Senior Technical Advisor, Collaboration.
A national science & engineering cloud funded by the National Science Foundation Award #ACI Craig Stewart ORCID ID Jetstream.
1 A national science & engineering cloud funded by the National Science Foundation Award #ACI Craig Stewart ORCID ID Jetstream.
EngageNY.org ©2012 Core Knowledge Foundation. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
2/13/ Elliptic Partial Differential Equations - Introduction Transforming.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Informatics Tools at the Indiana CTSI.
Numerical Methods Multidimensional Gradient Methods in Optimization- Example
EngageNY.org ©2012 Core Knowledge Foundation. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Jetstream Overview Jetstream: A national research and education cloud Jeremy Fischer ORCID Senior Technical Advisor,
© 2015 Core Knowledge Foundation. This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 3.0 Unported License.
1 Campus Bridging: What is it and why is it important? Barbara Hallock – Senior Systems Analyst, Campus Bridging and Research Infrastructure.
Jetstream: A national research and education cloud Jeremy Fischer ORCID Senior Technical Advisor, Collaboration and.
Research & Academic Computing Indiana University Statewide IT Conference 11 September 2003 Indianapolis IN.
Matt Link Associate Vice President (Acting) Director, Systems
funded by the National Science Foundation Award #ACI
Methodology Overview 2 basics in user studies Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this.
Pre and post workshop assessments
Applying the EMLS Model
Elliptic Partial Differential Equations – Direct Method
Numerical Methods Golden Section Search Method - Example
Numerical Methods Newton’s Method for One -Dimensional Optimization - Example
Naomi Radke, seecon international GmbH
Presentation transcript:

Using Prior Knowledge to Improve Scoring in High-Throughput Top-Down Proteomics Experiments Rich LeDuc Le-Shin Wu

The “Scoring” Problem Proteoforms are hypotheses about what was in MS. The model “knows” the process. Output is a ranked list of hypotheses. Science builds on prior knowledge Competing Hypotheses Ranked list of hypotheses, With measure of confidence, Under a given model Process Model

‘P score’ = P f,n = (xf) n x e -xf n! F. Meng, B. Cargile, L. Miller, J. Johnson, and N. Kelleher, Nat. Biotechnol., 2001, 19, f is the number of matching fragment ions, n is the # of matches, M a is the Mass Accuracy Meng-Kelleher p-score

Specific Example

Bayesian Approach Prior Probability of the Proteoform Likelihood of the Proteoform given the observed data Probability of the data Posterior probability of the proteoform after making the observations

The Scoring Model From Bayes Theorem we have: From independence we can: Which gives our final scoring function:

MS1 Generative Model Given a certain theoretic proteoform, what is the probability of seeing the observed precursor mass? Likelihood Fun Facts n Area does not equal one. n Need some level for “wrong precursor mass”

Probability Fragment Mass 0I wiwi Noise = k mimi MS2 Generative Model

Lambda Scores Assume that prior to scoring, each sequence had an equal probability of being the correct sequence. This means that if we are considering k sequences, then our prior probability is just: So then, the ratio of the posterior over the prior is:

Lambda Spread The lambda score spreads hits with the same number of matching fragment ions.

Room for Improvement Initial VersionI to Max of all proteoforms Theoretical Mass One set of real observations scored against 890,000 random “theoretical” proteoforms.

Scoring Models Compared Ahlf, D.R., Compton, P.D., Tran, J.C., Early, B.P., Thomas, P.M., Kelleher, N.L. “Evaluation of the Compact High-Field Orbitrap for Top-Down Proteomics of Human Cells”, J. Proteome Res., 2012, 11, PMCID: PMC

Future Directions n Add oxidation for MS1. n Improve modeling of various processes. n Incorporate into a search engine.

Conclusions n Include prior knowledge: Science builds on itself. n There is a system that gives a framework for including prior knowledge in models. n This particular implementation is better than older scoring systems, and it can improve!

Acknowledgements and Questions n Kelleher group for providing the data. n All my many colleagues who I have worked with on this project over the years. n Of course all the related funding agencies, but specifically NSF ABI

Acknowledgements & disclaimer n This material is based upon work supported by the National Science Foundation under Grants No. ABI n This work was supported in part by the Lilly Endowment, Inc. and the Indiana University Pervasive Technology Institute n Any opinions presented here are those of the presenter(s) and do not necessarily represent the opinions of the National Science Foundation or any other funding agencies

License terms n Please cite as: LeDuc, R.D., Using Prior Knowledge to Improve Scoring in High-Throughput Top-Down Proteomics Experiments, presented at ASMS 2013 Minneapolis MN June n Items indicated with a © are under copyright and used here with permission. Such items may not be reused without permission from the holder of copyright except where license terms noted on a slide permit reuse. n Except where otherwise noted, contents of this presentation are copyright 2013 by the Trustees of Indiana University. n This document is released under the Creative Commons Attribution 3.0 Unported license ( This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.