Automated Model-Building with TEXTAL Thomas R. Ioerger Department of Computer Science Texas A&M University.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Computer vision: models, learning and inference
Ioerger Lab – Bioinformatics Research
Funding Networks Abdullah Sevincer University of Nevada, Reno Department of Computer Science & Engineering.
The TEXTAL System: Automated Model-Building Using Pattern Recognition Techniques Dr. Thomas R. Ioerger Department of Computer Science Texas A&M University.
CAPRA: C-Alpha Pattern Recognition Algorithm Thomas R. Ioerger Department of Computer Science Texas A&M University.
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
The TEXTAL System for Automated Model Building Thomas R. Ioerger Texas A&M University.
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
PcaA Mycolic acid cyclopropyl synthase (Smith&Sacchettini) original structure solved at 2.0A via MAD R-value = 0.22, R-free = residues,  fold.
Current Status and Future Directions for TEXTAL March 2, 2003 The TEXTAL Group at Texas A&M: Thomas R. Ioerger James C. Sacchettini Tod Romo Kreshna Gopal.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
TEXTAL - Automated Crystallographic Protein Structure Determination Using Pattern Recognition Principal Investigators: Thomas Ioerger (Dept. Computer Science)
Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.
Recent Developments in TEXTAL Phenix Workshop Berkeley Sept Thomas R. Ioerger Texas A&M University.
Iris localization algorithm based on geometrical features of cow eyes Menglu Zhang Institute of Systems Engineering
TEXTAL: A System for Automated Model Building Based on Pattern Recognition Thomas R. Ioerger Department of Computer Science Texas A&M University.
TEXTAL Progress Basic modeling of side-chain and backbone coordinates seems to be working well. –even for experimental MAD maps, 2.5-3A –using pattern-recognition.
Color Transfer in Correlated Color Space Xuezhong Xiao, Computer Science & Engineering Department, Shanghai Jiao Tong University Lizhuang Ma., Computer.
Content-Based Image Retrieval using the EMD algorithm Igal Ioffe George Leifman Supervisor: Doron Shaked Winter-Spring 2000 Technion - Israel Institute.
19 April, 2017 Knowledge and image processing algorithms for real-life applications. Dr. Maria Athelogou Principal Scientist & Scientific Liaison Manager.
A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry.
The P HENIX project Crystallographic software for automated structure determination Computational Crystallography Initiative (LBNL) -Paul Adams, Ralf Grosse-Kunstleve,
Data Mining Techniques
Studying Visual Attention with the Visual Search Paradigm Marc Pomplun Department of Computer Science University of Massachusetts at Boston
CCP4mg Liz Potterton, Stuart McNicholas, Martin Noble, Jan Gruber.
CPSC 601 Lecture Week 5 Hand Geometry. Outline: 1.Hand Geometry as Biometrics 2.Methods Used for Recognition 3.Illustrations and Examples 4.Some Useful.
OBJECT RECOGNITION. The next step in Robot Vision is the Object Recognition. This problem is accomplished using the extracted feature information. The.
Cab55342 Autobuild model Density-modified map Autobuilding starting with morphed model.
1 Research Groups : KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems SCI 2 SMetrology and Models Intelligent.
BALBES (Current working name) A. Vagin, F. Long, J. Foadi, A. Lebedev G. Murshudov Chemistry Department, University of York.
CSCE 5013 Computer Vision Fall 2011 Prof. John Gauch
BLAST: A Case Study Lecture 25. BLAST: Introduction The Basic Local Alignment Search Tool, BLAST, is a fast approach to finding similar strings of characters.
ENT 273 Object Recognition and Feature Detection Hema C.R.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
Crystallographic Databases I590 Spring 2005 Based in part on slides from John C. Huffman.
Computing Missing Loops in Automatically Resolved X-Ray Structures Itay Lotan Henry van den Bedem (SSRL)
Spin Image Correlation Steven M. Kropac April 26, 2005.
* Challenge the future Graduation project 2014 Exploring Regularities for Improving Façade Reconstruction from Point Cloud Supervisors Dr. Ben Gorte Dr.
Digital Media Lab 1 Data Mining Applied To Fault Detection Shinho Jeong Jaewon Shim Hyunsoo Lee {cinooco, poohut,
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
Acquiring 3D models of objects via a robotic stereo head David Virasinghe Department of Computer Science University of Adelaide Supervisors: Mike Brooks.
Automation of Engineering Design Aids using Neural Networks Siripong Malasri and Jittapong Malasri Christian Brothers University Kriangsiri Malasri Georgia.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Computer Graphics and Image Processing (CIS-601).
Stabbing balls and simplifying proteins Ovidiu Daescu and Jun Luo Department of Computer Science University of Texas at Dallas Richardson, TX
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Mantid Current Development and Future Plans Nicholas Draper ICNS 2013.
Siena Computational Crystallography School 2005
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 15/16 – TP14 Pattern Recognition Miguel Tavares.
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
Robotics Chapter 6 – Machine Vision Dr. Amit Goradia.
Data Mining Concepts and Techniques Course Presentation by Ali A. Ali Department of Information Technology Institute of Graduate Studies and Research Alexandria.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Lecture 10 CS566 Fall Structural Bioinformatics Motivation Concepts Structure Solving Structure Comparison Structure Prediction Modeling Structural.
Robodog Frontal Facial Recognition AUTHORS GROUP 5: Jing Hu EE ’05 Jessica Pannequin EE ‘05 Chanatip Kitwiwattanachai EE’ 05 DEMO TIMES: Thursday, April.
CONTENTS:  Introduction.  Face recognition task.  Image preprocessing.  Template Extraction and Normalization.  Template Correlation with image database.
October 3, 2013Computer Vision Lecture 10: Contour Fitting 1 Edge Relaxation Typically, this technique works on crack edges: pixelpixelpixel pixelpixelpixelebg.
3D Single Image Scene Reconstruction For Video Surveillance Systems
CSE 4705 Artificial Intelligence
Semantic Interoperability and Data Warehouse Design
Cluster Validity For supervised classification we have a variety of measures to evaluate how good our model is Accuracy, precision, recall For cluster.
Dr. Thomas R. Ioerger Department of Computer Science
Presentation transcript:

Automated Model-Building with TEXTAL Thomas R. Ioerger Department of Computer Science Texas A&M University

Automated model-building program Can we automate the kind of visual processing of patterns that crystallographers use? –Intelligent methods to interpret density, despite noise –Exploit knowledge about typical protein structure Focus on medium-resolution maps –optimized for 2.8A (actually, A is fine) –typical for MAD data (useful for high-throughput) –other programs exist for higher-res data (ARP/wARP) Overview of TEXTAL Electron density map (not structure factors) TEXTAL Protein model (may need refinement)

Main Stages of TEXTAL electron density map CAPRA C  chains LOOKUP model (initial coordinates) model (final coordinates) Post-processing routines Reciprocal-space refinement/DM Human Crystallographer (editing) build-in side-chain and main-chain atoms locally around each C  example: real-space refinement

CAPRA: C-Alpha Pattern-Recognition Algorithm tracing linking Neural network: estimates which pseudo-atoms are closest to true C  ’s

Example of C  -chains fit by CAPRA % built: 84% # chains: 2 lengths: 47, 88 RMSD: 0.82A Rat  2 urinary protein (P. Adams) data: 2.5A MR map generated at 2.8A

Stage 2: LOOKUP LOOKUP is based on Pattern Recognition –Given a local (5A-spherical) region of density, have we seen a pattern like this before (in another map)? –If so, use similar atomic coordinates. Use a database of maps with known structures –200 proteins from PDB-Select (non-redundant) –back-transformed (calculated) maps at 2.8A (no noise) –regions centered on 50,000 C  ’s Use feature extraction to match regions efficiently –feature (e.g. moments) represent local density patterns –features must be rotation-invariant (independent of 3D orientation) –use density correlation for more precise evaluation

Examples of Numeric Density Features Distance from center-of-sphere to center-of-mass Moments of inertia - relative dispersion along orthogonal axes Geometric features like “Spoke angles” Local variance and other statistics TEXTAL uses 19 distinct numeric features to represent the pattern of density in a region, each calculated over 4 different radii, for a total of 76 features.

F=

Database of known maps Region in map to be interpreted The LOOKUP Process Find optimal rotation

Stage 3: Post-Processing

Interfaces for Using TEXTAL Stand-alone commands and scripts –capra-scale prot.xplor prot-scaled.xplor –neotex.sh myprotein > textal.log –lots of intermediate files and logs… WINTEX: Tcl/Tk interface –creates jobs in sub-directories –Public Release: July 2004 – Integrated into Phenix – –Python module –model-building tasks in GUI

Gallery of Examples

Conclusions Pattern recognition is a successful technique for macromolecular model-building Future directions: –building ligands, co-factors, etc. –recognizing disulfide bridges –phase improvement (iterating with refinement) –loop-building –further integration with Phenix –Intelligent Agent-based methods for guiding/automating model-building –interactive graphics for specialized needs (e.g. fixing chains, editing identities)

Acknowledgements Funding: –National Institutes of Health People: –James C. Sacchettini –Kevin Childs, Kreshna Gopal, Lalji Kanbi, Erik McKee, Reetal Pai, Tod Romo Our association with the PHENIX group: –Paul Adams (Lawrence Berkeley National Lab) –Randy Read (Cambridge University) –Tom Terwilliger (Los Alamos National Lab)