Motif Space Database Design Kiranjit Sidhu. 2 Outline  Schema Design  Content of Database  Functionality  Future Plans.

Slides:



Advertisements
Similar presentations
Database Management Using Microsoft Access Xinhua Chen, Ph.D. Chinese Association of Professionals in Science and Technology March 23, 2003.
Advertisements

Copyright © SoftTree Technologies, Inc. DB Tuning Expert.
1 Visualizer for Firewall Display & Analysis Tool.
Bitmap Index Buddhika Madduma 22/03/2010 Web and Document Databases - ACS-7102.
Copyright © 2003 Addison-Wesley Instructor Information Here.
The Protein Data Bank (PDB)
By Morris Wright, Ryan Caplet, Bryan Chapman. Overview  Crawler-Based Search Engine (A script/bot that searches the web in a methodical, automated manner)
Multiple Tiers in Action
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Molecular and MotifSpace Visualization Toolkit : RasCtrl Jingdan Zhang.
1 How to improve SQL Performance with new Health Check Tool? Carlos Sierra Consulting Technical Advisor © 2012 Oracle Corporation – Proprietary and Confidential.
Securing Enterprise Applications Rich Cole. Agenda Sample Enterprise Architecture Sample Enterprise Architecture Example of how University Apps uses Defense.
1 Foundations of Software Design Lecture 27: Java Database Programming Marti Hearst Fall 2002.
Some Introductory Programming 1. Structured Query Language - used for queries. - a standard database product. 2. Visual Basic for Applications - use of.
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
Migrating to EPiServer CMS 5 Johan Björnfot -
SSURGO Dataset to File Geodatabase Import Tool One Example of Extending Capabilities through Python 2013 IGIC Conference Muncie, Indiana Chris Morse, NRCS.
Label production Solution with Label Gallery programs Label Gallery is used for general label design and print GalleryForm is used to create data entry.
Databases and LINQ Visual Basic 2010 How to Program 1.
Analysis of SQL injection prevention using a proxy server By: David Rowe Supervisor: Barry Irwin.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Introductory Lecture Advanced Computer Programming.
Fifth in a series Nightly Procedures November 2010.
Miscellaneous Excel Combining Excel and Access. – Importing, exporting and linking Parsing and manipulating data. 1.
Introduction on R-GMA Shi Jingyan Computing Center IHEP.
Module 19 Managing Multiple Servers. Module Overview Working with Multiple Servers Virtualizing SQL Server Deploying and Upgrading Data-Tier Applications.
BalticGrid-II Project 2nd BG-II AHM, , Riga, Latvia1 Overview of application CoPS (Comparison of Protein Structures) D.Ludviga IMCS UL (SigmaNet)
March 3, 2005 mBIRN All Hands Meeting Data Provenance Nicole Aucoin.
© Anselm Spoerri Web Design Information Visualization Course Prof. Anselm Spoerri
CIS 338: Using Queries in Access as a RecordSource Dr. Ralph D. Westfall May, 2011.
What’s new? Update on Netrics Matching Engine V4.0 and V4.1 Dave Chamberlain
Overview of Bioinformatics 1 Module Denis Manley..
Ch. 101 Database Management An Introduction to Databases.
IT Architectures for Handling Big Data in Official Statistics: the Case of Scanner Data in Istat Gianluca D’Amato, Annunziata Fiore, Domenico Infante,
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
Slide 1 of 19Session 13 Ver. 1.0 Querying and Managing Data Using SQL Server 2005 In this session, you will learn to: Implement stored procedures Implement.
Foundations of Business Intelligence: Databases and Information Management.
Download the website to work locally Tool: Surf Offline 1.0 Create PERL program to extract website structure information and storage.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
CS5604: Final Presentation ProjOpenDSA: Log Support Victoria Suwardiman Anand Swaminathan Shiyi Wei Department of Computer Science, Virginia Tech December.
Lesson 29: Building a Database. Learning Objectives After studying this lesson, you will be able to:  Identify key database design techniques  Open.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
SQL Query Analyzer. Graphical tool that allows you to:  Create queries and other SQL scripts and execute them against SQL Server databases. (Query window)
CAA Database Overview Sinéad McCaffrey. Metadata ObservatoryExperiment Instrument Mission Dataset File.
Chapter 9 Working with Databases. Copyright © 2011 Pearson Addison-Wesley Introduction In this chapter you will learn: – Basic database concepts – How.
Introduction to MySQL  Working with MySQL and MySQL Workbench.
Eurostat May 2016 Eurostat, Unit B3 – IT solutions for statistical production Test Client Jean-Francois LEBLANC Christian SEBASTIAN.
Jacksonville SQL Saturday May 9, 2015 Introduction to Power BI.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Internet/Web Databases
Fusion Tables.
Visual Basic 2010 How to Program
Fujimi SPC System using JSL June 22, 2016
Miscellaneous Excel Combining Excel and Access.
Efficiently Searching Schema in SQL Server
Tutorial 8 Objectives Continue presenting methods to import data into Access, export data from Access, link applications with data stored in Access, and.
Populating a Data Warehouse
Web Systems Development (CSC-215)
Populating a Data Warehouse
Query Optimization Techniques
The 2nd Generation Live Database
Populating a Data Warehouse
Introduction to NetDB2 IST210.
Populating a Data Warehouse
Jean-Francois LEBLANC Christian SEBASTIAN
VIEWS / TSS Overview.
Reports Web Innovations 2017.
The site to download BALBES:
Query Optimization Techniques
SSIS Data Integration Data Warehouse Acceleration
Eurostat Unit B3 – IT and standards for data and metadata exchange
Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin
Presentation transcript:

Motif Space Database Design Kiranjit Sidhu

2 Outline  Schema Design  Content of Database  Functionality  Future Plans

3 Sample PDB File  Sample PDB File Sample PDB File  Each PDB File represented as a text file (~ 60K Lines)  Inefficient for pattern matching  Relational Database required for most efficient solution

4 Structure of Database  DB divided into two major components: Protein Data Motif (Occurrence) Data  Protein Data Obtained from PDB Files (Protein Data Bank) Derived Data  Motif Data Obtained from Luke’s FFSM technique Derived Data

5 Schema Design

6 Schema Design - Protein

7 Schema Design - Motif

8 Tools Used  Obtaining Data Perl Scripts  Database: SQL Server 2000 and SQL Server 2005 T-SQL (Bulk Import Data)

9 Obtaining Data PDB FileTemp Tables (T-SQL) T-SQL Procedures CSV File Extract Import Final DB Convert and Derive

10 Uploading Protein Data  Input dataset: ~ 70,000 PDB/Chain Combinations  Entries in tables: E.g. Approx. 800 Million Rows in the proteinchaindistance table  Initial version imported 10 PDB files in 1 day  Current version: under 3 minutes

11 Current Functionality  Protein (PDB) data has been completely uploaded into both: Production Database (MotifSpace) Development Database (MotifSpaceDev)  Visualize protein structure using data from database (data available)  Data can be obtained from Server using SOAP or web services.  Basic Queries such as Different PDBs a specific motif occurs in? Histograms to compute statistics.

12 Demo