Presentation is loading. Please wait.

Presentation is loading. Please wait.

Motif Space Database Design Kiranjit Sidhu. 2 Outline  Schema Design  Content of Database  Functionality  Future Plans.

Similar presentations


Presentation on theme: "Motif Space Database Design Kiranjit Sidhu. 2 Outline  Schema Design  Content of Database  Functionality  Future Plans."— Presentation transcript:

1 Motif Space Database Design Kiranjit Sidhu

2 2 Outline  Schema Design  Content of Database  Functionality  Future Plans

3 3 Sample PDB File  Sample PDB File Sample PDB File  Each PDB File represented as a text file (~ 60K Lines)  Inefficient for pattern matching  Relational Database required for most efficient solution

4 4 Structure of Database  DB divided into two major components: Protein Data Motif (Occurrence) Data  Protein Data Obtained from PDB Files (Protein Data Bank) Derived Data  Motif Data Obtained from Luke’s FFSM technique Derived Data

5 5 Schema Design

6 6 Schema Design - Protein

7 7 Schema Design - Motif

8 8 Tools Used  Obtaining Data Perl Scripts  Database: SQL Server 2000 and SQL Server 2005 T-SQL (Bulk Import Data)

9 9 Obtaining Data PDB FileTemp Tables (T-SQL) T-SQL Procedures CSV File Extract Import Final DB Convert and Derive

10 10 Uploading Protein Data  Input dataset: ~ 70,000 PDB/Chain Combinations  Entries in tables: E.g. Approx. 800 Million Rows in the proteinchaindistance table  Initial version imported 10 PDB files in 1 day  Current version: under 3 minutes

11 11 Current Functionality  Protein (PDB) data has been completely uploaded into both: Production Database (MotifSpace) Development Database (MotifSpaceDev)  Visualize protein structure using data from database (data available)  Data can be obtained from Server using SOAP or web services.  Basic Queries such as Different PDBs a specific motif occurs in? Histograms to compute statistics.

12 12 Demo


Download ppt "Motif Space Database Design Kiranjit Sidhu. 2 Outline  Schema Design  Content of Database  Functionality  Future Plans."

Similar presentations


Ads by Google