Download presentation
Presentation is loading. Please wait.
Published byTheresa George Modified over 9 years ago
1
Structure Representation and Coordinates Format Lecture 3 Structural Bioinformatics Dr. Avraham Samson 81-871
2
2 The PDB Format A full description is here It was designed around an 80 column punched card! It was designed to be human readable It is used by almost every piece of software that deals with structural data
3
3 The PDB Format - Records Every PDB file may be broken into a number of lines terminated by an end-of-line indicator. Each line in the PDB entry file consists of 80 columns. The last character in each PDB entry should be an end-of-line indicator. Each line in the PDB file is self-identifying. The first six columns of every line contain a record name, left-justified and blank-filled. This must be an exact match to one of the stated record names. The PDB file may also be viewed as a collection of record types. Each record type consists of one or more lines. Each record type is further divided into fields.
4
4 The PDB Format – An Example – The Header
5
5 The PDB Format – An Example – The Atomic Coordinates
6
6 The Description – Atom Records
7
7 What is Wrong with this Approach? The description and the data are separate Parsing is a nightmare – the most complex piece of code we have in our research laboratory probably remains the PDB parser There are no relationships between items of data Some data just cannot be parsed The fixed column format cannot represent some of today’s structures …
8
Structures are Spread Over Multiple Files – Most Users are Not Aware of this 8
9
9 REMARK 3 REFINEMENT. BY THE RESTRAINED LEAST-SQUARES PROCEDURE OF REMARK 3 J. KONNERT AND W. HENDRICKSON (PROGRAM *PROLSQ*). THE R REMARK 3 VALUE IS 0.168 FOR 2680 REFLECTIONS WITH I GREATER THAN REMARK 3 2.0*SIGMA(I) REPRESENTING 74 PER CENT OF THE TOTAL REMARK 3 AVAILABLE DATA IN THE RESOLUTION RANGE 10.0 TO 2.0 REMARK 3 ANGSTROMS. REMARK 4 THE ERABUTOXIN A (EA) CRYSTAL STRUCTURE IS ISOMORPHOUS WITH REMARK 4 THE KNOWN STRUCTURE OF ERABUTOXIN B (PROTEIN DATA BANK REMARK 4 ENTRIES *2EBX*, *3EBX*). EA DIFFERS FROM EB BY A SINGLE REMARK 4 SUBSTITUTION - EA ASN 26 FOR EB HIS 26. THE EA STARTING REMARK 4 MODEL WAS OBTAINED FROM A MOLECULAR REPLACEMENT STUDY IN REMARK 4 WHICH COORDINATES FOR 309 OF THE 475 ATOMS IN THE EB REMARK 4 STRUCTURE (*2EBX*) WERE USED. PDB Format - Important Components of the Data are Lost to All But Humans
10
mmCIF Was Developed to Address these Problems Methods in Enzymology. 1997 277, 571-590 10
11
11 All PDB data should be captured Describe a paper’s material and methods section Describe biologically active molecule Fully describe secondary structure but not tertiary or quaternary Describe details of chemistry (inc. 2D) Meaningful 3D views mmCIF – Scope of the Initial Effort
12
12 loop_ _atom_site.group_PDB _atom_site.type_symbol _atom_site.label_atom_id _atom_site.label_comp_id _atom_site.label_asym_id _atom_site.label_seq_id _atom_site.label_alt_id _atom_site.Cartn_x _atom_site.Cartn_y _atom_site.Cartn_z _atom_site.occupancy _atom_site.B_iso_or_equiv _atom_site.footnote_id _atom_site.entity_id _atom_site.entity_seq_num _atom_site.id ATOM N N VAL A 11. 25.360 30.691 11.795 1.00 17.93. 1 11 1 ATOM C CA VAL A 11. 25.970 31.965 12.332 1.00 17.75. 1 11 2 ATOM C C VAL A 11. 25.569 32.010 13.881 1.00 17.83. 1 11 3 mmCIF - Extract from a Data File
13
13 Summary mmCIF has provided the PDB with a robust data representation which serves as conceptual and physical schema upon which the current RCSB, PDBe and PDBj are built This work predated XML and XML-schema but embodies the important concepts inherent in these descriptions mmCIF was later exactly converted into XML and is now used more than mmCIF, but much less than the old PDB format Today mmCIF has no advantage over PDB
14
Other representations SMILES http://en.wikipedia.org/wiki/Simplified_mol ecular-input_line-entry_system 14
15
Other representations 15
16
Representing Positions Cartesian coordinates (x,y,z) are an easy and natural means of representing a position in 3D space There are many other alternatives such as polar notation (r,θ,φ) and you can invent others if you want to
17
Other representations -Cartesian coordinates vs. polar coordinates 17
18
The center of the graph is called the pole. Angles are measured from the positive x axis. Points are represented by a radius and an angle (r, ) radiusangle To plot the point First find the angle Then move out along the terminal side 5
19
Let's generalize this to find formulas for converting from rectangular to polar coordinates. (x, y) r y x
20
Let's generalize the conversion from polar to rectangular coordinates. r y x
21
How would you calculate distance? How would you calculate centroid? How would you calculate dihedral angle? 21
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.