Motivational Lecture: UNIX and computer-aided design of new medicines. Alexey Onufriev
Why bother?Example: rational drug design. Drug agent e.g:viral endonuclease (cuts DNA, RNA) If you block the enzyme's function – you kill the virus.
Molecular surface of acetyl choline esterase molecule (structure by Sussman et al.) color coded by electrostatic potential. Challenge: Find a small, non-toxic molecule that will selectively fit right into the cavity. And only this cavity, no other one!
Example of successful computer-aided (rational) drug design: One of the drugs that stopped AIDS epidemic in the US (part of anti-retro viral cocktail). The drug blocks the function of a key viral protein. To design the drug, one needs a precise 3D structure of that protein.
Biological function = f( 3D molecular structure ) Workhorses of biological functions = Proteins. Proteins = Protein Data Bank
Protein Data Bank ~ 30,000 structures
HEADER OXYGEN TRANSPORT 13-DEC M TITLE SPERM WHALE MYOGLOBIN F46V N-BUTYL ISOCYANIDE AT PH 9.0 COMPND MOL_ID: 1; COMPND 2 MOLECULE: MYOGLOBIN; COMPND 3 CHAIN: NULL; COMPND 4 ENGINEERED: SYNTHETIC GENE; COMPND 5 MUTATION: INS(M0), F46V, D122N SOURCE MOL_ID: 1; SOURCE 2 ORGANISM_SCIENTIFIC: PHYSETER CATODON; SOURCE 3 ORGANISM_COMMON: SPERM WHALE; SOURCE 4 TISSUE: SKELETAL MUSCLE; SOURCE 5 CELLULAR_LOCATION: CYTOPLASM; SOURCE 6 EXPRESSION_SYSTEM: ESCHERICHIA COLI; SOURCE 7 EXPRESSION_SYSTEM_STRAIN: PHAGE RESISTANT SOURCE 8 EXPRESSION_SYSTEM_CELLULAR_LOCATION: SOURCE 9 EXPRESSION_SYSTEM_VECTOR_TYPE: PLASMID; SOURCE 10 EXPRESSION_SYSTEM_PLASMID: PEMBL 19+ KEYWDS LIGAND BINDING, OXYGEN STORAGE, OXYGEN BINDING, HEME, KEYWDS 2 OXYGEN TRANSPORT EXPDTA X-RAY DIFFRACTION AUTHOR R.D.SMITH,J.S.OLSON,G.N.PHILLIPS JUNIOR A typical PDB entry (header) myoglobin
PDB Key Part: atomic coordiantes (x,y,z) ATOM 1 N MET N ATOM 2 CA MET C ATOM 3 C MET C ATOM 4 O MET O ATOM 5 CB MET C ATOM 6 CG MET C ATOM 7 SD MET S ATOM 8 CE MET C ATOM 9 N VAL N ATOM 10 CA VAL C ATOM 11 C VAL C ATOM 12 O VAL O ATOM 13 CB VAL C ATOM 14 CG1 VAL C ATOM 15 CG2 VAL C ATOM 16 N LEU N ATOM 17 CA LEU C ATOM 18 C LEU C ATOM 19 O LEU O ATOM 20 CB LEU C ATOM 21 CG LEU C ATOM 22 CD1 LEU C Problem: No Hydrogens! What's missing?
So, why are PDB files missing hydrogens? Experiment: H-atoms are so small, X-ray does not see them.
Approach: use a combination of physics, chemistry and computer science to create a web-server that will add hydrogens to 3D protein structure, based on a set of algorithms. Challenge: A lot of heterogeneous steps, performed by many different pieces of scientific software that have to talk to each other. Solution: use UNIX to orchestrate the process. Why UNIX? Because UNIX applications talk to each other See
Placing hydrogens flow chart: Pdb entry without H Initial (dumb) placing of hydrogens (easy) Pdb pre-processing. Clean-up, Building missing parts, etc. Use AWK, PERL, BASH scripts Smart placing of H. Involves trying all different positions ( and minimizing the total energy. Use various scientific cods developed for UNIX platforms) “Completed” PDB entry + pKs
Conclusions: UNIX tools have been used to build a useful web-based tools for drug design and molecular modeling. Virtually impossible to duplicate under a WINDOWS environment due to incompatibility of various proprietary codes that one would have to use if one tried (well, no one did...)