Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logistics of tutorial All scripts are pre-written & all data is pre-processed Data can be re-processed without deleting old files first In the tutorial.

Similar presentations


Presentation on theme: "Logistics of tutorial All scripts are pre-written & all data is pre-processed Data can be re-processed without deleting old files first In the tutorial."— Presentation transcript:

1 Logistics of tutorial All scripts are pre-written & all data is pre-processed Data can be re-processed without deleting old files first In the tutorial all commands are highlighted in Courier bold font with an underline and begin with prompt$ prompt$ pwd Tutorial assumes a basic knowledge of terminal commands, NMRPipe, and NMRDraw Processed data is saved as 3D NMRPipe data and 2D projections. It is helpful to examine both and to view horizontal and vertical slices through peaks. Tutorial uses planes 79 and 167 for comparison for HNCACB spectrum You can work along or in pairs GET DATA

2 Goals of tutorial Introduce NMRbox platform
Showcase NMRbox with NUS tools Not many users have the multitude of spectrum reconstruction tools in a single location Demonstrate potential Now that the platform is maturing we can focus attention to enhancing tools, providing training materials, etc. For this tutorial – Build a simple system to process a data set with multiple NUS tools with minimal user input Hopefully encourage more users to try non-uniform sampling Provide information about sampling schedules Show the output of various NUS tools

3 NUS Processing Tools Tools for processing a NUS HNCACB experiment with the tutorial directory and a non-rigorous processing time estimate for reference. Tool Data Directory after Processing Time (seconds) dft-nmrpipe 23 dft-rnmrtk rnmrtk 44 MaxEnt (RNMRTK) maxent LONER (RNMRTK) loner hmsIST hmsist 207 / 750 NMRPipe IST nmrpipe-ist 1254 NESTA-NMR nestanmr 51 SMILE smile qMDD (MDD) mddnmr mddnmr.proc 60-120 qMDD (CS) CAMERA (MaxEnt) camera 226 nmr_wash (SCRUB) scrub 581 nmr_wash (CLEAN) clean

4 HNCACB 12 kDa four helix bundle
Experimental setup grpdly = x 360 = t1 t2 t3 Nucleus N CACB HN Echo-Antiecho no Reference 47.742 4.773 Phases 0, 0 90, 0 ZF size 256 1024 FT options alt, neg alt Extract Region 9.52 – 5.9 ppm Sampling 800 points (hyper-complex) (21%) 1024 (complex) Max increment 50 76

5 HNCACB Sample Schedule (nuslist)
prompt$ cd hncacb_nus prompt$ more nuslist prompt$ cat nuslist | sort –n -k2 -k1 | more Note: Data is hyper-complex in all three dimensions. Note: Sample schedule (zero-indexed). Processing programs (one-indexed) Sample schedule Pts (C, Total) FID # (2n+1), (2n+2) Plane (2n+1, 2n+2) t1 t2 (sorted) x (t3)(HN) y (t1)(N) z (t2)(C) 1024, 2048 1, 2 2 5, 6 5 11, 12 12 25, 26 15 31, 32 20 73 41, 42 147, 148 31 63, 64 36 73, 74 74 (missing) 149, 150 49 75 99, 100 151, 152 Total number of nmrPipe planes

6 HNCACB Sample Schedule (nuslist)
Sample schedule with x axis t1 and y axis t2. Red dots represent data collected. The max value in t1 is 49 (50) and the max value in t2 is 75 (76). Total coverage is 21%

7 HNCACB Sample Schedule Point Spread Function
By applying a FT of the sample schedule with a 1 for values in the sample schedule and 0 for missing values you obtain a point spread function (PSF). The PSF is convolved with all signals in the spectrum leading to “sampling noise” in the spectrum.

8 General Workflow Create fid.com conversion script
Generate a conversion script to convert to multiple formats Generate a processing configuration file (autonus.cfg) Process data with a nu-dft Process data NUS tool Analyze the spectrum

9 nuDFT File conversion (fid.com)
prompt$ cd hncacb_nus prompt$ bruker Press “Read Parameters” Press “Save Script” Press “Quit” prompt$ more fid.com

10 Generate processing configuration file (autonus.cfg)
prompt$ autonus setup This generates a text file called autonus.cfg prompt$ more autonus.cfg The text file contains some basic input questions and sections for processing along the acquisition dimension, t1, t2, as well as sections for some of the NUS tools. Initially one can guess as to some of the values, but some will need to be determined as we process the data Utilizing the nu-dft is a quick way to set processing parameters and to ensure the data conversion worked properly For the workshop the values should be set appropriately for the HNCACB spectrum without any editing necessary

11 Create conversion scripts
prompt$ autonus convert prompt$ ls -ltr fid.com (dft-nmrpipe, nmrpipe-ist, smile, nestanmr, scrub, clean) convert_rnmrtk.com (dft-rnmrtk, maxent, loner) convert_hmsist.com (hmsist, camera) convert_mdd.com (mdd) convert.com (All) prompt$ more convert.com All the conversion scripts concatenated. The autonus tool is not robust right now, but works well for the HNCACB spectrum. prompt$ ./convert.com This will run all the conversion scripts All data saved in fid NOTE: The nmrpipe and rnmrtk formats are fully expanded with zeros in place of any missing FIDs Allow a nu-dft of the spectrum

12 nuDFT (nmrpipe) prompt$ cd hncacb_nus prompt$ autonus dft-nmrpipe
prompt$ ls -ltr prompt$ more dft-nmrpipe.com Script broken into 5 sections Acquisition dimension for expanded data (x saved back to x [xyz]) Intermediate files will be used by: dft-nmrpipe, nmrpipe-ist, clean, scrub, nestanmr Acquisition dimension for expanded data (x saved to z [yzx]) smile Acquisition dimension for compressed data (x saved to z [yzx]) hmsist, camera Process t1 dimension for nu-dft (y saved back to y) Process t2 dimension for nu-dft (z saved to y)

13 nuDFT (nmrpipe) prompt$ ./dft-nmrpipe.com prompt$ ./ls -ltr
nu-dft data saved in dft-nmrpipe directory. Intermediate t3 processed data saved in xyz and yzx folders for later use prompt$ nmrDraw –in dft-nmrpipe/dft%03d.ft3

14 Rowland NMR Toolkit (RNMRTK) Overview
Originally developed as a platform for developing new NMR data processing methods Now widely used as a general processing platform Provides Rich set of apodization (window) functions DFT processing General processing tools (phasing, reversing, truncating, etc.) Robust, efficient Linear Prediction (LP) extrapolation MaxEnt reconstruction (deconvolution, nonuniform sampling) universal data reader exports to nmrPipe, XEASY, Felix Generates synthetic data, noise (testing/error analysis) Based on a program “VNMR” developed by Jeff Hoch in the late 1970’s Redesigned in 1985 and then in the early 1990s Free for Academic use and included in NMRbox l1-norm real (LONER) created and will be bundled in NMRbox very soon! Included as of yesterday!!

15 RNMRTK – Suite of Programs
section – Manages the shared memory sections rnmrtk – The main processing program in the Toolkit Loading, apodization, DFT, phasing, saving, etc. flip – Forward linear prediction msa / msa2d / msa3d – Maximum entropy reconstruction inject – Add synthetic peaks to time domain data select1d / select2d – Throws away all data points whose index is not specified in a sample schedule seepln / contour – graphical display for 1D and 2D data Many others …

16 RNMRTK – Shared memory section
With a suite of programs for performing tasks – How do you get data output from one command to the next command without needing to write out intermediate files? Pipes (nmrPipe strategy) Shared memory (RNMRTK strategy) A shared memory section is a chunk of persistent memory where NMR data can be stored and used by different programs. If not deleted the shared memory section is persistent across logins (best practice is to delete shared memory section in processing script). section RNMRTK program to manage (create / delete) shared memory sections Let’s create a shared memory section prompt$ section -c The size of the shared memory section is the product of the arguments multiplied by 4 (32 bit data) plus 512 extra bytes for a header. Shared memory sections must be large enough to handle the data at its largest size. prompt$ section -d The maximum size of shared memory sections that can be created are defined by kernel values shmmax and shmall. NMRbox takes care of these advanced settings.

17 RNMRTK – rnmrtk rnmrtk Command for performing many of the traditional processing techniques Can be run: one command at a time in interactive mode scripted Row (column) oriented. Many commands need to have the dimension set.

18 RNMRTK – Issuing commands
Arguments are entered as different classes: floats (defined as a number with a decimal) setpar SF integers (defined as a number without a decimal) zerofill 2048 string (defined as a character string without a period) sinebell square 70.0 filename (defined as a character string with a decimal point) loadvnmr ./fid load hsqc.sec The order of arguments entered is important, but only within a given class of argument sstdc COS sstdc COS 20 rnmrtk is case insensitive (except for filenames) DIM command sets the dimension which is necessary for many commands rnmrtk << EOF LOAD test.sec DIM t1 FFT EOF LOAD command does not need dimension set FFT command must have dimension set

19 RNMRTK – Issuing commands
From the command line: bash% rnmrtk loadvnmr ./fid bash% rnmrtk setpar SF bash% * DIM command not persistent In interactive mode bash% rnmrtk rnmrtk% loadvnmr ./fid rnmrtk% DIM t2 rnmrtk% FFT rnmrtk% exit As a script bash% processing_script.com Typical script #! bin/bash section –c rnmrtk << EOF loadvnmr ./fid setpar PPM dim t2 zerofill 1024 fft 0.5 phase realpart EOF dim t1 zerofill 128 save test.sec Here doc Cannot have spaces at the beginning of lines

20 RNMRTK – Load data Varian rnmrtk loadvnmr ./fid Bruker
Need to create a parameter file called ser.par Example: Dom T1 T2 T3 N 64 C C C sw sf ppm Quad STATES STATES STATES Format Little-endian Int Layout t2:280 t1:128 t3:1024 Order of dimensions in memory (after loading) Note spacing between value and C # of padding bytes at beginning of fid # of padding bytes at end of fid # of header bytes Endian is defined by the computer architecture Layout defines order of dimensions in input file

21 RNMRTK – msa2d Reads parameters from a file and dimensions are specified on invocation line Ex: msa2d t1 t2 msa2d.param Constant Aim parameter file DEBUG 1 NLOOPS 400 DEF 0.1 AIM 2.5 SCALEFIRST 0.5 SCHED ./sample_schedule NUSE NOUT PHASE LW JVALUE Constant LAMBDA parameter file DEBUG 1 NLOOPS 400 DEF 0.1 LAMBDA 1.0 SCALEFIRST 0.5 SCHED ./sample_schedule NUSE NOUT PHASE LW JVALUE Max loops, convergence may be quicker NUS Uniform Data will be extrapolated For 3D NUS datasets the acquisition dimension is processed with conventional FT and then the indirect dimensions are processed with msa2d

22 RNMRTK – msa2d Monitoring convergence Very important!
Ensure that the value of test is less than 0.1 unless fewer than NLOOPS required Without deconvolution convergence typically occurs is around 40 iterations if proper def, aim, lambda values were use. High numbers of loops suggest msa2d is modeling noise With deconvolution the number of loops may be significantly higher.

23 nuDFT (rnmrtk) prompt$ cd hncacb_nus prompt$ autonus dft-rnmrtk
prompt$ more dft-rnmrtk.com Script processes data in all three dimensions with nu-DFT Two intermediate files are generated noisecalc.sec Used by noisecalc to estimate values for DEF and AIM for MaxEnt reconstruction f3_proc.sec Used by MaxEnt and LONER as starting point for reconstruction in the indirect dimensions prompt$ nmrDraw –in rnmrtk/dft-rnmrtk_f3f2.ft3 If everything worked the data should look virtually identical to the nmrPipe nuDFT earlier Intermediate files ready for automatic MaxEnt reconstruction

24 Auto MaxEnt Reconstruction
The maximum entropy algorithm in RNMRTK has three adjustable parameters; def, aim, and lambda. Data can be processed with constant aim mode or constant lambda mode. Workflow Choose Def and Aim Process a representative plane in Constant Aim mode Look up the converged Lambda value Process the whole dataset in Constant Lambda mode Choosing def and aim aim should be a small integer multiple of the rms noise in the data compute the rms for a blank region of a 1D (typically use fid with weakest signal) def should be a value smaller than that of the smallest expected peak too small and the baseline noise distribution becomes “spikey” too large and the noise level will be too high noisecalc – A tool developed by Mehdi Mobli to compute reasonable values of def and aim given a blank region of a 1D spectrum autonus – A tool that will perform all these steps in an automated fashion

25 Auto MaxEnt Reconstruction
prompt$ cd hncacb_nus prompt$ autonus maxent Script performs the Workflow steps of the previous slide automatically Reads in t3 processed data Analyzes spectrum for noise and estimates DEF and AIM Processes whole spectrum in constant AIM mode Examines the log file and averages Lambda Processes whole spectrum in constant Lambda mode Saves data and outputs some information Maximum number of loops Values between 20 and 80 are typical Reports DEF, AIM, and LAMBDA prompt$ ls -ltr maxent.com (Processing script) msa2d_param (MaxEnt paramter file) msa2d.txt (MaxEnt reconstruction log)

26 Auto MaxEnt Reconstruction
prompt$ more msa2d_param DEF determined from noisecalc multiplied by “def multiplication factor” LAMBDA determined from trial run in constant AIM mode Final data size extrapolated to 256x256

27 Auto MaxEnt Reconstruction
prompt$ nmrDraw -in maxent/maxent_f3f2.ft3 Examine the data as for nuDFT [plane 167, HN ~8.5 ppm and plane 79, HN ~9.3 ppm] Planes 167 (left) and 79 (right)

28 Automatic LONER Reconstruction
The l1-norm real algorithm in RNMRTK has one adjustable parameters; aim Processing Workflow Read in t3 processed data Analyzes spectrum for noise and estimates AIM Reconstruct indirect dimensions with LONER Saves data and outputs some information Maximum number of loops Reports AIM prompt$ more loner_param prompt$ nmrDraw –in loner/loner_f3f2.ft3

29 IST, SMILE, NESTA-NMR, CAMERA Workflows
Convert time domain data Process the acquisition dimension with NMRPipe Reconstruct missing time domain data Process two indirect dimensions with NMRPipe

30 hmsIST prompt$ autonus hmsist prompt$ ls –ltr hmsist.com (IST script)
phf2pipe.com (Script to rearrange data) run-hmsist.com (Script to run hmsist.com and parallalize calculations) hmsist-ft23.com (Script to process t1/t2 dimensions after hmsIST_ hmsist-all.com (Script to run them all) prompt$ more hmsist.com prompt$ hmsIST -help Ideally -xN and -yN would be 256, but calculation time is too long prompt$ ./hmisist-all.com

31 hmsIST prompt$ nmrDraw –in hmsist/hmsist%03d.ft3
Planes 167 (left) and 79 (right) Phase distortions?

32 Use the specStat.com command to measure noise in FT’d spectrum
NMRPipe IST prompt$ autonus nmrpipe-ist prompt$ more nmrpipe-ist.com Workflow Process expanded data with FT in all dimensions to fine tune parameters Use the specStat.com command to measure noise in FT’d spectrum Process data with ist3D.com with the measured noise from specStat.com as the criteria for convergence (istMaxRes) Notes ist3D.com processes the acquisition dimension with FT, performs an IST calculation to fill in missing points in t1/t2, and then processes the t1/t2 planes with FT all internally. Arguments can be passed to change default values for phases, FT arguments, apodization, etc. In the latest version -istMaxRes can be set to Auto w/o running specStat.com

33 NMRPipe IST prompt$ ./nmrpipe-ist.com
prompt$ nmrDraw -in nmrpipe-ist/ist%03d.ft3 Planes 167 (left) and 79 (right) No phase distortions

34 NESTA-NMR prompt$ cd hncacb_nus prompt$ autonus nestanmr
prompt$ ls –ltr prompt$ more nestanmr.com prompt$ ./nestanmr.com prompt$ ./nestanmr-ft23.com Examine the data as for nuDFT [plane 167, HN ~8.5 ppm and plane 79, HN ~9.3 ppm]

35 NESTA-NMR prompt$ nmrDraw -in nestanmr/nestanmr%03d.ft3
Planes 167 (left) and 79 (right) Significant distortions exist when extrapolating points.

36 SMILE prompt$ cd hncacb_nus prompt$ autonus smile prompt$ ls –ltr
prompt$ more smile.com prompt$ ./smile.com

37 SMILE prompt$ nmrDraw –in smile/smile%03d.ft3
Planes 167 (left) and 79 (right)

38 MDD & MDD-Compressed Sensing
prompt$ cd hncacb_nus prompt$ gedit MDD-protocol.txt & The MDD-protocol.txt file has instructions for using MDD prompt$ qMDD & After MDD opens choose “Open spectrum” and hit “Choose” If the hncacb_nus.proc directory already exists you can leave its contents or rebuild the directory from scratch. If will be faster to leave it. Once the parameters have been edited the data can be processed by checking the Zero checkbox (nuDFT), MDD checkbox (MDD), or CD checkbox (MDD-CS) and then hitting the Run button Note that the proc.sh file can be edited to change the proc_out parameter to save the data to a different location to not overwrite processed data. prompt$ cd ../hncacb_nus.proc This is the location of the data after processing The file hncacb_nus.ft3 is the processed data. You can rename it to a different filename as outlined in the MDD-protocol.txt file. I leave it to you to open the projections and 3D data sets to compare results, but I show my results here.

39 MDD MDD with NCOMP=25, RMDD=yyn, NOISE=0.7, and NITER=50
Planes 167 (left) and 79 (right)

40 MDD-Compressed Sensing
MDD-CS with CS_niter=10, CS_alg=IRLS, CS_lambda=1, and CS_norm=1 Planes 167 (left) and 79 (right)

41 MaxEnt with CAMERA prompt$ cd hncacb_nus prompt$ more camera.com
prompt$ ./camera.com

42 MaxEnt with CAMERA prompt$ nmrDraw –in camera/camera%03d.ft3
Planes 167 (left) and 79 (right)

43 SCRUB and CLEAN prompt$ cd hncacb_nus prompt$ autonus scrub
The SCRUB and CLEAN algorithms installed in NMRbox take data that has been processed in all three dimensions with conventional FT methods and “scrubs” or “cleans” the sampling noise from the spectrum. prompt$ more scrub.com scrub and clean read and write NMRPipe formatted data scrub and clean introduce a one point shift in the data fixed by a CS command

44 SCRUB and CLEAN Note: scrub.com and clean.com take around 10 mins each to process prompt$ ./scrub.com prompt$ nmrDraw -in scrub/scrub%03d.ft3 Planes 167 (left) and 79 (right)

45 SCRUB and CLEAN Clean can be run identically to scrub
prompt$ nmrDraw -in clean/clean%03d.ft3 Planes 167 (left) and 79 (right)

46 Comparing results The best way to compare results is by examining planes from a 3D head to head and examining the noise in 2D projections as we have been doing throughout the demo. Quality can be examined by comparing: How much “sampling noise” was de-convolved The resolution of the indirect dimensions The presence of any artifacts, such as phase twists that are present If the noise in the 2D planes of the 3D and 2D projections is Gaussian We can also compare the results as a strip plot. prompt$ cd hncacb_nus prompt$ more scroll.com prompt$ ./scroll.com 4 Feel free to change the percent contour level What does scroll.com do? It takes the nmrpipe-ist 2D projection and picks the peaks as a 1H-15N HSQC and then builds strips from each of the spectrum stacked next to each other. The contour levels are determined by the value entered on the command line as an argument. NOTE: The RNMRTK program outputs NMRPipe files as a single file and the scroll.com script needs the spectrum in the NMRPipe 2D plane mode, thus the MaxEnt reconstrution with RNMRTK is not part of the strip plot.

47 HNCA of MSG 82 kDa (2H/13CA/15N)
Experimental setup grpdly = x 360 = t1 t2 t3 Nucleus N CA HN Echo-Antiecho yes no Reference 56.618 4.657 Phases 0(90)(68)(-22), 0 0, 0 71.6(161.6), 0 ZF size 64 1024 2048 FT options neg alt Extract Region 10.2 – 6.2 ppm Sampling 1600 points (hyper-complex) (8.4%) 1024 (complex) Max increment 32 594

48 NUS Processing Tools for HNHA
Data can be processed in the same manner as HNCACB with a few notes. A few of the scripts were tweaked from how autonus created them All the changes dealt with sensitivity enhancement for compressed data used for hmsist and camera which the autonus tool does not handle properly yet. You can just view, edit, and run the scripts in-place for hmsist and camera LONER will not work with the dataset as there is a phase correction in t1 that is not allowed. To speed up calculations only a 2 ppm region along f3 was kept as the ROI

49 Uniform HNCO (ni=128, ni2=128) prompt$ cd hnco-large.fid
The HNCO was run with t1 and t2 collected out to 128 points. The process.com script processes the data with a sample schedule entered as an argument with RNMRTK’s maximum entropy reconstruction and performs a nuDFT. Any point not in the sample schedule will be ignored during processing. The only criteria is that the sample schedule not exceed 128 in either dimension and that the schedule is 1 indexed. The output files are: BaseOutputName_nudft.ft3 (3D) & BaseOutputName_nudft_proj.ft2 (Projection) BaseOutputName_msa2d.ft3 (3D) & BaseOutputName_msa2d_proj.ft2 (Projection) prompt$ more rnmrtk.com Script to process HNCO with FT in all dimensions with RNMRTK prompt$ more nmrpipe.com Script to process HNCO with FT in all dimensions with NMRPipe prompt$ ./rnmrtk.com Create a sample schedule making sure that increment values do not exceed 128 in either of the two dimensions prompt$ ./msa2d.com SampleSchedule.scd BaseOutputName After running the data will be saved with the filename Base_output_name.ft3


Download ppt "Logistics of tutorial All scripts are pre-written & all data is pre-processed Data can be re-processed without deleting old files first In the tutorial."

Similar presentations


Ads by Google