Download presentation
Presentation is loading. Please wait.
1
Introduction to NMRbox
Project & Platform Mark Maciejewski UConn Health Thanks the organizers for giving me a chance to speak today. The title is “Towards reproducible computation for NMR with NMRbox. Primarily focused on a new virtual machine we are building pre-configured with software used in all aspects of NMR data processing and analysis “Think inside the box”
2
Outline Lecture Hands on Motivation for the project NMRbox platform
Benefits to users and developers Usage Hands on Account management Connect to NMRbox VM Set the display resolution Software inventory File transfers The NMRbox project will deliver… NMRbox VM – A virtual machine pre-configured with a range of software used in biological NMR Will provide access to significant computational resources through individual VMs and computational clusters Will provide advanced training material which in many cases will be integrated into the VM The platform will have BMRB integration Annotation of workflows And interoperability between different NMR software packages In addition Bayesian tools will be incorporated into some existing NMR software packages And an API will be developed for developers to incorporate Bayesian inference into their packages.
3
Motivation: Abundance of software
Figure shows a weighted “word cloud” based on software frequency from the BMRB ~120 packages from BMRB As part of the NMRbox project we have identified over 200 packages From looking through J. Bio NMR BMRB and simple web search Hundreds of packages cited in BioMagResBank depositions, J. Bio-NMR, and other journals.
4
Motivation: Fragmentation
Operating systems Programming languages Another motivation is the fragmentation of platforms Several different operating systems, programming languages, and libraries This leads to an enormous burden for software developers and end users attempting to install software and is a major burden for non-computational experts. Libraries BLAS
5
Motivation: Challenges to persistency
Platforms become obsolete Developers graduate Software time bombs Many NMR software packages lack persistency for a variety of reasons Platforms become obsolete Graduate students or other developers move on and leave the lab This leaves many programs “as-is” and makes them difficult to keep running on newer platforms Grants end Software time-bombs. In order to push users to keep their software up-to-date to avoid issues with platforms and the evolution of OSs developers sometimes put time bombs in their software. While this helps to a degree for actively developed packages it still leads to old versions of their software not being persistent and can lead to problems if the developer ends their support. Grants end
6
Motivation: Meta-software packages
SHIFTX2 Sparky Rosetta MODELLER NMRPipe Python scripts Another motivation for NMRbox is the growing number of meta-packages such as a new software package called Compass from Chad Rienstra’s lab. This program attempts to predict the structure of a protein from solid state spectra The compass program itself is developed as python scripts, but The workflow relies on NMRPipe, Sparky, Rosetta, and ShiftX, Modeller Not only does the end user now need to install Compass They need to install the dependent programs They also need to configure compass based on the installation of the other packages. Compass will undoubtedly rely on certain versions of these ancillary programs which can lead to issues for end users. These issues combined make it very difficult for non-experts to utilize NMR software and adds to an overly high activation barrier for a researcher to dive into NMR Experimental protein structure verification by scoring with a single, unassigned NMR spectrum. Courtney, Rienstra, et al., Structure, 2015.
7
Motivation: Computational reproducibility
A computational study is reproducible when it provides the “complete software environment needed to reproduce the figures” - D. Donoho, Stanford Obstacles Missing primary empirical data Missing meta-data Missing software (scripts, programs) Non-persistence of software Manual interventions Read from slide through obstacles
8
Challenges Question How do we address these challenges?
Abundance of software (discovery) Fragmentation of OS’s, programming languages, libraries Persistency of software Complexity of design and installation Reproducibility of results Read from slide through obstacles Question How do we address these challenges? Answer NMRbox VM
9
NMRbox Project Deliverables – primary tools
Platform NMRbox VM: A virtual machine pre-configured with a wide range of software used in biological NMR Significant computational resources Data BMRB integration & richer depositions Metadata management and workflow annotation Analytics Bayesian tools to enhance data analysis and interpretation API for developers to incorporate Bayesian inference Read from slide through obstacles
10
Deliverables – community services
Training and Dissemination Workshops, tutorials, and guides User and developer support Driving Biological Projects (DBPs) Test beds for NMRbox technology development What limits your progress? Collaboration and Service (C&S) Apply technologies to challenging biomedical research problems Read from slide through obstacles
11
NMRbox VM Acquisition Persistent Access Guest OS
Agnostic – Install all software available Persistent Archive all versions Access Platform-as-a-Service (PaaS) Standalone / Downloadable VM Guest OS Xubuntu LTS Read the slide Note on Agnostic – We are trying to have VMs with a wide range of software. We will work hard to enhance the workflows of the most used software, but at the same time allow everyone access to their “favorite” software. There have also been some efforts lately for developers to release the software as a VM for easier installation, such as NMRPipe. The issue then is that you would have multiple VMs for all the software installed – we hope to have everything under a single umbrella.
12
NMRbox VM Content – Software packages
100+ packages installed (see Spectral reconstruction Spectral visualization Automated assignment Structure determination Molecular visualization Validation Chemical shift prediction Dynamics Residual dipolar coupling Meta packages General purpose Instrument manufactures Read the slide Note on Agnostic – We are trying to have VMs with a wide range of software. We will work hard to enhance the workflows of the most used software, but at the same time allow everyone access to their “favorite” software. There have also been some efforts lately for developers to release the software as a VM for easier installation, such as NMRPipe. The issue then is that you would have multiple VMs for all the software installed – we hope to have everything under a single umbrella.
13
NMRbox VM Content – Productivity Tools over a dozen editors
scientific python packages R and R tools office tools drawing tools Octave shells browsers Dropbox virtual keyboard (helpful with VNC) Read the slide Note on Agnostic – We are trying to have VMs with a wide range of software. We will work hard to enhance the workflows of the most used software, but at the same time allow everyone access to their “favorite” software. There have also been some efforts lately for developers to release the software as a VM for easier installation, such as NMRPipe. The issue then is that you would have multiple VMs for all the software installed – we hope to have everything under a single umbrella.
14
NMRbox VM Features added in Release 3 GPUs to support 3D drawing
PyMOL, VMD, Chimera, and others GPUs to support CUDA processing NAMD, others coming soon Commercial software dataChord spectrum Analyst, dataChord spectrum Miner, MestReNova Matlab compiled binaries ALATIS, GUARDD, TITAN See Release notes at - Read the slide Note on Agnostic – We are trying to have VMs with a wide range of software. We will work hard to enhance the workflows of the most used software, but at the same time allow everyone access to their “favorite” software. There have also been some efforts lately for developers to release the software as a VM for easier installation, such as NMRPipe. The issue then is that you would have multiple VMs for all the software installed – we hope to have everything under a single umbrella.
15
Virtual Machine Terminology
A software-based emulation of a guest computer backed by the physical resources of a host computer, managed by a hypervisor. VM = Access Local installation (standalone or downloadable) Connect to server (PaaS = Platform-as-a-Service) Advantages Over-subscribe the host computer Snapshot the VM and restore to any point Run multiple OS’s on a single computer “spin-up” VMs in minutes Dynamically load balance VMs across multiple hosts No performance penalties on modern computers Read the slide Note on Agnostic – We are trying to have VMs with a wide range of software. We will work hard to enhance the workflows of the most used software, but at the same time allow everyone access to their “favorite” software. There have also been some efforts lately for developers to release the software as a VM for easier installation, such as NMRPipe. The issue then is that you would have multiple VMs for all the software installed – we hope to have everything under a single umbrella.
16
Standalone NMRbox VM host computer hypervisor NMRbox (guest) shared
folder OS / NMR software user accounts Just to get a feel for how an end user would interact with a downloadable VM here is a short animation User would download a hypervisor software package such as VirtualBox Then download NMRbox Start the hypervisor and then import the NMRbox VM Essentially the user would have a fully functional OS pre-configured with a wide variety of software used in NMR data processing and analysis. User would then need to get their data into the VM Data is a bit trickier with a local VM. Your data can reside in a virtual disk (however this is a single flat file to the OS and can be dangerous) Shared folders work great, but can be tricky to configure the hypervisor to access at times. USB or file servers are the best but require additional hardware. These issues are resolved with a PaaS version of the VM
17
High Performance Storage
PaaS NMRbox VM Authentication Server VM host server Remote Users NMRbox VM - 1 CPU, Ram, NIC NMRbox VM - 2 CPU, Ram, NIC user data Cloud Storage backups user data user home folders NMR Software OS Files In a PaaS version each user will have their own NMRbox VM spun-up on our servers. They will access the VM with full GUI via RealVNC or ssh for advanced users A key is that the user storage and authentication is all separated from the VMs allowing seamless migration as new versions of NMRbox VMs are released and for going back to older versions if needed. High Performance Storage
18
PaaS deployed with enterprise-class resources
100 GB network 12 VM servers 480 cores 3.8 TB memory Redundant internal network Network attached storage 100’s of TBs available to NMRbox Ultra reliable cloud storage in excess of PB NMRbox VM is being deployed at UConn Health with enterprise level hardware The research network has a 100 GB network connection to our ISP That feeds into a 40 GB network fabric connecting all the switches in the datacenter VM hosts and compute clusters are connected via 10 GB connections with a separate 10 GB dedicated connection to storage The VM hosts will run the NMRbox VMs for individuals and developers Users home folders and the files needed to run the VMs are on fast storage with performance similar to a local SSD We also have access to cloud storage for backups and extra space for user data. The university has a 3 PB geo-dispersed storage system that continues to grow and offers unmatched reliability. It is currently configured for 15 – 9s of reliability. Users will connect via ssh or RealVNC. RealVNC offers several benefits Full GUI Free and runs on all devices Everything is encrypted Built-in file transfer for those not comfortable with scp Maps your local printer Runs in daemon mode. Users just connects and does nothing else. 38 NVIDIA GPUs dramatically increasing graphic performance & CUDA processing
19
VM Requirements for Users
Server based PaaS VM ssh or VNC (Windows, OSX, Linux, tablet, phone, 32/64-bit hardware, …) network connection Standalone / Downloadable VM 64-bit hardware (Windows, OSX, Linux,…) any modern laptop and desktop Oracle VirtualBox VMware Workstation Fusion Player
20
Benefits Users Developers Instructors “Zero-configuration” Access
Training Computational resources Discovery Persistence Reproducibility Cost Developers Stable platform Discovery Usage metrics Persistence Community Developer tools Computational resources Instructors Access to NMRbox VMs for courses and workshops
21
Practical aspects Large VM model Backups Downloadable version
Many cores, high memory, and GPUs Multiple users per VM, each user has two VMs (username.nmrbox.org and username2.nmrbox.org) GPUs restrict VM management Backups User home folders backed up daily Snapshots taken daily Backups / snapshots are for catastrophic failures Data deleted accidentally by users MAY be able to be restored timing (was the file captured, still exist?) Downloadable version Downloadable version in final testing
22
Practical aspects NMRbox versions and updates
Super major release (every 2 years) Update the version of the OS Ubuntu LTS (two year release cycle, 5 years of updates) Major releases (1 - 4 months) Update software versions & add new software Short PaaS outage (a few minutes) Minor releases (as needed) Add software packages Packages added “live” to PaaS (no disruption to service) Patches (as needed) Security patches and bug fixes (no disruption to service)
23
Practical aspects Older versions All versions archived
After major release, old version remains running with reduced resources (e.g. version2.nmrbox.org) Older versions available for download
24
Practical aspects Large memory VM Home folder and archive folder
A large memory VM can be “spun-up” for users if needed Home folder and archive folder Each user has two home folders; /home/nmrbox/username and /nmr/archive/username Google Group We have started a Google Group at Search for NMRbox to join. Support
25
Practical aspects Host workshops with NMRbox VMs
The NMRbox team will “spin-up” custom VMs to support other workshops File permissions and access Home and archive folders are not accessible to others by default. Will setup lab groups if desired. /public folder for quick sharing Contact us Suggestions for packages to include Suggestions about the package Issues with the NMRbox platform Trash Folder Trash folders cleaned (files older than 30 days)
26
NMRbox Usage 500+ worldwide users
27
37/50 US States, ~80% of users from US
NMRbox Usage 37/50 US States, ~80% of users from US
28
NMRbox Usage 25 Countries
29
NMRbox Usage package total_runs total_users rnmrtk 41846863 69 nmrpipe
186 shiftx2-v110-linux 482070 3 amber16 215403 24 openbabel-2.4.1 105769 6 hmsIST 101733 44 nmr-scripts 66566 141 cns_solve_1.3 62756 67 mddnmr 51360 62 cns_solve_1.21 28272 nustool 19380 65 xplor-nih-2.43 12322 35 rosetta 7076 28 nmrfam-sparky 5285 97 namd_gpu 4556 namd_cpu 3121 9 shifts-5.1 2119 37 connjurst 2027 56 ensemble 1698 ccpnmr 1197 xplor-nih-2.45 1113 7 molmol 968 34 modelfree 873 16 NMRViewJ 621 57 aria2.3 614 vmd 611 61 NMRFxProcessor 486 Redcat 334 4 chimera 301 21 relax 291 27 pymol 262 54 redcraft 251 13 pymol 228 26 flexible-meccano 211 12 fmcgui2.5_linux 189 16 TENSORV2_PC9 167 24 cyana-3.97 166 glove 142 11 camera 111 cara 83 INCHI-1 78 14 nmr_wash linux 68 15 cpmg_fitd9 66 21 pales 63 7 ponderosa 60 nestanmr 57 TREND-1.0 52 8 tinker 48 6 ALATIS 43 17 GISSMO 41 rnmr 37 fastmodelfree 33 5 MestReNova 29 9 ssp 4 BMRB-CS-Rosetta-Submission topspin 25 nessy adapt_nmr_enhancer azara-2.8
30
Cite NMRbox Very Important!! If you utilize NMRbox in your research please cite and acknowledge us. Details at NMRbox: A Resource for Biomolecular NMR Computation. Maciejewski, M.W., Schuyler, A.D., Gryk, M.R., Moraru, I.I., Romero, P.R., Ulrich, E.L., Eghbalnia, H.R., Livny, M., Delaglio, F., and Hoch, J.C., Biophys J., 112: , [PMID: , DOI: /j.bpj ] "This study made use of NMRbox: National Center for Biomolecular NMR Data Processing and Analysis, a Biomedical Technology Research Resource (BTRR), which is supported by NIH grant P41GM (NIGMS)."
31
Logistics VNC client RealVNC Connect/Viewer (v6.x)
NMRbox account reset password if
32
RealVNC Benefits VNC server starts automatically when connecting
Display port determined automatically Fully encrypted connection Single sign-on authentication Default local printer mapped to VNC server VNC sessions stay active when closed Free for users
33
Quick tour account management connect to VM set the display resolution
software inventory file transfers support resources
34
Account management
35
Account management
36
Connect to NMRbox PaaS Launch RealVNC client
In the address bar enter … ordinarily: <username>.nmrbox.org today: umd1.nmrbox.org or umd2.nmrbox.org NOTE: Do not manually start a VNC server and do not enter a VNC port number Enter your NMRbox username / password NOTE: username is pre-filled with your local machine username, which may not be your NMRbox username Connections are saved for easy access. If you sign-up for a RealVNC account, the connections can be shared across your devices.
37
Connect to NMRbox PaaS – Common issues
Entering the incorrect username / password Getting your computer banned Eight unsuccessful RealVNC logins in a short period of time will ban the computer for a short while ssh can also be banned, but is controlled by a different system Firewalls NMRbox PaaS is open Some institutions block outgoing traffic Blocking generally occurs on guest WiFi which sometimes only allows web traffic
38
#4 #2 #1 #3 #5 Window size & screen resolution Resolution changer
launch menu (top left) nmrbox-util Resolution Changer ALT: resolution-changer.py select resolution Apply #4 #2 #1 #3 #5
39
#5 Window size & screen resolution RealVNC Window
Window can be resized Behavior is dictated by Settings Options Scaling parameter Full screen mode Automatically scales both dimensions For high resolution laptops, use a smaller resolution and scale-up Display can span two screens. Three steps resolution_changer.py Double horizontal width RealVNC Settings Expert UseAllMonitors = True RealVNC Menu bar Full screen mode #5
40
#5 Window size & screen resolution RealVNC Window
Closing VNC window does not close server. You remained logged into the NMRbox VM server What if the VNC display is mis-behaving or you want to force a logout? Open a terminal > kill-vnc-server All VNC server sessions that are running under your username are shown Should only ever be one Current display is indicated Select the session to kill This will log out of the server Kill any running jobs Kill the VNC session Re-establish a connection from RealVNC #5
41
#5 Software inventory Inside NMRbox NMRbox.org website /usr/software
/usr/software/bin launch menu NMRbox.org website Software Registry: Release Notes: #5
42
#5 File transfer RealVNC built-in file transfer
Good old fashion scp / sftp Dropbox Globus #5
43
#5 File transfer: RealVNC built-in file transfer
SEND files from local computer to NMRbox hover mouse near top center of VNC window click on “File transfer” icon follow directions FETCH files from NMRbox to local computer click VNC icon in Xubuntu status bar (top right, next to username) select the icon in the top right with three horizontal lines select “File transfer …” menu item #5
44
#5 File transfer: scp / sFTP Secure Copy (scp) Secure FTP (sFTP)
from a terminal change to appropriate directory scp -r directory <username>.nmrbox.org:~/<path> scp file <username>.nmrbox.org:~/<path> Secure FTP (sFTP) Open your favorite sFTP client such as FileZilla Enter login information Host: <NMRbox username>.nmrbox.org Username: <NMRbox username> Password: <NMRbox password> Port: 22 #5
45
File transfer: scp / sFTP
accept the hostkey drag and drop files #5
46
#5 File transfer: Dropbox Inside NMRbox open a terminal
> dropbox start -i This will launch an installer enter your Dropbox username / password, or create an account Useful commands > dropbox help > dropbox status > dropbox stop > dropbox start Once connected the Dropbox icon will appear in the upper right hand corner of the RealVNC window #5
47
#5 File transfer: Globus What is it? How do I get it?
manage file transfers between any of your “endpoints” personal “endpoints” endpoints on your personal computers running “Globus Connect Personal” institutional “endpoints” NMRbox is a registered institutional “endpoint” How do I get it? homepage: look in footer for “Globus Connect Personal” (not “Server”) download for Mac / Linux / Windows follow directions NOTE: Globus Connect Personal is installed on your personal computer, not NMRbox #5
48
#5 File transfer: Globus How do I authenticate?
Goto and select “Log In” button #5
49
#5 File transfer: Globus How do I start the file transfer window?
Method 1: From the Globus icon (located in system tray) Control Globus Personal Connect Transfer Files Method 2: Goto #5
50
File transfer: Globus Select endpoints and transfer… #5
51
#5 Support resources NMRbox documentation NMRbox FAQ Tutorials
NMRbox FAQ Tutorials NMRbox VM: /tutorial Google Group Search for NMRbox #5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.