1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall Nirav Merchant Bio Computing & iPlant Collaborative Eric Lyons Plant Sciences & iPlant Collaborative University of Arizona or Will Computers Crash Genomics? Science Vol 331 Feb 2011
Tasks for today Managing your VM Add user, permission, security considerations etc. Understanding where the files are Terminal, editors etc Shell and scripting Start building your “Data Science ToolBox”
Step #1 for Big Data Toolkit Command line competency
Permissions Why do you need them ? What is a ACL (Access Control List) ? The UNIX model of permissions (next slides are from Greg Wilson at carpentry.org) carpentry.org Path statement and finding things
Has unique user name and user ID user
Has unique user name and user ID User name is text: "imhotep", "larry", "vlad", … user
Has unique user name and user ID User name is text: "imhotep", "larry", "vlad", … User ID is numeric (easier for computer to store)
usergroup
usergroup Has unique group name and group ID
usergroup Has unique group name and group ID User can belongs to zero or more groups
usergroup Has unique group name and group ID User can belongs to zero or more groups List is usually stored in /etc/group
usergroupall
usergroupall Everyone else
usergroupall Has user and group IDs
usergroupall read
usergroupall read write
usergroupall read write execute
usergroupall read ✔✔✗ write ✔✗✗ execute ✗✗✗
usergroupall read ✔✔✗ write ✔✗✗ execute ✗✗✗ File's owner can read and write it
usergroupall read ✔✔✗ write ✔✗✗ execute ✗✗✗ File's owner can read and write it Others in group can read
usergroupall read ✔✔✗ write ✔✗✗ execute ✗✗✗ File's can read and write it Others in group can read That's all
Where are my files ? Understanding layout of data –Home –Root –Tmp Permissions Storage space and planning for it Managing runaway items (more in next class)
Security considerations Update your OS (how can you do that ?) Why you should NEVER run as root (how do I add a user ?) Password and keys (and dual factor) Ssh foo
What is Shell? Shell is –Command Interpreter that turns text that you type (at the command line) in to actions: –User Interface: take the command from user Programming Shell can do –Customization of a Unix session –Scripting –Many Many automation steps
What is Shell? Shell is –Command Interpreter that turns text that you type (at the command line) in to actions: –User Interface: take the command from user Programming Shell can do –Customization of a Unix session –Scripting –Many Many automation steps
Customization of a Session Each shell supports some customization. –User prompt –Where to find mail –Shortcuts (alias) The customization takes place in startup files –Startup files are read by the shell when it starts up –The Startup files can differ for different shell
Popular Shells sh Bourne Shell ksh Korn Shell csh,tcsh C Shell (for this course) bash Bourne-Again Shell
Flavors of Unix Shells Two main flavors of Unix Shells –Bourne (or Standard Shell): sh, ksh, bash, zsh Fast $ for command prompt –C shell : csh, tcsh better for user customization and scripting %, > for command prompt To check shell: –% echo $SHELL (shell is a pre-defined variable) To switch shell: –% exec shellname (e.g., % exec bash)
Startup files and why you should care bash: /etc/profile (out-of-the-box login shell settings) /etc/bash.bashrc (out-of-box non-login settings) /etc/bash.bashrc.local (global non-login settings) ~/.bash_profile (login shell user customization) ~/.bashrc(non-login shell user customization) ~/.bash_logout (user exits from interactive login shell)
Some Special Keys How do you invoke tcsh ? Ctrl-U = Delete everything on the command- line Ctrl-A = Move cursor to the front Ctrl-E = Move cursor to the end Ctrl-P = Set the current command-line to the previous command Ctrl-N = Set the current command-line to the next command TAB = Filename completion
Preview pieces of toolbox We will work though Step 5 and go straight to commands
Next class Preparing to play with your data set – Can you download a piece of it ? Learn about space and process management Introduction to shell scripting and automation Start building your Big Data command line tool kit