1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall Nirav Merchant Bio Computing & iPlant Collaborative Eric Lyons Plant Sciences & iPlant Collaborative University of Arizona or Will Computers Crash Genomics? Science Vol 331 Feb 2011
Tasks for today Log into shell.u.arizona.edu (ssh) also learn how to transfer files old school way Shell what is it good for ? Navigating in the shell Working with GNU core utils Data analysis on the command line Building your Big Data tool kit
LINUX fundamentals Ssh to shell.arizona.edu Directory structure Permissions Listing, >, <, | and use of ‘ and “ and ;
ssh keys and managing them Quick Intro to keys (public, private) Where will you use these keys ? Lets create keys to allow easier login to shell.u Use linux/mac tutorial s/how-to-set-up-ssh-keys--2 s/how-to-set-up-ssh-keys--2 Windows users visit putty: – ow-to-create-ssh-keys-with-putty-to-connect-to-a-vps ow-to-create-ssh-keys-with-putty-to-connect-to-a-vps – ow-to-use-pageant-to-streamline-ssh-key- authentication-with-putty ow-to-use-pageant-to-streamline-ssh-key- authentication-with-putty
Process management Use of & bf, fg Kill nice renice detaching and why you need tmux (or screen)
GNU Core utils ml_node/index.html#toc_Introduction ml_node/index.html#toc_Introduction Commands we will work with cat: Concatenate and write files tac: Concatenate and write files in reverse nl: Number lines and write files head: Output the first part of files tail: Output the last part of files split: Split a file into pieces. csplit: Split a file into context-determined pieces 6.1 wc: Print newline, word, and byte counts 6.2 sum: Print checksum and block counts 6.3 cksum: Print CRC checksum and byte counts 6.4 md5sum: Print or check MD5 digests
Hands on analysis-with-the-unix-shell/ analysis-with-the-unix-shell/ How are you going to get the data from git ? What is missing in this data set ? (how to fix ?) Do you have access to gnuplot ? Make a plot described in this exercise – Save the plot as pdf output – How are you going to view the pdf – Run this without interactive prompts i.e straight from command line
Preview pieces of toolbox We will work though Step 5 and go straight to commands We will work with csvkit today – – Download the sample data set from city country – Install pip and then install csvkit – Explore the multiple commands
Next class Please practice your command line skills Get a github account