Download presentation
Presentation is loading. Please wait.
Published byMarcia McGee Modified over 6 years ago
1
Command Line For windows an “ok” ssh program is putty.
The favored operating system flavor in computational biology is UNIX/LINUX. The command line is similar to DOS. Some of the frequently used commands are here pwd ls ls –l chmod chmod a+x blastall.sh chmod 755 *.sh cd cd .. cd $HOME passwd ps ps aux rm more cat vi (text editor) ps ps aux ssh sftp For windows an “ok” ssh program is putty. UConn also has a site license for the ssh program from ssh.com
2
UNIX Basic UNIX commands
ls, cd, chmod, cp, rm, mkdir, more (or) less, vi, ps, kill -9, man A brief listing is here chmod is a particular pain in the Under unix every file has an owner and the owner, his group and everyone else have permissions to read, write and/or execute the file (or they don’t). If you want to see which permissions are currently assigned to your files, type ls -l at the command prompt. chmod a+x *.pl gives everyone execute permission for all files that end with .pl the * is a wildcard. (warning don't ever use rm in conjunction with *)
For more on chmod type ”man chmod“.
(In the OSX GUI you can control click at a file, and change permissions in the info box). Most ssh clients (FUGU and SSH) allow you to use a GUI to change file permissions (in FUGU ctrl click).
3
Unix - command line interface
If you tried to execute a command, and you made a mistake, for example, you mistyped a file name, you can recall the last command using the up arrow (down arrow for more recent). If you are tired typing long filenames, you can use the tab key to complete the line, provided there is only one way to complete the line. E.g: cd /Desktop could be replaced by cd /D<tab> If there are two or more choices you hear a boing, if you hit <tab> again, you get a list of choices. If you want to become more familiar with the unix command line, the code-academy has a good introduction at
4
characters at the end of lines
File tranfers from Windows to UNIX and return: End of Line characters are a problem. Under Windows DO NOT use notepad, it does not understand UNIX newline symbols ‘\n’. Best write your programs under UNIX using vi or vim (or any other editor you are comfortable with) 2nd best is to use a text editor like textwrangler (very nice and free program for UNIX). Like vi and vim it provides context dependent coloring. 3rd best is to remove end of line symbols in a UNIX editor or use sed (Stream EDitor) after you transferred the file:
sed s/.$// name_of_WINDOWS_infile > name_of_UNIX_outfile
(This replaces the last non letter character before the eol ($) with nothing) Some versions of office allow to change files as UNIX textfiles, but ... A related problem is encountered by Mac users. Most text editors will use MAC carriage returns at the end of the line. Most unix programs will not be able to handle these. In a terminal window you could use the following command to convert your file:
tr ’\r' ’\n' < name_of_the_Mac_file > name_of_the_unix_file
If you are working in a GUI environment, you also could use the convertNewLines.app program (install it in your application folder, drag the file you want to convert into the icon). The program is available here. The EoL confusion is very inconvenient, but there really is no easy solution, tough luck; and you better know about this in case something goes wrong.
5
Special characters: \n #newline \t #tab
6
To move files between local PC and server:
For windows machines: install ssh client from ftp://ftp.uconn.edu/restricted/ssh/ For Macintosh computers: install Filezilla (client!) from
7
Example: SSH to bbcsrv3.biotech.uconn.edu
qlogin formatdb -i p_abyssi.faa -o T -p T blastall -i t_maritima.faa -d p_abyssi.faa -o blast.out -p blastp -e 10 -m 8 -a2 ./extract_lines.pl blast.out Perl script that only retains the first hit and gets rid of comment lines sftp results load into spreadsheet sort data, do histogram … the extract_lines.pl script is here (you can sftp it into your account, you’ll need to chmod 755 extr*.pl afterwards)
8
Genbank Founded in 1982 at the Los Alamos National Laboratory
Initially managed at Stanford in conjunction with the BIOSCI/Bionet news groups transition to the NCBI on the east coast One precursor was Margaret Dayhoff’s Atlas of Protein Sequence and Structure In 1987 genbank fit onto a few 360 KB floppy disks. Genbank uses a flat file database format (see NCBI does not use a relational databank (as in Oracle, peoplesoft) NCBI stores data in ASN.1 format ( which allows to hardwire crosslinks to other data bases, and makes retrieval of related information fast. NCBI’s sample record ( contains links to most the fields used in the gbk flatfile. In the genbank records at NCBI the links connect to the features (i.e. the pubmed record, or the encoded protein sequence) --- not easy to work with.
9
Dr. Margaret Belle (Oakley) Dayhoff March 11, 1925 – February 5, 1983
Among other things, we owe her the first nucleotide and protein data bank, the PAM substitution matrix, and the single letter amino acid code. (Image from wikipedia)
10
Atlas of Protein Sequences 1972 (cont)
The Atlas also contained RNA sequences, and PAM matrix for nucleotides
11
Atlas of Protein Sequences 1972 (cont)
Contained phylogenetic reconstructions that went back in time to far before the Last Unversal Common Ancestor (LUCA) aka the cenancestor of all living cellular organisms alive today. tRNA phylogeny
12
PAM 250 log (odds) matrix Dayhoff recoding
13
Selecting Scoring Matrices
Choose a matrix appropriate to the suspected degree of sequence identity between the query and its target sequences PAM: empirically derived for close relatives BLOSUM: empirically derived for distant relatives Kerfeld and Scott, PLoS Biology 2011 13 8. Teaching Tools
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.