Download presentation
Presentation is loading. Please wait.
Published byElwin Warren Modified over 9 years ago
1
Applied Bioinformatics Course Overview & Introduction to Linux Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu
2
What is bioinformatics 2 Bioinformatics Data Hypotheses Questions Samples Experiments DNA RNA Protein Metabolite Phenotype Sequence Expression Structure Interaction Storage/retrieval Visualization Computational methods Statistical methods Bioinformatics
3
Why now? 3 Bioinformatics Data Hypotheses Questions Samples Experiments DNA RNA Protein Metabolite Phenotype Sequence Expression Structure Interaction Storage/retrieval Visualization Computational methods Statistical methods informatics
4
Roles for different investigators in bioinformatics Algorithm developer Statisticians Mathematicians Computer scientists Tool developer Bioinformaticians Data provider/consumer Biologists 4 Graph courtesy of http://www.incogen.com/
5
Comprehensive resource list March 2015 174 Resources 623 Databases 1548 Tools 5 http://bioinformatics.ca/links_directory/
6
Sequence and structure databases Genbank: http://www.ncbi.nlm.nih.gov/genbank/http://www.ncbi.nlm.nih.gov/genbank/ Annotated collection of all publicly available DNA sequences 126,551,501,141 bases in 135,440,924 sequence as of April 2011 UniProt: http://www.uniprot.o rg/http://www.uniprot.o rg/ Comprehensive resource for protein sequences and functional information 534,242 reviewed entries as of January 2012 PDB: http://www.rcsb.org/http://www.rcsb.org/ 3D structures of large biological molecules, including proteins and nucleic acids 79,180 structures as of February 2012 Pfam: http://pfam.sanger.ac.uk/http://pfam.sanger.ac.uk/ Collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs) 13,672 families as of November 2011 6
7
Genome browsers UCSC genome browser http://genome.ucsc.edu/cgi-bin/hgGateway http://genome.ucsc.edu/cgi-bin/hgGateway Ensembl genome browser http://www.ensembl.org/index.html http://www.ensembl.org/index.html 7
8
Gene-centric databases Entrez Gene http://www.ncbi.nlm.nih.gov/gene http://www.ncbi.nlm.nih.gov/gene NCBI/NIH All completely sequenced genomes One gene per page Ensembl BioMart http://www.ensembl.org/biomart/martview http://www.ensembl.org/biomart/martview EMBL-EBI and Sanger Institute Vertebrates and other selected eukaryotic species Batch information retrieval 8
9
Gene expression data Gene Expression Omnibus (GEO) http://www.ncbi.nlm.nih.gov/geo/http://www.ncbi.nlm.nih.gov/geo/ ArrayExpress http://www.ebi.ac.uk/arrayexpress/ http://www.ebi.ac.uk/arrayexpress/ 9
10
Pathway and network resources Gene Ontology (GO): http://www.geneontology.org/http://www.geneontology.org/ Pathway databases KEGG: http://www.genome.jp/kegg/pathway.htmlhttp://www.genome.jp/kegg/pathway.html Reactome: http://www.reactome.org/http://www.reactome.org/ WikiPathways: http://www.wikipathways.org/http://www.wikipathways.org/ Protein-protein interaction databases DIP: http://dip.doe-mbi.ucla.edu/http://dip.doe-mbi.ucla.edu/ MINT: http://mint.bio.uniroma2.it/mint/http://mint.bio.uniroma2.it/mint/ BioGRID: http://www.thebiogrid.org/http://www.thebiogrid.org/ HPRD: http://www.hprd.orghttp://www.hprd.org Protein-DNA interaction database Transfac: http://www.gene-regulation.comhttp://www.gene-regulation.com 10
11
Course content and grades 11
12
Course materials and report submission Lecture slides available athttps://sites.google.com/site/vanderbiltigp2014/bioregulation-ii/minimester- 3/applied-bioinformaticshttps://sites.google.com/site/vanderbiltigp2014/bioregulation-ii/minimester- 3/applied-bioinformatics Project reports are due at 5pm on the due date (4/13, 4/22, 5/1). There will be a 10% per day deduction for late reports. Report 1 should be sent to Dr. Zhang, Reports 2 and 3 should be sent to Dr. Liu (see email addresses below). Instructor contact information Dr. Bing Zhang: bing.zhang@vanderbilt.edubing.zhang@vanderbilt.edu Dr. Qi Liu: qi.liu@vanderbilt.eduqi.liu@vanderbilt.edu 12
13
ACCRE Advanced Computing Center for Research & Education http://www.accre.vanderbilt.edu/ http://www.accre.vanderbilt.edu/ The compute cluster currently consists of more than 500 Linux systems with quad or hex core processors Linux system An operating system (OS) like Windows or Mac Portable, multi-tasking, multi-user OS High performance and free, making it idea for high performance computing clusters 13
14
Proper use of ACCRE Information in the ACCRE cluster group igp300b_ab may not contain data, information, technology, images, or software that is controlled under Federal Export Administration Regulations (EAR), International Traffic in Arms Regulations (ITAR), Patient Health Information (PHI), or Research Health Information (RHI) nor is it considered proprietary. 14
15
Get an ACCRE account http://www.accre.vanderbilt.edu/?page_id=617 Registration form Name, VUNetID, Department (VU), School (VU), Email, Phone, Position Group: IGP300b_ab (igp300b_ab) Primary research area: bioinformatics Primary application: Existing Application Primary application name: R Primary application type: Serial Expected typical number of processors: NA Expected typical number of concurrent running jobs: 1 Linux experience: Expected compilers/languages: C, C++, R, perl, python Expected external libraries: NA BlueArc User: No Other useful information: NA 15
16
Logging onto the cluster and change password Windows Application: Bitvise SSH (https://www.bitvise.com/ssh-client-download)https://www.bitvise.com/ssh-client-download Two steps: edit profile->save profile Host: vmplogin.accre.vanderbilt.edu Username: your_user_name Mac Spotlight to find the application: Terminal Command: ssh your_user_name@vmplogin.accre.vanderbilt.eduyour_user_name@vmplogin.accre.vanderbilt.edu Change password rsh auth passwd Exit exit 16
17
Logging onto the cluster and change password (using Bitvise SSH in Windows) 17
18
Logging onto the cluster and change password (using Terminal in Mac) 18 You won’t see any response while typing password, which is fine.
19
Hierarchical File system / binusrhomescratchetctmp chmod cp date grep mv rm vi igptestanniecodybinlib bindocssrc libc.so libgpfs.so libjpeg.so libstdc++.so diff find gcc id make perl ssh prog1.c prog2.f77 prog3.cpp myprog.sh dothis.pl dothat.py /home /home/igptest /home/igptest/src/prog3.cpp 19
20
Working with directories pwd (print your present working directory) ls (list directory contents) mkdir (make a directory) cd (change directory) .. (parent directory) . (current directory) ~ or no parameter (home directory) rmdir (remove an empty directory) 20
21
Absolute and relative paths Absolute path A file or directory location in relation to the root of the file system, always begin with a / Relative path A file or directory location in relation to where you currently are in the file system, will not begin with a / 21 Absolute path Relative path
22
Working with files more (display the contents of a file) space bar to show next page q to exist cp (copy a file) mv (rename/move a file) rm (remove a file) 22
23
Getting help man (display manual pages for a command) man ls (display manual for the ls command) space bar to show next page q to exist Alternatives of ls ls -a (do not ignore entries starting with.) ls -l (use a long listing format) ls -al (use a long listing format and do not ignore entries starting with.) 23
24
Editing files with nano cd ~ (change to home directory) nano.bashrc (use nano to edit file.bashrc, which includes commands that are executed when starting the system). Add “setpkgs –a R” to the end of the file (this will allow you to use the R environment which has been installed in the ACCRE system for statistical computing). A quick tutorial http://staffwww.fullcoll.edu/sedwards/Nano/IntroToNano.htmlhttp://staffwww.fullcoll.edu/sedwards/Nano/IntroToNano.html 24
25
Copying files to/from a local computer Windows Application: Bitvise SSH (https://www.bitvise.com/ssh-client-download)https://www.bitvise.com/ssh-client-download Mac Application: Cyberduck (https://it.vanderbilt.edu/software/downloads.php)https://it.vanderbilt.edu/software/downloads.php Connect to: vmplogin.accre.vanderbilt.edu Username: your_user_name Don’t change other items 25
26
Copying files to/from a local computer (using Bitvise SFTP in Windows) 26
27
Copying files to/from a local computer (using Fugu in Mac) 27
28
Summary 28 CommandMeaning rsh Remote shell passwdModify a user’s password exitExit the shell pwdDisplay the path of the current directory lsList files and directories ls -aList all files and directories ls -alList all files and directories in a long listing format mkdir Make a directory cd Change to named directory cdChange to home directory cd ~Change to home directory cd..Change to parent directory rmdir Remove a directory moreView the contents of a file cp Copy file1 and name the copied file file2 mv file2>Move or rename file1 to file2 rm Remove a file man Display manual pages for a command nano Use the nano text editor to view and edit a file
29
Exercise Create a test directory with the name “test” under your home Copy the file sample_file.txt under directory /home/igptest to your test directory Make a copy of the file, sample_file_1.txt View and modify the file sample_file_1.txt using nano, correct the typo (Warld -> World) Copy the file to your desktop Copy a file from your desktop to your test directory Add “setpkgs –a R” to the end of your.bashrc file Go through the required sections of the following tutorial before next class. http://ryanstutorials.net/linuxtutorial/http://ryanstutorials.net/linuxtutorial/ Required sections: 1, 2, 3, 4, 5, 9, 11 Optional sections: 8, 12 Advanced sections: 6, 7, 10, 13 29
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.