Download presentation
Presentation is loading. Please wait.
1
1 Course plan Course homepage tells you everything, updated Monday/Tuesday After each lecture, any extra material (slides, example programs) will become available from the schedule page
2
2 Today’s plan Go through the lecture note New example: distributing job over several machines Exercise session
3
3 A Unix Shell A shell is a ‘wrapper’ around an operating system Mediates access to its functionality: execute commands, interact with files,.. Write shell in Python: - get command from user - execute shell commands - pass on external commands to the operating system
4
4 A Unix Shell [mailund@dhcp-11-23-11 processes]$ python shell.py >>> echo "hello, world" Running: echo "hello, world" hello, world >>> ls Running: ls process-communication.html shell.py >>> ls *.html Running: ls *.html process-communication.html >>> quit [mailund@dhcp-11-23-11 processes]$ How interaction with the shell might look: Shell command, others are external commands
5
5 Executing external commands Function system from module os : system takes a command (string), executes it, returns the exit status (0 if everything went well)
6
6 Executing shell commands Before calling os.system, check if the given command is a shell command:
7
7 Executing shell commands Adding more shell commands: [2J is an ANSI terminal escape sequence
8
8 Print command output in red To clearly mark the output coming from external commands, change the font color before and after calling os.system :
9
9 Print command output in red Output:
10
10 Indent output Now we need access to the output of the command Use function os.popen instead of os.system: – Also runs external command but lets user give input to or read output from the command though a file object: f = os.popen(cmd, ‘r’) # for reading output f = os.popen(cmd, ‘w’) # for giving input
11
11 os.popen Use the object returned by os.popen as a file object: The exit status of the command is returned when closing the file: status = f.close()
12
12 Output
13
13 Redirecting output (Exercise PM.4) Redirecting output to a file using >
14
14 Test
15
15 Example program: job parallelization Sequence analysis may often be parallellized: 1.Split input Fasta file into N parts 2.Perform task on each part on some machine 3.Collect partial results into one Blast, Megablast, clustering, repeat finding,..
16
16 1: Split input Fasta file into N parts 1.Create new directory.dir 2.Split in N parts of approximately the same size 3.Place each part in its own directory.dir/part17/part17, put output file here too Reason: then the overall command’s output file (only one name is given) will not be overwritten for each part
17
17 splitfile.py (part 1 of 2)
18
18 splitfile.py (part 2 of 2)
19
19 2. Perform task on each part on some machine While not all parts have been processed: 1.Read list of available machines 2.Check which are idle 3.While idle machines: a)Assign part to idle machine and start new process 4.Sleep 30 seconds Regarding 3a): -Use rsh to start a remote shell on foreign machine -Use os.system to run command -Create empty file 6done when part 6 is done
20
20 Usage and timestamp methods distribute_fasta_job.py (part 1 of 4)
21
21 Read options, initialize distribute_fasta_job.py (part 2 of 4)
22
22 Various helper functions distribute_fasta_job.py (part 3 of 4)
23
23 Main loop distribute_fasta_job.py (part 4 of 4)
24
24 Test: blastn of yeti.fasta against nessie.fasta Command: python distribute_fasta_job.py -f /web/chili/public_html/TBiB2006/yeti.fasta -n 30 -c '/users/chili/BiRC/MolBio/Blast-2.2.6/blastall -p blastn -e 0.000000001 –d /web/chili/public_html/TBiB2006/nessie.fasta -i /web/chili/public_html/TBiB2006/yeti.fasta -o yeti-nessie.b' machines.txt: threonine methionine crick watson pauling #leucine
25
25 Output Out-commented pauling and watson from machine list
26
26 Output ctd’ed Put pauling and watson back in machine list
27
27 3. Collect partial results into one cat yeti.fasta.dir/part*/yeti-nessie.b > yeti-nessie.b Note: all output files have same name, therefore creating one subdir for each part was necessary
28
28.. on to the exercises Sample files for clustalw exercise in ~chili/public_html/TBiB2006/ : sequence file: demo.fasta DNA distance matrix: multalinDNA.clustal (use option –dnamatrix= )
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.