GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use.

Slides:



Advertisements
Similar presentations
Chapter 4 Computation Bjarne Stroustrup
Advertisements

01/20/10 11/01/2015. Easy installation Examples in this presentation from windows and unix installations. –Options or Auto mode In windows a self extracting.
CIS 118 – Intro to UNIX Shells 1. 2 What is a shell? Bourne shell – Developed by Steve Bourne at AT&T Korn shell – Developed by David Korn at AT&T C-shell.
MICROCONTROLLED HOME Keith Jones EKU Deparment of Technology CEN.
Introduction to EMBOSS Gary Williams. What is EMBOSS? n Wisconsin package, GCG n Widely used, sources available for inspection n EGCG - academic.
EMBOSS INTERFACES Gary Williams. Interfaces n Web u EMBOSS W2H u PISE u wEMBOSS u celbalo u PBI n X-Windows u GCG - Seqlab u EMBOSS - SPIN, (+ others.
Guide To UNIX Using Linux Third Edition
Memory & Storage Architecture Seoul National University Computer Architecture “ Bomb Lab Hints” 2nd semester, 2014 Modified version : The original.
Guide To UNIX Using Linux Third Edition
Basic Unix Dr Tim Cutts Team Leader Systems Support Group Infrastructure Management Team.
CGI Programming: Part 1. What is CGI? CGI = Common Gateway Interface Provides a standardized way for web browsers to: –Call programs on a server. –Pass.
Lecture 8 Configuring a Printer-using Magic Filter Introduction to IP Addressing.
SQL Query Extras MIS 433. Rerunning the last Query n Type the forward slash “/” to rerun the last query that was entered.
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
Advanced Shell Programming. 2 Objectives Use techniques to ensure a script is employing the correct shell Set the default shell Configure Bash login and.
1 THE UNIX FILE SYSTEM By Chokechai Chuensukanant ID COSC 513 Operating System.
Introduction to Python
Some Ideas on Final Project. Feature extraction TGGCCGTACGAGTAACGGACTGGCTGTCTTCTCGT n CCGATACCCCCCACGCGAAACCCTACACATCAAAT p AGCTAACTAGAGTCACTCCTTAGGATAGTGAGCGT.
1 CSC’s unix environment. 2 corona.csc.fi and sepeli.csc.fi.
Chapter Four UNIX File Processing. 2 Lesson A Extracting Information from Files.
The UNIX Shell. The Shell Program that constantly runs at terminal after a user has logged in. Prompts the user and waits for user input. Interprets command.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Introducing EMBOSS/ Jemboss European Molecular Biology Open Software Suite Dr. Erik Bongcam-Rudloff.
MCB 5472 Assignment #6: HMMER and using perl to perform repetitive tasks February 26, 2014.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Day 7 Installing Software RPM tar, mtools make, ssh.
The WinMine Toolkit Max Chickering. Build Statistical Models From Data Dependency Networks Bayesian Networks Local Distributions –Trees Multinomial /

Shell Script Programming. 2 Using UNIX Shell Scripts Unlike high-level language programs, shell scripts do not have to be converted into machine language.
Linux+ Guide to Linux Certification, Third Edition
Linux Operations and Administration
Problem Statement: Users can get too busy at work or at home to check the current weather condition for sever weather. Many of the free weather software.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
Variables and ConstantstMyn1 Variables and Constants PHP stands for: ”PHP: Hypertext Preprocessor”, and it is a server-side programming language. Special.
Term 2, 2011 Week 1. CONTENTS Problem-solving methodology Programming and scripting languages – Programming languages Programming languages – Scripting.
Intro to PHP IST2101. Review: HTML & Tags 2IST210.
Setting up Cygwin Computer Organization I 1 May 2010 ©2010 McQuain Cygwin: getting the setup tool Free, almost complete UNIX environment emulation.
IBC233 Lecture 2 Updated Winter 2008 Agenda Test next Week – Jan 23 ISeries Architecture CL (Control Language) Library Lists Operations Navigator.
Introduction to Perl “Practical Extraction and Report Language” “Pathologically Eclectic Rubbish Lister”
Slide 1 Project 1 Task 2 T&N3311 PJ1 Information & Communications Technology HD in Telecommunications and Networking Task 2 Briefing The Design of a Computer.
A Genomics View of Unix. General Unix Tips To use the command line start X11 and type commands into the “xterm” window A few things about unix commands:
©Colin Jamison 2004 Shell scripting in Linux Colin Jamison.
Perl Tutorial. Why PERL ??? Practical extraction and report language Similar to shell script but lot easier and more powerful Easy availablity All details.
UK MRC Human Genome Mapping Project Resource Centre Jemboss – a Graphical User Interface for the EMBOSS suite of programs.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
Lesson 3-Touring Utilities and System Features. Overview Employing fundamental utilities. Linux terminal sessions. Managing input and output. Using special.
Asking the USER for values to use in a software 1 Input.
Files Tutor: You will need ….
Introduction to Python Dr. José M. Reyes Álamo. 2 Three Rules of Programming Rule 1: Think before you program Rule 2: A program is a human-readable set.
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python Karsten Hokamp, PhD Genetics TCD, 03/11/2015.
Lecture 6: Output 1.Presenting results in a professional manner 2.semicolon, disp(), fprintf() 3.Placeholders 4.Special characters 5.Format-modifiers 1.
1 Homework Done the reading? –K&R –Glass Chapters 1 and 2 Applied for cs240? (If not, keep at it!) Gotten a UNIX account? (If not, keep at it!)
1 Project 3 String Methods. Project 3: String Methods Write a program to do the following string manipulations: Prompt the user to enter a phrase and.
1 Agenda  Unit 7: Introduction to Programming Using JavaScript T. Jumana Abu Shmais – AOU - Riyadh.
IST 210: PHP Basics IST 210: Organization of Data IST2101.
Getting Started With Python Brendan Routledge
Linux Administration Working with the BASH Shell.
EMBOSS "The European Molecular Biology Open Software Suite "
9/21/04 James Gallagher Server Installation and Testing: Hands-on ● Install the CGI server with the HDF and FreeForm handlers ● Link data so the server.
9/13/ :29:51 AM.
Tutorial for using Case It for bioinformatics analyses
Computer Architecture “Bomb Lab Hints”
Sirena Hardy HRMS Trainer
T. Jumana Abu Shmais – AOU - Riyadh
Chapter Four UNIX File Processing.
Web DB Programming: PHP
Functions continued.
Cygwin: getting the setup tool
Presentation transcript:

GCG vs EMBOSS Gary Williams

Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use the tools that do the job n This is a comparison of GCG and EMBOSS to help you decide

Interfaces n Web u W2H available for both u EMBOSS W2H still has rough edges u PISE u Others under development n X-Windows u GCG - Seqlab u EMBOSS - SPIN, (+ others coming) n Telnet/xterm/Character-based u emnu

Command line is very similar n The UNIX command line interfaces of GCG and EMBOSS are very similar. n You type the name of the program n You can add any options you want to the command-line n Press the RETURN key n Any mandatory information that was not on the command-line will be prompted for.

GCG command-line % name -other=thing This is the name program that reads a sequence and writes out something. NAME what sequence ? embl:hsfau1 Begin (* 1 *) ? End (* 2016 *) ? Reverse (* No *) ? What should I call the output (* hsfau.name *) ?

EMBOSS command-line % name -other thing Reads in sequences and writes a thing Input sequence(s): embl:hsfau1 Output data [hsfau1.name]: n Use ‘-ask’ to make EMBOSS programs prompt for the start and end of sequences

Some common options n Running in scripts, don’t prompt, just fail if command-line is insufficient u GCG: -default u EMBOSS: -auto n Help on options u GCG: -check u EMBOSS: -help or -help -verbose n Boolean options (Yes/No, True/False) u GCG: -thing, -nothing u EMBOSS: -thing, -nothing, -thing=T, -thing=F, -thing=1, -thing=0, -thing=Y, -thing=N

Sequence options in EMBOSS "-sequence" related qualifiers -sbegin integer first base used -send integer last base used, def=seq length -sreverse bool reverse (if DNA) -sask bool ask for begin/end/reverse -slower bool make lower case -supper bool make upper case -sformat string input sequence format -ufo string UFO features

Sequence options in EMBOSS "-outseq" related qualifiers -osformat string output sequence format -ossingle bool separate file for each entry

EMBOSS general options -debug bool write debug output to program.dbg -auto bool turn off prompts -stdout bool write standard output -filter bool read standard input, write standard output -options bool prompt for required and optional values -verbose bool report some/full command line options -help bool report command line options

Data files n GCG uses ‘..’ to divide comments from data n EMBOSS does not use ‘..’ n In general, EMBOSS uses ‘#’ to mark a comment line n Use ‘embossdata’ to extract and check on data files. n As in GCG, data files copied into the current or home directory are used in preference to the originals.

List files (files of file names) n Similar to GCG lists files, but no ‘..’ n Comment lines start with ‘#’ n Can contain the names of other list files: # This is my list file embl:hsfau embl:ggg* myfile.seq:clone10

File formats n GCG u only GCG format, MSF and RSF n EMBOSS u many formats u automatically recognised u can specify using ‘::’ or ‘-osf’ u eg: clustal::globin.aln -osf gcg

One file, many sequences n GCG u Only one sequence per GCG file n EMBOSS u One or more sequences per file u Default is to write all sequences to one file u -ossingle will change to writing many files u GCG, Staden and plain format files can only hold one sequence per file.

Features n GCG u No concept of feature tables n EMBOSS u Many programs now write out results as GFF u Soon, all programs that find things will write the results as GFF u GFF will become another sequence format u Programs to manipulate and display sets of features are planned u c.f. showfeat, coderet, maskfeat, diffseq

Databases n EMBOSS is poor at grouping many databases under one name n E.G. Need a way of referring to ‘embl’ and ‘emblnew’ as one database. n This will be done, but currently, a list file containing the following seems best: embl:* emblnew:*

Command line wildcards n GCG: u embl:* - no problem n EMBOSS: u embl:* - UNIX complains it can’t find the files u solution is to quote it: u “embl:*” u or: u embl:\*

HELP n GCG: u genman, genhelp n EMBOSS u tfm

What program does what? n See David Martin’s list of equivalences: n NB this doesn’t list EMBOSS programs with no equivalent in GCG!

What EMBOSS does NOT do n The major deficiencies in the EMBOSS package are: n BLAST, FASTA, ASSEMBLY n You should use the publicly available software: u Blast - NCBI, HGMP, many other sites u Fasta - HGMP u Assembly - Staden package

What EMBOSS does do n Giving ‘stdout’ as the output file name makes output go to the screen. n Much effort is put into removing arbitrary limits. u E.g. Max. sequence length: 2Gb u Many programs limited only by available memory n Source code available for inspection, change and writing your own programs n EMBOSS is FREE! u GNU Public Licence u Open Source Software

THE END