Download presentation
Presentation is loading. Please wait.
Published byRoberta Curtis Modified over 9 years ago
2
MERG Contents 1.Bioportal A) Registration. B) Managing projects, files, and jobs. C) Submitting / checking jobs. 2.AIR (Appender, Identifier, and Remover) 3.PhyloSity
3
MERG If you do not have Bioportal account. Step 1: Feide –Login using your UIO email address (Instant access / Registration) Step 2: Email: bioportal-drift@usit.uio.nobioportal-drift@usit.uio.no Subject: bpcourse access. ------------------------------------------------------------------ If you already have Bioportal account: Proceed with Step 2 only.
4
MERG Bioportal Bioportal is a web-based bioinformatics service at University of Oslo (http://www.bioportal.uio.no/).http://www.bioportal.uio.no/ 590 CPU total connected to the Bioportal Additional access to over 4000 CPUs on TITAN cluster Continual software upgrades. Total number of jobs in 2009 = 17 515 (>1 500 000 CPU hours). Till today = 15332 jobs already.
5
MERG Applications available on Bioportal Phylogenetic analysis MrBayes PAUP PhyML Phylobayes Garli PAML Modeltest/Protest RAxML PHASE POY Treefinder BEAST AIR Bioinformatics applications Blast MAFFT PhyloSity Newbler Pfam PhredPhrap AUTODOCK4 Adscreening Preassemble Transeq Population genetics FAMHAP LAMARC NPMLE STRUCTURE PHASE UNPHASED SIMWALK2 PSCL Chemistry / Statistical application DALTON DIRAC GAUSSIAN Meltprofile
6
MERG Create/Manage project Upload files Select files and Application Check status of submitted jobs
7
MERG
8
Merge several single gene alignment into one multi gene alignment Identifying fast evolving sites Removing fast evolving sites
9
MERG File 1 File 2 File n >Human >Human >Human atgcatgcatgcatgcATGCATGCATGCATGC atgcatgcatgcatgc > Rat >Rat >Rat atgcatgcatgcatgcATGCATGCATGCATGC atgcatgcatgcatgc >Cow>Cow >Cow atgcatgcatgcatgcATGCATGCATGCATGC atgcatgcatgcatgc >Horse>Horse >Dog atgcatgcatgcatgcATGCATGCATGCATGC atgcatgcatgcatgc>Dog atgcatgcatgcatgcATGCATGCATGCATGC >Human atgcatgcatgcatgc--ATGCATGCATGCATGC--atgcatgcatgcatgc > Rat atgcatgcatgcatgc--ATGCATGCATGCATGC--atgcatgcatgcatgc >Cow atgcatgcatgcatgc--ATGCATGCATGCATGC--atgcatgcatgcatgc >Horse atgcatgcatgcatgc--ATGCATGCATGCATGC--?????????????? >Dog atgcatgcatgcatgc--ATGCATGCATGCATGC--atgcatgcatgcatgc
10
MERG Slowest - 1 Fastest - 8 PAML (by ZihengYang) - http://abacus.gene.ucl.ac.uk/software/
11
MERG AIR-Remover : Web-Interface on Bioportal Generates a ready to use alignment file after removing the fast evolving sites. Generates output file with statistics about the sites in rates files. Generates colored alignment file displaying the removed sites in RED.
12
MERG 135 genes 65 taxa
13
Phylosity - An online pipeline for processing and clustering of 454 amplicon reads into OTUs followed by taxonomic annotation
14
Test Dataset download link: https://www.bioportal.uio.no/onlinemat/online_material.php
15
The Pipeline includes :- Three steps: 1.Filtering low quality sequence reads followed by trimming the undesired segment. 2.Clustering sequence reads in Operational Taxonomic Unit (OTUs). 3.Taxonomic annotation of sequences / OTUs using BLAST.
17
Input files 1.Raw or Processed 454 Sequence data (.zip)
18
>FVCIMFP01AS1H7 length=251 AACAACGC…………………….. >FVCIMFP01ATTT6 length=201 AACAACGC…………………….. >FVCIMFP01APYQN length=227 TCACTCGC…………………….. >FVCIMFP01AS1R7 length=281 AACAACGC…………………….. >FVCIMFP01ATTS6 length=281 AACAACGC…………………….. >FVCIMFP01APYAN length=247 TCACTCGC…………………….. >FVCIMFP01AS15I length=281 AACAACGC…………………….. >FVCIMFP01ATTG2 length=281 AACAACGC…………………….. >FVCIMFP01APHGR length=247 TCACTCGC…………………….. >S1|FVCIMFP01AS1H7 length=251 AACAACGC…………………….. >S1|FVCIMFP01ATTT6 length=201 AACAACGC…………………….. >S1|FVCIMFP01APYQN length=227 TCACTCGC…………………….. >S2|FVCIMFP01AS1R7 length=281 AACAACGC…………………….. >S2|FVCIMFP01ATTS6 length=281 AACAACGC…………………….. >S2|FVCIMFP01APYAN length=247 TCACTCGC…………………….. >S3|FVCIMFP01AS15I length=281 AACAACGC…………………….. >S3|FVCIMFP01ATTG2 length=281 AACAACGC…………………….. >S3|FVCIMFP01APHGR length=247 TCACTCGC…………………….. 2.METADATA list (.txt)
19
AACAAC AACCGA GGCTAC TTCTCG GCTGCGTTCTTCATCGATGC CCTTGTTACGACTTTTACTTCC CTGATGGCGCGAGGGAGGC 3. TPA file(.txt)
20
T - Sequences with incorrect tags P - Sequences with non-matching primers C – Sequences with non-compatible tags N - Sequences with Ns L - Sequences with length < user specified value (e.g. 150) H - Collapse homopolymers D - Identical sequences G - Trim tags or/and primers from the sequences A - Trim Adaptor sequences Filtering and Trimming
21
>S3|FVCIMFP10F76XJ|308 Accepted Reject ed >S3|FVCIMFP10F76XJ|308|T-TTCTCG >S3|FVCIMFP10F76XJ|308|T-TTCTCG|FPY >S3|FVCIMFP10F76XJ|308|T-TTCTCG|FPY|RPY >S3|FVCIMFP10F76XJ|308|T-TTCTCG|FPY|RPY|rTY-CGAGAA >S3|FVCIMFP10F76XJ|308|T-TTCTCG|FPY|RPY|rTY-CGAGAA|AR >S3|FVCIMFP10F76XJ|308|T-TTCTCG|FPY|RPY|rTY-CGAGAA|AR|GB|L=287|D_2|HP_1_287:286 Trim tags Length Duplicates Homopolymers
22
Clustering - BLASTCLUST Clustering by a single-linkage method. The program begins with pairwise matches and places a sequence in a cluster if the sequence matches at least one sequence already in the cluster. BLASTCLUST used megablast algorithm for DNA sequences and blastp for protein sequences. Longest sequence is the representative sequences of each cluster. ftp://ftp.ncbi.nih.gov/blast/executables/release/2.2.24/
23
Clustering - CDHIT Fast greedy incremental clustering process. Sequences are first sorted in order of decreasing length. The longest one becomes the representative of the first cluster Then, each remaining sequence is compared to the representatives of existing cluster.
24
Clustering - CDHIT If the similarity with any representative is above a given threshold, it is grouped into that cluster. Otherwise, a new cluster is defined with that sequence as representative. Download link: http://www.bioinformatics.org/cd-hit/
25
Blast search BLASTN Blast search parameters NCBI-nr / custom database Custom databases is not automated but can be made available within the pipeline on Bioportal Blast parsing options (Overlapping% & Identity%)
26
Test Dataset download link: https://www.bioportal.uio.no/onlinemat/online_material.php
27
MERG
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.