Making the Tree of Life Accessible for Research This is a 20-minute overview with links to screencasts and demos, providing an introduction to the project.

Slides:



Advertisements
Similar presentations
Key-word Driven Automation Framework Shiva Kumar Soumya Dalvi May 25, 2007.
Advertisements

Web publishing training Introduction to the Tera text Web Content Management System (CMS) Learning Objective: Basic knowledge and skills required to publish.
FLIPPING THE CLASSROOM: ADVENTURES IN STUDENTS’ SELF DIRECTED STUDY ERI TOMITA AND JULIE DEVINE.
1 CBioC: Collaborative Bio- Curation Chitta Baral Department of Computer Science and Engineering Arizona State University.
Introducing Symposia : “ The digital repository that thinks like a librarian”
PayDox applications All features can be used independently.
Crawler-Based Search Engine By: Bryan Chapman, Ryan Caplet, Morris Wright.
UWWD In our quest to eliminate bad websites, we present…. HALLELUJAH!!
Presenter: Joshan V John Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan & Tien N. Nguyen Iowa State University, USA Instructor: Christoph Csallner 1 Joshan.
UNIT-V The MVC architecture and Struts Framework.
Linux Operations and Administration
Version Control with Subversion. What is Version Control Good For? Maintaining project/file history - so you don’t have to worry about it Managing collaboration.
Welcome to Drupal Crash course - Gartheeban Ganeshapillai.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
© 2008 LabKey Software Simplifying Scientific Data Management with LabKey Server January 29, 2009 Presenter: Peter Hussey,
Computing for Bioinformatics Introduction to databases What is a database? Database system components Data types DBMS architectures DBMS systems available.
1 HTML and CGI Scripting CSC8304 – Computing Environments for Bioinformatics - Lecture 10.
Classroom User Training June 29, 2005 Presented by:
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
CPSC 203 Introduction to Computers Lab 21, 22 By Jie Gao.
NODEJS, THE JOOMLA FRAMEWORK, AND THE FUTURE IAN MACLENNAN.
Interoperability Scenario Producing summary versions of compound multimedia historical documents.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
Jack Malloch Product Service Advisor Global Support Services.
DASISH Web Annotation Framework DWAN Annotator front- and backend November 2013, Nijmegen.
NiagaraAX Framework Version 3.8 Feature Overview
WEB API: WHY THEY MATTER ECOL 453/ Nirav Merchant
Christian M Zmasek, PhD Burnham Institute for Medical Research Bioinformatics and Systems Biology
CPSC 203 Introduction to Computers Lab 23 By Jie Gao.
MET280: Computing for Bioinformatics Introduction to databases What is a database? Not a spreadsheet. Data types and uses DBMS (DataBase Management System)
Personal Information Management Vitor R. Carvalho : Personalized Information Retrieval Carnegie Mellon University February 8 th 2005.
The S&I Tools & Repository April 12 th, S&I Tools and Repository Agenda: siframework.org S&I Repository repository.siframework.org.
BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback.
A curation interface for reconciliation of species names for India. Thomas Vattakaven and R. Prabhakar, India Biodiversity Portal, Strand Life Sciences,
Grid Computing Research Lab SUNY Binghamton 1 XCAT-C++: A High Performance Distributed CCA Framework Madhu Govindaraju.
Problem Statement: Users can get too busy at work or at home to check the current weather condition for sever weather. Many of the free weather software.
BIological NetwOrk Manager Cytoscape plugin Andrei Zinovyev Institut Curie/INSERM/Ecole de Mines, UMR 900 “Computational Systems Biology of Cancer”
1 / 22 AliRoot and AliEn Build Integration and Testing System.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Katy Wolstencroft myGrid University of Manchester IMPACT/Taverna Hackathon 2011.
Combining GATE and UIMA Ian Roberts. University of Sheffield NLP 2 Overview Introduction to UIMA Comparison with GATE Mapping annotations between GATE.
NeMys: an evolving biological information system, a state of art Deprez, Tim (UGent) Vincx, Magda (UGent) Vanden Berghe, Edward (VLIZ) Mees, Jan (VLIZ)
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
SWGData and Software Access - 1 UCB, Nov 15/16, 2006 THEMIS SCIENCE WORKING TEAM MEETING Data and Software Access Ken Bromund GST Inc., at NASA/GSFC.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
Dean Anderson Polk County, Oregon GIS in Action 2014 Modifying Open Source Software (A Case Study)
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
Walking with Wiki Presentation: Cameron Janzen. Overview What is a Wiki? What is the purpose? Example work Getting started – three main steps Creating.
1 openModeller Presentation Plan: Overview of openModeller OMWS: an open standard for distributed ecological niche modelling openModeller in relation to.
WP3 Design and Implementation of e-Hoop Learning platform & content Lefteris Kozanidis. PhD Hellenic Open University 1 e-Hoop 4 th Meeting Patras, Greece.
IPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment Sriram Srinivasan.
Introductory Phylogenetic Workflows in the Discovery Environment Sheldon McKay iPlant Collaborative, DNALC, Cold Spring Harbor Laboratory Feb 8, 2012.
: Information Retrieval อาจารย์ ธีภากรณ์ นฤมาณนลิณี
DataGrid is a project funded by the European Commission under contract IST EDG Baseline API Document Document build description and current.
4000 Imaje 4020 – Software Imaje 4020 – Content ■ Content of Chapter Software: 1. Flash Up 2. Netcenter 3. FTP 4. Active X 5. XCL commands 6. Exercise.
Actionable Identifiers an introduction Joan Starr California Digital Library.
Combining GATE and UIMA Ian Roberts. 2 Overview Introduction to UIMA Comparison with GATE Mapping annotations between GATE and UIMA.
WWW4MAIL Past, present and future Onime, Clement E Scientific Computing Section The Abdus Salam ICTP Trieste, Italy.
Text2PTO: Modernizing Patent Application Filing A Proposal for Submitting Text Applications to the USPTO.
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
Lifemapper 2.0 Using and Creating Geospatial Data and Open Source Tools for the Biological Community Aimee Stewart, CJ Grady, Dave Vieglais, Jim Beach.
A little more App Inventor and Mind the GAP!
The IPT user interface and data quality tools
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
PHP / MySQL Introduction
TallTimber & TimberPad: Software and Support for Forest Inventory
PPT1: Basics of software engineering
iDigBio API Hackathon ‘15 Introductory Webinar
Is a Content Management System in Your Future?
Presentation transcript:

Making the Tree of Life Accessible for Research This is a 20-minute overview with links to screencasts and demos, providing an introduction to the project and to the upcoming 2 nd hackathon (Jan 28 to Feb 1, 2013, Tucson, AZ). A project of the NESCent HIP (hackathons, interoperability, phyogenies) working group.

Latest version of this file: (ppt) or (PDF) RE-USE OF TREES Producers Consumers (re-users) Repositories “Most attempts at re-use seem to end in disappointment” [1] [1] Stoltzfus, et al., 2012, “Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis”,

Latest version of this file: (ppt) or (PDF) USE CASE: LEAF VEIN EVOLUTION R.L. Walls with Linnaeus aextoxicaceae/aextoxicon/aextoxicon_puntatum anacardiaceae/anacardium/anacardium_excelum anacardiaceae/rhus/rhus_glabra annonaceae/dugetia/dugetia_furfuraceae... Phylomatic Input list from Walls, 2011 APG framework with 1566 taxa 98-species tree of Walls, 2011 ?

Latest version of this file: (ppt) or (PDF) THE “TREE OF LIFE” =  Some big trees *  4,500 mammal species  55,473 angiosperm species  1,827 angiosperm taxa  800 fish families  16,000 taxa in ToLWeb  73,060 eukaryotic species  400,000 prokaryotic 16S rDNAs  250,000 species NCBI taxonomy  And other trees not listed * Proper phylogenies as well as phylogeny-based taxonomic hierarchies

Latest version of this file: (ppt) or (PDF) ARCHITECTURE OVERVIEW DataOperations User Controller Rectify Names (TNRS) NameBanks Find matching treesSource treesGraft missing taxaPrune extra taxaTranslate formatsGet branch lengthsCalibrations Species1 Species2 Species3 condition1 condition2 Species1 Species2 Species3 condition1 condition2 Phylotastic

Latest version of this file: (ppt) or (PDF) PHYLOTASTIC  Phy· lo· tas· tic/fī lō ˈ t ă s tĭk/ 1. Adjective: providing computable, convenient and credible access to expert knowledge of the phylogeny of species 2. Noun: an open-source project of HIP* to prototype and disseminate a distributed, web-services- based phylotastic system  Synonyms:ToL-o-matic  Web home: * Hackathons, Interoperability, Phylogenies, a NESCent working group

Latest version of this file: (ppt) or (PDF) HACKATHON #1, JUNE 4 TO NESCENT  Teams:  TNRS - taxonomic name resolution  TreeStore - triple store with REST API  Architecture - controllers, interfaces, pruners  Branch lengths - scaling trees using chronograms  Shiny - other demos and cool front-end stuff  30 participants  high diversity  2 remote sites

Latest version of this file: (ppt) or (PDF) PHYLOTASTIC.ORG It’s all open source Screencasts & live demonstrations

Latest version of this file: (ppt) or (PDF) SCREENCAST: SCRIPTABLE PRUNER, WEB FORM   YouTube video at (3 min)  Web form invokes URL API, like this:  wg.nescent.org/script/phylotastic.cgi?species=Felis+silvestris%2C+Canis+lupus%2C+Cavi a+porcellus&tree=mammals&format=newick wg.nescent.org/script/phylotastic.cgi?species=Felis+silvestris%2C+Canis+lupus%2C+Cavi a+porcellus&tree=mammals&format=newick  So, you can run it with curl  Or with a simple Perl script: #!/usr/bin/perl –w my $base = " my ( $tree, $taxa ) $taxa =~ s/[ _]/+/g; $taxa =~ s/,/%2C/g; system( "curl \"$base?species=$taxa&tree=$tree&format=newick\" > out.tre; open out.tre" ); exit; Rutger Vos

Latest version of this file: (ppt) or (PDF) SCREENCAST: MESQUITE-O-TASTIC   YouTube screencast at (3 min)  Installable Mesquite module is here: I  Peter Midford NESCent Arlin Stoltzfus NIST

Latest version of this file: (ppt) or (PDF) RECONCILIOTASTIC  Reconcile-tree problem  Very common use-case  Inputs are gene tree, species tree  Gene tree: easy to get  Species tree: hard to get  Approach (see Reconciliotastic demo at  Load gene tree ( with NCBI identifiers embedded in labels)  Compute species list  Extract identifiers from labels  Map IDs to species sources via NCBI web service  Get species tree phylotastically  Reconcile gene tree and species tree using Zmasek’s SDI library

Latest version of this file: (ppt) or (PDF) ROLE OF TNRS IN PHYLOTASTIC (BRIEF) 40 species auto-extract species names from text Riek, 2011 (Mammalian Biology 76(1):3- 11) Manually key in species list from tree image 36 species + 2 extras* 36 species Phylotastic * named in text but not used in phylogenetic analysis 40 species Phylotastic Copy & paste species named in Table 1 33 species Phylotastic 5 minutes <1 minute 12 minutes Hours or days

Latest version of this file: (ppt) or (PDF) ROLE OF TNRS — MORE DETAIL  screencast: (7 min)  Riek, 2011 (case study)  Cool demo:  PDF  auto-extracted names  tree  What Taxonomic Name Resolvers do  What the phylotastic TNRS team did  Using the Taxosaurus URL API  phalophus+monticola TNRS team Naim Matasci iPlant Gaurav Vaidya U. Colorado Siavash Mirarab UT Austin

Latest version of this file: (ppt) or (PDF) THE OTHER KIND OF DATING WITH FOSSILS r8s, pathd8, Multidivtime Calibrating a tree using fossil timepoints

Latest version of this file: (ppt) or (PDF) PHP 11 studies >4,000 trees 6,973 taxa 620,868 leaves DateLife engine (R, FastRWeb, Rserve) DateLife engine (R, FastRWeb, Rserve) DATELIFE

Latest version of this file: (ppt) or (PDF) CURRENT STATUS - WYSIWYG There are some holes We haven’t put the pieces together yet The interfaces are unstable Branches could shift without warning You might crash

Latest version of this file: (ppt) or (PDF) WHAT’S NEXT?  Phylotastic hackathon #2 (Jan 2013, AZ)  Themes  Integration – get components to work together  Use-cases – give users what they want  More Shiny Stuff — make it look good  Your idea here  To apply   More partners & sponsors

Latest version of this file: (ppt) or (PDF) ACKNOWLEDGEMENTS   Send feedback to Arlin Stoltzfus HIP Leadership Team Participants Sponsors