Download presentation
Presentation is loading. Please wait.
1
Biopackages.net Operating System Packages for Bioinformatics Allen Day 2005.05.17
2
What is a package? Software, config files, documentation, and/or data encapsulated in a single file Metadata describing: Version, license, package “category” Dependencies What the package provides
3
GMOD target audience Small MODs
4
Package Dependency Graph Dependencies What the package provides chado chado-Hsa genome-Hsa-nibucsc-blat genome-Hsa-annotation-affymetrix genome-Hsa-annotation-gene postgresql-AffxSeq postgresql-server perl-bioperl obo-core perl-go-perl
5
Dependencies Build Dependency Installation Dependency
6
What is a Package Manager? Tools to manage installation, upgrade, uninstallation of packages Verify package integrity (checksums) Maintain system integrity Transactional Allow rollbacks Dependency checking Dependency graph recursion Allow software customization (patches)
7
Current Generation of PMs RPM Dpkg Apt Yum Emerge tgz/bz2 Windows Installer
8
Why bioinformatics packages? Consistency of installation process Bioinfo. package installs vary wildly, and commonly lack documentation Automatic dependency installation Perl modules especially bad – bioperl has 60+ modules in its dependency tree Integrity/Auditing of system state Know an installed package works, which version, how to replicate system setup Tighter integration with operating system Daemons, config & log file locations, etc.
9
What’s available? RPM packages only right now Primary focus on Fedora Core 2 Some RPMs also available for Fedora Core 3 RedHat 9 Cygwin
10
What’s available? Three primary foci Applications Libraries Data sets
11
Applications Gbrowse Textpresso BLAT daemon NCBI Toolkit (BLAST, etc) HMMer
12
What’s available? Libraries Bioperl R & Bioconductor Squid EMBOSS
13
What’s available? Data sets Genome & protein sequence Sequence features Ontologies All installed using a common directory structure
14
What’s available? UCSC tools (utilities, BLAT system service, CGI scripts) Bioperl R / Bioconductor GMOD apps (Gbrowse, Textpresso, …) Data packages Genome sequence (fa, nib, blastdb) Genome features (Affy probeset alignments, mRNA, etc)
15
GMOD Components Available chado-Hsagbrowsetextpresso gmod-web-Hsa turnkey chado das2-Hsa apollo-Hsa cmap-Hsa ‘Hsa’ can be substituted for your organism Currently built for ‘Cel’, ‘Hsa’, ‘Sce’ ucsc-BLATgenome-Hsa-nib
16
More details… chado chado-Hsa genome-Hsa-nibucsc-blat perl-go-perl genome-Hsa-annotation-affymetrix genome-Hsa-annotation-gene postgresql-AffxSeq postgresql-server perl-bioperl ……………
17
Gene Expression Components chado-HsaBioconductorR Quant/Norm Pipeline chado-GEC DAS/2 for Genotyping, GeneChip
18
Resources http://www.biopackages.net http://www.biopackages.net ~1000 RPMs for Fedora Core 2, 3 Available via yum See site for a configuration example.
19
TODO Support more architectures Build for Cygwin & OS X. RPM has been ported to both Automate package build process Build farm of multiple architectures, controllable via scheduler (GridEngine) Automate (if possible) inclusion of new software / data releases
20
TODO Build community interest and involvement Keep adding more packages! Keep existing packages current!
21
Acknowledgements Patrick Alger Jared Fox Brian O’Connor Todd Harris Lincoln Stein Stanley Nelson
22
Anatomy of a specfile Metadata Name Depends Provides Changelog Build & install script hooks %prep %build %install %post %preun
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.