CoLIMS progress Computational Omics and Systems Biology (CompOmics) Group Niels Hulstaert
outline predecessor: ms-lims database schema architecture status in the pipeline bumpy road demo
ms-lims lifetime growth Millions of spectra
ms-lims usage MS or MS/MS analysis Format A Format B Format C MySQL DB Micromass Q-TOF I Bruker Ultraflex Bruker Esquire HCT Agilent HPLC Identification Matrix Science Mascot Results interpretation Consumer 1Consumer 2 Consumer 3 spectra Applied 4X00 Format D
time for an update mascot centric no maxquant support database schema limitations hard to maintain legacy code memory issues cyclic dependencies minimalist gui
ms-lims-X -> CoLIMS take the good things (and start from scratch) rich client straightforward installation lightweight PeptideShaker support MaxQuant support ProteomeXchange/PRIDE support more mature database schema unique protein sequences unique modifications
database schema
metadata
search input
identification results
quantification
user management
architecture
database server colims DB storage task server ActiveMQ storage engine colims-core colims-repository colims-distributed colims-model in-house client colims-distributed colims-core colims-client colims-model colims-repository
JMS and JMX java technologies widely used and has proven to be a stable component in distributed architectures loose coupling of clients and storage engine sequential storing: unique protein and modification tables transactional and retry mechanism
quantification status in progress: MaxQuant import functionality need for validator in the pipeline Mascot quant support first: mzTab support later: mzQuantML support
supported search engines MaxQuant In the pipeline: native Mascot support PeptideShaker: MS-GF+, OMSSA, X!Tandem, MS Amanda and Mascot
ProteomeXchange export PRIDE XML mzIdentML PeptideShaker imported data in ProteomeXchange/PRIDE 93 submissions, comprising spectra 50 submissions are public, containing spectra spectra on average per PeptideShaker project
in the pipeline PeptideShaker like data viewer data query tool native ProteomeXchange/PRIDE export (mzML, mzIdentML, mzTab) built-in distributed search architecture and identification interpretation (SearchGUI/PeptideShaker) improve client – storage task server interaction replace ms-lims and import existing data web interface third party access
design bumps ActiveMQ instead of in-house solution various database schema changes auditing issues unique protein accession -> unique sequence
adapting to PeptideShaker fast release cycles PSI-MOD -> UNIMOD modifications (multi search engines) protein inference strategy (protein tree)
adapting to MaxQuant no access to used FASTA spectral matching across searches black box
DEMO