OpenCRAVAT
Motivation Protected data Modular analysis Portable, visual results Local install Compute limited, use precomputed data Modular analysis Get only annotations you need Saves on install size and runtime Portable, visual results
Pipeline $ cravat input.vcf Output DB Input Mapping annotator_1
Managing Annotators Analysis Annotator 1 Annotator 2 Annotator 3 Server Annotator 1 Annotator 2 Annotator 3 User Machine Annotator 1 Annotator 2
Demo Install Run a simple job Web interface Install annotators Submit a job on web Use the web viewer Writing your own annotator
Installing pip package two commands pip install open-cravat cravat-admin install-base
PS C:\Users\kylem> pip install open-cravat Collecting open-cravat Requirement already satisfied: requests in c:\python36\lib\site-packages (from open-cravat) (2.19.0) Requirement already satisfied: pyliftover in c:\python36\lib\site-packages (from open-cravat) (0.3) Requirement already satisfied: requests-toolbelt in c:\python36\lib\site-packages (from open-cravat) (0.8.0) Requirement already satisfied: markdown in c:\python36\lib\site-packages (from open-cravat) (2.6.11) Requirement already satisfied: pyyaml in c:\python36\lib\site-packages (from open-cravat) (3.12) Requirement already satisfied: websockets in c:\python36\lib\site-packages (from open-cravat) (6.0) Requirement already satisfied: aiohttp in c:\python36\lib\site-packages (from open-cravat) (3.3.2) Requirement already satisfied: certifi>=2017.4.17 in c:\python36\lib\site-packages (from requests->open-cravat) (2018.4.16) Requirement already satisfied: idna<2.8,>=2.5 in c:\python36\lib\site-packages (from requests->open-cravat) (2.7) Requirement already satisfied: urllib3<1.24,>=1.21.1 in c:\python36\lib\site-packages (from requests->open-cravat) (1.23) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\python36\lib\site-packages (from requests->open-cravat) (3.0.4) Requirement already satisfied: attrs>=17.3.0 in c:\python36\lib\site-packages (from aiohttp->open-cravat) (18.1.0) Requirement already satisfied: async-timeout<4.0,>=3.0 in c:\python36\lib\site-packages (from aiohttp->open-cravat) (3.0.0) Requirement already satisfied: multidict<5.0,>=4.0 in c:\python36\lib\site-packages (from aiohttp->open-cravat) (4.3.1) Requirement already satisfied: idna-ssl>=1.0 in c:\python36\lib\site-packages (from aiohttp->open-cravat) (1.0.1) Requirement already satisfied: yarl<2.0,>=1.0 in c:\python36\lib\site-packages (from aiohttp->open-cravat) (1.2.5) Installing collected packages: open-cravat The scripts cravat-admin.exe, cravat-filter.exe, cravat-report.exe, cravat-store.exe, cravat-test.exe, cravat-util.exe, cravat-view.exe, cravat.exe, cv.exe, cva.exe and wcravat.exe are installed in 'c:\python36\Scripts' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed open-cravat-0.0.123 You are using pip version 10.0.1, however version 18.0 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command. PS C:\Users\kylem>
PS C:\Users\kylem> cravat-admin install-base Installing: aggregator:1.0.0, cravat-converter:1.0.0, excelreporter:1.0.0, hg38:1.0.0, oldcravat-converter:1.0.0, tagsampler:1.0.0, textreporter:1.0.0, vcf-converter:1.0.0, vcfinfo:1.0.1, wgbase:1.0.0, wgcircossummary:1.0.0, wgcodingvsnoncodingsummary:1.0.0, wggosummary:1.0.0, wglollipop:1.0.0, wgncrna:1.0.0, wgndex:1.0.0, wgsosamplesummary:1.0.0, wgsosummary:1.0.0, wgtopgenessummary:1.0.0 Start install of aggregator:1.0.0 Downloading aggregator:1.0.0 code archive [**************************************************] 8.4 kB / 8.4 kB (100%) Extracting aggregator:1.0.0 code archive Verifying aggregator:1.0.0 code integrity Finished installation of aggregator:1.0.0 Start install of cravat-converter:1.0.0 Downloading cravat-converter:1.0.0 code archive [**************************************************] 6.4 kB / 6.4 kB (100%) Extracting cravat-converter:1.0.0 code archive Verifying cravat-converter:1.0.0 code integrity Finished installation of cravat-converter:1.0.0 Start install of excelreporter:1.0.0 Downloading excelreporter:1.0.0 code archive [**************************************************] 460.4 kB / 460.4 kB (100%) Extracting excelreporter:1.0.0 code archive Verifying excelreporter:1.0.0 code integrity Finished installation of excelreporter:1.0.0 …..
Changing modules directory I’m doing this to save time Useful if you want to use separate disk/location cravat-admin md [directory]
PS C:\Users\kylem> cravat-admin md c:\python36\lib\site-packages\cravat\modules PS C:\Users\kylem> cravat-admin md ~\a\ashg\mdirs\base c:\Users\kyle\a\ashg\mdirs\base
Running a job Base install will map variants to genes, transcripts, and proteins Output goes in same directory as input
PS B:\insilico\ashg\jobs\example> cravat PS B:\insilico\ashg\jobs\example> cravat .\example_input Running converter... Converter (converter) converter finished in 0.11299991607666016 Running gene mapper... UCSC hg38 Gene Mapper (hg38) gene mapper finished in 0.5740008354187012 Running annotators... anntator(s) finished in 0.0 Running aggregator... Variants Genes Samples Tags aggregator finished in 0.17648959159851074 Running post-aggregators... Tag Sampler (tagsampler) VCF Info (vcfinfo) post-aggregator finished in 0.09199905395507812 Running reporter... Excel Reporter (excelreporter) reporter finished in 0.3959202766418457
Viewing Output Excel by default Can add text output with command line flag Text output good for pipelines
PS B:\insilico\ashg\jobs\example> cravat example_input -t text PS B:\insilico\ashg\jobs\example> ls Directory: B:\insilico\ashg\jobs\example Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 10/1/2018 10:00 AM 9036 example_input -a---- 10/1/2018 10:05 AM 634 example_input.aggregator.gene.log -a---- 10/1/2018 10:05 AM 211 example_input.aggregator.mapping.log -a---- 10/1/2018 10:05 AM 423 example_input.aggregator.sample.log -a---- 10/1/2018 10:05 AM 846 example_input.aggregator.variant.log -a---- 10/1/2018 10:05 AM 0 example_input.converter.err -a---- 10/1/2018 10:05 AM 555 example_input.converter.log -a---- 10/1/2018 10:05 AM 5704 example_input.crg -a---- 10/1/2018 10:05 AM 3668 example_input.crm -a---- 10/1/2018 10:05 AM 2980 example_input.crs -a---- 10/1/2018 10:05 AM 33550 example_input.crt -a---- 10/1/2018 10:05 AM 9132 example_input.crv -a---- 10/1/2018 10:05 AM 88630 example_input.crx -a---- 10/1/2018 10:05 AM 0 example_input.map.err -a---- 10/1/2018 10:05 AM 1568 example_input.map.log -a---- 10/1/2018 10:05 AM 221184 example_input.sqlite -a---- 10/1/2018 10:05 AM 22 example_input.status.json -a---- 10/1/2018 10:05 AM 214 example_input.tagsampler.log -a---- 10/1/2018 10:05 AM 68 example_input.vcfinfo.log -a---- 10/1/2018 10:01 AM 56793 example_input.xlsx
Variants Genes
PS B:\insilico\ashg\jobs\example> cravat example_input -t text PS B:\insilico\ashg\jobs\example> ls Directory: B:\insilico\ashg\jobs\example Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 10/1/2018 10:00 AM 9036 example_input -a---- 10/1/2018 10:05 AM 634 example_input.aggregator.gene.log -a---- 10/1/2018 10:05 AM 211 example_input.aggregator.mapping.log -a---- 10/1/2018 10:05 AM 423 example_input.aggregator.sample.log -a---- 10/1/2018 10:05 AM 846 example_input.aggregator.variant.log -a---- 10/1/2018 10:05 AM 0 example_input.converter.err -a---- 10/1/2018 10:05 AM 555 example_input.converter.log -a---- 10/1/2018 10:05 AM 5704 example_input.crg -a---- 10/1/2018 10:05 AM 3668 example_input.crm -a---- 10/1/2018 10:05 AM 2980 example_input.crs -a---- 10/1/2018 10:05 AM 33550 example_input.crt -a---- 10/1/2018 10:05 AM 9132 example_input.crv -a---- 10/1/2018 10:05 AM 88630 example_input.crx -a---- 10/1/2018 10:05 AM 0 example_input.map.err -a---- 10/1/2018 10:05 AM 1568 example_input.map.log -a---- 10/1/2018 10:05 AM 221184 example_input.sqlite -a---- 10/1/2018 10:05 AM 22 example_input.status.json -a---- 10/1/2018 10:05 AM 214 example_input.tagsampler.log -a---- 10/1/2018 10:05 AM 92284 example_input.tsv -a---- 10/1/2018 10:05 AM 68 example_input.vcfinfo.log -a---- 10/1/2018 10:01 AM 56793 example_input.xlsx
Web Interface GUI interface Install modules Submit jobs View running/completed jobs Interactively explore results Runs local server, connect to it in browser Start from command line, working on other ways to start Beta
PS B:\insilico\jobs\example> wcravat ======== Running on http://0.0.0.0:8060 ======== (Press CTRL+C to quit)
Installing annotators Move to store tab Click on logos to see details Free form description Attribution Install the following, they’re small Clinvar MuPIT COSMIC COSMIC Gene
Submitting jobs Select a file, or enter text Choose assembly, annotators, reports Attach a note (helps you remember what a job is)
Interactive Viewer Similar design to current web CRAVAT Variant, Gene tabs Interactive widgets Resize, reorder, hide
Advanced Viewer Features Filters Affect all tabs and widgets Can be saved Portable Sharing result databases
~/open-cravat-jobs ~/open-cravat-jobs/{jobId}/input.sqlite
PS B:\insilico\temp> ls Directory: B:\insilico\temp Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 10/1/2018 12:49 PM 80625664 input.sqlite PS B:\insilico\temp> cravat-view input.sqlite ======== Running on http://0.0.0.0:8060 ======== (Press CTRL+C to quit)
Writing an annotator View your own data side-by-side with other sources Share your data with other researchers
PS B:\insilico> cravat-admin new-annotator ashg_example Annotator ashg_example created at B:\insilico\mdir\annotators\ashg_example PS B:\insilico> cd B:\insilico\mdir\annotators\ashg_example PS B:\insilico\mdir\annotators\ashg_example> ls Directory: B:\insilico\mdir\annotators\ashg_example Mode LastWriteTime Length Name ---- ------------- ------ ---- d----- 10/1/2018 12:57 PM data -a---- 10/1/2018 9:51 AM 178 ashg_example.md -a---- 10/1/2018 9:51 AM 2326 ashg_example.py -a---- 10/1/2018 9:51 AM 1460 ashg_example.yml
ashg_example.yml ashg_example.py { 'uid': 363, 'chrom': 'chrX', 'pos': 134377618, 'ref_base': 'A', 'alt_base': 'T' }
Wrap up Two commands to install One command to run a job Fully local Modular GUI or command line control Multiple output formats Interactive viewer Simple to extend
Questions? https://github.com/KarchinLab/open-cravat https://github.com/KarchinLab/open-cravat/wiki