Download presentation
Presentation is loading. Please wait.
Published byNatalie Caitlin Holt Modified over 7 years ago
1
TIES Cancer Research Network Y4 Face to Face Meeting U24 CA 180921
November 4th 2016 University of Pittsburgh, Pittsburgh, PA
2
Welcome Meeting Goals, Summary of Y3 accomplishments and Y4 plans
3
Agenda 08:00 – 08:30 Goals for the meeting; Overall summary of Y3 accomplishments 08:30 – 11:00 Session 1 – TCRN Partner Sites Reports (Feldman, Bollag, Gaudioso, Reber, Schoenfeld) 11:00 – 11:15 Break 11:15 – 11:45 Session 2 – ITCR Grant Renewal Planning (Group) 11:45 – 12:30 Session 3 - Integrating TIES and TCRN with Cancer Registry (All) 12:30 – 01:30 Session 4 – Working Lunch: Scaling TCRN (Jacobson, Chavan) 01:30 – 02:00 Session 5 – Updates on Y3 releases , Work underway (Chavan) 02:00 – 02:15 02:15 – 03:45 Session 6 – Year 4 Development Plans (Tseytlin, Chavan and Jacobson)
4
Conflict of Interest Jacobson, Tseytlin - Shareholder, Consultant to Nexi, Inc Chavan – Shareholder, Nexi, Inc
5
Goals for the Meeting Review and discuss Y3 work
Where are we now and what is our long term goal Progress towards dissemination (users, institutions) Pilot projects and scientific efforts resulting from project Discuss Y4 work and agree on basics of project plan We will refine these further and present something more formal soon after meeting Contribute to TIES development path to meet needs of institutions and investigators Planning for Y4 manuscripts Planning for Grant Renewal
6
U24 Specific Aims Specific Aim 1. Enhance the informatics technology to support inter-institutional “trust”, paraffin registry development, tissue microarray (TMA) development, and nondestructive tissue use. Specific Aim 2. Establish the TIES Cancer Research Network (TCRN) with four founding member institutions. Develop governance, network agreements, and policies for operating the TCRN. Specific Aim 3. Recruit and support pilot scientific collaborations across the network. Specific Aim 4. Disseminate the software and measure its impact.
7
Major accomplishments in Year 3
Cancer Research manuscript First paper resulting directly from scientific effort using TCRN was accepted to The Breast Journal (Khoury, RPCI) Addition of TJU to network Further dissemination at all sites Three major releases (5.4, 5.5, 5.6) Manual annotation, beginnings of an analytic framework Dual license model, with spinout Nexi, Inc
8
Overall Progress in Y3 Complete Pending Software releases (3)
Auditing reports, LDAP, Manual Annotation All data loaded and coded at every site QA ongoing Pilot Projects Pending A little behind on pilot projects and dissemination Optional additional de-identifier
9
Overview of Y1 – Y3 Accomplishments
All sites have working systems, actively coding new documents All sites have approved IRB protocols All sites have signed Network Agreement All sites have set up appropriate regulatory controls All sites have portals with forms for account request All sites disseminating to users Policies and recommendations have all been approved de-identification QA, approval bodies, governance, verifying eligibility, study registration, auditing of users, incident reporting, joining of new members Adoption and Deployment Blueprints available Increased downloads of TIES Foundation for better social media and outreach (e.g. Insightly) Successful pilot projects
10
TIES Downloads ITCR
11
TIES 5.4, 5.5, and 5.6 releases v5.4 – 11/11/2015 - Improved query builder to support structured data search. - Query Activity Log report generation and auditing functions. v5.5 – 03/23/2016 - Add LDAP support to TIES Adding features to Regulatory Administrator audit v5.6 – 09/19/2016 Add and export manual annotations to case sets Better an more versatile export capabilities
13
All TCGA clinical data in TIES Demo
TCGA in TIES DEMO:
14
Annotation workflows TIES NLP Engine TIES Search and Cohort
Tokenizes words, punctuation, numbers and spaces Cleans document, deletes existing annotations RESETTER TOKENIZER TIES Search and Cohort Development TIES Datastore TIES Manual Annotation Specialized IE Engines & Other Data consumers ConTEXT ANNOTATION TRANSFORMER Organizes output annotations Detects Negation, Temporality, Degree of Certainty
15
Nexi, Inc. University of Pittsburgh exclusively licensed TIES and NOBLE coder Software to Nexi in June, 2016 License explicitly enables continued open source software for NFP, bidirectional flow of code Founders: Ed Engler (CEO), Rebecca Jacobson (Chair SAB), Girish Chavan, Eugene Tseytlin Nexi will license code to commercial entities develop new functionality that will not be owned by Pitt, and will customize to meet client needs offer support packages for sites that want to deploy TIES create other networks Currently developing customers Ed Engler, Nexi CEO
16
Y4 will be another pivotal year of this grant…
New TCRN members Multiple new pilot projects with process adjustments; Open TCRN to investigators at all sites Major new functionality Y2: Engaging Researchers Y3: Growing the network
17
Plans for Y4 Other goals Publications from users
Work to complete Help with LIMS Integration Cancer Registry Integration New NobleCoder Integration Image Annotation Tools Analytic framework Optional additional de-identifier Other goals Publications from users Dissemination at all sites Additional adoptions at other institutions
18
Session 1 TCRN Partner Sites Reports
19
Session 2 ITCR Grant Renewal Planning
20
U24 Grant Funding Current funding period ends on 7/31/18
ITCR grants are reviewed in two cycles. Due dates are June 14th 2017, November 20th 2017, June 14th 2018 Our best chance to avoid a funding gap is probably June 14th, 2017 Peer review December 2017 Council review January 2018 Funding announcement earliest possible, Summer 2018
21
ITCR Possible Mechanisms
Enhancement and Dissemination (U24) PAR : Advanced Development of Informatics Technologies for Cancer Research and Management This FOA supports the advanced development and enhancement of emerging informatics technologies to improve the acquisition, management, analysis, and dissemination of data and knowledge in support of cancer research. Sustainment (U24) PAR : Sustained Support for Informatics Resources for Cancer Research and Management This FOA supports the continued development and sustainment of high-value informatics research resources to serve current and emerging needs across the cancer research continuum.
22
Sustained Support PAR 15-331
Advanced Development PAR Sustained Support PAR Purpose emerging informatics technology, defined as one that has passed the initial prototyping and pilot development stage, not been widely adopted in the cancer research field improved user experience and availability of existing, widely-adopted informatics tools and resources. proposed sustainment plan must provide clear justifications for why the research resource should be maintained and how it has benefited and will continue to benefit the cancer research field. Budget The amount of requested budget my not exceed $600,000 Direct Costs (excluding consortium F&A costs) per year. Application budgets are not limited but need to reflect the actual needs of the proposed project. Applicants requesting $500,000 or more in direct costs in any year (excluding consortium F&A) must contact a Scientific/ Research Contact at least 6 weeks before submitting the application and follow the Policy on the Acceptance for Review of Unsolicited Applications that Request $500,000 or More in Direct Costs as described in the SF424 (R&R) Application Guide.
23
Which mechanism? Maintaining both the software and the network takes resources. Adding new nodes takes resources at the local sites There are still many enhancements we want to make to support the community Potential for including development in Sustained Support mechanism
24
What do we need to do now to increase chance of success?
Expand number of sites using software locally Expand number of users who are part of TCRN Expand number of users at local sites Increase number of studies using the network Multicenter publications that could not be done without the network Integration with other data sources Show that a national network is possible. How? Focus more on specific type of network…Pathomics? Rad-Path?
25
Timeline and collaboration
By end of December: Establish PIs, select mechanism strategy Determine sites to be added and approach them Position TCRN within the larger NCI landscape Position TCRN within the larger Cancer Research landscape Define major (new) goals January – March: Specific Aims and Executive Summary Outline approach, deliverables, collaborators scope of work March - June: Grant writing Letters of Support Budgets, Budget justifications, and Biosketches
26
Session 3 Integrating TIES and TCRN with Cancer Registry
27
Adding Cancer Registry Data to TIES
Identified as a high value development target from last years F2F meeting We have secured additional funding from or Institution for Precision Medicine to make this happen here Senior Developer Mike Davis will be leading effort. Cancer Registry staff participating. Hiring two new staff members. Project kickoff scheduled for Nov 8th Starting with Breast Cancer first Work that we do here can immediately be leveraged by all of you to similarly add CR data to your TIES instances
29
Your Use Cases How do you envision investigators at your institution using combinations of report data and Cancer Registry data through TIES? Can you provide an example of one or more queries (real or imagined) in which a user would selects a cohort using Cancer Registry AND text data? Assuming that we make some subset of Cancer Registry data available, should this de-identified be available for download by researchers? What interval should be used to update CR data? Would your institution be able to run the required scripts at some regular interval? Do you anticipate any regulatory challenges? Political challenges? Are there obvious solutions to those challenges?
30
Data Elements Demographics Primary Treatment Outcome Race Primary Site
Surgery Vital Status Gender Histology Chemotherapy Cancer Status Diagnosis Grade BRM Recurrence Smoking Path TNM Hormonal Cause of Death Alcohol Clinical TNM Immunotherapy Prognostic Factors (including site specific) Rad Onc
31
Regulatory Issues Modify IRB protocol at each site (Pitt has already done this) Discuss Network Agreement with Pitt lawyers – I do not think need it will need to be changed Policies and processes New process for assuring that no PHI is accidentally added to TIES Dates and doctor names in text Improper mapping of fields Discussion with regulatory experts; how much information can we provide on sequences
32
TCRN Adoption Process Please plan to spend some of next years budget to integrate your CR data Develop buy in, work with your Cancer Registries now Cancer Registry Directors, Cancer Center Directors, PM Directors Seek modification to your IRB soon. Pitt will investigate changes needed Provide flexibility to sites, we may not be adding the exact same fields TCRN Policy and Process group can act as the first pass for fields to be added Value to your researchers High quality of your data TCRN Exec Committee can set standard for MDS, help identify next cancers to be added
33
Session 4 Scaling TCRN to the Next Level
34
Where are the process pain points in TCRN right now?
Confusion in the steps of the approval process. Where do you see problems? Problems in the auditing process? Provisioning users? Any scale issues? Adding additional data - Cancer Registry Data, Specimen data Access to aggregate results requires same approval process Need for IRB approvals at each site for tissue No central site for curating TCRN information, processes, policies TCRN members don’t know where to /ask for help No central site for getting technical help (other than developers) Managing the onboarding process and administrative setup Getting out to potential users Lack of an OSS De-identifier
35
More pain points…
36
Social Media Strategy Blog – catchy google searchable titles (eventually get guest posts) Create SEO Keywords Create XML sitemap Use Google Analytics Tweet regularly Post on LinkedIn regularly Send out press releases for new versions of TIES
37
Email Based Account Approval
Continue to use Wordpress Account Request Forms for their flexibility in form flow design. Reviewers record their decision through buttons/links in the account request . Monitor account for new account requests and responses. Create accounts in TIES based on account request. Account is in Pending Review status. User cannot yet access TIES. Monitor reviewer decisions, once all reviewers approve, change account status to Approved.
38
Session 5 TIES Releases
39
v5.4 – Released November 2015 Structured Data Search Auditing Support
40
v5.5 – Released March 2016 Live Results Chart Export to Excel
De-identified ID Search Export to Excel Pop-out Reports
41
v5.5 – Released March 2016 Easy to see Section Headers
Node Settings and LDAP Support
42
v5.6 – Released September 2016 Manual Annotation Improved Login Dialog
43
Manual Annotation Tool
Allows you to manually enter structured data associated with case sets. Eliminates the need to store it in a separate spreadsheet as the expert reviews the reports. Data organized by forms and fields. Forms are study specific and can be shared with other study members or made public. Fields can be of Text, Number, Boolean and Category data types. Data is exported to Excel with each field stored in a separate column and a row for each report. Access the tool under the My Case Sets tab. Click Annotate from the Available Tasks menu under the name of the Case Set.
44
Manual Annotation Tool
45
Session 6 Year 4 Development Plans
46
Annotation workflows TIES NLP Engine TIES Search and Cohort
Tokenizes words, punctuation, numbers and spaces Cleans document, deletes existing annotations RESETTER TOKENIZER TIES Search and Cohort Development TIES Datastore TIES Manual Annotation Specialized IE Engines & Other Data consumers ConTEXT ANNOTATION TRANSFORMER Organizes output annotations Detects Negation, Temporality, Degree of Certainty
47
Analytics Use Cases Whole Slide Image (WSI) nucleus segmentation algorithm morphex report feature extraction to correlate with above analysis BIRADS category extractor for radiology reports BIRADS pathology report classifier (coming soon)
48
TIES Analytical Server
Analytics Framework TIES Analytics App Store Case Set Docker Image TIES DATA CENTER TIES Analytical Server Docker Image TIES Data
49
Why Docker for analytics ?
No complex installation of 3rd party software and dependencies. All code is self-contained inside an image No need to mandate a single technology stack on algorithm developers Lightweight compared to VM Seems to be a de-facto standard for distribution of complex software. INPUT: /input/text – for text data /input/images – for WSI imaging data /input/data – for delimited structured data All filenames are constructed of <patient id>.<document id>.(txt|tsv|svs) OUTPUT: /output – any output can be put here. If any delimited file (.tsv, .bsv, .csv) with first column containing filenames of input files will be imported back into TIES Any other output file will be zipped up and available for download to the user who launched the analysis.
50
Correlating Radiology and Pathology information
Right Breast BIRADS 4 Suspicious abnormality. Biopsy is recommended Breast, Right, excision: Atypical Ductal Hyperplasia and Radial Scar ?
51
BIRADS extraction
52
2nd 1st BI-RADS laterality classification BI-RADS category annotation
IMPRESSION: OVERALL BI-RADS Category 1. Negative mammogram. BI-RADS laterality classification 2nd BI-RADS category annotation 1st CRF Model NB, SVM, DT Entity Definition Left BI-RADS BI-RADS category assigned to the left breast Right BI-RADS BI-RADS category assigned to the right breast Bilateral BI-RADS BI-RADS category assigned to both breasts Overall BI-RADS Corresponds to the most abnormal BI‑RADS of the two breasts, based on the highest likelihood of malignancy. It is usually found after the detailed description of the BI‑RADS category for each breast. In some cases, it is the only BI‑RADS class in the report.
53
Corpus Development and Inter Annotator Agreement (IAA)
54
Total Gold BIRADS Tokens
Gold Corpus Metrics Corpus Split Total Docs Total Word Tokens Total Number Tokens Total Gold BIRADS Tokens BI-RADS 0 BI-RADS 1 BI-RADS 2 BI-RADS 3 BI-RADS 4 BI-RADS 5 BI-RADS 6 Training 368 105333 7588 608 83 79 129 100 90 77 50 Devel 58 16311 1189 101 12 15 16 32 14 7 5 Test 173 55711 4077 305 48 34 67 46 36 24 Total 599 177355 12854 1014 143 128 212 182 150 120
55
BIRADS Token Annotation Results
Corpus Split Features Recall Precision F1 Accuracy Development Section, Token Type, Context Token 0.92 0.83 0.87 0.99 Section, Token Type, Context Token, Anchor, Time 0.93 0.95 0.96 250 0.97 Test 0.98
56
BIRADS classification results
Recall Precision F1 Accuracy NB 0-6 0.83 0.84 0.91 3-5 0.88 0.87 0.93 SVM 0.89 0.90 0.94 0.95 PART 0.96 Bag-of-word (BoW) by line Total number of BI-RADS token annotations BI-RADS category BI-RADS sequence Imaging Study type Laterality Breast(s) studied Laterality word token counts
58
Pathology Report Classification
59
Pathology Report Classification
60
BIRADS Extraction UIMA pipeline cTAKES to extract token level features
CRF classifier from Mallet to identify BIRADS category number Weka PART classifier to classify BIRADS laterality Docker image to wrap the pipeline INPUT /input/text – location of text reports with filenames matching de-identified ids. OUTPUT /output – location of output annotations as tab delimited files
61
Future Development Ideas
New Coding Pipeline Integrate NobleCoder v1.1. More accurate coding, faster coding. Uncertainty, polarity, experiencer and temporality annotations. Latest NCIM terminology with more fine tuned sources. Cancer Registry data integration based management of account review and approvals. Patient level search index and visualization Manual Annotation Tool Enhancements Link report text annotations to data in form fields. Intelligent auto-highlighting and filling of form fields. Library of forms to choose from, making it easy to share and reuse previously created forms.
62
Feature AU PENN RPCI SB TJU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.