Using GATE to extract information from clinical records for research purposes Matthew Broadbent Clinical Informatics lead South London and Maudsley (SLAM)

Slides:



Advertisements
Similar presentations
High Resolution studies
Advertisements

17 February 2005Renal Unit Informatics Meeting UK Renal Registry The Unit Interface- Now and in the Future Fran Benoy-Deeney Paul Dawson.
S ELECTING AND I MPLEMENTING AN A CADEMIC EHR Phyllis Murray, RN, MSN, MAEd Program Manager January 24, 2014.
Child Development MICS3 Data Analysis and Report Writing.
© 2007 by Neil Hauge; made available under the EPL v1.0 | Neil Hauge Project Lead Oracle Dali JPA Tools Project – Graduation Review Draft.
Improving Depression Treatment in Primary Care: Dissemination and Implementation Edmund Chaney, PhD Department of Veterans Affairs, Seattle AcademyHealth.
1 OOA-HR Workshop, 11 October 2006 Semantic Metadata Extraction using GATE Diana Maynard Natural Language Processing Group University of Sheffield, UK.
PEE DEE MENTAL HEALTH CENTER ALZHEIMERS DEMENTIA DAY TREATMENT PROGRAM SILVER YEARS SILVER YEARS.
Institutional Audit Who runs it? What is it and how often does it occur? How will it affect us? What do we need to do? What will the outcome be and does.
Raising Achievement. 2 Aims To explore approaches and materials to support the planning of learning. To consider strategies for preparing learners for.
Physical Health in Severe Mental Illness: Problems and Solutions INFORMATICS Max Henderson Senior Lecturer in Epidemiological & Occupational Psychiatry.
Reading Assessment: GORT- 4 (Gray Oral Reading Test -4)
Organisation Of Data (1) Database Theory
Exploring opportunities for health research collaborations between Australia and the UK Professor Nigel Mathers & Dr Susan Nancarrow Institute of General.
Standardized Scales.
Sub module 2 Use of standardized records and registers.
Diabetes and the Health Innovation Network Charles Gostling 19 September, 2013.
Placement Monitoring Team: Interventions & Observations of a Lambeth Case Study Heidi Emery MHLD Placement Coordinator Placement Monitoring Team (PMT)
University of Sheffield NLP Module 4: Machine Learning.
Local Improvement following National Clinical Audit The View from a National Clinical Audit Provider – the Health & Social Care Information Centre.
Julie Welbig Transfusion Safety Officer Fairview Health Services
RURAL HEALTH NETWORK DEVELOPMENT PLANNING PROGRAM FUNDING OPPORTUNITY ANNOUNCEMENT HRSA PRE-REVIEW CONFERENCE CALL FEBRUARY 7, 2014 PRESENTER: AMBER.
Meaningful Use Stage I Core Objectives
Improving Access to Psychological Therapies (IAPT) in London - Implementing NICE Guidance Professor Stephen Pilling PhD Director, National Collaborating.
Copyright © Healthcare Quality Quest, Proposed standards for a national clinical audit — How we got involved and what we have learned.
1 © 2006 Curriculum K-12 Directorate, NSW Department of Education and Training English K-6 Syllabus Using the syllabus for consistency of assessment.
Improving Office Care for Chest Pain Thomas D. Sequist, MD MPH Associate Professor of Medicine and Health Care Policy Brigham and Women ’ s Hospital, Division.
Northern & Yorkshire Cancer Registry & Information Service NHS UKACR Conference 30 September How useful is the Cancer Waiting Times (CWT) dataset.
1 Cervical Screening Programme, England, : Graphs.
Click the arrows to advance forward and backward. Click the Next link below to advance to the assessment. The A B C & D’s of Suicide Assessment and Clinical.
Meaningful Use Basics.  Demographics  Active Medication List  Active Allergy List  Vitals  Smoking Status  Problem List  Computerized Physician/Provider.
Relieving distress, transforming lives Data Collection in IAPT The Importance of collecting data in IAPT-compliant services (References: The IAPT Data.
Healthwatch: Dementia Patient / Customer Experience Briefing to Health and Well-Being Board Janice Horsman Chair Healthwatch Westminster.
Assist. Prof. Dr. Memet IŞIK Ataturk University Medical Faculty Department of Family Medicine Class 2:
An overview of Health and Health Service Information at NHS Direct December 2007 Kim Diprose – Health Information Manager (Library & Information Services)
ePJS SLaM’s Electronic Clinical Record
“NHS South Central – Improving health and alleviating the causes of poor health for the benefit of patients, the public and taxpayer alike in Oxfordshire,
Promoting Excellence in Family Medicine Enabling Patients to Access Electronic Health Records Guidance for Health Professionals.
1 The UK Opportunity: what is experimental medicine? UNLOCK YOUR GLOBAL BUSINESS POTENTIAL Pre- clinical develop- ment Phase I Phase II Phase III Product.
September 5 th – 8 th 2013 Nottingham Conference Centre, United Kingdom
Results Conclusions Good compliance with writing TTOs however there is room for improvement with adherence to filling in certain information parameters.
Continual Development of a Personalized Decision Support System Dina Demner-Fushman Charlotte Seckman Cheryl Fisher George Thoma.
Women’s Health Academic Centre Impact of migration and stressful life events on women’s mental health Laura Nellums MSc, PhD Student Dr Stephani Hatch.
Post test survey of the General Census of Population and Housing.
1 Patient Access Management Leveraging Best Practices.
Information Extraction From Medical Records by Alexander Barsky.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
THE IMPACT OF ANTI-DEPRESSANTS AND COGNITIVE THERAPY ON PANIC DISORDER Christopher Cannizzaro Rowan University Abnormal Psychology.
I2B2 Shared Task 2011 Coreference Resolution in Clinical Text David Hinote Carlos Ramirez.
Chapter 7: Indexes, Registers, and Health Data Collection
RATIONALISING HEALTH INFORMATION SYSTEMS TO IMPROVE HEALTH OUTCOMES Public Health Services Queensland Health Australia
Annual R&D Report Professor Graham Thornicroft. Achievements and Highlights 1 Specialist NIHR Biomedical Research Centre Technology Platform funding 6.
SNOMED and Veterinary Clinical Systems “Do we have to eat the whole elephant?” Dr. Jeff Wilcke, June 2008.
CCS Information for Mental Health John Turp Clinical Systems Project Manager.
NHS Connecting for Health is delivering the National Programme for Information Technology NPfIT presentation to SCATA 17 th November 2005 Ian H K Scott.
Women’s Health Academic Centre Impact of stressful life events on migrant women’s mental health and well-being Laura Nellums MSc, PhD Student Dr Stephani.
Retrospective Chart Reviews: How to Review a Review Adam J. Singer, MD Professor and Vice Chairman for Research Department of Emergency Medicine Stony.
Slide 1 UCLH Cancer Collaborative (part of the National Cancer Vanguard with RM Partners, and Greater Manchester Cancer)
Biomedical Research Centre for Mental Health and Dementia Unit at South London and Maudsley NHS Foundation Trust the Institute of Psychiatry, King’s College.
University of Sheffield, NLP Introduction to Text Mining Module 4: Development Lifecycle (Part 1)
© 2016 Chapter 6 Data Management Health Information Management Technology: An Applied Approach.
NCQA’s Approach to the New Quality Measurement Landscape
Dr. Kęstutis Adamonis, Dr. Romanas Zykus,
Showcasing work by Jonnageddala, Liaw, Ray, Kumar, Chang, and Dai on
September 2016 Survey Data Entry User Guide (v1 – 6th September 2016)
Effect of centrally acting ACE inhibitors on Alzheimer’s disease progress: A retrospective longitudinal study using SLAM BRC Case register Dr Gayan Perera.
Dr Gayan Perera Epidemiologist
مدیریت داده ها و اطلاعات آزمایشگاه پزشکی
Optimizing Efficiency + Funding
کتابهای خریداری شده فن آوری اطلاعات سلامت 1397
Presentation transcript:

Using GATE to extract information from clinical records for research purposes Matthew Broadbent Clinical Informatics lead South London and Maudsley (SLAM) NHS Foundation Trust Specialist Biomedical Research Centre (BRC)

SLAM NHS Foundation Trust – the source data Electronic Health Record The Patient Journey System Coverage: Lambeth, Southwark, Lewisham, Croydon Local population: c. 1.1 million Clinical area: specialist mental health Active patients: c Total inpatients: c Total records: c Active users: c. 5000

Aim: to access clinical data from local health records for research purposes: Value: central to academic and national government strategy Accessing data from electronic medical records is one of the top 3 targets for research Sir William Castell, Chairman Wellcome Trust South London and Maudsley Biomedical Research Centre

Aim: to access clinical data from local health records for research purposes: Value: central to academic and national government strategy Major constraints: security and confidentiality structure and content of health records South London and Maudsley Biomedical Research Centre

PJS CRIS data structure: xml. FAST indexCRIS SQL CRIS application CRIS Architecture

CasesInstances MMSE coverage MMSE (structured) MMSE entries in free text

Using free text Starting estimate: 80% of value (reliable, complete data) lies in free text Design: CRIS was specifically designed to enable efficient and effective access to free text. Issue: free text requires coding! Quantity of text is overwhelming (c instances) Solution: GATE !

BRC researchers trained in GATE, including JAPE Method to date… Applications developed in collaboration with Sheffield (Angus, Adam, Mark) BRC identifies need and assesses feasibility of using GATE Small sample (e.g. 50 instances) manually annotated Initial application rules drafted, e.g. features and gazetteer requirements and definitions Prototype application developed New corpus run through the prototype and manually corrected Application v.2 created These steps iterate until precision and recall have plateauxed (c. 6 iterations) The application rules are collaboratively reviewed and amended throughout the process to maximise performance BRC Sheffield

Method to date… BRC identifies need and assesses feasibility of using GATE Small sample (e.g. 50 instances) manually coded Initial application rules drafted, e.g. features and gazetteer requirements and definitions Prototype application developed New corpus run through the prototype and manually corrected Application v.2 created All CRIS free text docs run through the application (c.11 million) Results (relevant annotations/features) loaded back into source SQL database BRC Sheffield Application v.6 created

Text: MMSE done on Monday, score 24/30 Trigger Date Score GATE MMSE application

Using free text – GATE coding of MMSE scores / dates Text extract from CRIS: MMSE scored dropped from 17/30 in November 2005 to 10/30 in April 2006

CasesInstances MMSE coverage MMSE (structured) MMSE entries in free text MMSE raw score/date GATE

GATE accuracy – recall and precision (unseen data) AppIterationsRecallPrecisionStatus Smoking status Operational Diagnosis Operational MMSE6Operational

Learning from experience – maximising performance Improving performance through improved methods: 1.Favouring precision over recall:

Multiple reference to diagnosis for BRCID

Learning from experience – maximising potential Improving performance through improved methods: 1.Favouring precision over recall - write rules that favour precision Keep it simple, e.g. gazetteer list to identify patients that live alone: lives alone lives by him/her self lives on his/her own AppIterationsRecallPrecisionStatus lives alone Dev

Learning from experience – maximising potential Improving performance through improved methods: 1.Better rules – favouring precision over recall 2.Post processing

Valid The MMSE numerator was larger than 30 The MMSE numerator was larger than the denominator The MMSE result date is 10 years before the document's creation date The MMSE numerator was missing The MMSE result occurs on the same day as a previous result Missing Date Information The MMSE result date is more than 31 days after the CRIS record date The MMSE result date is within 31 days of a previous result (and the..... result was the same) The MMSE result occurs on the same day as a previous result Post-processing: MMSE annotation codes applied locally

CasesInstances MMSE coverage MMSE (structured) Text instances with MMSE MMSE raw score/date GATE MMSE valid score/date GATE

Add features that support / improve post-processing Post-processing: supportive features Enables: testing of recall and precision for different annotations types selection of appropriate annotations for different analyses context to be taken into account in post-processing e.g. - for male patient with Alzheimers; DoB 1934; no other education annotation - for female patient with depression; DoB 1964; other annotation level = degree e.g. education annotation = her father failed art A-level Level: GSCE Rule: Fail Subject: her father

Learning from experience – maximising potential Improving performance through improved methods: 1.Better rules – favouring precision over recall 2.Post processing - supported by appropriate rules and features 3.Better development methodology

Methods to date… BRC identifies need and assesses feasibility of using GATE Small sample (e.g. 50 instances) manually coded Initial application rules drafted, e.g. features and gazetteer requirements and definitions Prototype application developed New corpus (e.g. 50 instances) run through the prototype and manually corrected Application v.6 created All CRIS free text docs run through the application (c.11 million) Results (relevant annotations/features) loaded back into source SQL database BRC Sheffield Occasional unexpected weirdness!

Post-processing: MMSE annotation codes applied locally The MMSE numerator was larger than 30 The MMSE numerator was larger than the denominator The MMSE result date is 10 years before the document's creation date The MMSE numerator was missing The MMSE result occurs on the same day as a previous result Missing Date Information The MMSE result date is more than 31 days after the CRIS record date The MMSE result date is within 31 days of a previous result (and the..... result was the same) The MMSE result occurs on the same day as a previous result

Post-processing: MMSE annotation codes applied locally The MMSE numerator was larger than 30 The MMSE numerator was larger than the denominator The MMSE result date is 10 years before the document's creation date The MMSE numerator was missing The MMSE result occurs on the same day as a previous result Missing Date Information The MMSE result date is more than 31 days after the CRIS record date The MMSE result date is within 31 days of a previous result (and the..... result was the same) The MMSE result occurs on the same day as a previous result

Post-processing: MMSE annotation codes applied locally The MMSE numerator was larger than 30 The MMSE numerator was larger than the denominator The MMSE result date is 10 years before the document's creation date The MMSE numerator was missing The MMSE result occurs on the same day as a previous result Missing Date Information The MMSE result date is more than 31 days after the CRIS record date The MMSE result date is within 31 days of a previous result (and the..... result was the same) The MMSE result occurs on the same day as a previous result

Methods to date… BRC identifies need and assesses feasibility of using GATE Small sample (e.g. 50 instances) manually coded Initial application rules drafted, e.g. features and gazetteer requirements and definitions Prototype application developed Application v.6 created All CRIS free text docs run through the application (c.11 million) Results (relevant annotations/features) loaded back into source SQL database BRC Sheffield

Learning from experience – maximising potential Improving performance through improved methods: 1.Better rules – favouring precision over recall 2.Post processing – include rules and features to support 3.Better development methodology Play to GATEs strengths (dont ask GATE to do what you can do better yourself) Know your data!

GATE accuracy – recall and precision (unseen data) AppIterationsRecallPrecisionStatus MMSE6Operational Diagnosis Operational Smoking status Operational

GATE accuracy – recall and precision (unseen data) AppIterationsRecallPrecisionStatus MMSE6Operational Diagnosis Operational Smoking status Operational Medication Development Education level Development Left school age Development SSD Interventions30.96 Development Lives alone Development AppIterationsRecallPrecisionStatus MMSE6Operational Diagnosis Operational Smoking status Operational

Using GATE data in real research How good is good enough?

Using GATE data in real research 1. Investigating relationships between cancer treatment and mental health disorders Using data from GATE applications: MMSE Smoking 4609 smoking status features for 1039 patients, from a total linked data set of c.3500 cases. Diagnosis Pilot for Department of Health Research Capability Programme, linking data from different clinical sources (CRIS and Thames Cancer Registry)

Using GATE data in real research 2. Investigating cost of care related to cognitive function in people with Alzheimers Using data from GATE applications: MMSE Diagnosis 803 new cases of Alzheimers identified from a combined total of 4900 cases Education Lives alone Social care Care home Medication Collaboration with pre-competitive pharma consortium