Towards Evidence-Based Discovery Catherine Blake School of Information and Library Science University of North Carolina at Chapel Hill

Slides:



Advertisements
Similar presentations
Introduction to Psychology
Advertisements

The Robert Gordon University School of Engineering Dr. Mohamed Amish
What are they? How do we write them? Why do we bother?
Student Learning Development, TCD1 Systematic Approaches to Literature Reviewing Dr Tamara O’Connor Student Learning Development Trinity College Dublin.
The Literature Review in 3 Key Steps
Practicing Community-engaged Research Mary Anne McDonald, MA, Dr PH Duke Center for Community Research Duke Translational Medicine Institute Division of.
Mapping Studies – Why and How Andy Burn. Resources The idea of employing evidence-based practices in software engineering was proposed in (Kitchenham.
Reading the Dental Literature
EVIDENCE BASED MEDICINE for Beginners
Inspire. Lead. Engage. Laura Banfield, Nursing Librarian Health Sciences Library September 2010 Introduction to Evidence- Informed Decision Making (EIDM)
The material was supported by an educational grant from Ferring How to Write a Scientific Article Nikolaos P. Polyzos M.D. PhD.
Critical Appraisal Dr Samira Alsenany Dr SA 2012 Dr Samira alsenany.
1. Scopus Update November 2004 American University of Beirut Presented by:Amanda Hart Date: 11 November 2004.
Reviewing the Literature P9419 Class #4 October 20, 2003.
Environmental Health III. Epidemiology Shu-Chi Chang, Ph.D., P.E., P.A. Assistant Professor 1 and Division Chief 2 1 Department of Environmental Engineering.
Research Proposal Development of research question
ESP/EMI Teacher Collaboration
Cohort Studies Hanna E. Bloomfield, MD, MPH Professor of Medicine Associate Chief of Staff, Research Minneapolis VA Medical Center.
Introduction to evidence based medicine
Critical Appraisal of an Article by Dr. I. Selvaraj B. SC. ,M. B. B. S
Introduction to Molecular Epidemiology Jan Dorman, PhD University of Pittsburgh School of Nursing
Research Methods Ass. Professor, Community Medicine, Community Medicine Dept, College of Medicine.
RESEARCH FRAMEWORK Yulia Sofiatin Department of Epidemiology and Biostatistics 2012 YS 2011.
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
Beyond Genes, Proteins, and Abstracts: A Framework to Capture Scientific Claims Catherine Blake School of Information and Library Science University of.
Dr. Alireza Isfandyari-Moghaddam Department of Library and Information Studies, Islamic Azad University, Hamedan Branch
Reading Scientific Papers Shimae Soheilipour
What research is Noun: The systematic investigation into and study of materials and sources in order to establish facts and reach new conclusions. Verb:
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Undertaking Research for Specific Purposes.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Univeristy of Tennessee Knoxville Increasing Effective Student Use of the Scientific Journal Literature Award: DUE NSF:
DrugEpi 1-4 Counting HS Marijuana Use Module 1 Overview Context Content Area: Descriptive Epidemiology & Surveillance Essential Question (Generic): How.
1 INLS890 Evidence-Based Discovery Spring, 2009 Catherine Blake, Ph.D.
Systematic Reviews.
Copyright © Allyn & Bacon 2008 Locating and Reviewing Related Literature Chapter 3 This multimedia product and its contents are protected under copyright.
Chapter 3 Copyright © Allyn & Bacon 2008 Locating and Reviewing Related Literature This multimedia product and its contents are protected under copyright.
Evidence-Based Public Health Nancy Allee, MLS, MPH University of Michigan November 6, 2004.
Information overload –more than 12 million references already in MEDLINE –thousands more each day –well-articulated queries retrieve many relevant articles.
Assessing the Frequency of Empirical Evaluation in Software Modeling Research Workshop on Experiences and Empirical Studies in Software Modelling (EESSMod)
Criteria to assess quality of observational studies evaluating the incidence, prevalence, and risk factors of chronic diseases Minnesota EPC Clinical Epidemiology.
UKPopNet Workshop 1 Undertaking a Systematic Review Andrew S. Pullin Centre for Evidence-Based Conservation University of Birmingham, UK.
Evidence-Based Medicine Presentation [Insert your name here] [Insert your designation here] [Insert your institutional affiliation here] Department of.
How to write a scientific article Nikolaos P. Polyzos M.D. PhD.
The Discussion Section. 2 Overall Purpose : To interpret your results and justify your interpretation The Discussion.
Understanding Medical Articles and Reports Linda Vincent, MPH UCSF Breast SPORE Advocate September 24,
Systematic Reviews Michael Chaiton Tobacco and Health: From Cells to Society September 24, 2014.
META-ANALYSIS, RESEARCH SYNTHESES AND SYSTEMATIC REVIEWS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.
Systematic Approaches to Literature Reviewing Dr Tamara O’Connor Student Learning Development
Moving the Evidence Review Process Forward Alex R. Kemper, MD, MPH, MS September 22, 2011.
Research Methods Ass. Professor, Community Medicine, Community Medicine Dept, College of Medicine.
Systematic Review: Interpreting Results and Identifying Gaps October 17, 2012.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
LITERATURE REVIEW ARCHELLE JANE C. CALLEJO, PTRP,MSPH.
Persevering Through the Prospectus Process By: Nicole Maxwell & Megan Nason.
RTI International is a trade name of Research Triangle Institute Nancy Berkman, PhDMeera Viswanathan, PhD
Automatically Identifying Candidate Treatments from Existing Medical Literature Catherine Blake Information & Computer Science University.
Types of Studies. Aim of epidemiological studies To determine distribution of disease To examine determinants of a disease To judge whether a given exposure.
Evidence Based Practice (EBP) Riphah College of Rehabilitation Sciences(RCRS) Riphah International University Islamabad.
Chapter 4 INTRODUCTION TO CLINICAL PSYCHOLOGY, THIRD CANADIAN EDITION by John Hunsley and Catherine M. Lee.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 1 Research: An Overview.
Is a meta-analysis right for me? Jaime Peters June 2014.
Sociology. Sociology is a science because it uses the same techniques as other sciences Explaining social phenomena is what sociological theory is all.
Welcome to Unit 7! HS100: Introduction to Health Science Ice breaker: While we wait for class to start at the top of the hour, please share at least one.
Critically Appraising a Medical Journal Article
The Claim Framework Catherine Blake
Department of Medicine Michael Farkouh, Vice-Chair Research michael
Blake & Pratt’s ‘Collaborative Information Synthesis’
CSCD 506 Research Methods for Computer Science
What are systematic reviews and why do we need them?
Evidence-Based Public Health
Presentation transcript:

Towards Evidence-Based Discovery Catherine Blake School of Information and Library Science University of North Carolina at Chapel Hill

2 Motivation Relentless increase in electronically available text –Life Sciences 17 million th entry added in April ,200 journals indexed 12,000 new articles each week ! –Chemistry – more than 110,000 articles in 1 year alone Consequences: –Hundreds of thousands of relevant articles –Implicit connections between literature go unnoticed Shift from Retrieval to Synthesis

3 Information Overload “One of the diseases of this age is the multiplicity of books; they doth so overcharge the world that it is not able to digest the abundance of idle matter that is every day hatched and brought forth into the world” - Barnaby Rich, 1613

Evidence-Based Discovery 4 If I have seen further than others, it is by standing upon the shoulders of giants. Sir Isaac Newton We can't solve problems using the same kind of thinking we used when we created them. Albert Einstein 1 American Heritage Dictionary Goal: Facilitate Discovery from Text To make easy or easier 1 A productive insight 1

5 Education Discovery Science Evidence-based Practice Natural Language Processing Human Discovery and Synthesis Human-assisted Discovery and Synthesis Heterogeneous Literature Core Chemistry Breast Cancer Genomics Synthesis and Discovery Work Practices News DocSouth

Outline Motivation Case Studies –METIS Human synthesis Natural language processing –Claim Jumping through Scientific Literature Next Steps Summary 6

Systematic Review Process –Formulate the problem –Locate and select studies –Assess quality of studies –Collect data –Analyze and present results –Interpret results –Improve and update review 28 months from initial idea to publication Increased demand due to evidence- based medicine

Manual Synthesis Select Extract Analyze Verify Guesswork guided by scientifically trained intuition Rescher (1978)

Context Information Study Information –e.g. date, location,... Population Information –e.g. gender, age,... Risk Factor or Intervention –e.g. duration of exposure, confounders Disease –e.g. stage, confounders Loosely coupled to review focus Tightly coupled to review focus

Collaborative Information Synthesis

Key: Estimate Missing Information What are people with Breast Cancer exposed to? What are people in a similar population exposed to? Are these rates significantly different? Studies with Breast Cancer patients Database of risk factors BRFSS Facts for each study number of patients age of patients geographic location risk-factor exposure … Codebook question asked age, gender % responses T. Tengs & N. D. Osgood (2001) “The link between smoking and Impotence: Two Decades of Evidence”, Preventive Medicine, 32:447-52

More than Automated Meta-Analysis Systematic Review External database Entire study Main topic Secondary Information Key Information Synthesis Traditional analysis –same study design –medicine = RCT –epidemiology = cohort Information Synthesis –any study that includes required information –augment missing information

13 Education Discovery Science Evidence-based Practice Natural Language Processing Human Discovery and Synthesis Human-assisted Discovery and Synthesis Heterogeneous Literature Core Chemistry Breast Cancer Genomics Synthesis and Discovery Work Practices News DocSouth Natural Language Processing

14 METIS Information Extractor Semantic Grammar Features: words, numbers, and semantic types in the Unified Medical Language System (UMLS) Information extracted : risk factor exposure (tobacco and alcohol )  gender age (min, max, mean)  start and end dates number of subjects with medical condition  geographical location {term;’age’} {term:’of’} {number;10<n2<110}{term;’to’}{number;10<n2<110} The age of breast cancer subjects ranged between 20 to 64 years old. {semantic type: neoplastic process, or disease}

METIS Info Extractor – Evaluation Diverse text corpus –epidemiology, surgery, biology,... –cohort studies, case-control trials,... Evaluation –Metrics (precision, recall) –Annotators (developer, domain expert, expert annotator, novice) –Primary topic (breast cancer, impotence) –Secondary information (tobacco and alcohol consumption)

METIS Info Extractor – Recall

METIS Info Extractor – Precision

Verify information extracted Electronic version of article Converted Article METIS Verifier

METIS Analyzer Meta-Analysis –Developed for agricultural application –Requires empirical studies with a quantitative outcome –Unit of study is an article - not a person –Result – a unitless metric called an effect size Two common meta-analysis techniques –Fixed effects –Randomized-effects model Evaluation: Compared generated effect size with examples in text books and published articles, Result: Same effect size

Synthetic Estimate Evaluation Tobacco Consumption Alcohol Consumption

Outline Motivation Case Studies –METIS –Claim Jumping Human discovery Natural language processing Human-assisted discovery and synthesis Next Steps Summary 24

25 Education Discovery Science Evidence-based Practice Natural Language Processing Human Discovery and Synthesis Human-assisted Discovery and Synthesis Heterogeneous Literature Core Chemistry Breast Cancer Genomics Synthesis and Discovery Work Practices News DocSouth Human Discovery and Synthesis

Human Discovery Day-to-day activities of scientists reflect –the complex socio-technical environments in which successful creativity tools will eventually be embedded –the human cognitive processing surrounding creativity Unit of analysis: a paper or grant proposal How do chemists transform an idea into a publication ? How do chemists arrive at their research question ?

Approach Recruitment –experienced scientists (7-45 yrs) –local chemists and chemical engineers –response rate 84% (21/25) Semi-structured interviews Critical incident technique 1.seminal paper in their field 2.recent paper authored by the participant 3.paper authored by the participant that they were particularly proud of

Interview Questions Discovery Questions –What is your definition of discovery ? –What evidence convinced you that the paper addressed the initial research questions ? –What factors limited the adoption and deployment of the discovery ? –How did you arrive at the research question ? –What if any existing evidence prompted the study/experiment ? –Were there any alternative explanations ? Information Usage questions –Other than the scientific literature, what information resources do you draw from to aid in your research processes ? –How many articles did you read last month that related to each of those projects ? –Is that typical of how many articles you read in a month for research projects ? –Do you read articles for another purpose ? If so what? –How many hours do you spend reading journal articles for research projects? –Which journals do you typically read and draw from ? –How would you characterize the journals that you read- are they only within your domain, or do you read journals that would be considered non-traditional in your research ? –If you only have a few minutes to read an article, what parts would you read? –What do you do with the article once you have read it ?

Chemists and Chemical Engineers Compared with other scientists chemists and chemical engineers –read more (Brown,1999) –have more personal subscriptions to journals (Noble & Coughlin, 1997) –spend more time reading (Tenopir & King, 2003) –visit the library more often (Brown, 1999) Consequences –information disseminated quickly –information has a relative short lifespan

Human Discovery Findings Discovery definition –Novelty- Balance theory and experimentation –Build on existing ideas- Practical application –Simplicity Hypothesis generation –Discussion- Previous experiments –Combine expertise- Read literature Hypothesis validation –Iterative- Tightly coupled

31 Education Discovery Science Evidence-based Practice Natural Language Processing Human Discovery and Synthesis Human-assisted Discovery and Synthesis Heterogeneous Literature Core Chemistry Breast Cancer Genomics Synthesis and Discovery Work Practices News DocSouth Natural Language Processing

Causal Relationships Newspaper genre –Causal relationships (Khoo, Chan, & Niu, 1998) Biomedical genre –Causes and treats (Price & Delcambre, 2005) –Causal knowledge (Khoo, Chan, Niu, 2000) Universal Grammar –Causatives (Comrie, 1974, 1981) –Action verbs (Thomson, 1987) 32

Claim Definition “To assert in the face of possible contradiction” Example sentence reporting a claim –“This study showed that Tamoxifen reduces the breast cancer risk” Example Claim Framework –Tamoxifen agent –reduces change –[breast cancer risk] object 33

The Claim Framework Goal –go beyond genes and proteins –differentiate between different levels of confidence in the claim –consider claims made in the full text Working hypothesis –literature will report findings using constructs within the Claim Framework –human annotators will agree on facets 34

Preliminary Results 29 articles from TREC Genomics –Total number of sentences: 5535 –Sentences with >=1 claim: 1250 (22.6%) –Total number of claims: 3228 –Average claims per sentence: 2.51 –Claims that did not fit in the Framework: 31 Per article –Average number of sentences: 191 –Average number of sentences with >=1 claim:43 35

Distribution of Claim Categories 36 CategoryTotal (%)Pilot(%)Main(%) Explicit Implicit Observation Correlation Comparison Total

37 All Documents AnnotationTotal (%)Words (Avg) Agent Agent Direction Agent Modifier Object Object Direction Object Modifier Change Change Direction Change Modifier Claim Basis Claim Basis Dir Claim Basis Mod Total

Inter Annotator Agreement Information FacetKappaAgreement Agent0.71 substantial Object 0.77 substantial Change 0.57 moderate Change+ChangeDir 0.88 almost perfect 38

Location of Claims 39 Total Sentences With% SectionClaimTotalsectionclaim Abstract Introduction Method Result Discussion Total

40 Education Discovery Science Evidence-based Practice Natural Language Processing Human Discovery and Synthesis Human-assisted Discovery and Synthesis Heterogeneous Literature Core Chemistry Breast Cancer Genomics Synthesis and Discovery Work Practices News DocSouth Human-assisted Discovery and Synthesis

User Study Timothy S. Carey, MD, MPH Sarah Graham Kenan Professor of Medicine Director, Cecil G Sheps Center for Health Services Research Ila Cote, PhD, DABT Acting Division Director US Environmental Protection Agency National Center for Environmental Assessment Michael T Crimmins PhD. Mary Ann Smith Distinguished Professor of Chemistry UNC and Department Chair, Department of Chemistry Paul Jones Clinical Associate Professor School of Information and Library Science Director of ibiblio.org Rudy L Juliano PhD. Boshamer Distinguished Professor of Pharmacology Principal Investigator, Carolina Center of Cancer Nanotechnology Excellence 41 Steven W. Matson Ph.D. Professor and Chair Department of Biology Robert C Millikan DVM PhD Barbara Sorenson Hulka Distinguished Professor Department of Epidemiology School of Public Health Dr. Rosa Perelmuter, PhD Director, Moore Undergraduate Research Apprentice Program Professor of Spanish and Assistant Dean, Academic Advising Program Jan F. Prins PhD. Professor of Computer Science and Chairman, Department of Computer Science Alexander Tropsha, Ph.D. Professor and Chair Director, Laboratory for Molecular Modeling Suzanne West, PhD Researcher Health, Social and Economics Research RTI International

42 Education Discovery Science Evidence-based Practice Natural Language Processing Human Discovery and Synthesis Human-assisted Discovery and Synthesis Heterogeneous Literature Core Chemistry Breast Cancer Genomics Synthesis and Discovery Work Practices News DocSouth

Closing Comments Accelerate synthesis Breast cancer study without METIS would take >13 years Without synthetic estimate = systematic review Accelerate discovery –Connections between literature –Speculative and orthogonal views Human discovery and synthesis –As important if not more so than automation 43 “Tap the vast reservoir of human knowledge” Louis Round Wilson, 1929

Acknowledgements METIS Funded in part by –California Breast Cancer Research program –University of California, Irvine Thanks to user groups –Particularly to Dr. Adams and Dr. Tengs Academic mentoring –Primary Advisor: Dr. Wanda Pratt –Medical Mentor: Dr. Catherine Carpenter –Co-Advisors: Dr Dennis Kibler and Dr Michael Pazzani –Committee Member: Dr Paul Dourish Claim Jumping Funded in part by –Faculty fellowship from the Renaissance Computing Institute –UNC Faculty Award Thanks to collaborators Nassib Nassar and Mats Rynge (RENCI) Amol Bapat and Ryan Jones (SILS) Chemists and Chemical Engineers Study Funded in part by –NSF Center for Environmentally Responsible Solvents and Processes

Questions and Comments Welcome Catherine Blake School of Information and Library Science University of North Carolina at Chapel Hill

Publication Bias Studies that find a correlation between a risk factor and disease are more likely to be published (Easterbrook et al, 1991, Ingelfinger et al, 1994) METIS provides a new way to explore this bias Bias introduced by authors, editors, funding,...