Download presentation
Presentation is loading. Please wait.
1
Adverse Event Post mentions
Leveraging Crowdsourcing to Help Classify Social Media Data for Medical and Patient Safety Insights Alex Gartland1, Jeffery L. Painter, JD2, Tim A. Casperson3, Greg E. Powell, PharmD, MBA3 1University of North Carolina at Chapel Hill, Chapel Hill, NC; 2JiveCast, Raleigh, NC, USA ; 3GlaxoSmithKline, Research Triangle Park, NC Introduction The classification and curation of social media data for use in drug safety is both time consuming and costly. This research aims to demonstrate how crowdsourcing can be used to accurately and efficiently classify social media data, potentially reducing associated cost, allowing trained drug safety experts to focus their time and expertise on assessing social media data. Results The results from Phase 1 yielded an overall percent accuracy with our reference dataset as follows: 92.8% for both the “$0.03 batch” and the “$0.03 batch with a bonus” bathes and 92.5% for the “$0.05 batch”. Curation time of each of the three Phase 1 batches was approximately 6 days. Phase 2 was curated in less than a day and a half and had an overall accuracy of 91.6%. Phase 1 and 2 Summary Statistics Background Pharmacovigilance, or more commonly drug safety, is primarily focused on identifying and evaluating safety signals (unwanted or unexpected effects from taking a medicine). The gold standard data source in the field is currently spontaneously reported adverse events collected in the FDA Adverse Event Reporting System (FAERS1). The major drawbacks with spontaneous data include under-reporting and lack of quality/generalizability. For the last few years, pharmacovigilance efforts have evolved by increasing the utilization of observational data (such as electronic medical health records and insurance claims), however limitations also exist. More recently, drug safety experts have begun to look at the use of social media as a timelier source of geographically diverse pharmacovigilance information2. Number of Posts Number of Questions Number Match Percent Match Curation Time(Hours) Phase 1 $0.03 per Post 500 11,000 10,217 92.8% 147 $0.03 per Post with $1.00 bonus per 50 Posts 10,216 $0.05 per Post 10,189 92.6% 146 Phase 2 $0.18 per Post 5,000 110,000 100,981 91.8% 33 Phase 2 Summary Statistics Question Name Number Match Percent Match Number of Posts Adverse Event Post mentions Proto-AE* 4177 83.5 5000 Time to Onset 4823 96.5 Outcome 4542 90.8 Poster Information Poster Type 3481 69.6 Post Mentions Personally Identifiable Information 4906 98.1 Concomitant Medications 4461 89.2 Occupation 4907 Education 4963 99.3 Smoking 4874 97.5 Alcohol Use 4973 99.5 Illicit drug use 4949 99.0 Pregnancy 4968 99.4 Health Services Information 4728 94.6 Seeking Information 4711 94.2 Drug Abuse 4891 97.8 Product Complaint 4694 93.9 Medical History 4800 96.0 Product Information Route 4181 83.6 Formulation 3911 78.2 Dosing 4719 94.4 Indication 4031 80.6 Benefit Discussed 4291 85.8 Total 100,981 91.8 Objective This research evaluates whether crowdsourcing, via Amazon Mechanical Turk (MTurk) can be used as a platform to accurately and efficiently classify medically relevant concepts in social media posts. Methods A dataset consisting of ~15,000 social media posts was previously curated by GlaxoSmithKline drug safety scientists/physicians and served as the reference dataset for comparing crowdsourced classifications. Amazon Mechanical Turk (MTurk) was chosen to crowdsource basic classification of social media posts using the reference dataset. The goals were to measure cost, accuracy and time (Phase 1) and quality and time (Phase 2). Phase 1, consisting of three identical batches of 500 randomly selected posts, was tested at three variable price points -- $0.03, $0.03 with a $1.00 bonus per 50 posts, and $0.05. Each post was reviewed by three unique MTurk workers (Turkers) and compared to the reference dataset using a majority voting system. Phase 2 consisted of 5,000 randomly selected posts where each post was reviewed only once by Turkers and then compared to the reference dataset. To help ensure quality we implemented measures to detect poor quality (e.g. use of bots) and then applied an accept/reject system to posts completed by Turkers. * Proto-AE – for the purposes of this study, we did not attempt to identify a true adverse event, but looked for the qualifications that would be typical of an AE and therefore, refer to all potential AE posts as Proto-AEs Fig 1: Curation tool used by experts Fig 3: Side by side box plots of accuracy in each test phase Discussion These results show that the average Turker can review and classify social media posts in preparation for a range of medical insights with over 90% accuracy as compared to a trained drug safety expert. Overall the cost of curation was relatively low ($0.03 to $0.18 per post) and the rate of review increased as the pay increased. At the $0.18/post pay rate Turkers would have been able to review the entire reference dataset (~15,000 posts) in about 12 days (the original review by internal staff took ~3 months to complete). Additional research is required to further assess the strengths and limitations of crowdsourcing for medical insights. Fig 2: Mechanical Turk user interface References and Acknowledgments 1. FDA, FDA Adverse Event Reporting System, 2016 (accessed Feb 8, 2016), 2. Gregory E. Powell, Harry A. Seifert, Tjark Reblin, Phil J. Burstein, James Blowers, J. Alan Menius, Jeffery L. Painter, Michele Thomas, Carrie E. Pierce et. al. Social Media Listening for Routine Post-Marketing Safety Surveillance. Drug Safety We also thank Arooj Akhtar and Richard Le for their work on this project including assisting with study design and data analysis Conclusion Crowdsourcing is a cost effective and efficient method for classifying basic medical information contained in social media posts.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.