Presentation is loading. Please wait.

Presentation is loading. Please wait.

April 10-12, Chicago, IL Ensuring Compliance of Patient Data with Big Data and BI Ayad Shammout & Denny Lee.

Similar presentations


Presentation on theme: "April 10-12, Chicago, IL Ensuring Compliance of Patient Data with Big Data and BI Ayad Shammout & Denny Lee."— Presentation transcript:

1 April 10-12, Chicago, IL Ensuring Compliance of Patient Data with Big Data and BI Ayad Shammout & Denny Lee

2 April 10-12, Chicago, IL Please silence cell phones

3 3 Agenda A Quick Big Data Primer Healthcare and Big Data Compliance and Auditing SQL Compliance Project Compliance and Auditing with Big Data and BI Big Data: Unstructured Volumes of Data Analytics: PowerPivot, Power View

4 4 What is Big Data? Volume Exceeds physical limits of vertical scalability Velocity Decision window small compared to data change rate Variety Many different formats makes integration expensive Variability Many options or variable interpretations confound analysis

5 5 10x increase every five years 85% from new data types Data explosion Volume Velocity Variety Hadoop Cloud By 2015, organizations that build a modern information management system will outperform their peers financially by 20 percent.   – Gartner, Mark Beyer “Information Management in the 21st Century”

6

7 7 Big Data Business Value

8 8 Data

9 9 Hadoop: The most visible face of Big Data

10 10 HDInsight: Visit HadoopOnAzure.com 10

11 Healthcare and Big Data

12 12 Healthcare and IT Often the laggard in technology Yet application of IT to healthcare can radically change what we can do Genomic Sequencing Proteomic sequencing Incidence Prediction

13 13 Healthcare Big Data Example Scenarios Clinical Trial Deviations Originally Viagra was developed to lower blood pressure and treat Angina Now its used to help newborn pulmonary hypertension and altitude sickness Incidence Prediction Missed 4 or more visits, twice as likely to have an asthmatic incident Particular Cardiac monitor sine wave points to highly likelihood of heart attack Campaigns Social media and advertising campaigns to understand user behavior and sentiment Patient Satisfaction Social media and advertising campaigns to understand user behavior and sentiment

14 14 BIDMC Auditing Scenario Auditing is critical component HIPAA in ensuring patient privacy 1 Billion rows+ of audit data 146 mission critical clinical applications Comprehensive audits yield 300-500k transactions/day HIPAA requires audit system with 20 years of data Auditing Project Available to community as part of Compliance SDK Updating for SQL Server 2012, HDInsight, Power View, and MobileBI* Creating an enterprise tool for consolidated storage, reporting and alerting of all application audit data - that's cool! John Halamka’s Cool Technology of the Week (Wellsphere Top Health Blogger, Health Impact Award)

15 15 BIDMC Compliance Project SSIS HDInsight Windows HDInsight Azure SQL Server 2008/2012 Audit LogsETL Logs to HDFS Use Excel 2013 PowerPivot and Power View SSAS (tabular)

16 16 Auditing Sensitive Information 16

17 Audit Logs 17 Storage Infrastructure Transfer files to ASV via AzCopy, CloudExplorer, etc.

18 18 Storage Infrastructure 18 Hadoop on Azure Compute Nodes (Medium VMs) Azure Storage Vault (ASV) Azure Blob Storage Azure Flat Network Storage

19 19 Storage Infrastructure 19 Hadoop on Azure Compute Nodes (Medium VMs) Azure Storage Vault (ASV) Azure Blob Storage Azure Flat Network Storage Stream data To compute Push data Back to Storage map sort shufflereduce http://dennyglee.com/2013/03/18/why-use-blob-storage-with-hdinsight-on-azure/

20 20 SSIS to HDInsight

21 21 SSIS Processing

22 22 SSAS Tabular of HoA Audit Data

23 23 Hadoop / Auditing: File sizes Currently testing gz vs. raw E.g. 12MB raw text file vs. 633Kb gz file (~20x compression) 20x smaller size, ~same query time Approx same map / reduce task utilization File Size is 250MB-1GB SSIS package takes care of the size Future testing: avro, protobuf 23 QueryDuration (s) select count(*) from sql_audit_asv_raw56.066 select count(*) from sql_audit_asv_gz58.994

24 24 Hadoop / Auditing: Formats For ease of processing, replace carriage returns within embedded SQL statements, e.g. select col1, col2 from tableA to select col1, col2 from tableA This allows you to create a Hive table using CR as row delimiter (i.e. does not have things like SQL quoted identifiers) 24

25 25

26 SQOOP, HiveODBC, Templeton, CSV, etc BI Connectivity

27 27 Big Data … Excel-lerated! 2 Server, 3mo 110 GB binary files SSIS SSIS extraction 1.2GB of text 120MB gz Hadoop to PowerPivot 6MB

28 28 PowerPivot workbook of HoA Audit data

29 29 Power View of HoA Audit Data

30 30 Win a Microsoft Surface Pro! Complete an online SESSION EVALUATION to be entered into the draw. Draw closes April 12, 11:59pm CT Winners will be announced on the PASS BA Conference website and on Twitter. Go to passbaconference.com/evals or follow the QR code link displayed on session signage throughout the conference venue. Your feedback is important and valuable. All feedback will be used to improve and select sessions for future events.

31 April 10-12, Chicago, IL Thank you! Diamond Sponsor Platinum Sponsor


Download ppt "April 10-12, Chicago, IL Ensuring Compliance of Patient Data with Big Data and BI Ayad Shammout & Denny Lee."

Similar presentations


Ads by Google