Presentation is loading. Please wait.

Presentation is loading. Please wait.

Moving Towards A Data Repository That Facilitates Data Analysis CHOP November 18, 2009 1.

Similar presentations


Presentation on theme: "Moving Towards A Data Repository That Facilitates Data Analysis CHOP November 18, 2009 1."— Presentation transcript:

1 Moving Towards A Data Repository That Facilitates Data Analysis CHOP November 18, 2009 1

2 Relational Database Design 2

3 Normalization Normalization - process of efficiently organizing data in a database to reduce redundancies of data Goal - consistency of data –Store data once and one time only! –security –disk space –speed of queries –efficiency of database updates –data integrity  In normalized database no aggregation and no calculated fields 3

4 Data Anomolies 4

5 Unnormalized data set Patient ID NameAddressDOBDocAppt Date LocationDX 111111Cindy Marselis 2320 Edge Hill Road 1/11/64Armstrong9/1/09 11:00 AM Alter 2011Herniated Disc Flu 111111Cindy Marselis 9331 Rising Sun Avenue 1/11/64Morningstar9/1/09 11:00 AM Alter 2011Herniated Disc 111111Cindy Marselis 2320 Edge Hill Road 1/11/64Allen11/1/09 10:00 AM Alter 2012Psoriasis 222222Kathryn Marselis 2320 Edge Hill Road 11/3/04Dershaw8/1/09 11:00 AM Speakman 105 Well baby check 111111Cindy Schwartz 9331 Rising Sun Avenue 1/11/64Armstrong8/11/09 3:00 PM Alter 105Psoriasis Herniated Disc 5

6 Normalized db - before 6

7 Normalized db - after 7

8 Example of Appointment Entity Relationship Diagram 8

9 Structured, free text, unstructured text 9

10 Free text Issues with string searches –Must match exactly in case, punctuation, spelling, etc. Use of lookup tables where possible 10

11 Unstructured Text Gartner: white-collar workers spend from 30 to 40% of time managing documents Merrill Lynch: > 85 % of business information exists as unstructured data –e-mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations and Web pages. In relational db, data that can't be stored in rows and columns. –stored in a BLOB (binary large object) –e-mail files, word-processing text documents, PowerPoint presentations, JPEG and GIF image files, and MPEG video file Metadata (data about data can be stored) http://www.information- management.com/issues/20030201/6287-1.html 11

12 Approaches to structured and unstructured data 1.Unique database: consolidates all structured and unstructured data together –expensive to buy and maintain –large volume of data can clog the database making it slow and inefficient 2.Use two databases: one structured data, and one for unstructured data. –Avoids performance issues with structured data –significant performance limitations for unstructured data 12

13 Approaches to structured and unstructured data 3. Unstructured data left on file servers with database to record and links to unstructured data files. –Avoids issue with volumes of data –Fragile as links are broken when files and folders moved around. –Must create links every time new document created 4.Complex and expensive connectors used to tap in all databases and file servers providing unified view of data. –Expensive and complex requiring purchase and maintenance of multiple databases and file servers with the added cost of all required connectors. 5. Patents currently under development. 13

14 Certification Commission for Health Information Technology (CCHIT) EHR Construct EMAR: Electronic Medication Administration Record CPOE: Computerized Physician Order Entry PFS: Physician Fee Schedule OC/RR: Physician Order Communication/Results Retrieval CPOE: Computerized Physician Order Entry PFS: Physician Fee Schedule R-ADT: Registration Admission Discharge Transfer 14

15 Data Warehousing 15

16 16 External and internal forces require tactical and strategic decisions Search for competitive advantage Business environments are dynamic Decision-making cycle time is reduced Pressures Driving Need for Business Intelligence and Data Warehousing

17 17 Operational data –Relational, normalized database –Optimized to support transactions –Real time updates Operational vs. Decision Support Data DSS –Snapshot of operational data –Summarized –Large amounts of data Data analyst viewpoint –Timespan –Granularity –Dimensionality

18 18 Creating a Data Warehouse

19 19 Separated from operational environment Integrated Data Historical data over long time horizon Snapshot data captured at given time Subject-oriented data Mainly read-only data with periodic batch updates from operational source, no online updates Codd’s Key Data Warehouse Rules

20 20 Contains different levels of data detail –Current and old detail –Lightly and highly summarized Metadata (data about the data) critical components –Identify and define data elements –Provide the source, transformation, integration, storage, usage, relationships, and history of data elements Codd’s Key Data Warehouse Rules contd.

21 Decision Support Systems DSS 21

22 22 DSS Components

23 Decomposition of DSS – Operational Data o Tumor registry o A/D/T o Radiology narrative o Pathology narrative o Lab results o Patient Accounting o Charge Master 23

24 Decomposition of DSS – External Data o Research spider o Treatment guidelines o Reimbursement schedules o NCI/NIH protocols 24

25 Decomposition of DSS – ETL Rationalize normal lab values Transform gender codes and free text Narrative dumps Doctor cleansing o Similar names o Which practice gets credit? 25

26 26 ETL – Extraction, Transformation, Load Transform : cleanse data for consistency and output exceptions o Apply business rules o Selecting certain columns to load (not null records) o Translating coded values (1, M, male = 0 ) o Derive new calculated value (sale_amount = qty * unit_price) o Join data from multiple sources (lookup, merge) o Aggregate (rollup/summarize data – average LOS for each doctor by DRG) o Transpose/pivot (turning columns into rows) o Data validation. Extract data from source systems Load: data into repository

27 ETL Best Practice 27 ETL: 60-80% of development effort Create multi-departmental team charged with consensus on Transformation! Review exceptions carefully o Indicator of issues with operational db design o Indicator of changes needed in transformation

28 Decomposition of DSS – Business Data Business data – central repository Includes metadata: source, format, timing of feeds CharacteristicFactors IntegratedCentralized Holds data retrieved from entire organization Subject- Oriented Optimized to give answers to diverse questions Used by all functional areas Time VariantFlow of data through time Projected data Non-VolatileData never removed Always growing 28

29 Decomposition of DSS – Business Model Data Comprehensive Cancer Center definition of a patient o Must have seen a physician for suspected or confirmed benign or malignant condition o What about patients seen for screening mammography? 29

30 Decomposition of DSS – End user query tool Web-based or client-server? OLAP – Online Analytic Processing o Microsoft o Business Objects (bought by SAP) o MicroStrategy o Cognos (bought by IBM) o Oracle (includes Hyperion) 30

31 Decomposition of DSS – End –user tool o Drill down functionality o Roll up o Charts – not data level o Export features http://demos.telerik.com/aspnet- ajax/chart/examples/functionality/ drilldown/defaultcs.aspx 31

32 Design – Star Schema 32

33 Star Schema Center fact table o usually contains numeric information for summary reports. Dimension table radiate from fact table Dimension table is hierarchial ‘rollup’ allows to compare types of hospitals, disease categories, or even patient age bands. Creates logical data cube dimensions identifying a set of numeric measurements within the cube. 33

34 34 Data-modeling technique Maps multidimensional decision support into relational database Yield model for multidimensional data analysis while preserving relational structure of operational DB Four Components: –Facts –Dimensions –Attributes –Attribute hierarchies Star Schema

35 35 Simple Star Schema

36 Star Schema 36

37 Entity Relationship Diagram 37

38 Analysis 38

39 39 Advanced data analysis environment Supports decision making, business modeling, and operations research activities Characteristics of OLAP –Use multidimensional data analysis techniques –Provide advanced database support –Provide easy-to-use end-user interfaces –Support client/server architecture Online Analytical Processing (OLAP)

40 Healthcare Cube – slice and dice view Diagnosis Time Physician Time Strategic Period YearQuarterMonthWeekDayShiftHour Provider ClinicSpecialtyGroupPhysician 40

41 Dashboard 41

42 Scorecard including Key Performace Indicators (KPI) Risk-adjusted mortality index Risk-adjusted complications index Risk-adjusted patient safety index Severity-adjusted average length of stay Expense per adjusted discharge, case mix- and wage-adjusted 42


Download ppt "Moving Towards A Data Repository That Facilitates Data Analysis CHOP November 18, 2009 1."

Similar presentations


Ads by Google