Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMS SAS Users Group Conference Learn more about THE POWER TO KNOW®

Similar presentations


Presentation on theme: "CMS SAS Users Group Conference Learn more about THE POWER TO KNOW®"— Presentation transcript:

1 SAS Enterprise Business Intelligence (SAS/EBI) Information Maps Best Practices
CMS SAS Users Group Conference Learn more about THE POWER TO KNOW® October 17, 2011 Presented by: Letha Christian, Lead SAS/EBI Implementation Project Vivek Sethunatesan, Lead BI Architect Maricom Systems, Inc.

2 Agenda Objective SAS/EBI Project Implementation
Integrated Data Repository (IDR) and Challenges SAS/EBI Information Maps and Features Design Approach and Best Practices for Part D Information Map SAS Information Map Studio Verification of Information Map and Challenges for Development Best Practices for Information Map Development Questions

3 Objective This presentation provides an overview of Information Map development in support of the SAS/EBI Implementation project. The presentation will touch on the development challenges faced and overcome, and will also provide an overview of recommended best practices.

4 SAS/EBI Project Implementation
The SAS/EBI project has three major goals: Implement a hardware infrastructure that supports the SAS/EBI environments in the CMS Data Center. Implement the SAS/EBI Tool Suite. This provides additional tools for CMS users that will allow decision makers to spend less time looking for answers and more time driving strategic decisions. Implement SAS/EBI capabilities to provide query and reporting functionality for data that primarily resides in the IDR and uses a Teradata database. Connectivity to other data sources such as Oracle and DB2 are also available.

5 Integrated Data Repository (IDR)
Contains claims, provider, and beneficiary data from 2006 to the present in a highly normalized Teradata Data Warehouse. Contains millions of beneficiaries and billions of claims. Consists of detailed Part A, B, and D data, including every version of Claims Data. Consists of supporting reference data for Beneficiary, Provider, Drug, Contract, Geography, and Product (HCPCS, Diagnosis, Rev Center, IDE, and Procedure). Contains V2 views as the access layer for applications and users. Supports As-Is and As-Was reporting. Provides reporting capabilities based on National Provider Identifier (NPI) crosswalk data. Contains drug data from Medi-Span® and First DataBank.

6 Integrated Data Repository (IDR)

7 IDR Data Model for Medicare View

8 IDR Challenges for Business Reporting
Users must write their own queries to pull information from the IDR. Users need a strong understanding of the data sources, relationships, and join conditions. Possibility of missing joins, business rules, and inconsistent results. Increased level of effort. Complexity of software – not simple drag and drop. Performance degradation can result when pulling detailed level data that is not required. May require increased capacity requirements for the IDR.

9 SAS/EBI Information Maps
Provides (business) metadata about the data contained within a data warehouse (IDR). Provides information to the user about how data tables are related. Acts as a semantic layer built on top of the Medicare Virtual Data Mart (VDM). Ensures users are querying the IDR in a consistent and accurate manner. Created and managed by information architects/ development teams with input from information consumers. Information Maps change as the IDR changes. These changes require a structured approach for implementation.

10 SAS/EBI Information Maps
Generates join conditions, data summarizations, and query code against the IDR (using Information Map Studio). Provides security and guard rails. Allows targeted view of data. Allows use of data filters (pre-defined or custom). Allows for ‘in database processing’ and data summarization at the appropriate levels, based on the requirements (e.g., state-level summarized data). Reduces report building and analysis time. Supports a wide range of users. Must be used as data sources in SAS Web Report Studio. Integrates across the SAS Enterprise suite of products.

11 Information Map Design Approach
Data sources for Information Maps were pre- determined to be the IDR Medicare VDM. The initial approach was to design by functionality and subject area. The first map was developed for Part D subject area. Used SAS Information Map Studio to build and manage Information Maps. Validated access of Information Maps using business queries through SAS Web Report Studio and Enterprise Guide applications.

12 Information Map Development Best Practices
Developed customized labels that were understandable by business users. Established business rules that provided consistent delivery of information by using standard filters and calculations. Provided appropriate joins within the data sources so SAS could automatically generate the appropriate query code.

13 Part D Information Map Accesses Medicare Part D claims information only. Thirty-seven commonly used Part D data elements. Data elements use IDR naming conventions understood by business users. Data elements grouped by business area, such as Provider, Beneficiary, Geographic, and Drugs. Contains four pre-defined filters ̶ Final Action, Claim Type, Reconciliation Year, and Date Ranges ̶ which will be used to limit the data results. Use of these filters will be critical to overall performance due to the size of the IDR tables.

14 Part D Information Map Provides calculations for the following aggregates and metrics: Total Drug Cost Distinct Beneficiary Count Count of PDEs Claim Total Drug Costs Claim Line Amount Totals Contract PBP Amount Totals NDC Amount Totals Over 120 joins have been mapped to support the join relationships of tables in the IDR Part D Information Map. SAS will use these join relationships to help generate SQL to satisfy end-user requests.

15 Information Map Studio - Presentation
SAS Information Map Studio provides an easy-to-use interface that allows development teams to manage Information Maps.

16 Information Map Studio - Properties
The Properties tab enables developers to select attributes for each element within the Information Map. Developers can also describe and determine the use of each element.

17 Information Map Studio - Relationships
The Relationships tab enables developers to define the join criteria for each table included within the Information Map.

18 Validating the Part D Information Map
Ensured all joins defined in the Information Map are supported by the IDR relational data model. Solicited queries from business users from their previous reports. Used queries generated for the Part D Dashboard and validated the data results against the results using the Information Map. Documented the performance of the queries across environments executed to note the behavior of different filters and aggregations. Performed User Acceptance Testing with users familiar with Part D data and received sign-off .

19 Challenges of Information Map Development
Determining the Information Map data source. Use of detailed level data warehouse results in complexities in SQL generations and performance implications. Deriving requirements was challenging due to aggressive timelines and the availability of stakeholders. Obtaining valid/realistic SQL for testing. Limited query reports on Part D. Environment/performance issues. Data availability in the DEV, VAL, and PRD environments. Different data volumes across environments. Scope of model, 120+ joins to verify – detailed analysis.

20 SAS/EBI Best Practices and Considerations When Accessing BIG DATA

21 Typical Access Using SAS/EBI Tools
Data Mart Data Architects Information Delivery Portal End users SAS Enterprise Guide SAS Power Users SAS Web Report Studio SAS Standard Users/Power Users SAS Information Maps Tool Developers Ad hoc and extracts Ad hoc and canned reports

22 Information Maps Best Practices - Overview
Need to ensure that the time to return data does not suffer as the complexity and size of the data source grows. Predefined filters should be used to limit the data returned. New calculated columns bring consistency to reporting. Additional guidelines should be considered for larger data sources. Appropriate measures should be taken to thoroughly test Information Maps from a performance perspective.

23 Accessing Source Data Using SAS Information Maps

24 Factors that Influence Information Map Performance
BIG DATA Number of database calls when executing reports Data Source Design (Cube, VDMs, PDMs, SAS Datasets) Work done is SAS Server vs. Database SAS/EBI Data Access via Information Maps

25 Information Map Data Source Considerations
Data sources can serve one Information Map or support multiple Information Maps.

26 SAS OLAP Cubes SAS OLAP Cubes - An OLAP cube provides access to data that has been summarized into multi-dimensional views and hierarchies. Accessing data from an OLAP cube is typically fast because most resource-intensive calculations are performed as part of the generation of the summarized data.

27 Teradata Stored Procedures
Teradata Stored Procedures - In more complicated database structures, it may be necessary to create multiple sub-queries or other complex code. In this case, an optimally tuned Teradata stored procedure can be used. Once created, it can be accessed like any other table or a database view from the Information Map because the Libname Engine sees it like a single database table. Information Maps VDMs Teradata Stored Procedures 27

28 Information Map Data Source Considerations
De-normalized Views - In SAS/EBI, the Libname Engine decides how to request data from Teradata; that is, whether to send one query or send multiple queries and join the results on the SAS side. If there is only one table/view to query from, then the Libname Engine sends a single query to the database. PRVDR_DM_VIEW PRVDR_DM_VIEW is created by Joining all xref tables and PRVDR.

29 SAS Datasets SAS Datasets - SAS datasets are similar to Teradata tables but are stored in a SAS Server location. They can be used across multiple projects in Information Map Studio with similar properties as a Teradata table and can be created using Enterprise Guide. EG Process Flow SAS datasets Information Maps

30 Source Database Foundation
Physical Data Marts Physical Data Marts - Physical Data Marts are the best practice structures for supporting multiple users and applications with business-critical data. They generally take the form of Star Schema designs that aggregate detail data from the main tables and views. Source Database Foundation Information Maps

31 SAS Stored Process SAS Stored Process - SAS code or program flow that is registered in the Metadata server using the Management Console. Stored processes can have parameters for user input and produce various output such as datasets, reports, or charts. Once created, they can be called from the Enterprise Guide or Information Map.

32 Web Report Studio - Best Practices
The following WRS options can help the performance of a report. Report Layout Use Group Breaks to eliminate scrolling. WRS will only query and process one element of the Group Break field at a time, so performance is an added benefit. Cross-tabs are memory intensive; use list reports, if possible. List reports will paginate the output one page at a time, greatly reducing system overhead. Limit size to a maximum two pages of generated output. Use Group Breaks in conjunction with the Cross-tab, or in conjunction with a list report. If the total output (including all Group Breaks) is over 20 pages, consider scheduling the report. If the total output is over 100 pages, do not use WRS.

33 Web Report Studio - Best Practices
Scheduling If the report is to be used by separate groups (Bene state, Bene City, etc.) and does not require a prompted response from the user, you can schedule the WRS report and use the distribution option to deliver a PDF to a subscription channel. Prompts Do not store pick-list values that will change frequently within the Information Map. You will need to re-deploy the Information Map every time the list needs updating. Avoid using database-driven lists greater than Dynamic lists do not paginate; use a user-entered filter. Do not use the main/fact table as a source for a dynamic pick-list filter. The system needs to use a ‘select distinct’ and if it is doing that on a multi-million row table, users will wait too long for the pick-list to generate.

34 SAS BI Performance Tuning - Conclusion
SAS has many options for building robust BI applications. Thorough analysis must be conducted in order to determine which options should be leveraged. Data Sources Level of Development Level of Administration Reusability Teradata Stored Procedures Medium SAS Stored Process Low SAS Datasets SAS OLAP Cubes High IDR De-normalized Views IDR Physical Data Marts

35 Questions?


Download ppt "CMS SAS Users Group Conference Learn more about THE POWER TO KNOW®"

Similar presentations


Ads by Google