Copyright Statement Copyright Mini Kanwal This work is the intellectual property of the author. Permission is granted for this material to be shared for non- commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.
Data Administration In An Academic Environment Mini Kanwal Data Administrator University Information Systems Georgetown University October, 2002
Data Administration In An Academic Environment University data are institutional assets and are a key to support its fundamental instructional, research and public service missions. In the past, the information has been maintained by different administrative areas and often stored in different systems. However, because Georgetown’s management and staff are increasingly turning into information consumers, integration of information from these different systems is very important. The staff, faculty and students are relying more than ever upon university data to help meet their work or study goals. Our poster outlines the different challenges we face and the strategies we are taking to educate, empower, and inform the value of the data - its integrity, quality and security to the academic and non-academic departments, faculty and staff of the university. It also focuses on how the data administration and management is becoming a part of the larger university-wide effort.
Data Administration Definition: As a new function for Georgetown, Data Administration is the function of applying formal guidelines and tools to manage the university’s information resources, thereby providing reliable, accurate, secure, and accessible data to meet the strategic and management needs of all campus users. Data Administration is a part of the Enterprise Data Warehouse project (EDW) and has several components - including warehouse management software, database management systems and access tools. The data administration serves three distinct groups: Operational (Detail information to support line employees in their daily work) Managerial (Directors, Managers; workload monitoring & survey response) Executive (Provost, Vice President, and Deans: strategic forecasting)
PeopleSoft Admissions Student Information Systems NETWORK AND DATA INFRASTRUCTURE EDW ROLES HR/Payroll ENTERPRISE DATA WAREHOUSE Academic Support Facilities PeopleSoft Financials SUPPORT SERVICES Training Standards Desktop Support Help Desk DATAADMINISTRATIONDATAADMINISTRATION Business Rules VoiceDataVideo NETWORK SERVICES Security Messaging Authentication Imaging EDI APPLICATIONS ENVIRONMENT
Internal Issues Large population of data consumers – faculty, staff, students Multiple data sources / Multiple data owners Don’t know where data is located Don’t know what data is available for our area of interest Can’t access the data No data standardization policy Data isn’t consistent or clean Domain/range of data often not documented Query and reporting tools are difficult to use
Overall Project Goals for Data Administration Clarification of the business rules, data definitions and data uses Improve data quality including definition, accuracy, and timeliness Improve the security of the data including confidentiality and protection from loss Improve ease of access. Users can get at data and understand it better Increase understanding of existing administrative data Foundation for greater ease of sharing data between systems Create a data management structure Reduce the redundancy of the data
Challenges Experienced Magnitude of the task Lack of communication/resources Lack of understanding for data administration function Lack of awareness of the value of the data Lack of urgency perceived within departments “Meta what?” Time conflicts Setting priorities
Strategies Adopted Educate the user community Web site createdwww.georgetown.edu/uis/ia/dw/da Training sessions One-to-one sessions Working groups created Data Policy Committee Data Element Workgroup DW/DA Steering group Tools used Incorporate data administration function in daily operations Global meta data repository created Evaluated Metadata Repository tools currently in the market
Global Metadata Repository In supporting Georgetown’s move towards a true enterprise-wide business model, we have created a web-based Metadata repository. This cutting-edge, web-based application is written in ColdFusion and retrieves data from the Oracle database. This is a powerful data drill-down application that has facilitated the front-end and back-end users to create dynamic queries. This repository provides a centralized source of information for Financial and Student administrative data at Georgetown. Here at Georgetown we have built the data warehouse using Star Schema design. The star schema contains one large table, called the fact table, placed in the center with smaller tables, called dimension tables, joined to the fact table in a radical pattern. The metadata repository contains information on data about data. The users can view the metadata of Stars, Tables (fact and dimensional) and the fields in each table. The metadata includes the business and technical metadata, like the physical names, logical names, datamart field names, and the business definitions of each one of them.
Student Records Data Warehouse Conceptual Model
Data Administration Tools Used ErWin, the data modeling tool - A conceptual schema, our models unified and logically integrated view of the organization’s entire collection of data resources. Informatica, the ETL tool – Used to extract, transform and load the data from one environment to the other. Cognos, the BI tool – Used to create queries, and generate reports for the users.
Metadata Repository Tool Evaluation Need for Repository – Data dictionary Metadata repository tools Georgetown University’s requirements for the repository tool Vendor responses Evaluation of vendor products Next steps
Data Administration Measurable Success Criteria Improves infrastructure Supports data integrity Improves productivity Improves efficiency Increases/improves functionality Enhances decision making process and management reporting Improves end user tools Supports University strategic objectives Supports/contributes to increased standardization
ENTERPRISE DATA WAREHOUSE/DATA ADMINISTRATION PROJECT SUMMARY ActionsGoalsBenefits Make Georgetown’s information accessible Make Georgetown’s information consistent Provide an adaptive and resilient source of information Protect Georgetown’s information assets in a secure environment Create the foundation for informed decision making 1) Create central repository of data that is easily accessible, understandable, navigable and available nearly 24x7 2) Build data warehouse infrastructure (server, database, developer tools, end user query tools) 3) Plan and prioritize data mart projects 1) Establish data administration staff 2) Create Data Administration Advisory Group 3) Create University-wide data admin policies and procedures 4) Build a University-wide data dictionary 1) Use distributed and incremental data mart design 2) Integrate multiple sources of data over time without disruption to existing services 3) Provide “time and date” source information 1) Create a solid security model that withstands the test of time and changes in University structure 2) Create automated authentication system 3) Create automated authorization system 4) Create automated audit trails 1) Integrates data from disparate sources / systems 2) Empowers customers to obtain and analyze data 3) Reduces administrative, labor-intensive activities 4) Provides greatly enhanced access to data without taxing back end information systems 5) Directly involves customers in data mart design 1) Develop a data warehouse tool / content training program 2) Expand User Services support for Data Warehousing 3) Encourage working groups to meet and communicate often and share report formats, coding techniques and experience 4) Create Web-based online documentation 1) Business definitions and data elements are uniform across business applications 2) A common understanding of business rules and uses of data throughout the University is documented 3) Naming standards are developed 1) Audit requirements for authentication are satisfied. 2) Audit requirements re: data access are satisfied 3) Ensures the right people have access to the right data 4) Manual security maintenance no longer necessary 1) Customers can answer complex business questions 2) Business Questions answered quickly and easily 3) Expertise to support true end user computing 1) Historical data are more readily available 2) Data from outside of the University can be integrated with core business system information 3) New information becomes available to customers