Presentation is loading. Please wait.

Presentation is loading. Please wait.

Karen Ritter, IRP Director GTCC Guilford Technical Community College

Similar presentations


Presentation on theme: "Karen Ritter, IRP Director GTCC Guilford Technical Community College"— Presentation transcript:

1 Data Warehouse: What is it, do we really need it, and if so, how can we get it?
Karen Ritter, IRP Director GTCC Guilford Technical Community College 5/19/2019

2 What is it? Technically speaking…. A data warehouse is a central repository containing stable, accurate, consistent, clearly understood data that are needed for management information and decision making across the whole organization. Bore Snore….ZZZZZZZZZZZZZZZZZZZZZZ Most data warehouses include a copy of the data in the organization’s operational systems. Typically, copies will be taken at regular intervals in order to build up an historical database capable of revealing patterns and trends over time. A warehouse may also contain external data, and other information previously kept by users in personal spreadsheets and databases. The physical form in which these data are held in the warehouse is another major design consideration, but this has no real bearing on whether or not something can be considered a data warehouse. In principle, everything could be stored in flat files, but in practice, most data warehouses use a relational database. This is usually more efficient, however; transfer of knowledge of that database from developer to maintainer is difficult, particularly when employees leave positions and vacancies persist for long periods of time. For that reason, I recommend flat files be stored, even if you are loading data into a relational database. Guilford Technical Community College 5/19/2019

3 Visualization of data warehouse
Discuss different data sources and storing created tables…joins, merges, tabulates, etc. Guilford Technical Community College 5/19/2019

4 Do we really need/want it?
The answer depends on your college’s data, reporting, assessment, and research needs…. and your ability to develop and maintain the data warehouse. Data warehouses are like dogs…they REQUIRE lots of ATTENTION and are a LONG-TERM commitment!!! Data warehouses require attention on a regular basis! It’s a long-term commitment. It’s important that this is a part of someone’s job description. If you don’t have the personnel with the know how and time, there’s no point in starting it. Also, in the one-person shops when someone leaves, who’s gonna do the data extracts in the interim? Transfer of knowledge is required!! Guilford Technical Community College 5/19/2019

5 Some advantages of having an internal data warehouse:
Fulfill data requests and complete reports much faster with fewer human resources Easily find data entry errors in Colleague and/or processes View trend data to see patterns over time Make decisions based on these trends and patterns, rather than a hunch The number one reason to have a data warehouse is that it just plain ol’ makes your job easier….i.e. We can run our IPEDS Completions report from start to finish in less than five minutes. Prior to our data warehouse, it took approximately 40 hours to do that report manually. I think of it as….do you want to wash dishes by hand? Or use an automatic dish washer? Guilford Technical Community College 5/19/2019

6 More advantages of having an internal data warehouse:
Perform statistical analysis (t-tests, ANOVAs, regression, chi-squares) for program/intervention evaluation Longitudinally you’ll be ready for OLAP, Data Mining, etc. Guilford Technical Community College 5/19/2019

7 So, we’ve decided we want one... how do we get it?
Basic Steps Extract desired data from source (Datatel) at regular intervals (Day 1, 8, 40, 56, 60, 72, and 80). Clean data (Excel, Access, SAS, SPSS) and put it in desired format. Build regular processes from data files (table joins, standard reports, etc.) Only three basic steps, but many important design decisions to be made. Guilford Technical Community College 5/19/2019

8 BUT, many important decisions to be made AND time-consuming development to undergo.
What tool do we use for extraction (what can we afford)? What data to extract? What files are those data in anyway? How do we find out? Develop queries to extract the data we want. Test them. Are we getting the right data? What to do with the errors? ERRORS WILL CONSUME THE MOST OF YOUR TIME!!! What days during the term should we run extracts? End of term? How many files should we store, what should be the layout of those files, do we need documentation of our files? WHEW, now we extracted what we want, what tool will we use to manipulate data? What standard reports should we develop? What is the definition of Success rate anyway? Does anyone really know? What do the end users really want? How to handle ad hoc requests? What to do when someone requests data that we haven’t extracted? Back to the drawing board!!! Anyone still reading? I know this isn’t proper etiquette for PowerPoint, but my point is…..developing a data warehouse involves a great many details that need to be thought through completely before you even begin! Guilford Technical Community College 5/19/2019

9 GTCC’s Current Curriculum Internal Data Warehouse
Five end-of-term files: SDF, CDF, SGF, GF, and FAF Snapshots (a scaled down version determined by institutional needs): Demographics and Courses files. Run queries on Day 1, 8, 40, 56, 60, 72 -see handout for more details- We have a total of 18 queries (not including the FAF) that become these files. We store our data in SAS data sets and we keep the raw files that came from Datatel. Day 1 – before any activity, Day – Generic 10%, Day 40 – mid-term, Day 56 – LDTW, Day 60 – LDTW capture, Day 72 – transfer data is all in and active program codes have all been cleaned up, we copy some fields from Day 72 into our End-of-Term files. Now you can see why this is a commitment and a gap in personnel could be damaging to your trend datasets. VERY Important to have documentation written in a fashion that a lay person can run a black box version until someone is on board who can take over the process. Guilford Technical Community College 5/19/2019

10 Let’s see it in A-C-T-I-O-N!
This is how we decided to do it….using Query Builder to extract into flat files. Then we use SAS to clean the data, and create usable SAS datasets. We then developed SAS programs to run our standard reports. Ad hocs are generated using SAS, Access, or Excel…depending on which will be the easiest for us and/or the end user. Guilford Technical Community College 5/19/2019

11 Where to Get More Information
Google or Dogpile “data warehouse” and you can learn more than you ever really wanted to know…OR For specifics on GTCC’s internal data warehousing project, contact me at Guilford Technical Community College 5/19/2019


Download ppt "Karen Ritter, IRP Director GTCC Guilford Technical Community College"

Similar presentations


Ads by Google