Download presentation
Presentation is loading. Please wait.
Published byBeverly Brooks Modified over 9 years ago
1
Student Centered ODS ETL Processing
2
Insert Search for rows not previously in the database within a snapshot type for a specific subject and year Search for rows not previously in the database within a snapshot type for a specific subject and year Check for duplicates Check for duplicates Identify test Identify test Lookup metadata Lookup metadata Create Unique Test Event identifiers Create Unique Test Event identifiers Load data Load data Copy Previous Year Test details Copy Previous Year Test details
3
≠ PADMIDSnapshotFgUpdtDtEndDt 20179458T1008F2008-11-21 09:21:36.117NULL Source Data SMFConfig_ID 134 IDUnqTstEvent_IDEffDtEndDt 567917438502 2008-11-21 00:00:00.000NULL Event SMFConfig_ID 134 The current date and time is 11/21/2008 10:15AM when load starts IDUnqTstEvent_IDEffDtEndDt 567917438502 2008-11-22 00:00:00.000NULL Event SMFConfig_ID 134 The current date and time is 11/21/2008 2:30PM when load starts How the effective date is determined before / after noon In some rare cases, a row is found to be inserted that has already been end dated. In that case it will be added to the database and be effective for one day. IDUnqTstEvent_IDEffDtEndDt 567917438502 2008-11-22 00:00:00.0002008-11-22 23:59:59.000 Event SMFConfig_ID 134
4
Update Search for changes to rows already in the ODS from Staging within a snapshot type for a specific subject and year that occurred since the last run of the update ETL Search for changes to rows already in the ODS from Staging within a snapshot type for a specific subject and year that occurred since the last run of the update ETL No need to check for duplicates No need to check for duplicates Identify test Identify test Lookup metadata Lookup metadata Load data Load data –Determine how to apply update Normally end date prior version of row and insert new row Normally end date prior version of row and insert new row On some occasions rows that had been previously end dated may be reintroduced to the ODS On some occasions rows that had been previously end dated may be reintroduced to the ODS Sometimes the only action is to end date the current row Sometimes the only action is to end date the current row Copy Previous Year Test details Copy Previous Year Test details
5
= PADMIDSnapshotFgUpdtDtEndDt 20179458T1008F2008-11-21 09:21:36.117NULL Source Data SMFConfig_ID 134 How the effective date is determined before / after noon IDUnqTstEvent_IDEffDt 5679174385022008-11-21 00:00:00.000 Event SMFConfig_ID 134 The current date and time is 12/15/2008 10:15AM when load starts In ODS EndDt 2008-12-14 23:59:59.000 6274234385022008-12-15 00:00:00.000134Change from StagingNULL FNm Bob Robert The current date and time is 12/15/2008 2:30PM when load starts IDUnqTstEvent_IDEffDt 5679174385022008-11-21 00:00:00.000 Event SMFConfig_ID 134In ODS EndDt 2008-12-15 23:59:59.000 6274234385022008-12-16 00:00:00.000134Change from StagingNULL FNm Bob Robert
6
Update Logic Read updated row from Staging Read updated row from Staging Read current record from ODS Read current record from ODS If row in ODS is end dated If row in ODS is end dated –Add new row to ODS If row in ODS is not end dated If row in ODS is not end dated –Compare columns Staging <> ODS –If the columns are different End date ODS row End date ODS row Add Staging row to ODS Add Staging row to ODS –If the columns are the same and the end date in the ODS row is not set and the end date from the Staging record is set End date ODS row End date ODS row
7
Delete (soft) For the subject / year table search for rows in the ODS that are no longer in Staging For the subject / year table search for rows in the ODS that are no longer in Staging The examinee table is not checked for end dated rows, only the subject / year table is used to determine if a delete is needed per Assessment The examinee table is not checked for end dated rows, only the subject / year table is used to determine if a delete is needed per Assessment A Cascaded update is performed end dating the entire test event and all related rows from the following tables: A Cascaded update is performed end dating the entire test event and all related rows from the following tables: –Event –EventInst –EventInd –BnchLvl –Score –RaterScore –PaperPencilData –CmptrBasedData –EventClsRm The end date is determined the same way as in the insert and update ETL The end date is determined the same way as in the insert and update ETL
8
The affects of source tables on Student Centered tables Not all tables are loaded for all tests Score, Benchmark Level and Rater Score are not loaded for Virtual tests Presently only Writing has Rater Scores (possibly ELPA in the future) Paper / Pencil tests load to Paper Pencil Data and Event Class Room Computer based tests load to Computer Based Data Writing has only total Benchmark Level scores while other subjects have Benchmark Level scores at the category (aka strand) level Some tables are only loaded if there are values present Only non-blank scores are loaded to the Benchmark Level table Only non-null scores are loaded to the Score and Rater Score tables Only non-null institution identifiers are loaded to the Event Institution table
9
Example of applying updates Row is inserted Update to Subject / Year table << Changes to an Institution and an indicator – end date current rows << Insert new current rows – null end date << Changes to student demographic data – end date current row << Insert new current row – null end date
10
Maintenance Occasionally there may be the need to make corrections or to reload portions of the ODS Occasionally there may be the need to make corrections or to reload portions of the ODS After a few years there may also be the need to remove some lower level of detail from the ODS After a few years there may also be the need to remove some lower level of detail from the ODS In any case as maintenance is needed communication will be made to inform clients of what changes are coming and some suggestions on how to deal with those changes In any case as maintenance is needed communication will be made to inform clients of what changes are coming and some suggestions on how to deal with those changes
11
How to store the Extracted ODS Data It is recommended that the extract layout be used as a guide for the staging database A data model is available in this format for your use The model is in power designer 12.5 and available in html format for review
12
Considerations for your local ODS The ID column from each table is sufficient for a primary key The ID column from each table is sufficient for a primary key Columns ending in _ID are foreign keys from other tables and should be indexed Columns ending in _ID are foreign keys from other tables and should be indexed The SMFConfig_ID can be used for vertical partitioning of the Core Content data The SMFConfig_ID can be used for vertical partitioning of the Core Content data Additional indexes are in the data model and you may want to tune and add more based on your needs Additional indexes are in the data model and you may want to tune and add more based on your needs
13
Loading data to your local ODS If you follow the suggested database design, the process of loading data is simple If you follow the suggested database design, the process of loading data is simple –On your schedule you will receive data Rows that match by ID are updated Rows that match by ID are updated Rows not found by matching on ID are inserted Rows not found by matching on ID are inserted The Security tables are replaced on each load in total The Security tables are replaced on each load in total That’s all there is to keeping your local ODS up to date. That’s all there is to keeping your local ODS up to date.
14
How to retrieve data from your local ODS Generally there are two types of queries Generally there are two types of queries –Current data –Data as of a date For current data select where the end date is null For current data select where the end date is null –select * from Event where UnqTstEvent_ID = 500 and Enddt is null For data as of a date a query such as this will work For data as of a date a query such as this will work –select * from Event where EffDt = ‘2008-11-19 10:57:00.000’) and UnqTstEvent_ID = 500
15
Formal and Inferred Registration System (FIRS) FIRS makes it possible for ODE to give the proper data for each of the students served by your clients FIRS makes it possible for ODE to give the proper data for each of the students served by your clients Spring and Fall Membership, all state assessments and any student transfers through OSTX since 2004-2005 are used by FIRS to form a chronology of which institutions a student was related to and when Spring and Fall Membership, all state assessments and any student transfers through OSTX since 2004-2005 are used by FIRS to form a chronology of which institutions a student was related to and when This information is used by the extract process to provide the most complete data possible This information is used by the extract process to provide the most complete data possible In the coming year the new Consolidated ADM data collection will begin providing information to this process as well In the coming year the new Consolidated ADM data collection will begin providing information to this process as well The tables FIRS and DistUnqTstEvent provided in your extract are taken from this system The tables FIRS and DistUnqTstEvent provided in your extract are taken from this system
16
Extract Process Regions must provide a list of districts in order to receive the extract Regions must provide a list of districts in order to receive the extract The ODE helpdesk will setup the relationships between the Region and their districts The ODE helpdesk will setup the relationships between the Region and their districts The first extract will be a full The first extract will be a full The next extract will change to an incremental automatically The next extract will change to an incremental automatically An incremental extract includes changes for continuing students plus full extracts for students new to the client districts An incremental extract includes changes for continuing students plus full extracts for students new to the client districts Each time an extract is performed the last extract date in the Region’s configuration is set Each time an extract is performed the last extract date in the Region’s configuration is set The next extract will contain all changes since the last extract date The next extract will contain all changes since the last extract date
17
Extract Process (continued) Data is extracted into the formats specified by the StudentCenteredExportFormat.xls Data is extracted into the formats specified by the StudentCenteredExportFormat.xls One file is produced per table described in the format One file is produced per table described in the format The files are in CSV format with text delimited by quotes The files are in CSV format with text delimited by quotes Files will only be produced if there are rows qualifying for the extract Files will only be produced if there are rows qualifying for the extract The file Manifest.txt contains a list of the files extracted with the count of rows and when it was produced The file Manifest.txt contains a list of the files extracted with the count of rows and when it was produced The CSV files and the Manifest.txt are compressed into a.zip file and placed on ODE’s secure FTP site for pickup by the Region The CSV files and the Manifest.txt are compressed into a.zip file and placed on ODE’s secure FTP site for pickup by the Region
18
File Transfer and Scheduling When the districts the Region serves are communicated to the ODE Helpdesk ODE will also make sure security is setup for connecting to the secure FTP server When the districts the Region serves are communicated to the ODE Helpdesk ODE will also make sure security is setup for connecting to the secure FTP server Instructions for connection will be provided Instructions for connection will be provided The ODE helpdesk can schedule which days the Region will receive files The ODE helpdesk can schedule which days the Region will receive files The same scheduling system is already in use for extracting data from Student Centered Staging The same scheduling system is already in use for extracting data from Student Centered Staging
19
Requested extracts At times it may be necessary for the Region to receive a full extract to repopulate your local ODS or the Region may have missed some extracts produced previously At times it may be necessary for the Region to receive a full extract to repopulate your local ODS or the Region may have missed some extracts produced previously –Full extracts can be scheduled by the ODE Helpdesk The number that can be done in one day is limited The number that can be done in one day is limited Full extracts may have to wait until the weekend Full extracts may have to wait until the weekend Full extracts will not be provided on a regular basis Full extracts will not be provided on a regular basis –For missed extracts, the ODE Helpdesk can set the date as of which the extract will pull information and provide a larger incremental extract The date as of which the extract will pull information is cleared after the run of the extract The date as of which the extract will pull information is cleared after the run of the extract Processing will return to normal automatically Processing will return to normal automatically
20
Limiting Access Each extract will provide a new copy of the security files (FIRS and DistUnqTstEvent) Each extract will provide a new copy of the security files (FIRS and DistUnqTstEvent) The FIRS file contains the district related to the student with the end date provided for review if needed The FIRS file contains the district related to the student with the end date provided for review if needed The DistUnqTstEvent relates the district to the specific test events available to that district The DistUnqTstEvent relates the district to the specific test events available to that district By simply joining through this table when providing access to your clients you can restrict access to just the information they are allowed to access By simply joining through this table when providing access to your clients you can restrict access to just the information they are allowed to access Declare @DistInstID int Set @DistInstID = 2082 select e.* from Event e from Event e join DistUnqTstEvent d join DistUnqTstEvent d on e.UnqTstEvent_ID = d.UnqTstEvent_ID on e.UnqTstEvent_ID = d.UnqTstEvent_ID and d.DistInstID = @DistInstID and d.DistInstID = @DistInstID
21
Data Model Follows the same definitions provided by the StudentCenteredExportFormat.xls spreadsheet Follows the same definitions provided by the StudentCenteredExportFormat.xls spreadsheet Organized around the document as well providing different color coding for the same major subjects as the spreadsheet: Organized around the document as well providing different color coding for the same major subjects as the spreadsheet: –Core Content Data –Score Data –File Processing Control Data –Metadata –Security – plus – –Possible changes The model show how the tables are related to each other and provides useful information about the data The model show how the tables are related to each other and provides useful information about the data The model is in Power Designer 12.5; an html version is available to review the model that contains table create statements as well The model is in Power Designer 12.5; an html version is available to review the model that contains table create statements as well
22
Pulling it all together The SMFConfig table is essentially a link to the subject and year that the data belongs with The SMFConfig table is essentially a link to the subject and year that the data belongs with Most of the metadata is contained in a Table of Tables / Master Codes scheme which houses virtual tables and related code values Most of the metadata is contained in a Table of Tables / Master Codes scheme which houses virtual tables and related code values Each row in the Table of Tables represents a virtual table Each row in the Table of Tables represents a virtual table Rows in the Master Codes table relate back to the Table of Tables for rows that represent the values stored in the virtual table of tables Rows in the Master Codes table relate back to the Table of Tables for rows that represent the values stored in the virtual table of tables The Ctgry table is used to indicate the score reporting category (aka strand) related to score data The Ctgry table is used to indicate the score reporting category (aka strand) related to score data The Ctgry table also contains entries for total scores The Ctgry table also contains entries for total scores Review samples Review samples
23
Student Centered ODS Thanks for coming!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.