Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Routine Purge Preparation Roy Zimmer Western Michigan University.

Slides:



Advertisements
Similar presentations
Developer Meets Developer March 2011 Chicago, Illinois, USA Roy Zimmer Western Michigan University.
Advertisements

Threats to privacy in the forensic analysis of database systems Patrick Stahlberg, Gerome Miklau, and Brian Neil Levine Department of Computer Science.
But he could not visit branch to manage his policy amidst his BUSY SCHEDULE Thankfully, Mr. Mahesh can NOW manage His insurance policy at his fingertips.
Information and Technology for Better Decision Making MDDC Deb Gallagher Presented by Deb Gallagher December 2004 Defense Manpower Data Center Access Card.
Welcome Data Imports Instant Imports & How to Create an Import File Ryan McIntire Digital Measures.
Procurement Card Training Strategic Account Management (SAM)
Further Data Modelling …and the effect of time. Plan Introduction Structured Methods –Data Flow Modelling –Data Modelling –Relational Data Analysis –Further.
Western Michigan University History Presented by Caley Coleman
Catalog: Batch delete old Patron Records How to conduct global/batch updates to records – patron Adding Faculty and Patron/Student Records Manually Standardizing.
Banner Employee Self Service e. Did you know that you can view all of your Personal Information that we have on file for you in Employee Self Service?
6 th Annual Focus Users’ Conference Texas Reporting Presented by: Bethany Heslam.
Advanced Tables Lesson 9. Objectives Creating a Custom Table When a table template doesn’t suit your needs, you can create a custom table in Design view.
ANSI 5010 Deidre Lawson Fryfogle- Product Manager.
PELICAN Keys to Quality – GSD Session 11 August 26th, 2008.
Page 1 Returns Receivings By MIS Department. Page 2 The Returns Process When a store or customer wants to return goods, they are supposed to contact the.
Advanced File Processing
Session Number 7 Duplicate PIDM Panel Discussion Cuesta Community College Lori McLain - System Administrator/Operator.
PLIF’s Driving You to Drink?: Tips and Tricks for Success in Getting Patron Info into Aleph ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Andrew Perry and Dave Ritchie LiSUG.
FireRMS SQL Audit, Archiving & Purging Presented by Laura Small FireRMS Quality Assurance.
CFT Offline Monitoring Michael Friedman. Contents Procedure  About the executable  Notes on how to run Results  What output there is and how to access.
1 Chapter 4. To familiarize you with methods used to 1. Access input and output files 2. Read data from an input file 3. Perform simple move operations.
DAY 15: ACCESS CHAPTER 2 Larry Reaves October 7,
Address Refer to Slide 2 for instructions on how to view the full-screen slideshow.Slide 2.
DAY 14: ACCESS CHAPTER 1 Tazin Afrin October 03,
South Dakota Library Network ALEPH Basic Circulation Training Patrons South Dakota Library Network 1200 University, Unit 9672 Spearfish, SD
University of Michigan Enterprise Directory Services Appendix A Conceptual Architecture.
EMIS-R Data Collector Uncovered Teresa Williams NWOCA/SSDT OAEP
An introduction for Data Reporters. College Credit Plus Replaces PSEO Replaces dual enrollment.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
Basic & Advanced Reporting in TIMSNT ** Part Two **
CS 320 Assignment 1 Rewriting the MISC Osystem class to support loading machine language programs at addresses other than 0 1.
Get your hands dirty cleaning data European EMu Users Meeting, 3rd June. - Elizabeth Bruton, Museum of the History of Science, Oxford
Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money.
PPMS Spring User Group Meeting 5/30/2012 Presented by Susan Engel.
Submitting Course Outlines for C-ID Designation Training for Articulation Officers Summer 2012.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
M1G Introduction to Database Development 4. Improving the database design.
Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.
A337 - Reed Smith1 Structure What is a database? –Table of information Rows are referred to as records Columns are referred to as fields Record identifier.
Amber Johnson U.S. Department of Education WVASFAA Fall 2015 Conference October 29, 2015 FSA ID: The FSA PIN Replacement.
Automating a Vendor File Load Process with Perl and Shell Scripting Roy Zimmer Western Michigan University.
XP New Perspectives on Microsoft Access 2002 Tutorial 31 Microsoft Access 2002 Tutorial 3 – Querying a Database.
Finding a PersonBOS Finding a Person! Building an algorithm to search for existing people in a system Rahn Lieberman Manager Emdeon Corp (Emdeon.com)
Office of Housing Choice Voucher Program Voucher Management System – VMS Version Released October 2011.
Ceridian Time Solutions Supervisor. IMPORTANT If you are not the manager of an employee but have been assigned the task of approving time for that employee,
Voyage meets MeLCat: MC’ing the Introductions. MeLCat extract sequences Voyager bibout.pl bib extract patout.pl today’s extract yesterday’s extract patdiff.pl.
CECAS Updates EC Program Directors’ Institute March 9, 2010.
Banner and PeopleSoft Patron Database Updates: Who’s on First? Scott Gillies, Information Systems Librarian and Joy Garmon, Coordinator of Access Services.
Data Coordinators Conference – 2014 Laura Marroquin CASEWORKER/JCMS Specialist Everything New Data Coordinators Should Know.
Chapter 11: Sequential File Merging, Matching, and Updating Programming Logic and Design, Third Edition Comprehensive.
Navigation: If the tutorial opens up in your web browser, simply click your mouse to advance to the next slide. Use the “Backspace”
Sequential Processing to Update a File Please use speaker notes for additional information!
DAY 14: ACCESS CHAPTER 1 RAHUL KAVI October 8,
ENTER To view the PS Form 3546 tutorial slide show, press the F5 key or click on Slide Show at the top of this window. In the dropdown window, click View.
Bank Reconciliation Chapter 4. PAGE REF #CHAPTER 4: Bank Reconciliation SLIDE # 2 Objectives Reconcile your checking Create bank reconciliation reports.
Emdeon Office Batch Management Services This document provides detailed information on Batch Import Services and other Batch features.
DEVRY CIS 336 W EEK 7 G ROUP P ROJECT T ASK 5 Check this A+ tutorial guideline at
The Login Page is the first page your customers
Project Work Order Generator
ADE EDIS READ & Optimizer TRAINING Colorado Department of Education
NextGen Purchasing Calendar Year End 1099 Process
Creating and Modifying Queries
Please use speaker notes for additional information!
Programming Logic and Design Fourth Edition, Comprehensive
Sirena Hardy HRMS Trainer
MBUG 2018 Session Title: How to Set Up Automated State Aid Load Processing Using Microsoft Access and Banner Presented By: Rosiland Ashford Institution:
RiskMan Personal Delegates
GIL Users Group Meeting
Presentation transcript:

Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Routine Purge Preparation Roy Zimmer Western Michigan University

About 5 or 6 years ago… No more SSN switch to using WIN WIN is our Western Identification Number

About 5 or 6 years ago… No more SSN switch to using WIN Banner WIN is our Western Identification Number

About 5 or 6 years ago… No more SSN switch to using WIN Banner New campus ID cards WIN is our Western Identification Number

A few less years ago… Rewrote the patron update process to use Banner

A few less years ago… Rewrote the patron update process to use Banner Started thinking about not being SSN-based

The WIN had become available in the data feeds for our patron update. Needed to change Institution ID interim step: arbitrary 14-digits -> WIN final step: WIN -> Bronco NetID Patron update was switched from being SSN-based to WIN-based. BroncoNetID is our single signon ID

Summer 2008 – What we started with Have data for about 74,000 patrons. About 183,000 barcodes (less than half are active!).

Summer 2008 – What we started with Have data for about 74,000 patrons. About 183,000 barcodes (less than half are active!). Several thousand duplicate records, one with SSN, one with WIN (in the SSAN field) The older duplicate record typically had charges, amounts owed, etc.

2008: August – October Most of my time was spent on the cleanup… Dali

Patron duplicate detector – LB4020 foreign students various errors Sample follows… August

(WINs & SSNs above are not real) Sample output used one day

Our first run came up with 3489 duplicate patron records.

We created a program that used the LB4020 report as input to identify patron records that we wanted to alter – call it LB4020fix. These records needed to be extracted from Voyager for modification and re-import. Modify me with LB4020fix

Voyager has a patron extract utility, but it doesn’t extract all relevant data for a patron. We’d started using our own – patronsif.pl - years ago.

Voyager extract (Pptrnextr) Up to 3 patron-barcode + group combinations Similarly limited number of addresses WMU extract (patronsif.pl) Unlimited patron-barcode + group combinations Unlimited number of addresses → +

Voyager has a patron extract utility, but it doesn’t extract all relevant data for a patron. We’d started using our own – patronsif.pl - years ago. For the patron cleanup we incorporated patronsif.pl into LB4020fix. Patron notes field problem: CR+LF stored if user pressed the RETURN key creates unwanted extra lines within a record drop_crlf utility replaces “CR+LF” with “space+space”

LB4020fix reads the duplicate report (LB4020) and extracts patron sif format data for the duplicate records. SIF-A new WIN-based records BroncoNetID in InstitutionID change expiredate to SIF-B old SSN-based records change InstitutionID to current BroncoNetID SIF-C new WIN-based records have the current update, expire, and purge dates and BroncoNetID The heart of the cleanup process

SIF-A new WIN-based records BroncoNetID in InstitutionID change expiredate to SIF-B old SSN-based records change InstitutionID to current BroncoNetID SIF-C new WIN-based records have the current update, expire, and purge dates and BroncoNetID update, key on SSN purge on expiredate [remove new records] 1 LB4020fix reads the duplicate report (LB4020) and extracts patron sif format data for the duplicate records. The heart of the cleanup process

SIF-A new WIN-based records BroncoNetID in InstitutionID change expiredate to SIF-B old SSN-based records change InstitutionID to current BroncoNetID SIF-C new WIN-based records have the current update, expire, and purge dates and BroncoNetID update, key on SSN purge on expiredate [remove new records] update, key on SSN [prep old records to be “new”] 12 LB4020fix reads the duplicate report (LB4020) and extracts patron sif format data for the duplicate records. The heart of the cleanup process

SIF-A new WIN-based records BroncoNetID in InstitutionID change expiredate to SIF-B old SSN-based records change InstitutionID to current BroncoNetID SIF-C new WIN-based records have the current update, expire, and purge dates and BroncoNetID update, key on SSN purge on expiredate [remove new records] update, key on SSN [prep old records to be “new”] update, key on InstID [unify old records with new data] 123 LB4020fix reads the duplicate report (LB4020) and extracts patron sif format data for the duplicate records. The heart of the cleanup process

SIF-A new WIN-based records have current BroncoNetID change expiredate to SIF-B old SSN-based records change InstitutionID to current BroncoNetID SIF-C new WIN-based records have the current update, expire, and purge dates and BroncoNetID update, key on SSN purge on expiredate [remove new records] update, key on SSN [prep old records to be “new”] update, key on InstID [unify old records with new data] 123 LB4020fix reads the duplicate report (LB4020) and extracts patron sif format data for the duplicate records. The heart of the cleanup process This clean-up process, with variations, was repeated many times. Details omitted here for the sake of brevity (and sanity).

Several things went awry along the way. Not all records could be matched up with a WIN or SSN (as reported by LB4020), so those had to be handled by assigning temporary SSNs, WINs, and/or Institution IDs.

Several things went awry along the way. Not all records could be matched up with a WIN or SSN (as reported by LB4020), so those had to be handled by assigning temporary SSNs, WINs, and/or Institution IDs. At another point, the interim records used in the process weren’t deleted during a purge. Those had to be detected, reassigned an older expiration date ( ), and carefully purged before proceeding.

We now had 1081 duplicate patron records.

We added the expiration date to the duplicate detector, LB4020. Now we could see that all the SSN-based records were expired, or about to be.

We added the expiration date to the duplicate detector, LB4020. Now we could see that all the SSN-based records were expired, or about to be. At this time we discovered that new WIN-based records were coming in as duplicates to SSN-based records that were typically set to expire

We added the expiration date to the duplicate detector, LB4020. Now we could see that all the SSN-based records were expired, or about to be. At this time we discovered that new WIN-based records were coming in as duplicates to SSN-based records that were typically set to expire This had to change!

We added the expiration date to the duplicate detector, LB4020. Now we could see that all the SSN-based records were expired, or about to be. At this time we discovered that new WIN-based records were coming in as duplicates to SSN-based records that were typically set to expire This had to change! And the semester was about to start…

Yes, we did avert disaster. But we had more problems. Early September…

Yes, we did avert disaster. But we had more problems. The duplicate detection report, which had grown to 60 pages, was now down to 1. The next day it had grown to 3 pages. Early September…

Yes, we did avert disaster. But we had more problems. The duplicate detection report, which had grown to 60 pages, was now down to 1. The next day it had grown to 3 pages. Some records not having all fields populated on the LB4020 duplicate detector caused problems. Also had to fix duplicate records where the SSAN field was null. Early September…

We removed several hundred obsolete records that had neither WIN nor SSN. Discovered records that had no Institution ID – yet another problem. Mid September…

We removed several hundred obsolete records that had neither WIN nor SSN. Discovered records that had no Institution ID – yet another problem. We are now down to 1 SSN-based record. Mid September… This person had our assigned WIN being the same as the SSN. Not supposed to happen! Identified 15 more such instances and submitted them to I.T. for correction.

Found some more SSN-based records – don’t know why they still existed – and converted them to being WIN-based. October… Flipped the “switch” so that we no longer get SSNs for our patron update.

Still had records from our NOTIS era – pre Summer 1998 Purged them if they: did not have life-time borrowing privileges did not have an SSN recorded did have an Institution ID Legacy data

Trouble ahead… 3M SelfCheck

Trouble ahead… Multiple Active Barcodes will NOT work with SelfCheck! 3M SelfCheck

3M SelfCheck requires 1 active barcode per patron. We had patrons with multiple active barcodes.

3M SelfCheck requires 1 active barcode per patron. We had patrons with multiple active barcodes. Wrote a program to whittle that down. Got them reduced to 300, but the next day, it was up to 1777!

3M SelfCheck requires 1 active barcode per patron. We had patrons with multiple active barcodes. Wrote a program to whittle that down. Got them reduced to 300, but the next day, it was up to 1777! Under control now, with patrononeactive.pl, running Monday – Friday. This keeps only the most current active barcode for a patron.

3M SelfCheck requires 1 active barcode per patron. We had patrons with multiple active barcodes. Wrote a program to whittle that down. Got them reduced to 300, but the next day, it was up to 1777! Under control now, with patrononeactive.pl, running Monday – Friday. This keeps only the most current active barcode for a patron. Forgot about those patron records without an Institution ID. Had 882 of them. Fixed them.

We looked at records created before 2008, those that had no SSN but did have an Institution ID. Extracted these records, modified them: expiredate = createdate purgedate = expiredate + 4 years Reimported these records. They should disappear with future annual patron purges. An eye towards the future…

We still had 11,696 records with no SSN (nor WIN). We expect most of these to be routinely purged in the future, leaving us with 456. What we ended with

We still had 11,696 records with no SSN (nor WIN). We expect most of these to be routinely purged in the future, leaving us with 456. When we started, we had about 250,000 patron records. We now have about 68,000. Duplicate records are routinely dealt with. We filter out all but the single most current active barcode for a patron. We will have annual patron purges. What we ended with

Know what you’re starting with. Keep your goal in mind. Figure out a good solution. Be flexible. Be ready for mistakes. Watch out for new/current data undoing your changes. Know when you’re done. Worthwhile points…

patronsif.pl drop_crlf lb4020.pl lb4020fix.pl patrononeactive.pl patrononactive.ksh Contact me if you would like to get any of the above. Resources

patronsif.plas listed, gets patron data and puts it in patron SIF format. institution ID based. gets all patron+barcode groupings. (not site-specific) drop_crlfshell script that contains this line: perl -pi -e's/\r\n/ /g' $1 replaces CR+LF combination with two spaces. (this is useful anytime you use patronsif.pl) Some details on the resources…

lb4020.pldetects duplicate patron records. shows: name, expired (Y/N), SSAN, expire date, modify date, institution ID WMU-specific: indicates whether SSN or WIN in SSAN. modification required for your institution. lb4020fix.plcontrol structure around patronsif.pl code that uses lb4020.pl output as starting point for the fixing process. creates one or more patron SIF files for fixing data. use drop_crlf if necessary. Some details on the resources…

patrononeactive.pl queries Voyager, checking patrons’ active barcodes. if more than one is found, changes all but the most recent active barcodes to other. check the code carefully as it may need modification for your use. (incorporates patronsif.pl code) patrononeactive.ksh combines patrononeactive.pl and drop_crlf in a script suitable for cron use Some details on the resources…

Picture © 2008 by Roy Zimmer Thank you for listening. Roy Zimmer