HESA Student Record 2007/08 Catherine Benfield Head of Operations Development Kyle Summers Head of Software Development
Session Outline Update on documentation –Catherine Validation –Kyle
Documentation Major update (Version 2.0) released before Christmas –*J specification –New guidance documents on Modules and Courses –Sample XML data –Responses to queries raised at seminars and to box
Updates Documentation now at Version 2.3 Further updates will provide more guidance and necessary changes
What else is HESA working on? V0.1 validation rules Specification of the HIN process Standard groupings, populations and other derived fields Internal database structure Formalising arrangements with JCQ Finalising specification for the Aggregate Overseas Record
Software support Working with software houses –Consultancy –User groups Exploring other support mechanisms with UCISA –Institutions with in-house systems
For the future Re-specification of: –PIs –TQI –NSS population –POPDLHE Check documentation design What else do you need from us?
Validation kits Student dataValidation kitValidation result Business Rules Schema View Print Save
Types of checks Structural (schema or XSD) Business rules
Structural Everything exists where expected Fields contain valid codes An Instance must appear within a Student This field may contain one of the following valid values: xxx, yyy, zzz This field must contain a valid date
Business rules Fields contain relevant values Code x can only be used if Instance.FIELD is coded y Instance.FIELD must exist where institution is in Wales Instance.COMDATE must be before Instance.ENDDATE
Business rules What do they do? Compare fields within an entity Compare fields across entities
Business rules What don’t they do? Duplication checks across the entire return Checks for postcodes HIN checks
How do business rules work? Introducing Schematron A language for making assertions about patterns found in XML documents
Schematron Uses XML to express rules Compares XML elements Course.TQSSEC must not be completed if Course.TTCID contains codes 0, 3, 4 or 5
Schematron Rules used to produce XSLT files <a href="#{generate-id(.)}" target="_self" title="Link to where this pattern was found..."> Course.TQSSEC must not be completed if Course.TTCID contains codes 0, 3, 4 or 5
Schematron XSLT files used to process student data Result is a list of errors formatted as … XML
Validation kit New validation kit software –New validation techniques –Graphical interface –Check for latest rules
Graphical interface Hides the complexities Select data file and collection within the kit View results within the validation kit –Optionally save or print the results
Latest rules Each set of rules identified uniquely Automatic check for newer rules from the HESA website
Results List errors … or congratulations View erroneous data in context Save or print error list
Other software How can other software systems use these validation checks? Schema –Ensures the data is in the right language –Freely available on the HESA web site Schematron –Ensures the data makes sense –The XSLT files available for download Uses industry standard techniques
Progress Basic techniques are working Performance testing the options available Provide more information on the web site