Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.cls.ioe.ac.uk Return from Anarchy Jon Johnson 11 May 2005 Migrating from SPSS to SIR.

Similar presentations


Presentation on theme: "Www.cls.ioe.ac.uk Return from Anarchy Jon Johnson 11 May 2005 Migrating from SPSS to SIR."— Presentation transcript:

1 www.cls.ioe.ac.uk Return from Anarchy Jon Johnson 11 May 2005 Migrating from SPSS to SIR

2 www.cls.ioe.ac.uk Introduction  CLS runs 3 / 4 British Birth Cohort Studies  Multi-disciplinary study of the life-course of three generations born in 1958,1970 and 2000  Data collected in various ways, paper, CAPI, administrative data  Complex data, 100,000 variables, 18,000 participants per study

3 www.cls.ioe.ac.uk History  Punch cards, different data centres, SIR, SPSS  The data has been through the range of data storage fashions  Social science versus Medical data access models  Goal of increased accessibility and understanding of relationships within data  Development of social science meta-data standards

4 www.cls.ioe.ac.uk Current Data Collection  Data collection methods such as CAPI has a negative and positive side  Data is pre-punched  Data is pre-checked  Data is less understandable  Data is more complicated  Recent data supplied for one sweep was > 100,000 variables

5 www.cls.ioe.ac.uk Taming data  Datasets are routinely supplied in SPSS format  SPSS is not an ideal environment to manage such data  SIR is an ideal environment to manage this data

6 www.cls.ioe.ac.uk Data Migration with minimum information loss  SPSS Data List  Rarely used, high level of manual intervention  Visual Basic (a.k.a. SaxBasic)  Platform dependent  Limited functionality, multi-step process  ODBC  Flaky at best  Reverse engineer SPSS file  SPSS Portable format - stable if poorly documented format

7 www.cls.ioe.ac.uk Implementation  PQL, Perl, Python ?  Stable across OS’s  Good text manipulation  Good XML support  Case based databases

8 www.cls.ioe.ac.uk How it works  parse spss file  grabs variable name, value labels, data values etc  looks up a configuration file for BDI settings  check if also setting up database or just adding a new record  do some conversions: time, date, scaled vars  do some analysis of the data to grab range of values,  write out warning if > 3 missing values or a range of missing values  write out schema  python spss_parser.py -f -s -d

9 www.cls.ioe.ac.uk Use  Once into SIR the data can be restructured  Extend to other datasets held in other statistical packages such as Stata or SAS going via StatTransfer -> SPSS portable format and go from there  Also creates XML to add to a data store - superseded !!!


Download ppt "Www.cls.ioe.ac.uk Return from Anarchy Jon Johnson 11 May 2005 Migrating from SPSS to SIR."

Similar presentations


Ads by Google