Download presentation
Presentation is loading. Please wait.
Published byBrent Beasley Modified over 9 years ago
1
An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics
2
R at ONS Open source software in ONS Supporting the government IT strategy Development of training for GSS R Development Group i.Support use of R within ONS ii.Increase user base iii.Aim for incorporation in production systems Teaching R to a SAS audience Increasing usage 2
3
SAS at ONS Designated standard software Statistics Canada Generalised Estimation System (GES) Suite of SAS macros Calibration weights, domain estimates, variance estimates 3
4
ReGenesees Free R package R evolved Generalised software for sampling estimates and errors in surveys Developed by Italian Statistics Office (Istat) 4
5
R vs SAS Comparative study of complex survey estimation software Quality Improvement Fund (QIF) SAS (GES) v R (ReGenesees) Investigating open source in line with GSS strategy 5
6
Calibration Used if there is a relationship between auxiliary data and response variable An estimation procedure which constrains sample-based estimates of auxiliary variables to known totals (or accurate estimates) 6
7
Surveys chosen and why... Business surveys QSI– Cut-off sample BRES – Separate calibration totals Set thresholds for Winsorisation ABS – Biggest survey with 4,000 strata Externally calibrated weights 7
8
Surveys chosen and why... Social surveys LFS – biggest survey resource intensive LOS – longitudinal IPS – 2-stage calibration 8
9
Quarterly Stock Inquiry Cut-off sampling Combined ratio estimation Calibration to one auxiliary Estimates and variance estimates GES – Seven separate input files ReGenesees – Six simple commands 9
10
Quarterly Stock Inquiry - GES 10
11
Quarterly Stock Inquiry - ReGenesees design<e.svydesign(data= ids= strata= weights= fpc=) template<-pop.template(data= calmodel= partition=) pop<-fill.template(universe= template=) population.check(df.population= data= calmodel= partition=) cal<-e.calibrate(design= df.population= sigma2=) est<-svystatTM(design= y= by=,) 11
12
What we found...... Software comparison Time Missing values Programming 12
13
Conclusions/Recommendations ReGenesees successfully used in place of GES ReGenesees easier – less risk! GES more capable for some aspects and vice versa Recommend to explore further! 13
14
Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.