SAS Mapping functionality to measure and present the Veracity of Location Data
2 University of Derby, UK Richard J Self, Senior Lecturer in Analytics and Governance Vishal Patel, Final Year Student, University of Derby Daniel Corah, Final Year Student, University of Derby, Viktor Horecny, Final Year Student, University of Derby
3 Objectives SAS – Exploring Mapping Functionality to Visualise Veracity of Location Data Lessons Learned about Location Data Veracity and SAS Visualisations
4 Context Smart Device Locations Services is seen as reliable May not be true, consequences are many Retail LBS based marketing Social network apps Photo locations in social media and Google Maps Forensics Criminal Justice system Research Question is To what extent is A-GPS reliable and in what circumstances?
5 Triggers to Research Project
6 Final Year Student Project 12 students researching 3 are co-authors, contributing valuable analyses 7 students contributed data to this presentation (2460 data points) Daniel Corah Vishal Patel Amna Almutawa Ishwa Khadka Victor Horecny Shehzaad kashmiri Farondeep Bains
7 Critical Questions Levels of accuracy in different conditions Indoors / outdoors Rural / residential / urban Weather conditions Stability of indicated location Differences between devices (make / model / operating system)
8 V Patel – Key Insight – Models Vary phoneNMeanStd DevStd Err Nexus iPhone Diff (1-2) MethodVariancesDFt ValuePr > |t| PooledEqual SatterthwaiteUnequal Proc Univariate – Histogram issues
9 D Corah – Key Insight – Stone Built Houses Proc SGPLOT
10 V Horecny – Key Insight – Chipsets HTC-M8 (blue) modern chipset HTC-Desire S (Pink) early version chipset Uses XL/JMP ®
11 Other Insights Cloud conditions affect accuracy Accuracy variable with time
12 Overall Accuracy of LBS 85% <+ 25 metres 2364 out of 2420 <= 500 m
13 Accuracy Variable with Time Start-up of LS max error 360m Uses Annotate coding and macros
14 Accuracy Variable with Time
15 Consolidated Data – 2420 points Red = > 300m
16 Annotate for Time Based Accuracy Challenges Auto-scaling and boundaries Data System ANNOMAC coding for labels
17 Raw Data Lat_True_Deg Long_True_De gLat_Xif_DegLong_Xif_DegLoc_IndImage_PathDate_Time_StampPhone_type OpenIMG_0464.JPG06/09/ :24:24iPhone 5C OpenIMG_0466.JPG17/08/ :56:51iPhone 5C OpenIMG_0465.JPG17/08/ :52:47iPhone 5C circuit 1IMG_01102.jpg23/03/ :25:46iPhone 5C circuit 1IMG_01103.jpg23/03/ :25:47iPhone 5C circuit 1IMG_01104.jpg23/03/ :25:48iPhone 5C circuit 1IMG_01105.jpg23/03/ :25:49iPhone 5C circuit 1IMG_01106.jpg23/03/ :25:50iPhone 5C circuit 1IMG_01107.jpg23/03/ :25:51iPhone 5C circuit 1IMG_01108.jpg23/03/ :25:52iPhone 5C circuit 1IMG_01109.jpg23/03/ :25:53iPhone 5C Lat_True_Deg and Long_True_Deg found through Google Maps Lat_Xif_Deg, Long_Xif_Deg and Date_Time_Stamp read from images using IrfanView
18 Boundaries proc means data=work_derby min max noprint; output out=means_derby; var x y; run; /* deduce and output corner coordinates (in Lat (Y) / Long Degrees (X)) and output using symput */ data _null_; set means_derby; if _stat_ = 'MIN' then do; call symput('min_x', x); call symput('min_y',y); end; if _stat_ = 'MAX' then do; call symput('max_x', x); call symput('max_y',y); end; run;
19 Auto-Scaling xsys = '1'; /* using Frame area*/ ysys = '1'; hsys = '3'; dotsize=0.5; /*basic size of plotted error dot % of frame */ /* plot data in centered 90% of Frame Area */ /* min_x etc set from previous section */ x=(90-(x - symget('min_x'))*90 / (symget('max_x') - symget('min_x')))+5; y=(y - symget('min_y'))*90 / (symget('max_y') - symget('min_y'))+5;
20 Dot Generation – using annomac macros if error < 1 then do; dotsize=dotsize*1; /* small dot for high accuracy */ %slice(x,y,0,360,dotsize,darkgreen,solid,3); /* different colors for different errors */ end; else if error>=1 and error < 10 then do; dotsize=dotsize*1.5; %slice(x,y,0,360,dotsize,mediumgreen,solid,3); end; else if error >= 10 and error < 100 then do; dotsize=dotsize*2; %slice(x,y,0,360,dotsize,mediumyellow,solid,3); end; else if error >= 100 and error < 200 then do; dotsize=dotsize*2.5; /* large dot for big error */ %slice(x,y,0,360,dotsize,darkyellow,solid,3); end;
21 Adding Sequence Labels length color $64 number $4 posn 8. ; retain posn; if _n_ = 1 then posn = 0;. posn = posn + 1; if posn = 10 then posn = 1; /* similar to using the MOD( ) function base 9 */ if posn=1 then do; %label(x,y,number,white,0,0,3,times new roman,1); /* Position cannot be added from a variable */ end; /* in the label macro (last macro parameter) */ else if posn=2 then do; %label(x,y,number,white,0,0,3,times new roman,2); end; /* etc. */
22 Final Output – Using Proc GANNO goptions reset=all border cback=black ctitle=white; proc ganno annotate=workanno; /* from previous Data Step */ run;
23 Conclusions Mapping relies on using Annotate Can be displayed in Proc GMAP or GANNO GANNO allows simple scaling.
24 Session ID #3202