Presentation is loading. Please wait.

Presentation is loading. Please wait.

MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Similar presentations


Presentation on theme: "MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation."— Presentation transcript:

1 MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation

2 What we shall examine during this hour Data files of different formats. Examine ways and means of massaging the different formats into one ‘usable’ format. Examine ways of “manufacturing” records to facilitate generating end user reports.

3 A bit of history While working at a major airline a few years back, I encountered problem where one of our databases contained flight information with a start date of the service and a planned termination date for the service”.

4 OrigDestOrigin CityDest City Start Date End Date ALBSDFALBANY NY LOUISV ILLE KY 2007 0122 2008 0403 ABQLBBLUBBOCK TX ALBUQ UERQU E NM 2007 0101 2007 1031

5 Our booking database on the other hand contained ‘daily records’ of the seating status of each class, for each flight segment (which could consist of one or more legs). This necessitated the break down of the data shown above into “a record per day” format.

6 DateOrigDestThe Key 20070101ABQLBB20070101ABQLBB 20070102ABQLBB20070102ABQLBB 20070103ABQLBB20070103ABQLBB 20070104ABQLBB20070104ABQLBB

7 So that we could effect a join to the “Available Seating” database.

8 KEYFYMNQS 20070101ABQLBB981239 20070102ABQLBB252734 20070103ABQLBB441895 20070104ABQLBB921134 Available Seating

9

10

11

12

13 The raw data

14

15 FILEDEF RAW DISK C:/ibi/apps/steve/AirlineSchedule.txt FILEDEF AIRLINE DISK C:\ibi\apps\steve\AIRLINE.FOC -RUN CREATE FILE AIRLINE -RUN MODIFY FILE AIRLINE FIXFORM DEPARTURE/3 DEPARTURECITY/50 ARRIVAL/3 ARRIVALCITY/50 FIXFORM STARTDATE/A8 ENDDATE/A8 MATCH WITH-UNIQUES DEPARTURE ARRIVAL ON MATCH REJECT ON NOMATCH INCLUDE DATA ON RAW END

16

17 Creating that “record per day”

18 FILEDEF AIRLINE1 DISK C:/ibi/apps/steve/AIRLINE.OUTTT -RUN MODIFY FILE AIRLINE COMPUTE STARTDATE1/YYMD = 0; COMPUTE ENDDATE1/YYMD = 0; COMPUTE STARTCITY/A50=; COMPUTE ENDCITY/A50=; COMPUTE STARTCODE/A3=; COMPUTE ENDCODE/A3=; COMPUTE TEMPDATE/YYMD=0; PERFORM EXTRACT1 Filedef’s and variable initialization

19 We shall utilize the Scratch Pad Area (SPA)

20 Get the data from the database record by record CASE EXTRACT1 NEXT WITH-UNIQUES DEPARTURE ARRIVAL ON NEXT ACTIVATE DEPARTURECITY ARRIVALCITY STARTDATE ENDDATE ON NEXT COMPUTE STARTDATE1= D.STARTDATE; ON NEXT COMPUTE ENDDATE1 = D.ENDDATE; ON NEXT COMPUTE STARTCITY = D.DEPARTURECITY; ON NEXT COMPUTE ENDCITY = D.ARRIVALCITY; ON NEXT COMPUTE STARTCODE = D.DEPARTURE; ON NEXT COMPUTE ENDCODE = D.ARRIVAL; ON NEXT COMPUTE TEMPDATE = D.STARTDATE; ON NEXT PERFORM EXTRACT2 ON NONEXT GOTO EXIT ENDCASE

21 Start date greater than end date? Yes: quit case No: write the record to file CASE EXTRACT2 IF TEMPDATE GT ENDDATE1 THEN PERFORM EXTRACT1; TYPE ON AIRLINE1 " " COMPUTE TEMPDATE = TEMPDATE + 1; GOTO EXTRACT2 ENDCASE DATA END -RUN

22 The output

23

24 Where do we go from here? The available seating table resides in a SQL Server database

25 Load this data into our SQL Server data repository

26 Create INSERT statements

27 FILEDEF ROUTECOUNT DISK C:/ibi/apps/steve/AIRLINE.OUTTT -RUN APP HOLD steve TABLE FILE ROUTECOUNT PRINT * ON TABLE HOLD AS RECCOUNT END -SET &LLINES = &LINES; -START111 -SET &FILENUM = 1; -SET &CURRENTCTR =0; -SET &FIRSTLINE = 'INSERT INTO DailyFlights(Date,Start,Destination,'; -SET &FIRSTLINE1 = 'StartCity,DestinationCity)'; -SET &SECONDLINE =; -SET &THIRDLINE = ; -SET &APOST = HEXBYT(39,'A1'); -SET &DATEE=; -SET &STARTC=; -SET &DESTC=; -SET &SCDEST=; -SET &ECDEST=;

28 Write the SQL “Use”Statements FILEDEF ROUTECOUNT1 DISK C:/ibi/apps/steve/AIRLINE.OUTTT FILEDEF SCHEDULE DISK C:/ibi/apps/steve/AIRLINE.SQL1 -RUN -WRITE SCHEDULE USE FUSE2007 -WRITE SCHEDULE GO -WRITE SCHEDULE BEGIN TRANSACTION

29 Read all records & write to file -REPEAT LOOPER FOR &I FROM 1 TO &LLINES STEP 1 -READ ROUTECOUNT1 &A.2 &DATEE.10 &C.1 &STARTC.3 &A.1 &DESTC.3 &B.1 &SCDEST.50, - &CA.1 &ECDEST.50 -SET &SECONDLINE = ' VALUES (' || &APOST || &DATEE || &APOST; -SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &STARTC || &APOST; -SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &DESTC || &APOST; -SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &SCDEST || &APOST; -SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &ECDEST || &APOST; -SET &SECONDLINE = &SECONDLINE || ');'; -WRITE SCHEDULE &FIRSTLINE -WRITE SCHEDULE &FIRSTLINE1 -WRITE SCHEDULE &SECONDLINE -LOOPER -WRITE SCHEDULE COMMIT TRANSACTION

30 The Insert Statements USE FUSE2007 GO BEGIN TRANSACTION INSERT INTO DailyFlights(Date,Start,Destination, StartCity,DestinationCity) VALUES ('2006/11/30','ABE','MHT','ALLENTOWN, PA','MANCHESTER, NH'); INSERT INTO DailyFlights(Date,Start,Destination, StartCity,DestinationCity) VALUES ('2006/12/01','ABE','MHT','ALLENTOWN, PA','MANCHESTER, NH'); ….. COMMIT TRANSACTION

31 The 50 million foot view

32 Raw Data Sequential File SQL Statements File System Watcher & SSIS Load Package File System Watcher & SSIS Load Package

33

34

35

36

37 cd C:\Program Files\Microsoft SQL Server\90\DTS\Binn DTExec /f "C:\AirlineScheduleLoad\AirlineSc hedule\AirlineSchedule\bin\LoadS chedule.dtsx"

38

39

40

41

42 Query created by join

43

44 JOIN DATEM AND START AND DESTINATION IN DAILYFLIGHTS TO DATEE AND START AND DESTINATION IN BOOKINGS AS J1 -RUN DEFINE FILE DAILYFLIGHTS CITYPAIR/A10 = START || '-'|| DESTINATION; END TABLE FILE DAILYFLIGHTS PRINT DATEM AS 'Date‘ CITYPAIR AS 'City Pair‘ STARTCITY AS 'Origin' DESTINATIONCITY AS 'Destination‘ FCLASS AS 'F’ YCLASS AS 'Y‘ MCLASS AS 'M' NCLASS AS 'N‘ QCLASS AS 'Q‘ SCLASS AS 'S' BY START NOPRINT BY DESTINATION NOPRINT BY DATEM NOPRINT ON TABLE SUBHEAD "Orange Free State Airlines" "Flight Schedule“ …..

45

46

47 During this hour we Examined data files of different formats. Examined ways and means of massaging the different formats into one ‘usable’ format. Examined ways of “manufacturing” records to facilitate generating end user reports. Verified that the data was correct.

48 During this hour we Saw that there were many “different” ways to modify anomalous data into the format of your choice, to produce the reports that you require. Which really goes to show that you can..

49 MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation PowerPoint presentation & code samples may be found at: http://cid- 4c765fc825912e4d.skydrive.live.com/browse.aspx/Public or by email ssimon@statestreet.com


Download ppt "MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation."

Similar presentations


Ads by Google