MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation.

Slides:



Advertisements
Similar presentations
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Advertisements

Access 2007 ® Use Databases How can Microsoft Access 2007 help you structure your database?
Introduction to ETL Using Microsoft Tools By Dr. Gabriel.
1Key – Report Creation with DB2. DB2 Databases Create Domain for DB2 Test Demo.
Quiz 2 Review. For which of the following attributes would a hash- index most likely be a better fit than a B+-tree index? A. Social Security Number B.
Moving Data Lesson 23. Skills Matrix Moving Data When populating tables by inserting data, you will discover that data can come from various sources.
Integrated Imaging and Document Management System Product Demonstration.
Database Concepts Lec. 5. What Is a Database? Data are unprocessed raw facts that include text, number, images, audio, and video. Information is processed.
Query Manager. QM is a collection of tools you can use to obtain information from the AS/400 database Used to –select, arrange, and analyze information.
Chapter 22 Simulation with Process Model to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004.
CSC 4630 Meeting 9 February 14, 2007 Valentine’s Day; Snow Day.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
COMMANDLINE OPTIONS IN SSIS -ABHIJIT -SANJAY -SUSHANT.
Module 11: Data Transport. Overview Tools and functionality in Oracle and their equivalents in SQL Server for: Data transport out of the database Data.
Module 13 Automating SQL Server 2008 R2 Management.
1 Welcome to Technology The Public Library of Charlotte &Mecklenburg County.
Basic & Advanced Reporting in TIMSNT ** Part One **
Your Tour Guide is Jim Provensal. What We Will Cover s Introduction to MicroSoft Access u What is a database u What is a “Relational” Database s The Major.
Introduction to Databases. Overview  What is a Database?  What is a Database Management System?  How is information organized in a database?  What.
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
WebFOCUS for the layman Part 2 Steve Simon State Street Corporation.
1 MySQL and phpMyAdmin. 2 Navigate to and log on (username: pmadmin)
Lesson 1 -What is a Database? -Fields and Records
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Copyright  Oracle Corporation, All rights reserved. 4 CMIS Powell Oracle Designer: Creating the Database Design CMIS Powell.
Stored Procedures, Transactions, and Error-Handling
Northeastern Regional Information Center Financial Services 1031 Watervliet-Shaker Road Albany, NY November 2001.
What is a Database? A Database is…  an organized set of stored information usually on one topic  a collection of records  a way to organize information.
Databases Week 5 LBSC 690 Information Technology.
DAY 12: DATABASE CONCEPT Tazin Afrin September 26,
Transaction processing Book, chapter 6.6. Problem: With a single user…. you run a query, you get the results, you run the next, etc. But database life.
Page 5: Job Costing System Gerald Katherine Armineh.
Integration Services in SQL Server 2008 Allan Mitchell SQL Server MVP.
Database Management Systems.  Database management system (DBMS)  Store large collections of data  Organize the data  Becomes a data storage system.
Access 2007 ® Use Databases How can Microsoft Access 2007 help you structure your database?
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
Programming in R SQL in R. Running SQL in R In this session I will show you how to: Run basic SQL commands within R.
Chapter 7: Relations Relations(7.1) Relations(7.1) n-any Relations & their Applications (7.2) n-any Relations & their Applications (7.2)
SQL Basic. What is SQL? SQL (pronounced "ess-que-el") stands for Structured Query Language. SQL is used to communicate with a database.
Database structure and space Management. Segments The level of logical database storage above an extent is called a segment. A segment is a set of extents.
Data delivery Eileen Howes 10 April Data Management and Analysis Group Summary What we wanted What we got What we want from 2011 Census.
Microsoft Access Lesson 2 Lexington Technology Center February 13, 2003 Bob Herring On the Web at
Activity 1 Retrieving a database file In this activity, we are going to view and change several records in the database file ch10_01.mdb. 1.Copy the database.
Easy Step – by – step method to create a backup set for your Main drive on a separate backup hard disk. 1.Turn on Retrospect 2.Select – Automate 3.Select.
WebFOCUS for the layman or “Yes WE can !!!!” Steve Simon State Street Corporation.
Database Management System. DBMS A software package that allows users to create, retrieve and modify databases. A database is a collection of related.
CSC 370 – Database Systems Introduction Instructor: Alex Thomo.
Steve Simon State Street Corporation Getting Your Core FOCUS Onto Financial Reporting Language.
1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.
Temporary SQL Tables Los Angeles Pierce College Computer Science 560.
An Airlines and Tour management application. Business Overview  This App will be developed for emerging travel enterprises that wish to enter into the.
Chapter 11: Sequential File Merging, Matching, and Updating Programming Logic and Design, Third Edition Comprehensive.
11 Chapter 111 Sequential File Merging, Matching, and Updating Programming Logic and Design, Second Edition, Comprehensive 11.
Sequential Processing to Update a File Please use speaker notes for additional information!
SQL for Super Users Presented by: Adam Jacobson Red Three Consulting, Inc.
Scripting Just Enough SSIS to be Dangerous. 6/13/2015 Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation.
1 Section 10 - Embedded SQL u Many computer languages allow you to embed SQL statements within the code (e.g. COBOL, PowerBuilder, C++, PL/SQL, etc.) u.
Notes: **A Row is considered one Record. **A Column is a Field. A Database is…  an organized set of stored information usually on one topic  a collection.
Lawson Mid-America User Group Spring 2016 Meeting.
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
SSIS ETL Data Resource Management. Create an ETL package using a wizard database server to database server The business goal of this ETL package is to.
©NIIT BCP and DTS Implementing Stored Procedures Lesson 2A / Slide 1 of 23 Objectives In this lesson, you will learn to: Perform bulk copy using the BCP.
Free Braindumps - Pass Exam - Dumps4download
Online Booking – Schedule Features User Guide
PL/SQL Scripting in Oracle:
Discrete Math (2) Haiming Chen Associate Professor, PhD
Enterprise Java Beans.
Virginia Lenvik Geography 375 Spring 2013
Final Project Geog 375 Daniel Hewitt.
Presentation transcript:

MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation

What we shall examine during this hour Data files of different formats. Examine ways and means of massaging the different formats into one ‘usable’ format. Examine ways of “manufacturing” records to facilitate generating end user reports.

A bit of history While working at a major airline a few years back, I encountered problem where one of our databases contained flight information with a start date of the service and a planned termination date for the service”.

OrigDestOrigin CityDest City Start Date End Date ALBSDFALBANY NY LOUISV ILLE KY ABQLBBLUBBOCK TX ALBUQ UERQU E NM

Our booking database on the other hand contained ‘daily records’ of the seating status of each class, for each flight segment (which could consist of one or more legs). This necessitated the break down of the data shown above into “a record per day” format.

DateOrigDestThe Key ABQLBB ABQLBB ABQLBB ABQLBB ABQLBB ABQLBB ABQLBB ABQLBB

So that we could effect a join to the “Available Seating” database.

KEYFYMNQS ABQLBB ABQLBB ABQLBB ABQLBB Available Seating

The raw data

FILEDEF RAW DISK C:/ibi/apps/steve/AirlineSchedule.txt FILEDEF AIRLINE DISK C:\ibi\apps\steve\AIRLINE.FOC -RUN CREATE FILE AIRLINE -RUN MODIFY FILE AIRLINE FIXFORM DEPARTURE/3 DEPARTURECITY/50 ARRIVAL/3 ARRIVALCITY/50 FIXFORM STARTDATE/A8 ENDDATE/A8 MATCH WITH-UNIQUES DEPARTURE ARRIVAL ON MATCH REJECT ON NOMATCH INCLUDE DATA ON RAW END

Creating that “record per day”

FILEDEF AIRLINE1 DISK C:/ibi/apps/steve/AIRLINE.OUTTT -RUN MODIFY FILE AIRLINE COMPUTE STARTDATE1/YYMD = 0; COMPUTE ENDDATE1/YYMD = 0; COMPUTE STARTCITY/A50=; COMPUTE ENDCITY/A50=; COMPUTE STARTCODE/A3=; COMPUTE ENDCODE/A3=; COMPUTE TEMPDATE/YYMD=0; PERFORM EXTRACT1 Filedef’s and variable initialization

We shall utilize the Scratch Pad Area (SPA)

Get the data from the database record by record CASE EXTRACT1 NEXT WITH-UNIQUES DEPARTURE ARRIVAL ON NEXT ACTIVATE DEPARTURECITY ARRIVALCITY STARTDATE ENDDATE ON NEXT COMPUTE STARTDATE1= D.STARTDATE; ON NEXT COMPUTE ENDDATE1 = D.ENDDATE; ON NEXT COMPUTE STARTCITY = D.DEPARTURECITY; ON NEXT COMPUTE ENDCITY = D.ARRIVALCITY; ON NEXT COMPUTE STARTCODE = D.DEPARTURE; ON NEXT COMPUTE ENDCODE = D.ARRIVAL; ON NEXT COMPUTE TEMPDATE = D.STARTDATE; ON NEXT PERFORM EXTRACT2 ON NONEXT GOTO EXIT ENDCASE

Start date greater than end date? Yes: quit case No: write the record to file CASE EXTRACT2 IF TEMPDATE GT ENDDATE1 THEN PERFORM EXTRACT1; TYPE ON AIRLINE1 " " COMPUTE TEMPDATE = TEMPDATE + 1; GOTO EXTRACT2 ENDCASE DATA END -RUN

The output

Where do we go from here? The available seating table resides in a SQL Server database

Load this data into our SQL Server data repository

Create INSERT statements

FILEDEF ROUTECOUNT DISK C:/ibi/apps/steve/AIRLINE.OUTTT -RUN APP HOLD steve TABLE FILE ROUTECOUNT PRINT * ON TABLE HOLD AS RECCOUNT END -SET &LLINES = &LINES; -START111 -SET &FILENUM = 1; -SET &CURRENTCTR =0; -SET &FIRSTLINE = 'INSERT INTO DailyFlights(Date,Start,Destination,'; -SET &FIRSTLINE1 = 'StartCity,DestinationCity)'; -SET &SECONDLINE =; -SET &THIRDLINE = ; -SET &APOST = HEXBYT(39,'A1'); -SET &DATEE=; -SET &STARTC=; -SET &DESTC=; -SET &SCDEST=; -SET &ECDEST=;

Write the SQL “Use”Statements FILEDEF ROUTECOUNT1 DISK C:/ibi/apps/steve/AIRLINE.OUTTT FILEDEF SCHEDULE DISK C:/ibi/apps/steve/AIRLINE.SQL1 -RUN -WRITE SCHEDULE USE FUSE2007 -WRITE SCHEDULE GO -WRITE SCHEDULE BEGIN TRANSACTION

Read all records & write to file -REPEAT LOOPER FOR &I FROM 1 TO &LLINES STEP 1 -READ ROUTECOUNT1 &A.2 &DATEE.10 &C.1 &STARTC.3 &A.1 &DESTC.3 &B.1 &SCDEST.50, - &CA.1 &ECDEST.50 -SET &SECONDLINE = ' VALUES (' || &APOST || &DATEE || &APOST; -SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &STARTC || &APOST; -SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &DESTC || &APOST; -SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &SCDEST || &APOST; -SET &SECONDLINE = &SECONDLINE || ',' || &APOST || &ECDEST || &APOST; -SET &SECONDLINE = &SECONDLINE || ');'; -WRITE SCHEDULE &FIRSTLINE -WRITE SCHEDULE &FIRSTLINE1 -WRITE SCHEDULE &SECONDLINE -LOOPER -WRITE SCHEDULE COMMIT TRANSACTION

The Insert Statements USE FUSE2007 GO BEGIN TRANSACTION INSERT INTO DailyFlights(Date,Start,Destination, StartCity,DestinationCity) VALUES ('2006/11/30','ABE','MHT','ALLENTOWN, PA','MANCHESTER, NH'); INSERT INTO DailyFlights(Date,Start,Destination, StartCity,DestinationCity) VALUES ('2006/12/01','ABE','MHT','ALLENTOWN, PA','MANCHESTER, NH'); ….. COMMIT TRANSACTION

The 50 million foot view

Raw Data Sequential File SQL Statements File System Watcher & SSIS Load Package File System Watcher & SSIS Load Package

cd C:\Program Files\Microsoft SQL Server\90\DTS\Binn DTExec /f "C:\AirlineScheduleLoad\AirlineSc hedule\AirlineSchedule\bin\LoadS chedule.dtsx"

Query created by join

JOIN DATEM AND START AND DESTINATION IN DAILYFLIGHTS TO DATEE AND START AND DESTINATION IN BOOKINGS AS J1 -RUN DEFINE FILE DAILYFLIGHTS CITYPAIR/A10 = START || '-'|| DESTINATION; END TABLE FILE DAILYFLIGHTS PRINT DATEM AS 'Date‘ CITYPAIR AS 'City Pair‘ STARTCITY AS 'Origin' DESTINATIONCITY AS 'Destination‘ FCLASS AS 'F’ YCLASS AS 'Y‘ MCLASS AS 'M' NCLASS AS 'N‘ QCLASS AS 'Q‘ SCLASS AS 'S' BY START NOPRINT BY DESTINATION NOPRINT BY DATEM NOPRINT ON TABLE SUBHEAD "Orange Free State Airlines" "Flight Schedule“ …..

During this hour we Examined data files of different formats. Examined ways and means of massaging the different formats into one ‘usable’ format. Examined ways of “manufacturing” records to facilitate generating end user reports. Verified that the data was correct.

During this hour we Saw that there were many “different” ways to modify anomalous data into the format of your choice, to produce the reports that you require. Which really goes to show that you can..

MODIFY your way of thinking when it comes to anomalous data formats Steve Simon State Street Corporation PowerPoint presentation & code samples may be found at: 4c765fc825912e4d.skydrive.live.com/browse.aspx/Public or by