The Data Large Number of Workbooks Each Workbook has multiple worksheets Transaction worksheets have large (LARGE) number of lines (millions of records.

Slides:



Advertisements
Similar presentations
Data Mining and Governmental Accounting Applications
Advertisements

Quality and Outcomes Framework Assessor Training Skills in note-making, summarising and report writing Module S5.
Skills in note-making, summarising and report writing Quality and Outcomes Framework Assessor Training.
Build a database I: Design tables for a new Access database
Chapter 6 Computer Assisted Audit Tools and Techniques
Test process essentials Riitta Viitamäki,
Rfc4474bis-01 IETF 89 (London) STIR WG Jon & Cullen.
General Ledger and Journals. Financial Services - GL and Journals presentation What are journals? A journal [document] is used to record accounting.
© 2005 myCompany General Ledger Instructor Name. 2 This course covers General Ledger Accounting in SAP, including:  General ledger account master records.
The Accounting Cycle As we have learned, accounting is a systematic process of recording the financial transactions of a business and arranging the subsequent.
1 CA202 Spreadsheet Application Combining Data from Multiple Sources Lecture # 6.
Master Data for SCM (1) Master Data for Demand Planning & Fulfillment Processes EGN 5623 Enterprise Systems Optimization (Professional MSEM) Fall, 2012.
Copyright © 2003 Americas’ SAP Users’ Group Custom Archiving 101 Session Code 108 Karin Tillotson Sr. Basis Administrator Tuesday, May 20 th, 2003.
University of Southern California Enterprise Wide Information Systems Financial Modules Instructor: Richard W. Vawter.
IMS1805 Systems Analysis Topic 4: How do you do it? Guidelines for doing analysis.
Integration of Applications MIS3502: Application Integration and Evaluation Paul Weinberg Adapted from material by Arnold Kurtz, David.
Database Design Concepts INFO1408 Term 2 week 1 Data validation and Referential integrity.
Workflow & Event Derivation Workshop
Introduction to SAP R/3.
SAP R/3 Materials Management Module
Troy Eversen | 19 May 2015 Data Integrity Workshop.
Statewide Financial System Program 1 AR 215 Creating and Maintaining Receivables AR 215 Creating and Maintaining Receivables Welcome.
Database Design IST 7-10 Presented by Miss Egan and Miss Richards.
Designing Starbuilder Custom Financials Brian Bixby Bixby Consulting Services, Inc.
Module 3: Table Selection
Worksheet for a Service Business
TESTING.
What is Sure BDCs? BDC stands for Batch Data Communication and is also known as Batch Input. It is a technique for mass input of data into SAP by simulating.
McGraw-Hill/Irwin Copyright © 2005 by The McGraw-Hill Companies, Inc. All rights reserved. ENTERPRISE INFORMATION SYSTEMS A PATTERN BASED APPROACH Chapter.
Master Data for SCM (1) Master Data in Demand Planning & Fulfillment Processes EGN 5346 Logistics Engineering (MSEM, Professional) Fall, 2013.
Automatic Rendering Tool and Digital Audit Process Innovation Nov 2005 YongMoon Lee KICPA & AICPA.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Clients (and the interface level) Application Server (and the application level) Database Server (and the Database level)
Data and its manifestations. Storage and Retrieval techniques.
Management Information Systems MS Access MS Access is an application software that facilitates us to create Database Management Systems (DBMS)
INTERLEGES AGM KIEV THE “ESSENTIALS” OF LAW FIRM WEBSITES.
(Business) Process Centric Exchanges
Slideshow 2 Setting Up Bank Services, Tax Services and Schedule Codes.
Colleague, Excel & Word Best of Friends Presented by: Joan Kaun & Yvonne Nelson College of the Rockies.
MS Access 2007 Management Information Systems 1. Overview 2  What is MS Access?  Access Terminology  Access Window  Database Window  Create New Database.
It’s month end, now comes the reconciliation crunch Richard Byrom Oracle Applications Consultant Appsworld January 2003.
Handling Many to Many Relationships. 2 Handling Many:Many Relationships Aims: To explain why M:M relationships cannot be implemented in relational database.
Word problems DON’T PANIC! Some students believe they can’t do word problem solving in math. Don’t panic. It helps to read the question more than once.
Master Data for SCM (1) Master Data in Demand Planning & Fulfillment Processes EGN 5623 Enterprise Systems Optimization (Professional MSEM) Fall, 2011.
Key Applications Module Lesson 17 — Organizing Worksheets Computer Literacy BASICS.
1 IETF 88 (Vancouver) November 6, 2013 Cullen Jennings V3.
Verification & Validation. Batch processing In a batch processing system, documents such as sales orders are collected into batches of typically 50 documents.
We provide web based benchmarking, process diagnostics and operational performance measurement solutions to help public and private sector organisations.
Software Engineering 2004 Jyrki Nummenmaa 1 BACKGROUND There is no way to generally test programs exhaustively (that is, going through all execution.
Writing. Academic Writing Allow about 20 minutes In TASK 1 candidates are presented with a graph, table,chart or diagram and are asked to describe, summarise.
Files Tutor: You will need ….
13- 1 Chapter 13.  Overview of Sequential File Processing  Sequential File Updating - Creating a New Master File  Validity Checking in Update Procedures.
Programming Configuration Files in LabVIEW David Thomson Systems Integrator Original Code Consulting National Instruments Alliance Member Research Scientist.
Controlling (CO) SAP University Alliances Version 1.0
PestPac Software. Overview: Effective with the release of Version 1.38, all payment entry will be done through the QuickPayment Entry screen. The design.
Add and Edit Patients. When you see a Red circle or a next button, like this, that means we want you to click on something. These are the same steps you’ll.
Info Read SEGY Wavelet estimation New Project Correlate near offset far offset Display Well Tie Elog Strata Geoview Hampson-Russell References Create New.
INF3110 Group 2 EXAM 2013 SOLUTIONS AND HINTS. But first, an example of compile-time and run-time type checking Imagine we have the following code. What.
Implementing Multicurrency in an Existing Dynamics GP Environment Rod O’Connor 20-NOV-2014.
Microsoft Office Tips Pivot tables. Agenda Learn how to create and use PivotTables Q&A Excel 2010 is very similar to 2007, I have tried to demonstrate.
SAP R/3 User Administration1. 2 User administration in a productive environment is an ongoing process of creating, deleting, changing, and monitoring.
SAP MATERIALS MANAGEMENT(MM) TRAINING IN SOUTHAFRICA,AUSTRALIA
 Tata Consultancy Services 1 Financial Information.
 TATA CONSULTANCY SERVICES MM - INVOICE VERIFICATION.
View Integration and Implementation Compromises
Key Applications Module Lesson 17 — Organizing Worksheets
Presented by: Andy Vitullo Principle, Logan Consulting
Processing Integrity and Availability Controls
Fundamentals of Data Representation
Joins and other advanced Queries
Presentation transcript:

The Data Large Number of Workbooks Each Workbook has multiple worksheets Transaction worksheets have large (LARGE) number of lines (millions of records across the set ) Conclusion - Only IDEA is going to be any use

The Scope Source system is a multi-entity SAP ERP where the entities interact to form Supply Chains Each Workbook contains all FI postings in a period for an entity in separate Inputs and Outputs worksheets Aim is to combine all the entities and all periods and to profile Supply Chains

Supply Chains Image Here

Initial IDEA Challenges The Workbooks have passwords and imports failed The Workbooks have column totals in value fields at their foot which add non-standard records We decide to produce a cloned set of workbooks, check their totals but remove the total rows

First Steps 1 Controls !!! If we don’t know where we started we will never realise if we mess up the import and subsequent work Thankfully these are present as value column totals and check out Source References - if we don’t reference the source entity and period we can’t refer back to original data (eg date corruption on import) - so we added these

Next IDEA Challenges We go for broke - load everything into IDEA but.... Append ---The Workbooks are discovered to differ between entities - some have additional columns, some missing columns Import Error - Some Workbooks have dodgy date formats and bad data (vlookup failures) so corrupted data is imported

First Steps -1 Approach now was to take a step back and fix data problems (eg loss of vlookup references and dates ) in the workbooks first The “go for broke” exercise had also identified that in addition to additional/missing columns we had different field types and field lengths being imported in what appeared to be identical columns... Oh well....

Back to Square One Then we had a think We at least had discovered most if not all of the data problems In the absence of some automatic normalisation tools we were going to have to use cunning and guile to get where we need to be... To start !

Pilot the Solution Since aggregating all the different worksheet formats was too difficult but seemed (fingers crossed) mainly to be a problem between entity workbook sets, if we could identify the subset of records we needed on the basis of a pilot exercise on one entity, we could load the sets independently and create extract subsets towards a final aggregation

Welcome to SAP SAP is a highly integrated ERP with many sub-modules Once transactions in the sub-modules cause an accounting event, a posting ends up in the accounting module FI There are lots of postings and we had buckets of them labelled “inputs” and “outputs”

SAP Organisational Model Image here

What we knew 1 The entities for which we had FI postings were “company codes” which are legal entities with their own P&L and Balance Sheet for external accounting In SAP “company” is something different and corresponds to the optional grouping of company codes for consolidation reporting - think accounting groups

What we knew 2 This organisation also implemented the optional “Business Areas” which can be used across “Company Codes”, so a coding of business activities shared by multiple legal entities Thankfully many of the other optional SAP organisational types eg “Sales Areas” did not make an appearance (...so far...)

The Summarisation Trap Using summarisation on key fields to get a handle on the data is an exploratory data technique but with large numbers of diverse transaction types you need to have a model of the data to work against Sometimes you keep summarising in the hope that a pattern will emerge - unfortunately with big data sets there are lots of patterns.... But what do they mean ?

The Model Image Here

More is sometimes less !!

Finding Signposts Happily we were now in a position to ask some sensible questions about the codings used for key fields (SAP provides a default set but they are often extensible) Business Area Document Type Document Number (the first 2 digits are significant for which ledger is posted)

Directing Extraction Using the model and the codes we were able to identify the elements of the model in the pilot entity inputs and outputs We had initially tried the Vendor/Customer Fields but suspected they were only part of the story We needed to leave all internal postings behind but keep relevant intercompany ones

Not There Yet But.... So for the pilot entity we had developed a relationship between input and output sets following the model which can be reconciled to the original controls and through them to the analysis of P&L and Balance Sheet postings for periods So positive progress

Codes to use. All Records with Vendor/Customer Names (External & Internal) PLUS Business Area InwardOutward Documents TypeInwardOutward DOCFIRST 2InwardsOutwards No CINoYes16Yes 0No CQNoYes 21 Yes 10 Yes DPYesNo 22 Yes 11 Yes GAYes 31 Yes 14 Yes GDYesNo 32 Yes 2 GEYes 36 Yes 20No RRYesNo 37 Yes 3 RVNoYes 44 Yes 4 V3YesNo51Yes 5 No VLYesNo52Yes 6 VQYesNo54Yes 8 55Yes 9 59Yes

Further IDEA Challenges We now have to apply the extraction criteria to the other workbook sets individually (mostly avoiding the inconsistent fields) and then add these extracts together (overcoming the field type and length problems with append).... Maybe a job for a batch process ???