tFileInputEBCDIC Bug Report & Design Recommendation

Slides:



Advertisements
Similar presentations
1 jNIK IT tool for electronic audit papers 17th meeting of the INTOSAI Working Group on IT Audit (WGITA) SAI POLAND (the Supreme Chamber of Control)
Advertisements

Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
Introduction to Maven 2.0 An open source build tool for Enterprise Java projects Mahen Goonewardene.
Professional Toolkit V2.0 C:\Presentations - SmartCafe_Prof_V2.0 - bsc page 1 Professional Toolkit 2.0.
An Empirical Study of the Reliability in UNIX Utilities Barton Miller Lars Fredriksen Brysn So Presented by Liping Cai.
O’Reilly – Hadoop: The Definitive Guide Ch.5 Developing a MapReduce Application 2 July 2010 Taewhi Lee.
Sakai on Rails Integrating Ruby and Sakai David Adams, Virginia Tech.
WTX Overview.
Lab#1 (14/3/1431h) Introduction To java programming cs425
© The McGraw-Hill Companies, 2006 Chapter 9 Software quality.
16-Jun-15 Exceptions. Errors and Exceptions An error is a bug in your program dividing by zero going outside the bounds of an array trying to use a null.
Exceptions. Errors and Exceptions An error is a bug in your program –dividing by zero –going outside the bounds of an array –trying to use a null reference.
CS1061 C Programming Lecture 3: The Programming Environment + Introduction to the Concept of an Algorithm A. O’Riordan, 2004.
Finding and Debugging Errors
4/16/2007Declare a Schema File I1. 4/16/2007Declare a Schema File I2 Declare a Schema File A collection of semantic validation rules designed to constrain.
CSE 219 COMPUTER SCIENCE III PROPERTIES OF HIGH QUALITY SOFTWARE.
Programming. Software is made by programmers Computers need all kinds of software, from operating systems to applications People learn how to tell the.
Talend 5.4 Architecture Adam Pemble Talend Professional Services.
1 © Talend 2014 Service Locator Talend ESB Training 2014 Jan Bernhardt Zsolt Beothy-Elo
CVSQL 2 The Design. System Overview System Components CVSQL Server –Three network interfaces –Modular data source provider framework –Decoupled SQL parsing.
Getting Started With Java Downloading and installing software Running your first program Dr. DwyerFall 2012.
AJAX Chat Analysis and Design Rui Zhao CS SPG UCCS.
CSCI 224 Introduction to Java Programming. Course Objectives  Learn the Java programming language: Syntax, Idioms Patterns, Styles  Become comfortable.
Initial Data Load Extension Module Webinar February 4th, 2009.
Lecture Set 1 Part B: Understanding Visual Studio and.NET – Structure and Terminology 1/16/ :04 PM.
Working Out with KURL! Shayne Koestler Kinetic Data.
Metadata Creation with the Earth System Modeling Framework Ryan O’Kuinghttons – NESII/CIRES/NOAA Kathy Saint – NESII/CSG July 22, 2014.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
Games Development 2 Text-based Game Data CO3301 Week 4.
Testing Session Testing Team-Release Management Team.
Java Introduction to JNI Prepared by Humaira Siddiqui.
1 Global Address Verification Overview Bud Walker, Admound Chou.
Marcelo R.N. Mendes. What is FINCoS? A Java-based set of tools for data generation, load submission, and performance measurement of event processing systems;
Testing. 2 Overview Testing and debugging are important activities in software development. Techniques and tools are introduced. Material borrowed here.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Software Documentation Section 5.5 ALBING’s Section JIA’s Appendix B JIA’s.
By Rachel Thompson and Michael Deck.  Java.io- a package for input and output  File I/O  Reads data into and out of the console  Writes and reads.
(1) Milestone 1 Review Milestone 2 Planning Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of.
By The Supreme Team CMPT 275 Assignment 2 May 29, 2009.
1 Implementing LEAP2A using the Argotic library in.NET Andrew Everson Extensions for Argotic version can be downloaded from:
Copyright Theorem Solutions Ltd 2001 Tony Ranger Technical Director Theorem Solutions Ltd. The PDM
Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008.
J.P. Wellisch, CERN/EP/SFT SCRAM Information on SCRAM J.P. Wellisch, C. Williams, S. Ashby.
Page 1 – Autumn 2009Steffen Vissing Andersen SDJ I1, Autumn 2009 Agenda: Java API Documentation Code Documenting (in javadoc format) Debugging.
NETWORK VISUALIZATION ABHISHEK KUMAR (2011CS50272)
Error Handling Tonga Institute of Higher Education.
Event Management. EMU Graham Heyes April Overview Background Requirements Solution Status.
Defensive Programming. Good programming practices that protect you from your own programming mistakes, as well as those of others – Assertions – Parameter.
Marcelo R.N. Mendes. What is FINCoS? A Java-based set of tools for data generation, load submission, and performance measurement of event processing systems;
Design Overview. Generated Packages ► fUML.Library.* - generated ► fUML.Semantics.* - generated ► fUML.Syntax.* - generated ► fUML.Test.* - generated.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Solvency II Tripartite template V2 and V3 Presentation of the conversion tools proposed by FundsXML France.
UAB Requirements for 2016 Ivan Prieto Barreiro 18/04/2016 UAB Requirements for
Slide 1 © 2016, Lera Technologies. All Rights Reserved. Oracle Data Integrator By Lera Technologies.
SE-1021 Software Engineering II
B.6 Roadmap 2013 – 2014 SDMX RI User Group Luxembourg, September 2013.
Objectives You should be able to describe: Interactive Keyboard Input
SOFTWARE DESIGN AND ARCHITECTURE
LOCO Extract – Transform - Load
Microsoft Office Illustrated
Malwarebytes Installation Issues Number Facing error with Malwarebytes software is not something unusual as most of the users use to face.
This is the cover slide..
Topics Introduction to File Input and Output
Programming.
(Computer fundamental Lab)
Exceptions 10-May-19.
Batch Setup.
Topics Introduction to File Input and Output
Games Development 2 Text-based Game Data
New “replace string” parameter for fix_doc_do_file_08
Presentation transcript:

tFileInputEBCDIC Bug Report & Design Recommendation July 2009

Introduction Describe the bug that is crashing a job due to a bad data record using the tFileInputEBCDIC componet. Describe how the EBCDIC component is very ineffecient as currently released.

Versions Using TALEND Open Studio Version: 3.1.3.r26090 Using Cobol2j

BUG - Setup Took a CHARTIS Cobol Copy Book. Translated to an xc2j file using Cobol2j Translated to a TALEND Schema XML file using the xc2j2talend.xsl style sheet. Created a job using the tFileInputEBCDIC component for input.

BUG - Setup 2 Configured the tFileInputEBCDIC component. Point to a data file that is in the Copybook format. Point to the xc2j file. Add a schema, import the TALEND schema generated from the XSL style sheet of the xc2j file. Connect the tFileInputEBCDIC input to a tLogRow component.

BUG - Result Bad record crashes the Job.

BUG - Error dump from TALEND run console Starting job OGISRealignmentTest_1 at 14:03 30/07/2009. Jul 30, 2009 2:03:47 PM net.sf.cobol2j.RecordSet next SEVERE: Cannot parse field: FLAT-RESERVE-TYPE-N. Data: ' ', Picture: 9(01), Type: 9, Size: 1 SEVERE: Total bytes processed before error: 102 Exception in component tFileInputEBCDIC_1 net.sf.cobol2j.RecordParseException: Couldn't parse record nr: 1. at net.sf.cobol2j.RecordSet.next(RecordSet.java:107) at talenddemosjava.ogisrealignmenttest_1_0_1.OGISRealignmentTest_1.tFileInputEBCDIC_1Process(OGISRealignmentTest_1.java:2564) at talenddemosjava.ogisrealignmenttest_1_0_1.OGISRealignmentTest_1.runJobInTOS(OGISRealignmentTest_1.java:3831) at talenddemosjava.ogisrealignmenttest_1_0_1.OGISRealignmentTest_1.main(OGISRealignmentTest_1.java:3747) Caused by: net.sf.cobol2j.FieldParseException: at net.sf.cobol2j.RecordSet.readZoned(RecordSet.java:466) at net.sf.cobol2j.RecordSet.getFieldsValues(RecordSet.java:189) at net.sf.cobol2j.RecordSet.getFieldsValues(RecordSet.java:244) at net.sf.cobol2j.RecordSet.next(RecordSet.java:89) ... 3 more Caused by: java.lang.NumberFormatException at java.math.BigDecimal.<init>(Unknown Source) at net.sf.cobol2j.RecordSet.readZoned(RecordSet.java:464) ... 7 more Job OGISRealignmentTest_1 ended at 14:03 30/07/2009. [exit code=1]

BUG - Explanation. java.math.BigDecimal throws a NumberFormat exception. It is a valid exception, the data in the field of the first record is garbage. This is because the data record is three bytes of zero. [0x0,0x0,0x0] Not three bytes of EBCDIC character that represents zero [0x0F, 0x0F, 0x0F]

BUG - Why this is a BUG. A bad record should not crash a job. One reason to use an ETL tool is to scrub bad records. Crashing on a bad record Does not allow scrubbing. Impacts robustness on a tool that needs to be reliable in the face of bad data.

Design – How it works In looking at the error log for the BUG, there are few flaws highlighted in the existing design. The cobol2j package builds a translator that takes a buffer of bytes the length of the cobol copybook and tries to convert them to a set of Java objects. This particular bug crashes the job when one of these objects can’t be created.

Design – How it works Taking even one field out of the TALEND schema xml file generated by the xc2j XSL translation generates a runtime error in TALEND that prevents the job from running.

Design – What it means. Generating a lot of excess Java objects. Just because data is in a record doesn’t mean you need it in the TALEND job. Creating obects has a performance impact because the Java VM’s perform internal house keeping (new & gc) on every object. Not graceful with bad data. Should allow the ETL job to do something graceful with bad records, not crash the job.

Design – Recommendation Fix the tFileInputEBCDIC component: Option1 : Rework the schema generator to only include the fields required for the job instead of every field. On a bad field exception, mark it as an error in the stream of records (TALEND schema), but continue the job. Option2: Switch to JRecord. JRecord does not suffer the same crash on bad data in it’s library. It uses copybooks directly with out the intermediate xml format. It is easy to only translate the fields required for the job from each record. Has a java api to access copybook metadata in a .jet template during the job generation phase in TALEND.