Download presentation
Presentation is loading. Please wait.
1
LOCO Extract – Transform - Load
Installation
2
Component Structure - Server
Extractor Transformation Engine Including Translation Engine And Transformers From Source System To Target System Loader Response Translator Including Style sheet Translation To Source System From Target System The Response Translator Is only configured for Source Systems that require a response
3
Pre - Installation Requirements
Loco ETL is a java based application which requires the following to be installed or available prior to installation: Java 6 SDK or later Apache Tomcat version 6 or later A directory into which the ETL will be installed, referred to as the ETL_HOME directory The application uses a Derby database so an options directory must be made available, referred to as the OPTIONS_DIRECTORY During the installation OpenJMS and Active MQ can be selectively installed if required, these will be installed into the OPTIONS_DIRECTORY along with Derby. These can also be downloaded freely from the various sources via the links provided on the LOCO website the download sites also provide links to their documentation.
4
Installation Confirm the ETL_HOME and OPTIONS_DIRECTORY exist and are writeable by the operator Download the zip or tar file using the link provided by LOCO-ETL into a temporary folder and unziip. Run the install script, this will prompt the operator to enter the two required directories and then install the application into these two specified directories. As the Configuration Client is provided as a war file there are a couple of edits that need to be made so: Copy the etl.war file from the webapp directory to the tomcat/webapps directory If tomcat is running already, it will explode the etl.war file into a webapps/etl directory, if tomcat is not running it should be started by running the tomcat/bin/startup script. Edit the webapps/etl/WEB-INF/classes/log4j.xml file to set the etlFileLogger appender to point to the required log directory. Edit the webapps/etl/WEB-INF/spring/service.xml file to set the baseDirectory property to be the ETL_HOME and the etlHome property to be ETL_HOME/mapper. Restart tomcat. The Derby configuration database server can be run manually using the scripts provided in the ETL_HOME/mapper/bin directory but really should be configured by your system administrator to be run automatically. Before proceeding to the next step, ensure Derby is running.
5
Configuration Open a browser window and enter the url this will connect to the ETL Configuration Client. Open the Tools/ETL Licence tab and enter the provided licence key into the New ETL Licence Key field and save. Open the ETL Configuration/Environment tab, ensure the ETL Home field is correct and select Save, this will update the directory structure fields. Before continuing with the configuration the user should be familiar with the Component Structure outlined.
6
Configuration - Overview
The ETL configuration is encapsulated in an object referred to as a Server, this includes: Extractor, Transformation Engine including Transformer objects, Loader and Response Translator The Server object is multi-threaded to achieve maximum performance but take care when determining the number of threads to use as too many can result in the JVM spending more time switching between threads rather than actually running them. Multiple servers can be run simultaneously so for instance, one server can be running as an on-line service handling financial trades while another instance operates as a batch process loading CSV files from another source system etc. If the source system requires a response from the target system, even if its only pass or fail, a Response Translator should be configured, obviously if no response is required eg. a batch process, then Response Translator does not need to be configured.
7
Configuration – Templates and Maps
The first step is to define the data format provided by the source system and required by the target system, to do this Templates must be created. Where either data format is XML, this can be created automatically by using the Tools -> Template Import feature. To use this feature example files must be placed is specific directories ie. ETL_HOME/template/source or ETL_HOME/template/target. Currently these are set up for Financial trades sub-divided into Asset Classes however these can easily be changed as required. Where source system is a database, the template relates to the columns returned by the SQL Query that extracts the data, refer to the guidelines for more detail but essentially, aliases should be used for the column names as this allows for the structure of the query to be altered without affecting the template name. Where the target is a database, the single table column names should be entered into the template. Where the source or target is a delimited file, the column position and description should be entered into the template, the description is only used for info, the column position must start from 0. Where the source or target if a Fixed Field Width file, there is a screen under the Tools -> Configuration Assistant tab to help with this. Once the templates have been defined, the next step is to create a Map which is simply identifying the Source and Target templates and which fields map to each other.
8
Configuration – Server
The server object encapsulates the ETL process as outlined below so this section brings it all together. The important thing to note is that the datatype is the key to how data is processed. Basic This section sets the basic environment the server runs in and is fairly self explanatory however the Error logging section needs a bit of an explanation. The error logging referred to here is in addition to the normal logging and is used to capture failed XML format messages which can be modified and resubmitted via the Message mechanic which is available from the Tools menu. Extractor This section defines the type and connection properties of the Extractor object ie. the object that is the link between the source system and the ETL process. Where appropriate eg. when the source is a database, the operator will be prompted to enter the details of the datatype ie. The processing order, applicable when this server processes multiple datatypes from the same source The specific datatype The format of DATE data, in Java style, examples are given in the Guidelines tab The map to be used for this datatype and The SQL query filename to be used to extract the data.
9
Configuration - Server
Loader This section defines the type and connection properties of the Loader object ie. the object that is the link between the back office ( target ) system and the ETL process. Where the source system requires a response, the Response Translator object is also defined here. Where appropriate the operator will be prompted to provide details of the datatype ie. The specific datatype The format of DATE data, in Java style, examples are given in the Guidelines tab The name of the file to be generated When a Response Translator is to be configured, it should be noted that the communication link type can be different from the link between the Loader object and the Back Office System Response Translator Stylesheet When a Response Translator is configured it may or may not require a stylesheet ( Java XSLT ) to be defined, this stylesheet will be used to translate the response from the back office system format into the format required by the source system, it should be noted that cpmplex stylesheets can have a performance impact.
10
Configuration - Server
Datatype Specification This defines how the datatype of a message is determined. During the configuration the operator will be presented with two different sets of parameters depending on the type of datasource, these will be covered by the following: Source Datatype Field - The field in the Source system data that identifies the datatype of the data Datatype - The datatype used throughout the ETL to identify the correct Map etc Use Value of Source Datatype Field - If the field exists use the value to identify the identified datatype Source Datatype Field has Specific Value - If the field exists, does it have a specific value Source Datatype Field has Value Containing Specific Value - If the field exists, does it contain a specific value Source Datatype Field does NOT Contain Specific Value - If the field exists, does it NOT contain a specific value Source Datatype Field exists a specific number of times - If the field exists specific number of times Source Base Elment does NOT Exists - The source does not contain an element whose Base Name is specified Source Datatype Field Specific Value or Count or Base Name- If the field exists, the specific value required or the number of times it exists eg. 1 Data Map - The map used to transform Source format to Target format data Internal Message ID Field - The field in the Mapped Message that contains the unique data identifier supplied by the Source system TargetResponse Message ID Field - Where the Source requires a response, this identifies the field in the Recipient system response containing the unique data identifier Target Response Datatype Field - Where the Source requires a response, this identifies the field in the Recipient system response containing the datatype NOTE: the Base name of an element is the last sub-element ie. given an element with a name of customer.address, the Base name is address.
11
Configuration - Server Transformers
The package ships with a selection of Transformers which are used to massage the message as it proceeds through the ETL process, the available transformers can be accessed from the ETL Configuration -> Transformers tab. Each transformer includes a set of parameters which are applicable. The transformer list includes two database lookups ie. static table data and SQL generated data, both of these can be created using the Tools -> Configuration Tools -> Lookup screens. When creating a server configuration, transformers can be allocated to a particular datatype together with defining the appropriate values for each parameter from the ETL Configuration -> Server screen. To add a new Transformer, the operator should create a new Java package to add the Transformer class to and add the jar file to the ETL Classpath. Source code for a transformer will be made available to be used as an example.
12
Configuration – Directory Structure
The LOCO – ETL tool uses several pre-defined directories which are sub-directories of the ETL_HOME directory: Under the Tools tab there is a facility to import Source and Target system XML format data templates, these templates should be placed in ETL_HOME/template/source ( target ) directory. The Response Translator uses XSLT stylesheets, these should be placed in ETL_HOME/mapper/xsl. Database source extractors use SQL queries, these can be developed anywhere but should then be placed in ETL_HOME/mapper/data/requests directory.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.