Copyright 2007, Information Builders. Slide 1 Cesare Petrizio June, 2008 The File Is Too Large
Copyright 2007, Information Builders. Slide 2 The File is Too Large Agenda What Is Too Large? Partition or Not To Partition Horizontal Partitioning Vertical Partitioning Access File Intelligent Partitions for Reporting JOINs and External Indexes Partitions and MODIFY XFOCUS alternative
Copyright 2007, Information Builders. Slide 3 Human Resources Database Lots of Employees
Copyright 2007, Information Builders. Slide 4 The File is Too Large Employee File EMPINFO 01 S1 ************** *EMP_ID **I *LAST_NAME ** *FIRST_NAME ** *HIRE_DATE ** * ** *************** ************** I I I I I FUNDTRAN I PAYINFO I SALINFO 02 I U 03 I SH1 08 I SH1 ************** ************** ************** *BANK_NAME * *DAT_INC ** *PAY_DATE ** *BANK_CODE * *PCT_INC ** *GROSS ** *BANK_ACCT * *SALARY ** *NET ** *EFFECT_DATE * *JOBCODE ** *CHECK_NO **I * * * ** * ** ************** *************** *************** ************** **************
Copyright 2007, Information Builders. Slide 5 The File is Too Large Limits for FOCUS/FUSION files Number of Pages: FOCUS (512K 4K pages) 2 gig Number of Segments 64 Number of Indexes + text fields + segments 189 Number of Fields3072 Segment Size (data + pointers) 3968 bytes Maximum Efficiency All instances of a single chain fit on a page. What is Too Large? FOCUS Database
Copyright 2007, Information Builders. Slide 6 The File is Too Large Estimating File Size Each Instance is comprised of Data and Pointers Data Type Storage Ann bytes of Storage AnVn Bytes of Storage + 2 bytes for length In 4 bytes Dn.m 8 bytes Fn.m4 bytes Pn.m (n <= 15)8 bytes Pn.m (n > 15)16 bytes Smart Dates4 bytes PLUS Filler to pad Segment to full word (up to 4 bytes) TXn8 byte pointer to separate pages with text data
Copyright 2007, Information Builders. Slide 7 The File is Too Large Estimating File Size Pointers 4 Bytes consists of: Type Page #, Word Offset Types: Parent to Real Child Parent to KU Child Parent to KM Child Child to Parent Forward Chain Deleted, Free, End-of-Chain
Copyright 2007, Information Builders. Slide 8 The File is Too Large Estimating File Size Index = I Internal Index – Updated as File is Updated Per Entry: Value Address of Data Instance Additional Pages for Access Speed Pages may be ½ full
Copyright 2007, Information Builders. Slide 9 The File is Too Large Estimating File Size SEGNAME=EMPINFO, SEGTYPE=S1 FIELDNAME=EMP_ID, ALIAS=EID, FORMAT=A9, INDEX=I, $ 9 FIELDNAME=LAST_NAME, ALIAS=LN, FORMAT=A15, $ 15 FIELDNAME=FIRST_NAME, ALIAS=FN, FORMAT=A10, $ 10 FIELDNAME=HIRE_DATE, ALIAS=HDT, FORMAT=YYMD, $ 4 FIELDNAME=DEPARTMENT, ALIAS=DPT, FORMAT=A10, $ 10 FIELDNAME=CURR_SAL, ALIAS=CSAL,FORMAT=D12.2M, $ 8 FIELDNAME=CURR_JOBCODE,ALIAS=CJC, FORMAT=A3, $ 3 FIELDNAME=ED_HRS, ALIAS=OJT, FORMAT=F6.2, $ 4 +1 Pointers: 4 (3 – Parent – to – Child, 1 – Chain) = 16 bytes Data: 63 Bytes + 1 “filler” = 64 bytes Total: 80 bytes (3968 / 80 = 49.6) Instances/Page 49 EX CALCFILE filename EMPINFO 01 S1 ************** *EMP_ID **I *LAST_NAME ** *FIRST_NAME ** *HIRE_DATE ** * ** *************** ************** I I I I I FUNDTRAN I PAYINFO I SALINFO 02 I U 03 I SH1 08 I SH1 ************** ************** ************** *BANK_NAME * *DAT_INC ** *PAY_DATE ** *BANK_CODE * *PCT_INC ** *GROSS ** *BANK_ACCT * *SALARY ** *NET ** *EFFECT_DATE * *JOBCODE ** *CHECK_NO **I * * * ** * ** ************** *************** *************** ************** **************
Copyright 2007, Information Builders. Slide 10 The File is Too Large What to Do? Change to XFOCUS Database 16K page size 1024 pages per filename Internal, External or MDI indexes Partition Horizontal Partitioning Partition by SEGMENT MODIFY / TABLE – No Changes Vertical Partitioning Partition by Value “Intelligent Partitioning” Only first partition used in MODIFY May need to use COMBINE for MODIFY External Index needed for JOIN
Copyright 2007, Information Builders. Slide 11 DN Horizontal Partitioning 2 Copyright © 2001 Information Builders, Inc.
Copyright 2007, Information Builders. Slide 12 The File is Too Large HEmploye File EMPINFO 01 S1 ************** *EMP_ID **I *LAST_NAME ** *FIRST_NAME ** *HIRE_DATE ** * ** *************** ************** I I I I I FUNDTRAN I PAYINFO I SALINFO 02 I U 03 I SH1 08 I SH1 ************** ************** ************** *BANK_NAME * *DAT_INC ** *PAY_DATE ** *BANK_CODE * *PCT_INC ** *GROSS ** *BANK_ACCT * *SALARY ** *NET ** *EFFECT_DATE * *JOBCODE ** *CHECK_NO **I * * * ** * ** ************** *************** *************** ************** **************
Copyright 2007, Information Builders. Slide 13 The File is Too Large ACCESS File Points to the FOCUS Files FILENAME=HEMPLOYE, SUFFIX=FOC,ACCESS=HEMPLOYE,$ SEGNAME=EMPINFO, SEGTYPE=S1 FIELDNAME=EMP_ID, ALIAS=EID, FORMAT=A9, INDEX=I,$... SEGNAME=FUNDTRAN, SEGTYPE=U, PARENT=EMPINFO, LOCATION = FUNDS FIELDNAME=BANK_NAME, ALIAS=BN, FORMAT=A20, $... SEGNAME=PAYINFO, SEGTYPE=SH1, PARENT=EMPINFO FIELDNAME=DAT_INC, ALIAS=DI, FORMAT=MDYY, $... SEGNAME=SALINFO, SEGTYPE=SH1, PARENT=EMPINFO, LOCATION = SALES FIELDNAME=PAY_DATE, ALIAS=PD, FORMAT=MDYY, $... MASTER File
Copyright 2007, Information Builders. Slide 14 The File is Too Large ACCESS File Points to the FOCUS Files MASTER = HEMPLOYE DATA = IBIBJS.HEMPLOYE.FOCUS,$ LOCATION = FUNDS LOCATIONDATA = IBIBJS.FUNDS.DATA,$ LOCATION = SALES LOCATIONDATA = IBIBJS.SALES.DATA,$ ACCESS File
Copyright 2007, Information Builders. Slide 15 DN Vertical Partitioning 2 Copyright © 2001 Information Builders, Inc.
Copyright 2007, Information Builders. Slide 16 The File is Too Large USEmploye File USEMP FOCUS USSALS FOCUS USFUND FOCUS EMPINFO 01 S1 ************** *EMP_ID **I *LAST_NAME ** *FIRST_NAME ** *HIRE_DATE ** * ** *************** ************** I I I I I FUNDTRAN I PAYINFO I SALINFO 02 I U 03 I SH1 08 I SH1 ************** ************** ************** *BANK_NAME * *DAT_INC ** *PAY_DATE ** *BANK_CODE * *PCT_INC ** *GROSS ** *BANK_ACCT * *SALARY ** *NET ** *EFFECT_DATE * *JOBCODE ** *CHECK_NO **I * * * ** * ** ************** *************** *************** ************** **************
Copyright 2007, Information Builders. Slide 17 The File is Too Large CAEmploye File CAEMP FOCUS CASALS FOCUS CAFUND FOCUS EMPINFO 01 S1 ************** *EMP_ID **I *LAST_NAME ** *FIRST_NAME ** *HIRE_DATE ** * ** *************** ************** I I I I I FUNDTRAN I PAYINFO I SALINFO 02 I U 03 I SH1 08 I SH1 ************** ************** ************** *BANK_NAME * *DAT_INC ** *PAY_DATE ** *BANK_CODE * *PCT_INC ** *GROSS ** *BANK_ACCT * *SALARY ** *NET ** *EFFECT_DATE * *JOBCODE ** *CHECK_NO **I * * * ** * ** ************** *************** *************** ************** **************
Copyright 2007, Information Builders. Slide 18 The File is Too Large EUEmploye File EUEMP FOCUS EUSALS FOCUS EUFUND FOCUS EMPINFO 01 S1 ************** *EMP_ID **I *LAST_NAME ** *FIRST_NAME ** *HIRE_DATE ** * ** *************** ************** I I I I I FUNDTRAN I PAYINFO I SALINFO 02 I U 03 I SH1 08 I SH1 ************** ************** ************** *BANK_NAME * *DAT_INC ** *PAY_DATE ** *BANK_CODE * *PCT_INC ** *GROSS ** *BANK_ACCT * *SALARY ** *NET ** *EFFECT_DATE * *JOBCODE ** *CHECK_NO **I * * * ** * ** ************** *************** *************** ************** **************
Copyright 2007, Information Builders. Slide 19 The File is Too Large ACCESS File Points to the FOCUS Files MASTER = HEMPLOYE,$ DATA = c:\ibi\apps\hr\hemploye.foc,$ LOCATION = FUNDS LOCATIONDATA = c:\apps\hr\funds.foc,$ LOCATION = SALES LOCATIONDATA = c:\ibi\apps\hr\sales.foc,$ DATA = c:\ibi\apps\hrca\hemploye.foc,$ LOCATION = FUNDS LOCATIONDATA = c:\ibi\apps\hrca\funds.foc,$ LOCATION = SALES LOCATIONDATA = c:\ibi\apps\hrca\sales.foc,$ DATA = c:\ibi\apps\hreu\hemploye.foc,$ LOCATION = FUNDS LOCATIONDATA = c:\ibi\apps\hreu\funds.foc,$ LOCATION = SALES LOCATIONDATA = c:\ibi\apps\hreu\sales.foc,$ ACCESS File
Copyright 2007, Information Builders. Slide 20 The File is Too Large ACCESS File Points to the FOCUS Files MASTER = HEMPLOYE,$ DATA = c:\ibi\apps\hr\hemploye.foc,$ WHERE=DEPARTMENT EQ ‘MIS’ OR ‘PRODUCTION’,$ LOCATION = FUNDS LOCATIONDATA = c:\apps\hr\funds.foc,$ LOCATION = SALS LOCATIONDATA = c:\ibi\apps\hr\sales.foc,$ DATA = c:\ibi\apps\hrca\hemploye.foc,$ WHERE= DEPARTMENT EQ ‘CANADA’,$ LOCATION = FUNDS LOCATIONDATA = c:\ibi\apps\hrca\funds.foc,$ LOCATION = SALS LOCATIONDATA = c:\ibi\apps\hrca\sales.foc,$ DATA = c:\ibi\apps\hreu\hemploye.foc,$ WHERE=DEPARTMENT EQ ‘EUROPE’,$ LOCATION = FUNDS LOCATIONDATA = c:\ibi\apps\hreu\funds.foc,$ LOCATION = SALS LOCATIONDATA = c:\ibi\apps\hreu\sales.foc,$
Copyright 2007, Information Builders. Slide 21 The File is Too Large Intelligent USE – no ACCESS File -IF &DEPARTMENT NE ‘ALL’ GOTO USESOME; -USEALL USE USEMP AS HEMPLOYE USFUND AS FUNDS USSALS AS SALES CAEMP AS HEMPLOYE CAFUND AS FUNDS CASALS AS SALES EUEMP AS HEMPLOYE EUFUND AS FUNDS EUSALS AS SALES END -GOTO SKIPIT -USESOME USE -SET &PREF = IF &DEPARTMENT EQ ‘CANADA’ THEN ‘CA’ ELSE - IF &DEPARTMENT EQ ‘EUROPE’ THEN ‘EU’ ELSE ‘US’; &PREF|EMP AS HEMPLOYE &PREF|FUND AS FUNDS &PREF|SALS AS SALES END -SKIPIT
Copyright 2007, Information Builders. Slide 22 The File is Too Large Vertical Partition Concerns A Vertical Partitioned Database is not MODIFIED. Only a Single Partition (first in ACCESS file) is Updateable. Solution: Have separate Masters for MODIFY Purposes: COMBINE USEMP AND CAEMP AND EUEMP AS COMBO MODIFY FILE COMBO FIXFORM DEPARTMENT/A10 … IF DEPARTMENT EQ ‘CANADA’ GOTO CANADACASE; IF DEPARTMENT EQ ‘EUROPE’ GOTO EUROPECASE; MATCH EMP_ID … CASE CANADACASE COMPUTE CADEPARTMENT = DEPARTMENT; CAEMP = EMP_ID; … MATCH CAEMP … ENDCASE CASE EUROPECASE COMPUTE EUDEPARTMENT = DEPARTMENT; EUEMP = EMP_ID
Copyright 2007, Information Builders. Slide 23 The File is Too Large Vertical Partition Concerns Internal Indexes are Separate for each Vertical Partition. Cannot JOIN to a concatenated file. Solution: Keep “archived Data” in several partitions) with External Index (or MDI) Keep Active Data in separate “modifiable” database Periodically create new archive partition, and ADD to External index Use MORE to concatenate active with Archive for reporting.
Copyright 2007, Information Builders. Slide 24 The File is Too Large Vertical Partition Concerns FILE=CUSTOMER,SUFFIX=FOC, DATA=c:\ibi\apps\customer.foc,$ SEGNAME = CSEG,SEGTYPE=S1 FIELD=CUSTOMER_ID,,A9,$ FIELD=CUSTOMER_NAME,,A20,INDEX=I,$ … FILE=ACUSTMER,SUFFIX=FOC, DATA=c:\ibi\apps\customer.foc,$ SEGNAME = CSEG,SEGTYPE=S1 FIELD=CUSTOMER_ID,,A9,$ FIELD=CUSTOMER_NAME,,A20,INDEX=I,$ … One Customer File Orders for Current Year are “active”. Find all Orders, Current and Prior for Given Customer Two Masters, Same Dataset CUSTOMER.MAS ACUSTMER.MAS
Copyright 2007, Information Builders. Slide 25 The File is Too Large Vertical Partition Concerns FILE=AORDERS,SUFFIX=FOC, DATA=c:\ibi\apps\aorders.foc,$ SEGNAME = OSEG,SEGTYPE=SH1 FIELD=ORDER_NO,,A9,$ FIELD=CUSTOMER,,A9,INDEX=I,$ FIELD=YEAR, ORDER_YEAR,,YY,$ … FILE=ORDERS,SUFFIX=FOC,$ SEGNAME = OSEG,SEGTYPE=SH1 FIELD=ORDER_NO,,A9,$ FIELD=CUSTOMER,,A9,INDEX=I,$ FIELD = YEAR, ORDER_YEAR,,YY,$ … Two Masters, Multiple Datasets AORDERS.MAS ORDERS.MAS
Copyright 2007, Information Builders. Slide 26 The File is Too Large Vertical Partition Concerns -* External Indexes USE -- MDI specified in ACX USE c:\apps\ord2007 AS ORDERS c:\apps\ord2006 AS ORDERS c:\apps\ord2005 AS ORDERS c:\apps\ordhist AS ORDERS C:\apps\ordidx WITH ORDERS END JOIN CUSTOMER_NUMBER IN CUSTOMER TO ALL CUSTOMER_NUMBER IN ORDERS AS AJ JOIN CUSTOMER_NUMBER IN ACUSTMER TO ALL CUSTOMER_NUMBER IN AORDERS AS BJ TABLE FILE CUSTOMER PRINT … BY CUSTOMER_NUMBER BY HIGHEST ORDER_DATE MORE FILE ACUSTMER END …
Copyright 2007, Information Builders. Slide 27 File is Too Large