Design and ETL 2017. 6
Loading a Star Dimension table 다음에 fact table 데이터 로딩 필요에 따라서 referential integrity를 만족하도록 dimension, fact table에 동시에 로딩도 가능 Dimension table 간에는 dependency가 거의 없음
Load a Dimension Table Incremental process Necessary to inspect the data sources of dimension tables for new and changed information on a regular basis What a dimension load must achieve
Load a Dimension Table
Load a Dimension Table Preparing records for processing Row-wise processing Source data pivoted, transposed
Load a Dimension Table
Load a Dimension Table (Type 1 change)
Load a Dimension Table (Type 1 change) brand_name (Type 1 attribute) not fully dependent on the natural key sku -> brand_code (type 2) ->(weakly determine) brand_name (type 1)
Load a Dimension Table (Type 2 change) Cook, Dan insertion
Loading the Fact Table What a fact table load must achieve
Loading the Fact Table Restructuring A single record
Loading the Fact Table Aggregation If the source data is given at a finer grain than is required by the fact table, => aggregation ! NK 기준, 해당하는 record를 병합, 요약 작업
Loading the Fact Table Identification of surrogate keys Source 데이터 레코드는 Natural Key를 가지고 추출 추출한 레코드가 fact table에 들어갈 때는 Surrogate Key가 필요
Optimizing the Load Eliminating lookup Caching lookup Dimension table에 대하여 Type 1 change 검사를 굳이 하지 않음 Caching lookup Dimension table 전체 또는 필요 컬럼들을 memory에 로드
Optimizing the Load
Cleansing the Data Cleansing the dimensional data 표준 코드
Cleansing the Data Facts with invalid details
Housekeeping Columns Housekeeping columns for ETL process
Housekeeping Columns
How to Design and Document a Dimensional Model
Dimensional Modeling Kimball’s guideline Each star corresponds to a discrete process
Dimensional Modeling may be grouped into a fact table !
Dimensional Modeling
Dimensional Modeling
Dimensional Modeling
Dimensional Modeling
Dimensional Modeling
Dimensional Modeling
Dimensional Modeling (detailed)
Dimensional Modeling (detailed)
Dimensional Modeling (detailed)
Dimensional Modeling (detailed)