Tuesday, February 5, 2013

Understanding the types of tables in Oracle Business Intelligence Data Warehouse

The Oracle Business Analytics Warehouse (OBAW) is based on the dimensional modeling technique where fact tables are joined to multiple related dimension tables to form a "star schema".   While at the most basic level the star schema can be represented by dimensions and facts, the process of building those tables in the Oracle Business Analytics Warehouse requires several more table types.   

The standard OBAW ETL (Extract, Transform, Load) process includes the following tables:

Dimension Staging  (_DS) contains data loaded from a source application database.

Dimension Target   (_D)  contains the transformed dimension data for the final star schema model.

Aggregate tables (_A) sontains pre-summarized data at a higher grain than the base fact table.  Aggregate tables are used to improve query performance on front end tools.

Fact Staging tables (_FS) contains staged metric data prior to ETL transformation.   Foreign key values for the dimensions have not yet been translated to WID (warehouse ID) key values.

Fact tables (_F) contains the metrics to be analyzed along with foreign key WID (warehouse id) values pointing to the associated dimensions.

Delete tables (_DEL)  contains the primary keys of rows that were physically deleted from the source application.  These rows are either flagged or deleted from the data warehouse during the ETL process.

Dimension Hierarchy tables (_DH) contains dimension hierarchical structure information.  These are flattened hierarchies with multiple columns representing each level of the hierarchy, typically a code and name pair at each level and allow for rolling up data at various summary group levels.

Staging tables for Dimension (_DS) contains dimension hierarchy information that has not been processed by final ETL transformations.

Internal tables (_G, _GS, _S) Internal tables are referenced by ETL mappings for data transformation purposes as wells as controlling ETL execution.

Mini dimension tables (_MD) Include combinations of the most queried attributes of
their parent dimensions. These smaller "combo" tables are then joined to the fact tables in order to improve query performance.

Persisted staging tables (_PS) contains a static copy of source table data with additional calculated columns in order to improve ETL processing.    These tables are useful when there isn't a clear incremental extract date available to handle changes that affect multiple records.  These tables are never truncated except during a full load.

Pre-staging temporary table (_TMP) Source-specific tables used as part of the ETL processes to make the source data match the standard staging table structure. These tables contain intermediate results that are created as part of the conforming process.

No comments:

Post a Comment