Fundamental Concepts
Gather Business Requirements and Data Realities
samples in the book
Chapter 1 DW/BI and Dimensional Modeling Primer , p 5
Chapter 3 Retail Sales , p 70
Chapter 11 Telecommunications , p 297
Chapter 17 Lifecycle Overview , p 412
Chapter 18 Dimensional Modeling Process and Tasks , p 431
Chapter 19 ETL Subsystems and Techniques ,p 444
Collaborative Dimensional Modeling Workshops
Dimension models should be designed by folks who fully understand the business and their needs.
Four-Step Dimensional Design process
- Select the business Process
- Declare the Grain
- Identify the Dimensions
- Identify the facts
Business Processes
Operational Activities
Grain
The grain establishes exactly what a single fact table row represents.
Dimensions for Descriptive Context
Facts for Measurements
Star Schemas and OLAP Cubes
Graceful Extensions to Dimensional Models
- Add column to Fact table to describe FACT
- Add column to Fact table to contain foreign key to new dimension table
- Add column to Dimension table to add Attributes
Basic Fact Table Techniques
Fact Table Structure
A fact table contains the numeric measure produced by an operational measurement event in the real world.
Additive, Semi-Additive, Non-Additive Facts
Balance amounts are common semi-additive facts because they are additive across all dimensions except time.
Some measures are completely non-additive, such as ratios.
Nulls in Fact Tables
nulls must be avoided in the fact table’s foreign keys
Conformed facts
Same fact across different table, must use same name
Transaction Fact Tables
Periodic Snapshot Fact Tables
Factless Fact Tables
Samples : table containing students attend school or not.
Aggregate Fact Tables
For accelerate the query performance.
Consolidated Fact Table
Sales actual and sales forcast being saved into same table, this design will make it easy to analysis but hard to ETL.
Basic Dimension Table Techniques
Dimension Surrogate Key
- Structure: wide, flat, denormalized tables with many low-cardinality text Attributes.
- Single primary keys
- Can’t use operational system’s natural key
- Recommend to use anonymous integer primary key; Date dimension is exempt from this rule.
Natural, Durable and Supernatural key
-
Natural key is generated from business System
-
Durable / Supernatural key is generated by DW to indicate although Natural Key changed but it’s the same object. (for example an employ rejoined.)
-
Drilling down: fundamental data analysis method
-
Degenerate Dimensions
example : an invoice with multiple items. Items fact table has all the dimensions as foreign key. Then invoice number become a dimension for item fact table ; but the invoice number dimension do not has any attribute with it. So the invoice number dimension table became a Degenerate Dimension. And this kind of dimension would be helpful with transaction and accumulating snapshot fact tables.
- Use text words in dimension attribute instead of crypic abbreviations , flags etc
- Why to use Date Dimension instead of using SQL compute: because Date Dimension has more attributes like: week number, holiday , fiscal period etc.
- DateTime dimension table also need default row as normal dimension table
- Role playing dimension: means dimension being defined once but being referenced mulitple times in one fact table and each time has different meaning. For example , Time dimension.
- Junk Dimension: when transaction has loads of dimension that don’t have a lot of value, we can combine some of them as one dimension.
- Snowflaked Dimensions: when you normalized all the dimension table.
- Outtrigger Dimensions: when dimension reference another dimension.
- for example a dimension refer to date dimension.
- The baseline is dimensions are all supporting fact table. There shouldn’t be a case that fact table need one dimension to get the key of another dimension.