Data Warehousing models

May 27, 2020

Data Warehousing models - Star schema

Multidimensional Schema is especially designed to model data warehouse systems. The schemas are designed to address the unique needs of very large databases designed for the analytical purpose (OLAP).

Star schema is mature modeling approach widely adopted by relational data warehouses. It requires modelers to classify their model table as either dimension or fact. Dimension tables describe the business entities(the things you model). Entities can include products,people, places and concepts including time itself. The most consistent table you’ll find in a star schema is a date dimension table. A dimension table contains a key column(or columns) that act as a unique identifier and descriptive columns.

Fact tables store observations or evets and can be sales orders, stock balances, exchange rates, temperature, etc. A fact table contains dimension key columns that relate to dimension tables and numeric measure columns. The dimension key columns determine the dimensionality of a fact table while the dimension key values determine the granularity of a fact table. Let’s consider a fact table designed to store sale targets that has two dimension key columns Date and ProductKey. It’s easy to understand that the table has two dimensions. The granularity can’t be determined without considering the dimension key values. As in the example, consider that the values stored in the Date column are the first day of each month. In this case, the granularity is at month-product level. Generally dimension table contain a relatively small number of rows. Fact tables can contain a very large number of rows and continue to grow over time.

There are several characteristics of star schema:

· Every dimension in a star schema is represented with the only one-dimension table.

· The dimension table should contain the set of attributes.

· The dimension table is joined to the fact table using a foreign key

· The dimension table are not joined to each other.

· Fact table would contain key and measure

· The star schema is easy to understand and provides optimal disk usage.

· The dimension tables are not normalized.

· The schema is widely supported by BI tools.

There are several advantages and disadvantages in this star schema modeling.

Advantages	Disadvanatages
A simplified schema means that we won’t have to write confusingly long queries every time we want some information from our database.	Denormalizing our data means that data anomalies could arise from one-off inserts or updates. In practice, star schemas are implemented via “trickle feeds” or batch processing to compensate for this issue.
We optimized for reads. Now that we can write fewer `JOINS`, our results will be returned more quickly.	We have limited analytical flexibility. A star schema is generally designed for a particular purpose. Since we have fewer features in the star schema than in the full database, we are restricted to just what this star schema contains.
Also, it will business logic for reporting. We won’t have to explain to stakeholders all the crazy joins that went into making the schema, just maybe .

References

“Star and SnowFlake Schema in Data Warehouse,” Guru99. [Online]. Available: https://www.guru99.com/star-snowflake-data-warehousing.html. [Accessed: 27-May-2020].

Peter-Myers, “Understand star schema and the importance for Power BI - Power BI,” Power BI | Microsoft Docs. [Online]. Available: https://docs.microsoft.com/en-us/power-bi/guidance/star-schema. [Accessed: 27-May-2020].

Search This Blog

Geeky Blinders

Data Warehousing models - Star schema

Comments

Post a Comment

Popular Posts