Tuesday, 13 August 2013

Principles of Dimensional Modeling

Overview

The requirements definition completely drives the data design for the data warehouse.

Data design consists of putting together the data structures. A group of data elements form a data structure. Logical data design includes determination of the various data elements that are needed and combination of the data elements into structures of data. Logical data design also includes establishing the relationships among the data structures.  An essential component of this document is the set of information package diagrams. Remember that these are information matrices showing the metrics, business dimensions, and the hierarchies within individual business dimensions. The information package diagrams form the basis for the logical data design for the data warehouse. The data design process results in a dimensional data model.


So far we have formed the fact table and the dimension tables. How should these tables be arranged in the dimensional model? What are the relationships and how should we mark the relationships in the model? The dimensional model should primarily facilitate queries and analyses. What would be the types of queries and analyses? These would be queries and analyses where the metrics inside the fact table are analyzed across one or more dimensions using the dimension table attributes. Before we decide how to arrange the fact and dimension tables in our dimensional model and mark the relationships, let us go over what the dimensional model needs to achieve and what its purposes are. Here are some of the criteria for combining the tables into a dimensional model.
1.       The model should provide the best data access.
2.       The whole model must be query-centric.
3.       It must be optimized for queries and analyses.
4.       The model must show that the dimension tables interact with the fact table.
5.       It should also be structured in such a way that every dimension can interact equally with the fact table.

The model should allow drilling down or rolling up along dimension hierarchies. With these requirements, we find that a dimensional model with the fact table in the middle and the dimension tables arranged around the fact table satisfies the conditions. In this arrangement, each of the dimension tables has a direct relationship with the fact table in the middle. This is necessary because every dimension table with its attributes must have an even chance of participating in a query to analyze the attributes in the fact table. Such an arrangement in the dimensional model looks like a star formation, with the fact table at the core of the star and the dimension tables along the spikes of the star. The dimensional model is therefore called a STAR schema.

FACT and DIMENSIONAL Tables in Start Schema representation




No comments:

Post a Comment