Overview
The requirements definition completely drives the data design for
the data warehouse.
Data design consists of putting together the data structures. A
group of data elements form a data structure. Logical data design includes
determination of the various data elements that are needed and combination of
the data elements into structures of data. Logical data design also includes
establishing the relationships among the data structures. An essential component of this document is the
set of information package diagrams. Remember that these are information
matrices showing the metrics, business dimensions, and the hierarchies within
individual business dimensions. The information package diagrams form the basis
for the logical data design for the data warehouse. The data design process
results in a dimensional data model.
So far we have formed the fact table and the dimension tables. How
should these tables be arranged in the dimensional model? What are the
relationships and how should we mark the relationships in the model? The
dimensional model should primarily facilitate queries and analyses. What would
be the types of queries and analyses? These would be queries and analyses where
the metrics inside the fact table are analyzed across one or more dimensions
using the dimension table attributes. Before we decide how to arrange the fact
and dimension tables in our dimensional model and mark the relationships, let
us go over what the dimensional model needs to achieve and what its purposes
are. Here are some of the criteria for combining the tables into a dimensional
model.
1. The model
should provide the best data access.
2. The whole
model must be query-centric.
3. It must be
optimized for queries and analyses.
4. The model
must show that the dimension tables interact with the fact table.
5. It should
also be structured in such a way that every dimension can interact equally with
the fact table.
The
model should allow drilling down or rolling up along dimension hierarchies.
With these requirements, we find that a dimensional model with the fact table
in the middle and the dimension tables arranged around the fact table satisfies
the conditions. In this arrangement, each of the dimension tables has a direct
relationship with the fact table in the middle. This is necessary because every
dimension table with its attributes must have an even chance of participating
in a query to analyze the attributes in the fact table. Such an arrangement in
the dimensional model looks like a star formation, with the fact table at the
core of the star and the dimension tables along the spikes of the star. The
dimensional model is therefore called a STAR schema.
FACT and
DIMENSIONAL Tables in Start Schema representation
No comments:
Post a Comment