Problem Statement: Assuming the users of the Auto Sales data
warehouse is the Sales departments.
The Executives might want to do the analysis as follows:
1.
How many Sedan class cars were sold in the
Northern Zone for the year 2006?
2.
Quarter wise Analysis of the above query.
3.
Quarter and State (in the Zone) wise analysis of
the above query.
4.
Quantity of cars by make within the Sedan Class
(e.g.: Swift, Maruti Dx etc.) by State (in the Zone) and by Quarter for the
year 2006.
We see that in every query, the level of granularity
increases.
Characteristics of Dimensional tables
1.
Primary Key of the table uniquely identifies
each row.
2.
Large number of attributes.
3.
Attributes are generally textual in nature.
4.
Attributes in the dimensional table are not
directly related to other attributes of the table.
5.
Dimensional table is not Normalized or Low level
of normalization (for efficient query performance, query should pick the
attribute and directly go to fact table)
6.
Compared to the fact table, it has fewer
records.
7.
Facilitates drilling down and rolling up of data
since the data is hierarchal. E.g.: Year, Month, Day etc.
Characteristics of a Fact Table
1.
Concatenated Key: A row in the fact table relates to a
combination of rows from all dimensional tables.
2.
Data Grain: Level of detail for the measurement
or metrics.
3.
Fully Additive Measures: The attributes, if can
be summed up by simple addition are known as fully additive. When we run a
query to aggregate the measures in a fact table, the output will be correct
only, if the measures are fully additive in nature.
4.
Semi additive Measures: These are attributes like percentage etc.
which have been derived from the additive measures.
5.
Fewer attributes but more rows compared to
dimensional tables.
6.
Sparse Data: for a particular combination of
dimensional table, it might be the case that the fact table doesn’t have any
data. E.g. : Form the month of February 2006, there might be no sale of Maruti
Swift in Rajasthan, hence no entry would be present in the fact table This is
an example of sparse data.
7.
Degenerate Dimensions: When we are selecting
facts and dimensions from the operational systems, there may be some
attributes, which are neither measures (facts) nor belong to dimension tables. E.g.:
invoice number, Order number. They are useful for analysis and kept in the fact
table.
The Fact less fact table
Assuming that a fact table is made for recording the
attendance of the students. Dimensions of this model would be student, course
and time. If the student is present then
it will be recorded as ‘1’ in the fact table.
The presence of any entry in the fact table would represent that that
the student is present and the fact hence would not contain any specific
measure i.e. it is fact less. Such types of situations arise when the fact is
measuring /recording events.
Conclusion
The entity relationship model is not suitable for Decision
support System. Dimensional modeling is apt for designing Decision support
systems as they facilitate more analysis. It not only is easy for users to
understand but also optimizes navigation and is most suitable for query
processing. Hierarchies within different dimensions facilitate Drill down and
roll up analysis of data, providing more flexibility for analysis.
Your blog is really useful for me, and I gathered some information from this blog. I did SAP Course in Chennai, at FITA Academy. It's really useful for me to make a bright career in IT industry.
ReplyDeleteThanks rebeka
ReplyDelete