Dimension and Fact Tables

Data Warehousing : Dimension and Fact Tables

data_warehousing_:_dimension_and_fact_tables

Dimension table

Dimension Table is a key component for Start Schema.
A dimension table contains the attributes which represent dimensions, along which the measurement is taken in fact table.

Attributes and Keys

Every Dimension table must have a primary key that uniquely identifies each record of the table. It is commonly observed that the dimension table contain many attributes. Hence, it appears to be wide i.e. when you create a dimension table you will find it spreading horizontally.

Attribute values

The values of the attributes in dimension table are rarely numeric, most of the times you will find the values in attributes are in textual format. For example product name, brand, category, sub-category, etc.

Relation among Attributes

Frequently you can observe, the attributes you come across in a dimension table are not directly related. Like, Product_brand has to do nothing with the package_date but still both could be the attributes of Product dimension table.

Normalization

The dimension table is not supposed to be Normalized. This is because normalizing a table would create many intermediate tables. When a query picks up an attribute from dimension table and recovers measurements along that for the fact table, the query has to go through those intermediate tables which become inefficient. Hence, dimension tables are not Normalized.

See Database Normalization

For example, the time dimension would contain every hour, day, week, month, quarter and year that has occurred since you started your business operations. Product dimension could contain a name and description of products you sell, their unit price, color, weight and other attributes as applicable. Attributes would be a customer?s first and last name, age, gender etc.

Types of Dimensions:

Conformed Dimensions

In data warehousing, a conformed dimension is a dimension that has the same meaning to every fact with which it relates. The date dimension is an example of a shared dimension.

Outrigger Dimensions

An outrigger is a dimension table or entity that is joined to other dimension tables in a star schema. Outriggers are used when a dimension table is snowflaked. Outriggers are tables or entities that are shared by more than one dimension.

Shrunken Rollup Dimensions

A shrunken dimension is a subset of a dimension's attributes that apply to a higher level of summary.
For example, a Month dimension would be a shrunken dimension of the Date dimension.

Dimension-to-Dimension Table Joins

Dimensions may have references to other dimensions. However, these relationships can be modeled with outrigger dimensions.

Role-Playing Dimensions

A table with multiple valid relationships between itself and another table is known as a role-playing dimension. This is most commonly seen in dimensions such as Time and Customer.
For example, the Sales fact has multiple relationships to the Time query subject on the keys Order Day, Ship Day, and Close Day.

Junk Dimensions

A Junk Dimension is a dimension table consisting of attributes that do not belong in the fact table or in any of the existing dimension tables. The nature of these attributes is usually text or various flags, e.g. non-generic comments or just simple yes/no or true/false indicators.

Degenerate Dimensions

Sometimes you may come across some dimensions in fact table, which are not additive at all. For example order_number, customer_id, you can not add these types of dimensions. However, in case, you need to find order made by a particular customer in this month; then you will need the customer_id to relate your search. These types if attributes or dimensions of fact table are called Degenerated Dimension.

This kind of dimension does not have its dimension as it is derived from the fact table.

Swappable Dimensions

They are used when the same fact table is paired with different versions of the same dimension.

Step Dimensions

Sequential processes, like web page events, mostly have a separate row in a fact table for every step in a process. It tells where the specific step should be used in the overall session.

Fact Table

A Fact table is a table that contains measurements along the attributes of dimension tables. It can contain the information at lowest possible level. Some fact table just contains summary data, called as Aggregated Fact Table. The fact table almost contains the date stamped data.

Fact table contains Concatenated key which is the concatenation of primary keys of all the dimension tables. The concatenated key of fact table must uniquely identify the row in a fact table.

Type of facts

Additive

Fully additive measures are those that can be easily summed up for all dimensions in fact table.
For example quantity_ordered, is an attribute that can be summed up for all dimensions. Like, we can take out total quantity_order, for a particular customer, region, date, brand, etc.

Semi-Additive

Semi-additive measures are those which can be summed along some dimensions of fact table but not all dimensions. Like, balance amount can not be summed up over time dimension as it changes over the time.

You can aggregate department headcounts to give an organization total, but you cannot aggregate them over time, so the Sales department headcount for March 31 may be 20 employees, and for April 30 the headcount may be 23, but that does not mean that the total headcount at the end of April is 43.

Non-Additive

Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
Profit margins are non-additive. If a department has two employees, and one employee has sold an item with a 55% profit margin and the other has sold an item with a 45% profit margin, the profit margin for the department is not 100%.

Key Differences Between Fact Table and Dimension Table

  • Fact table contains measurement along the dimension/attributes of a dimension table.
  • Fact table contains more records and less attribute as compared to dimension table whereas, dimension table contain more attributes and fewer records.
  • The table size of fact table grows vertically whereas, table size of dimension table grows horizontally.
  • Each dimension table contains a primary key to identify each record in the table whereas, fact table contains concatenated key which is a combination of all primary keys of all dimension table.
  • Dimension table has to be recorded before the creation of fact table.
  • A Schema contains fewer fact tables but more dimension tables.
  • Attributes in fact table are numeric as well as textual, but attributes of dimension table have textual attributes only.

See also ETL Process


If you like dEexams.com and would like to contribute, you can also write your article here or mail your article to admin@deexams.com . See your article appearing on the dEexams.com main page and help others to learn.


Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.


Python if , elif and else

Python Conditions and If statements

  • 0
Python for beginners

Learning Python Part 1

  • 3
Struct Alignment and Padding

Struct Alignment and Padding in C++ And C

  • 0
Friend function

Friend function C++

  • 0
Pointers

C++ Pointers

  • 0
Structures

C++ Structures

  • 0
Types of Inheritance in C++

Inheritance and access specifiers C++

  • 0
Java date pattern

Java Date Pattern Syntax

  • 0
Java Date and Calendar

Java Date formats

  • 0
JAVA Data Type

Data types in Java

  • 0
Java unreachable code

Unreachable Code Error in Java

  • 0

Post Comment

Comments(0)

WEB TECHNOLOGY

Articles

×

Forgot Password

Please enter your email address below and we will send you information to change your password.