In Power BI, there are dimensions of data that are not updated regularly. Instead, they are slowly and unpredictably updated over time. Those dimensions are known as slowly changing dimensions (SCD).
The SCDs basically belong to the concept of data warehousing. They are an important part of any data warehouse implementation. The concept of SCDs relies on moving a particular set of data from one state to another. Slowly changing dimensions depict how all the changes happening in the source system impact the data in the data warehouse. All these changes happening in the source system do not happen very frequently. These changes take place slowly. Accordingly, they were named “slowly changing dimensions.”
For many years, many different types of SCDs have been developed.
Six types of SCDs
1. Slowly Changing Dimension type Zero (SCD 0), The Fixed Method
Slowly Changing Dimension Type Zero is also known as Fixed Dimensions. With the type zero SCD, we tend to ignore all the changes happening in a dimension. Whenever there are certain changes in the source system, the landing dimensions in our data warehouse are not supposed to be changed. In other words, we tend to ignore the changes happening within the data source.
Suitability: SCD 0 is suitable for the attributes that will not be updated at all like joining date, purchase date, gender etc.
2. Slowly Changing Dimension type One (SCD 1), Canceling the Old Values with New ones
Slowly Changing Dimension Type One is also referred to as Overwriting. It basically includes substituting old dimensions with new ones. With a type 1 SCD, we tend to replace the old data with the new one.
Let’s understand this with an example. Assume that a customer has changed his residential address. In this case, the old address of the customer is not required. As a result, the company will replace the old residential address with the new one. This is one of the simplest yet most excellent examples of SCD type 1.
Suitability: SCD 1 is suitable for the attributes where historical value is not required to be kept. For example, a change in a contact number.
3. Slowly Changing Dimension type Two (SCD 2), The Row Versioning Method
Slowly Changing Dimension Type Two is often referred to as Row Versioning. With type two SCD, businesses are able to secure the old data and the current data as well. In other words, we tend to keep the history of all the data changes happening in the data warehouse.
As discussed in the above paragraph, in type 2 SCD, we have to maintain the history of data. So, we have to insert a new row of data into the data warehouse, whenever there is a transactional change in the system.
There may be one of the following transactional changes in the system:
- Insertion: Whenever a new row is inserted into a table, it is known as an Insertion.
- Updating: Whenever an existing row of data is updated with new data, it is known as updating.
- Deletion: Whenever a row of data is removed from a table, it is known as deletion.
Suitability: SCD 2 is the most popular one among all the types. It can easily handle the attributes changing their values multiple times. But yes, very rapid changes will not be scalable in this type.
4. Slowly Changing Dimension type Three (SCD 3), Adding A New Column to Depict the Previous Value
In simple words, Slowly Changing Dimensions Type three can be defined as adding a new column to show the previous value. With a type 3 SCD, we add a previous value column to the table. Additionally, in slowly changing dimension type 3, all the versions—the previous one and the current one—are maintained in a single row.
Suitability: SCD 3 is suitable for cases where you need to keep only the last version of historical data. It is not suitable for cases where an attribute value might change multiple times. This type is not considered scalable if you want to preserve history. For Example, female employees will change their names after their marriage. Here, SCD 3 can be used because these types of changes will not occur rapidly.
5. Slowly Changing Dimension type Four (SCD 4), Using Historical Table
The Slowly Changing Dimension Type four basically includes the usage of historical tables. With type 4 SCD, the dimension table contains all the latest values. On the other hand, its history is maintained in a separate table.
Yes, you guessed it right. SCD type 4 is similar to SCD type 2. But the only difference is that in SCD type 4, there are two separate files or tables that are maintained. Whenever there are the latest values added to the dimension, the old values automatically get stored in the historical tables.
Suitability: If the changes are rapid in nature then SCD 2 will not be scalable. In that case, SCD 4 will come into the picture. In this technique, a rapidly changing column or attribute is moved out of the dimension and moved to a new dimension table.
6. Slowly Changing Dimension type Six (SCD 6), The Combined Approach or The Hybrid SCD
The Slowly Changing Dimension Type Six is also known as the Combined Approach or the Hybrid SCD. Type 6 SCD is a combination of type 1 SCD, type 2 SCD, and type 3 SCD. You can also understand or remember it as 1+2+3=6.
SCD type 6 uses a combination of SCD types 1, 2, and 3 for tracking the changes happening in the dimensions. In some cases, multiple parts of a record’s dimension change gradually. In this case, implementing a single type, multiple times may lead to issues with rapid inflation of the table size. Hence, in situations like these, it is preferred to use a combination of different SCDs, that is, SCD type 6.
Suitability: SCD 6 is the most suggested type. Without this, complex queries have to be used. It offers a mixed approach.
Conclusion: Which is the best type to keep historical information?
Depending on the kind of change anyone is looking for, the SCD type is selected. The suitability of each type has been mentioned in the respective sessions above.