Placeholder Dimension for Early Arriving Facts and Late Arriving Dimensions

Early arriving facts and late arriving dimensions is a concept in data warehousing that refers to the facts that arrive in the data warehouse before the related dimension information has been loaded into the data warehouse. For example, in a retail data warehouse, orders details might arrive earlier than the corresponding customer’s information who placed the order. Since the dimension is not available at the time the fact table is loaded, the foreign key in the fact table will not have a corresponding primary/surrogate key in the dimension table, thereby causing an error.

Placeholder dimensions are a way to handle early arriving facts and late arriving dimensions in a data warehouse. Placeholder dimensions are temporary dimension records that are created with the specified default values until the real dimension data becomes available. Once the real dimension data arrives, the placeholder dimension is replaced with the real dimension data.

Here’s how you can create a placeholder dimension in Astera Data Warehouse Builder.

1. Right-click the dimension entity for which you want to specify the placeholder dimension, select properties, and navigate to the layout builder.

image-20230705153750164

2. Add a new field – in this case the ‘IsPlaceholder’ field - of Boolean data type and mark the Dimension Role as ‘Placeholder Dimension’.

image-20230705153809418

3. Click OK.

Note: Make sure to specify default values for all the fields within the dimension entity which do not allow nulls. These default values will be used to create the placeholder dimension record in case of early arriving facts and late arriving dimensions. If the default value is not specified for these fields, a warning will be shown prior to the data model deployment, and an error will be thrown at runtime.

You can also specify default values for fields that do allow nulls, but it is not a requirement.

image-20230705153819187

USE CASE

Consider this use case where we have specified a placeholder dimension for the Customers Dimension.

image-20230705153834681

Let’s consider a scenario of early arriving facts and late arriving dimensions. An order is placed by a customer with Customer ID ‘ALFKI’. However, the information of the Customer with Customer ID ‘ALFKI’ is not available in the dimension table yet.

When the fact is loaded, it looks for the corresponding dimension in the customer dimension table. Since the record for the corresponding dimension (CustomerID ‘ALFKI’) is missing and has not arrived yet, a temporary dummy Placeholder Dimension record is created in the customer dimension – using the default values specified earlier - and its surrogate key is used to populate the foreign key in the fact table.

The placeholder dimension is identified by the ‘IsPlaceholder’ field, if the value of this field is ‘1’ (which indicates that the Placeholder field is True), it shows that the record is a placeholder in the dimension table.

image-20230705153843673

Customer Dimension Table – Placeholder Record created with CustomerKey (Surrogate key) 180

image-20230705153851651

Order_Details Fact Table – CustomerKey (Surrogate key) 180 of the placeholder record acting as a foreign key

When the actual data for the customer where CustomerID = ‘ALFKI’ arrives, the placeholder dimension record is updated with the new information and the ‘IsPlaceholder’ value changes to zero indicating that it is no longer a placeholder record, and therefore the Placeholder field is ‘False’.

image-20230705153858982