For example "Order Delivery Date" key in "Orders" fact
table, is a future date, as the order is still under processing. In this
case the date is not available. The best way to handle it is to give a surrogate
key, and have that surrogate key link to a record having a
description (say) this date is yet not applicable. In case of an accumulating
fact table, when the date happens, one can over-write the record.
Sometimes that foreign key is not available, due to the data quality issues
of the source systems. In this case the approach is similar, and that is
to assign a special surrogate key linking to a record in the dimension table.
For example, if the invoice number is not available, you can assign a special
surrogate key, which will link to a special record in the dimension table,
which will have the description the "invoice number is not available". If
you want to assign more specific reasons related to the invoice number not
being available, you can think of having multiple special surrogate keys
linking to multiple special records, with each carrying a different description.
That helps you at a later stage in case you want to do some data enhancement/fixing.
The same approach will be applicable in case the fact. |