SAS SCD2 Loader macro code to keep a full history for a given table.
About
In todays article we will be looking at bespoke code written in SAS to capture data from a source using the slowly changing dimension variant 2 method (SCD2). The SCD2 loader is helpful when you want to capture all changes to a key/entity stored in a source which only retains the current view (SCD1 – update existing).
The basic ideas behind this SAS code are the following:
- Read in the source data and identify new records, changed records, deleted records
- Update records in target table based on finding from above setp
- Insert records
Code
Below is the macro code which defines the SCD2 logic.
Explanation
The above code takes advantage of the built in sha256() function which is used to create a unique hash object based on the input columns form the source table. This object can then be used in conjunction with the unique key to identify any chagnes to the source table compared with the target table.
The code creates and stores a number of process specific columns in the target table prefixed with ‘db_’. These include columns to hold time stamps, tpye of action taken by the process on the record and flags which can be used to quickly identify the current record for a given key/entity.
The macro has the following limitations: the source table must contain a single unique key used to identify an entity. The source table must follow the SCD1 logic i.e. 1 reocrd per unique entity. The code updates the taget table directly and so while the process is running users should not have access to the table being updated.
Unit test results
Standard tests have been carried out on the macro and to view the unit test script click here and the results here. If you come across any bugs or have suggestions for further tests please let me know in the comments below.
References
- SAS sha256() function (link)