Database Write Strategies¶
The Database Write Strategies available in Astera Centerprise provide a convenient way for you to make updates and insertions in a database table. They can be used to implement advanced logic for handling data updates, thus promoting seamless data integration and warehousing.
In this article, we will be discussing the Database Write Strategies available in Astera Centerprise. Let’s take a look at a sample use case.
Here, we are writing data from a source table in one database to a destination table in another database. The source data consists of customer information from a fictitious organization, and the organization would like to have control over which records are written to the destination table.
Centerprise supports the following database write strategies:
- Data Driven
- Database Diff Processor
- Slowly Changing Dimensions
- Source Diff Processor
Data Driven¶
The Data Driven write strategy processes records based on some predefined criteria expressed in the form of rules. For example, insert records where LTV > 80, or delete records where CreatedDtTm < UpdatedDtTm.
Each rule has a database action associated with it, such as Insert, Update, Delete or Error. If a record does not pass a rule, the record will be tried against the next rule on the list, until all rules have been exhausted.
To add a Data Driven write strategy, open the flow Toolbox, expand the Database Write Strategy group, and drag and drop the Data Driven object onto the dataflow designer.
An example of what a Data Driven object looks like is shown below.
The following properties are available to help you configure the logic of your database update:
The Layout Builder screen allows you to add or remove fields in the field layout, as well as select their data type.
Note: To quickly add fields to the layout, drag and drop the node Output port of the object whose layout you wish to replicate into the node Input port of the Data Driven object. The fields added this way will show in the list of fields inside the node and as well as in the Layout Builder.
Data Driven Write Strategy Conditions screen allows you to manage rules and associate them with database actions.
To add a new rule, follow these steps:
Click the Add Condition icon to create a new rule.
Type a descriptive name for the rule in the Description field.
Activate the rule by checking the Active checkbox.
Select a database action, such as Insert, Update, Delete or Error, from the Perform Database Action dropdown list. This action will be applied when the rule holds.
In the When the Following Condition Holds input, write an expression for the rule. For example, LoanAmount < 320000. Or click … to open the Expression Builder where you can visually build your rule using Record tree and IntelliSense.
Click Compile to check for any syntax errors in your rule. The status should read “Successful” for a successful compilation. Close the Expression Builder.
The new rule has been successfully configured.
You can add multiple rules if necessary. To delete an existing rule, select it and click the Remove Condition icon.
General Options screen
The Comments input allows you to enter comments associated with this object.
Database Diff Processor¶
The Diff Processor database write strategy synchronizes records between two tables. The destination table is compared against a diff table, and any differences between the two tables are reconciled in the destination table.
To apply Diff Processor write strategy, open the Flow toolbox, expand the Database Write Strategy group, and drag and drop the Diff Processor object onto the dataflow designer.
An example of what a Diff Processor object may look like is shown below.
The following properties are available to help you configure the logic of your database update:
Database Connection screen – allows you to enter the connection information for the diff table, such as server name, database and schema, as well as credentials for connecting to the selected diff table.
Pick Table screen:
Using the Pick Table screen, select the diff table for this database write strategy.
Select Field for Matching Database Record – specifies a field (or a combination of fields) which will be used to match records from the diff table with records that exist in the destination table.
Output Options:
- Single Output: This is the default selection when using a database write strategy. Single Output means that the database action, such as Insert, Update, Skip or Error will be chosen by the database write strategy’s logic rather than specified by the user. Using a single output is recommended when a database write strategy is applied.
- One Port for Each Action – This adds extra flexibility in letting you specify maps for each of the database write actions, such as Insert, Update, Skip or Error. Each set of maps is completely independent of another, and will be applied according to logic of the database write strategy.
Slowly Changing Dimensions¶
The Slowly Changing Dimensions (SCD) database write strategy addresses scenarios where a field’s value for a record varies over time. Centerprise supports industry-standard Type 1 and Type 2 SCD methods to handle slowly changing values in the selected fields of a dimension table.
The Type 1 method is used when no historical data is required on how values changed in the SCD field over time. Type 1 method simply updates a record in the dimension table, overwriting an old SCD value with a new value.
The Type 2 method makes it possible to track historical data by creating multiple records in the dimension table. The records are identified by using an extra surrogate key in addition to any other keys already present in the table. With Type 2, you have unlimited history as a new record is inserted each time a change to the SCD value is made. Additional fields, such as EffectiveDate, ExpirationDate, or Version may be used to track the timeline of SCD value updates, as well as identify the current SCD value.
To add the Slowly Changing Dimensions write strategy, open the Flow toolbox, expand the Database Write Strategy group, and drag and drop Slowly Changing Dimensions object on the dataflow.
An example of what an SCD object might look like is shown below.
The following properties are available to help you configure the logic of your database update:
Database Connection screen – Here you need to enter the connection information for the SCD table, such as server name, database and schema, as well as credentials for connecting to the selected SCD table.
Pick Table screen – Using Pick Table screen, select the SCD table that will be updated by this database write strategy.
Output Options:
- Single Output: This is the default selection when using a database write strategy. Single Output means that the database action, such as Insert, Update, Skip or Error will be chosen by the database write strategy’s logic rather than specified by the user. Using a single output is recommended when a database write strategy is applied.
- One Port for Each Action: Adds extra flexibility in letting you specify maps for each of the database write actions, such as Insert, Update, Skip or Error. Each set of maps is completely independent of another, and will be applied according to logic of the database write strategy.
Layout Fields screen:
Using the SCD Field Type dropdown, select among the following designations for each field in the layout:
For SCD Type 1 updates:
- Business Key - designates the field holding the key that is normally used to identify records in the table
- Not Used - designates a field that is not used by the SCD update logic. However, this field may still be updated if it is mapped.
- Update Not Allowed - designates a field that is not used by the SCD update logic. This field may not be updated even if it is mapped. An attempt to update the field will result in an error status for the entire record.
- SCD1 - Update - designates a field that stores an SCD value. This is the value that slowly changes over time.
- Audit - Last Changed - stores the date and time when the record with the SCD value was last updated.
- Audit - SCD1 Change - stores the date and time when the record with the SCD Type 1 value was last updated.
For SCD Type 2 Updates:
- Business Key - designates the field holding the key that is normally used to identify records in the table.
- Surrogate Key - designates the field holding an extra key that identifies versions of the SCD value with the same business key.
- Not Used - designates a field that is not used by the SCD update logic. However, this field may still be updated if it is mapped.
- Update Not Allowed - designates a field that is not used by the SCD update logic. This field may not be updated even if it is mapped. An attempt to update the field will result in an error status for the entire record.
- SCD2 - Update and Insert - this field stores an SCD value changing over time. A new record will be added each time the SCD value changes. The existing record may be updated with the new ExpirationDate.
- Current Record Designator - stores the Active Value if the record has the current version of the SCD value. Otherwise it stores the Inactive Value. Active Value and Inactive Value should be entered in the appropriate cells in the grid next to the Current Record Designator.
- SCD2 Effective Date - stores the effective date of the SCD value in the record.
- SCD2 Expiration Date - stores the expiration date of the SCD value in the record. Expiration Date is Null in the record storing the current SCD value.
- Audit - Created - stores the date and time when a new record with a new SCD value was created.
- Audit - Last Changed - stores the date and time when the record with an SCD value was last updated.
- Audit - SCD2 Change - stores the date and time when the record with an SCD Type 2 value was last updated.
- Version - stores the version number of the SCD value.