Database Table Source¶
Centerprise Database Source provides the functionality to retrieve data from a database table. Additionally, Database Table Source provides change data capture functionality to perform incremental reads. Database Table sources supports multi-way partitioning which partition database table into multiple chunks and reads these chunks in parallel. This features brings about major performance increase for database reads. Database source enables you to specify where clause and sort order to control the result set.
Database Table Source provides incremental read functionality based in the concept of audit fields. Incremental read is one of the three change data capture approaches supported by Centerprise. Audit fields are fields that are updated when a record is created or modified. Examples of audit fields include created date time, modified date time, and version number.
Incremental read works by keeping track of the highest value for the specified audit field. On the next run, only the records with value higher than the saved value are retrieved. This feature is useful in situations where two applications need to be kept in sync and the source table maintains audit field values for rows.
Database Table Partitioning¶
Database table partitioning feature partitions database table in a number of user-specified chunks. These chunks are then retrieved in parallel to improve the overall throughput for the job. When combined with the highly parallel Centerprise engine, this feature substantially improves the performance of large data movement jobs. Partitioning is done by selecting a field and defining value ranges for each partition. At runtime, Centerprise generates and runs multiple queries against the source table and processes the result set in parallel.
Adding a database table source object allows you to transfer data from a database table. An example of what a database table source object looks like is shown below.
To configure the properties of a Database Table Source object after it was added to the dataflow, right-click on it and select Properties from the context menu. The following properties are available:
Source Connection screen – allows you to enter the connection information for your source, such as server name, database and schema, as well as credentials for connecting to the selected source.
Pick Source Table screen:
Select a source table using the Pick Table dropdown.
- Select Full Load if you want to read the entire table.
- Select Incremental Load Based on Audit Fields to perform an incremental read starting at a record where the previous read left off.
Incremental load is based around the concept of Change Data Capture (CDC), which is a set of reading and writing patterns designed to optimize large-scale data transfers by minimizing database writing in order to improve performance. CDC is implemented in Centerprise using Audit Fields pattern. The Audit Fields pattern uses create time or last update time to determine the records that have been inserted or updated since the last transfer and transfers only those records.
- Most efficient of CDC patterns. Only records that were modified since the last transfer are retrieved by the query thereby putting little stress on the source database and network bandwidth
- Requires update date time and/or create date time fields to be present and correctly populated
- Does not capture deletes
- Requires index on the audit field(s) for efficient performance
To use the Audit Fields strategy, select the Audit Field and an optional Alternate Audit Field from the appropriate dropdown menus. Also, specify the path to the file that will store incremental transfer information.
WHERE Clause screen:
You can enter an optional SQL expression serving as a filter for the incoming records. The expression should start with the WHERE word followed by the filter you wish to apply.
For example, WHERE CreatedDtTm >= ‘2001/01/05’
General Options screen:
The Comments input allows you to enter comments associated with this object.