Distinct Transformation¶
Distinct¶
Distinct transformation removes duplicate records from the dataset. You can use all fields in the layout to identify duplicate records, or specify a subset of fields, also called key fields, whose combination of values will be used to filter out duplicates.
Sample¶
Steps¶
To add a Distinct transformation, drag the Distinct object from the Transformations group in the Flow toolbox and drop it on the dataflow.
An example of what a Distinct object might look like is shown below.
To configure the properties of a Distinct object after it was added to the dataflow, right-click on it and select Properties from the context menu. The following properties are available:
Meta Object Builder screen:
Meta Object Builder screen allows you to add or remove fields in the field layout, as well as select their data type.
Note: To quickly add fields to the layout, drag and drop the node Output port of the object whose layout you wish to replicate into the node Input port of the Distinct object. The fields added this way show in the list of fields inside the node and as well as in the Meta Object Builder.
Distinct Transformation Properties screen
Using the grid, select key field (or fields) that will be used to return distinct non-duplicate records.
If incoming records are sorted by values in the selected key fields, you can enable the Incoming data is ordered by key fields option to boost performance.
General Options screen:
This screen shares the options common to most objects on the dataflow.
Clear Incoming Record Messages
When this option is on, any messages coming in from objects preceding the current object will be cleared. This is useful when you need to capture record messages in the log generated by the current object and filter out any record messages generated earlier in the dataflow.
Do Not Process Records with Errors
When this option is on, records with errors will not be output by the object. When this option is off, records with errors will be output by the object, and a record message will be attached to the record. This record message can then feed into downstream objects on the dataflow, for example a destination file that will capture record messages, or a log that will capture the messages and as well as collect their statistics.
The Comments input allows you to enter comments associated with this object.
Usage¶
An example of a dataflow with Distinct transformation is shown below.