Creating Field Profile

The Field Profile feature captures statistics for selected fields from one or several objects. Field Profile is essentially a transformation object as it provides Input and Output ports similar to other transformations. These output ports make it possible to feed the statistics collected to another object on the dataflow.

In this document, we will learn how to create Field Profile in Astera Centerprise.

Using Field Profile

In this case, we have Invoice data extracted from a sample Database Table Source.

0_invoice_data

We want to collect detailed statistic on some of these fields of data and write it to a Delimited File Destination. For this purpose, we will use Field Profile feature of Centerprise.

1. To get a Field Profile object from the Toolbox, go to Toolbox > Data Profiling > Field Profile. If you’re unable to see the toolbox, go to View > Toolbox or press Ctrl + Alt + X.

1_toolbox

2. Drag-and-drop the Field Profile object onto the dataflow designer.

2_dataflow_fieldprofile

You can see that the dragged FieldProfile1 object is empty right now. This is because we haven’t mapped any fields on it yet.

3. One-by-one map ShipName, CustomerID, Country, ProductName, UnitPrice, Quantity and OrderDate from the source object onto the FieldProfile object.

3_mapping

Note: Statistics will be collected only for the fields linked to the Input port of the Field Profile object. This way, you can selectively collect statistics for a subset of fields from the selected field layout.

Configuring the Field Profile Object

1. To configure the Field Profile object, right-click on its header and select Properties from the context menu.

4_properties

A configuration window will open. First screen is Layout Builder screen. This is where we can create or delete fields, change its name and data type.

5_layout_builder

2. Click Next. On this Properties window, specify the Statistics Type from the dropdown list.

6_config_window

Field Statistics dropdown allows you to select detail levels of statistic to collect. Select among the following detail levels:

  • Basic Statistics: This is the default mode. It captures the most common statistical measures for the field’s data type.
  • No Statistics: No statistics is captured by the Data Profile.
  • Detailed Statistics – Case Sensitive Comparison: Additional statistical measures are captured by the Data Profile, for example Mean, Mode, Median etc, using case-sensitive comparison for strings.
  • Detailed Statistics – Case Insensitive Comparison: Additional statistics are captured by the Data Profile, using case insensitive comparison for strings.

In this case, we are collecting a Detailed Statistics – Case Sensitive Comparison.

![7_statistics type](creating-field-profiles.assets/7_statistics type.png)

Click OK.

3. Right-click on Field Profile object’s header and select Preview Output from the context menu.

8_preview_output

A Data Preview window will open up showing you the statistics of each mapped field as a record.

9_data_preview

Writing to a Destination

Observe that the Field Profile object contains an Output node. On expanding, you will see various statistical measures as fields with output nodes.

10_output_node

We can write these statistical measures to a destination file.

1. Drag-and-drop Delimited File Destination onto the dataflow designer by going to Toolbox > Destinations > Delimited File Destination.

11_delimited_destination

2. Auto-map all fields, under the output node of FieldProfile object, to the DelimitedDestination object.

12_mapping

3. Configure settings for Delimited File Destination from here.

Executing the Task

1. After configuring the settings for Delimited File Destination object, click on Start Dataflow icon 9_run_dataflow from the toolbar at the top.

A Job Progress window will open at this instant and will show you the trace of the job.

13_Job_progress

You can open the delimited file that contains statistic from the link provided in Job Progress window.