UI Walkthrough - Astera Centerprise Data Integrator

What is Astera Centerprise?

Astera Centerprise is designed to support integration of complex and high-volume data. It is a powerful ETL (Extract, Transform, Load) tool that provides connectivity to all leading databases, flat and hierarchical file sources, and even supports legacy systems such as COBOL. Moreover, PDF form source, Email source and Report source enables users to extract data from a PDF file, Email and Report Models in Centerprise.

It combines data profiling, data transformation and data reporting in a single seamless user-interface. For further information, watch Centerprise Demo or read Centerprise Data Integrator product brochure.

Centerprise Home Screen

1_start_page

Above screenshot shows the start page layout of Astera Centerprise 7.5. The items on the start page keeps varying, however, the menu bar and icon bar abides the arrangement shown.

On the top-left side of this page, there are some tabs and icons. These are described in the coming sections.

2_menu_icon_bar

Icon Bar

1.1_main_iconbar_table

Dataflows

13_dataflowAbove screenshot shows the main screen of a Dataflow. To open a Dataflow, go to File > New > Dataflow. Here, you can see an additional menu item on the Menu Bar – Dataflow. There is also a secondary Icon bar and a Toolbox panel on left-side of the screen.

If the Toolbox is hidden, you can access it by going to View > Toolbox or using the shortcut Ctrl + Alt + X. Toolbox has different categories and we will discuss these categories in detail in the following sections.

You can hide and close the Toolbox panel from these icons 37_hide respectively.

Read more on Dataflows here.

Dataflow Menu Item

14_dataflow_menu

Dataflow menu item provides options that allow users to

  • Change the layout of various objects in the dataflow
  • Expand and collapse the objects’ view in the dataflow
  • Zoom in and zoom out of the designer
  • Change Links to orthogonal links
  • Replace parameter information
  • Run a task in Data quality mode

You can quickly access these options from the secondary icon bar.

Secondary Icon Bar

1.2_secondary_iconbar

Toolbox - Dataflow

15_toolbox

The items on the Toolbox are arranged into expandable sections. From each section, objects can be dragged-and-dropped on to the dataflow designer. In this section, we will discuss about these items in detail.

Sources

16_sources

Data is extracted from a source and brought to Centerprise client for further transformation and integration. Source objects from the Toolbox can be added to the dataflow designer through a simple drag-and-drop action. Read more on setting up sources here.

The following types of sources are supported by Centerprise:

Items Extension Description of Sources
COBOL Source .cbl COBOL (COmmon Business-Oriented Language) are fixed-width files containing text and/or binary data.
Database Table Source .dbo Database files store data information in a series of tables, table fields, and field data values and organized according to a data model.
Delimited File Source .csv Delimited file is a text file that stores data in fields separated by a delimiter.
Excel Workbook Source .xls , .xlsx Excel file is a spreadsheet file.
File System Items Source File System item source provides metadata information about files found in a particular folder.
Fixed Length File Source .txt Fixed length file is a text file in which every field has fixed length.
Report Source .rmd Report Source is a file with structured data extracted from an unstructured file using a Report Model.
Email Source Email Source in Astera Centerprise enable users to retrieve data from emails and process the incoming email attachments.
SQL Query Source .sql SQL (Structured Query Language) Query Source enables the user to retrieve data from a database using SQL query or a stored procedure.
XML/JSON File Source .xml XML (eXtensible Markup Language) stores data in a hierarchical structure.

Destinations

17_destinations

All these destinations can be added to the dataflow designer through drag-and-drop interface. Read more on setting up destinations here.

Following destinations are supported by Centerprise:

Items Extension Destinations
Database Table Destination .dbo Database destination provides the functionality to write data to a database table. Users can control how data is written to a database table.
Delimited File Destination .csv Delimited file destination provides the functionality to write data to a delimited file with the ability to control the structure and content of the file.
Excel Workbook Destination .xls , .xlsx Excel File Destination provides the functionality to write data to Microsoft Excel workbook and does not need Microsoft Excel to be installed on the machine.
Fixed Length File Destination .txt Fixed-length file destination object allows user to write data to a fixed-length file.
SQL Statement Destination .sql SQL Statement Destination object offers extra flexibility over database destination objects in applying a custom INSERT, UPDATE, or DELETE SQL code that controls what will be written into the destination table.
XML/JSON File Destination .xml XML/JSON file destination object allows you to write data to an XML or a JSON file.

Transformations

18_transformation

Transformations are used to perform a variety of operations on the data as it flows through a dataflow. Centerprise provides an extensive library of built-in transformations. These transformations are divided into two types:

  • Single Record level - creates derived values by applying a lookup, function, or expression to fields from a single record
  • Set level - operates on a group of records and may result in joining, reordering, elimination, or aggregation of records

Above screenshot shows the transformations toolbox as it appears in Astera Centerprise. A brief description of each is given in the following table:

Items Transformation type Description
Aggregate Set level Creates aggregations of a dataset, using functions such as Sum, Count, Min, Max, Average, Variance or Standard Deviation.
Apply To All Single Record level Applies an Expression transformation to all mapped elements. This transformation is useful when applying a common Expression transformation to the entire dataset without the need of using multiple Expression transformation objects.
Constant Value Set level Returns a single, prespecified value for all records.
Database Lookup Single Record level Returns a single output field from the database lookup table, or a combination of fields in which lookup values matches the incoming values.
Denormalize Set level Combines several records into a single record. In other words, it transposes rows into columns.
Distinct Set level Removes duplicate records from the dataset.
Expression Single Record level Defines an expression that can be used to process the incoming value (or values) according to the expression’s logic.
File Lookup Single Record level Looks for certain specified values in the source data, replaces them with the desired information and stores the replaced values in a file.
Filter Set level Filters out data according to a predefined rule.
Function Single Record level Contains a series of built-in mathematical, logical, financial, conversion and encoding functions.
Join Set level Joins records from two record sets with the help of a join key. It combines fields in the data.
List Lookup Single Record level Stores information in the metadata and is used to look for certain values in the source data and replace them with the desired value.
Merge Set level Combines records from two inputs into a single output stream with the same layout as the input streams.
Normalize Set level Creates several records from a single record. In other words, it transposes columns into rows.
Passthru Set level Creates a new dataset based on the elements that were passed to the transformation.
Reconcile Set level Identifies and reconciles new, updated, or deleted entries within an existing data source.
Route Set level Invokes one or more paths in the dataflow, according to some decision logic expressed as a set of rules.
Sequence Generator Single Record level Makes it easy to add sequences of integer values to your dataset.
Sort Set level Sorts values in the dataset – either in ascending or descending order of some key field(s).
Subflow Set level Calls a subflow to run as part of your dataflow.
Switch Single Record level Matches source data for the criteria specified by the user, and wherever the criteria is met, it replaces the information in the particular field with the desired output.
Tree Join Set level Enables you to join datasets in a hierarchy and create tree structures.
Union Set level Combines incoming data from two or more inputs into a single output. It combines rows in the dataset.

Function Transformations

19_functions

This item contains the built-in functions provided by Astera Centerprise. These functions are further classified into various categories (Math, Financial, Date Time, String etc.). Read more about using functions in the Functions Glossary.

Data Profiling

20_dataprofiling

Data profiling is essentially related to collecting statistic on fields of data, performing data quality checks on the incoming data and creating log files for records with errors and warnings.

Object Icon Image Purpose
Data Profile 1571653585377 Provides complete data field statistics – basic and detailed.
Data Quality Rules 1571653618068 Performs validation checks for incoming records.
Field Profile 1571653644972 Provides statistics for specific fields.
Record Level Log 1571653682735 Creates log files for records with errors and warnings.

Resources

21_resources_

This category contains options that enable users to parameterize the dataflows.

Object Icon Image Purpose
Context Information 1571653705225 To use Context Information parameters that take their values dynamically at dataflow run time.
Database Connection 1571653724746 To enter the connection details such as server name, credentials, database name etc.
Variables 1571653744608 To locally declare variables on a flow that can be replaced during runtime with the assigned values.

Database Write Strategy

22_database_write_strategies

Database Write Strategy is used to perform database write actions such as INSERT, UPDATE, UPSERT, or DELETE. These actions are directly performed in the database table destination. Four different Database Write Strategy options are available in Astera Centerprise.

Object Icon Image Purpose
Data Driven 1571653803418 Processes records based on some predefined criteria, which is expressed as a rule.
Database Diff Processor 1571653847321 Synchronizes records between two tables in a database by comparing destination table against a diff table.
Slowly Changing Dimension 1571653877223 Addresses scenarios where field values for a record varies slowly over time.
Source Diff Processor 1571653904897 Add records incrementally to a destination to avoid reading/writing existing records in the file.

Text Processors

23_text_processors

Text Processors enable the users to:

  • Resolve data into components and write each component to a different field
  • Serialize different field components

Learn more about the Delimited Parser here.

Services

24_services

Using the SOAP and Rest web services connector, you can easily connect to any data source that uses SOAP protocol or can be exposed via Rest API.

Object Icon Image Purpose
REST Client 1571653940274 Uses RESTful APIs to access and/or modify data using HTTP methods.
SOAP Transformation 1571653943600 It is a web service transformation that allows users to call a remote SOAP web service as part of a dataflow.

EDI

25_edi

In this screenshot, you can see the options for EDI (Electronic Data Interchange) supported by Centerprise. EDI is a special file format just like XML and JSON. Here, Source, Destination, Parser and Serializer have the same purpose as defined in the earlier sections.

Subflow

26_subflow

Subflows can be perceived as ‘black boxes’ inside a dataflow, simplifying and streamlining the dataflow design. Subflows can be called in a dataflow by dragging-and-dropping the subflow transformation object onto the dataflow designer.

Toolbox - Subflow

You can view the Subflow option in the toolbox when you open a subflow. To open a subflow, go to File > New > Subflow.

Toolbox for a subflow offers similar categories of actions and tasks as for a dataflow with an additional category of a subflow task which expands into two objects shown below.

27_subflow_input_output

A short description of these objects is given below.

Object Icon Image Description
Subflow Input 1571653970694 It is a connector controlling the input layout of your subflow.
Subflow Output 1571653988788 It is a connector controlling the output layout of your subflow.

Workflow

28_workflow

Workflow is designed to orchestrate an automated and iterative execution of ordered tasks. Tasks are performed according to some predefined path and custom logic. For an in-depth understanding of workflows, refer to this article on working with workflows and the help video.

Toolbox - Workflow Tasks

To open a Workflow, go to File > New > Workflow.

Toolbox for a workflow offers similar categories of actions and tasks as for a dataflow with an additional category of a workflow task which expands into several objects as shown below.

29_workflow_tasks

In the screenshot above, you can see a list of tasks that are included in the Workflow Tasks category. A brief note on their purpose is as follows.

Object Icon Image Purpose
Data Dump Task 1571654039903 Picks up data from one location and dumps it to another location.
Decision 1571654057500 Invokes one of the two paths in the workflow, depending on whether the logical expression inside the Decision object returns a Yes (True) or a No (False).
EDI Acknowledgement 1571654080237 It issues an acknowledgement notification to the sender when EDI message is received.
File System 1571654105695 It performs some action with a file or a folder. For example, the task can copy a file, or delete all files in a folder.
File Transfer Task 1571654119499 It performs some action on an FTP server. For example, the task can upload a file to the FTP server, rename a file, or delete all files from a remote directory.
Or 1571654133864 It is used after multiple decision task and triggers workflow task when any one decision returns true.
Run Dataflow Task 1571654150102 It is used to start a dataflow as part of your workflow.
Run Program Task 1571654166488 It is used to run an executable, command, or batch file, as part of your workflow.
Run SQL File 1571654206063 It runs the SQL code inside a file as part of your workflow.
Run SQL Script 1571654220699 It runs some SQL code as part of your workflow.
Run Workflow Task 1571654237783 It starts another workflow as part of your workflow.
Send Main 1571654253184 It sends email at appropriate junctions in your workflow.

Report Model

30_RM

The objective of a Report Model is to convert unstructured data into a structured format. This unstructured data is normally a text file, a PDF file, an image file; it can even be an excel file or a word file as long as the data stored is unstructured.

Above image represents the main screen of a Report Model in Centerprise. To open a Report Model, go to File > New >Report Model.

In addition, a vertical panel – Report Browser – that contains Model Layout and Data Export Settings can be seen on the left side of the screen. There is also a toolbar specific to Report Model interface. Each of these additional attributes are explained in the following sections.

Read more on how to create a Report Model here

Toolbar - Report Model

1.9_RM_icons

Report Browser

It contains features and layout panels for initiating and building an extraction template and exporting extracted data. There are two main panels in a Report Browser;

· Model Layout

· Data Export Settings

Model Layout

31_model_layout

Model layout panel serves the purpose of building a layout of data extraction. It contains data regions and fields built according to a custom extraction logic from an unstructured file.

In the figure above, you can see a hierarchical layout of an extraction model with a single instance region as well as collection regions containing multiple fields.

Icons - Model Layout

2.0_Model_layout_icons

Data Export Settings

32_data_export_settings

Data Export Settings deals with all the settings related to the export of data into an excel sheet, a csv file or a database table. Depending on the file format it is exported to, this exported data can later be called in a dataflow, subflow or a workflow through Excel Workbook Source, Delimited File Source or a Database Table Source.

Icons - Data Export Settings

2.1_Export_icons

Scheduler

33_scheduler

Centerprise offers a built-in Scheduler to perform quotidian tasks. The main screen of the Scheduler with options to customize a repetitive task can be seen in the above screenshot. To open Scheduler window, go to Server > Job Schedules.

Configure Scheduler by going to the Deployed Job tab; add the status, name and schedule type. Then, define a file path, server and the frequency of the scheduled task. It further provides an option to run the scheduled dataflow in a pushdown mode.

Users can also set up an email notification by going to Notification Email tab and fill in the necessary details.

Understand how to set up a scheduling task through an example here.

An icon tab is highlighted in the screenshot, details of each is as follows:

2.2_scheduler_icons

Deployment

34_deployment

Deployment is a way for setting up Centerprise Projects to run on the Scheduler. Deployment enables the use of a Config File on a Project Archive (*.car) file, making the selected flow run independent of any local parameters.

To open Deployment window, go to Server > Deployment.

To configure Deployment; provide a name, an archived copy of the Project file (*.car), an optional file to specify project parameters and a comment.

Understand how to set up a project deployment through an example here.

An icon tab is highlighted in the screenshot, details of each is as follows:

2.3_deployent_icons

Job Monitor

35_Monitor

Job Monitor allows you to monitor the jobs that are executed on the server. To open a Job Monitor, go to Server > Monitor.

With a Job Monitor, you can observe the job type, job execution server, executed file, duration of the job, its status, error records etc.

In Job Monitor, a record in green shows a running job, a record in red signifies a job ending with error and a record in blue signifies a job ending with success.

An icon tab is highlighted in the screenshot, details of each is as follows:

2.4_monitor_icons

Miscellaneous

36_short-cuts

View > Toolbox, Server Explorer, Data Source Browser, Job Progress, Verify, Data Preview and Quick Profile will generate a shortcut tab of each of these options on the bottom of the Main screen. Moreover, tabs on Level 1 will expand into a vertical panel on the left side of the Main screen whereas tabs on Level 2 will expand into a horizontal panel at the bottom of the screen.