Defining Region End Type as Specific Text and Regular Expression

Region End Type options are useful for defining where to end a particular data region. This option appears in the Region Details group-box in the Region Properties panel. There are several options to define the end-type of a region.

01-options

In this article, we will discuss the use of Till Regular Expression and Till Specific Text options.

Click here to learn more about the rest of the options.

Sample Use-Case

In this case, we have an invoice containing details of the dealer, Global Cars, and the list of vehicles available for purchase. This is what the data looks like:

02-source-file

Here, we want to extract the dealer’s information by defining end-type of data regions using the Till Regular Expression and Till Specific Text options.

Steps to Create a Report Model

1. Go to File > New > Report Model and load the unstructured document in a new report model. This is how the file looks like in the ReportMiner designer.

03-designer

2. Add a new data region to the Model Layout panel by right-clicking on the Record node and selecting the first option from the menu.

04-add-data-region

3. Define a pattern in the orange bar above the designer to capture the first line of the region. In this case, we are using “DEALER NAME” as the pattern.

05-pattern

Note: Make sure the pattern is vertically aligned with the data on the canvas.

4. Here, only the line containing the pattern is a part of the region. Let’s use the Till Specific Text option to define the end-type of the region in the Region Properties Panel.

06-string-region

We have used “EMAIL” as the specific text to capture all the lines until line 14.

The Line Count option determines the minimum number of lines for the region. ReportMiner looks for the specific text or regular expression after a set number of lines (defined by Line Count) from where the pattern is matched. Note that Line Count takes precedence and determines the end-point of the region when the specific text or regular expression is not found in the document.

5. Now, the entire data region, starting from where the pattern is matched till the specific text, highlighted by grey area, has been captured.

07-complete-region

6. Alternatively, you can also specify a regular expression to define the end-type of the region. Here, we have selected Till Regular Expression from the Region End Type drop-down menu and specified a regular expression to define the format of the email.

08-regex-region

7. Now, create data fields in this region to capture all the required information.

09-add-fields

8. You have successfully created a data region and extracted relevant data fields containing the dealer’s information.

10-model-layout

This is how you can capture data regions using the Till Specific Text and Till Regular Expression options as the Region End Type.