Line Count option in ReportMiner enables users to specify the number of lines over which a data region spans. This feature is useful when transposing data that appears in rows in an unstructured file and convert it into vertical fields (columns) inside a report model.
In this document, we will explore how Line Count feature helps with the selection of a data region in Astera ReportMiner.
In this case, we have unstructured data in a PDF file.
Download the sample PDF file from here.
This file contains a customer list report including their account name, contact and address details.
Observe that a single record spans over 4 lines on this PDF file. In order to capture this data and place it into different fields, we will use the Line Count feature.
First, let’s load this unstructured file onto ReportMiner’s designer.
Go to File > New > Report Model. A Report Options window opens. Here, provide the File Path for the unstructured PDF file by clicking on the folder icon.
There are many options available on the Report Options window to configure how ReportMiner reads the unstructured PDF file such as specifying the scaling factor, font, tab size and password.
You can read about these options here.
Click OK and the file will open in the report model designer.
Now that the file is open in ReportMiner, we will create an extraction template.
Creating a Report Model¶
1. Right-click on the Record node under Model layout panel. Select Add Data Region from the context menu.
A pattern box, properties panel and a data node is added on the Report Model screen.
2. Specify a pattern with which ReportMiner can match your file to capture data. You can use an alphabet, character, number, word or a wild card or any combination of these to define your pattern.
In this case, write “ACCOUNT” in the pattern box to match it on the file as shown below.
Observe that only one field information is captured through matching pattern. In order to capture the data spanning over 4 lines, we must increase the Line Count value.
3. Increase the Line Count value to 4.
Each data region block contains information of a single record. Let’s create data fields.
4. Highlight the data region after “ACCOUNT:”, right-click on it and select Add Data Field from the context menu. Rename it to Account.
Repeat the process to create more data fields and name them as shown below. You can see the layout of the extraction template in the Model Layout panel.
5. Preview data by clicking on Preview Data icon placed in the toolbar at the top of the designer window.
A Data Preview window will open displaying a preview of the extracted data.
The source file for the use-case discussed above is attached below.