Astera ReportMiner allows users to process input stream of date, name and address data and return it into elements such as hour, day, month, suffix, first name, last name, city, state, zip code etc. respectively, as parsed output with the help of auto-parsing feature.
In this document, we will learn how to automatically parse name and address data fields using the auto-parsing feature.
Sample Use Case¶
In this case, we have some unstructured data stored in a PDF file.
Download the sample PDF file from here.
This file is a customer list report that contains information such as full contact name, full address and account details of the customers.
If you look at it, the Account field contains full name with title. We want to extract and parse this information into Suffix, First Name, Last Name etc., and full address information into City, Street, Zip Code etc.
Creating a Report Model¶
1. Load the unstructured source file in ReportMiner’s designer.
2. Add a data region. Specify the pattern by typing “ACCOUNT:” in the pattern bar and increase the Line Count to 4.
3. Highlight the data region after “ACCOUNT:”. Right-click on it and select Add Name Field from the context menu.
Observe that the parsed name field components such as NamePrefix, NameFirst, NameLast etc., have been added to the report model under the Model Layout panel.
4. Highlight the data region in the second line after “CONTACT”. Right-click on it and select Add Data Field from the context menu. Rename it to Contact.
5. Highlight the last two lines of the data region. Right-click on it and select Add Address Field (US) from the context menu.
You can see that this has automatically parsed the address information into different components such as AddressState, AddressCity, AddressZipcode, etc.
Preview the extraced data to make sure everything, including the different fields and data, is in place.
The source file for this use-case is attached below so that you can test the auto-parsing feature.