Delimited File Source¶
Delimited File¶
Delimited File Source provides the functionality to read data from a delimited file. Delimited files are one of the most commonly used data sources and are used in a variety of situations. Centerprise Delimited File Source provides extensive functionality. The following subsections discuss some key features.
Encodings and Character Sets¶
Centerprise supports a wide range of encodings and character sets including double byte and multi byte character sets.
Layout Discrepancies¶
In many instances, file formats are inconsistent with agreed upon layout. Centerprise provides the functionality to handle these inconsistencies. Inconsistencies include extra fields, out of sequence fields, inconsistent field headers.
Data Formats¶
With Centerprise, you can parse dates, numbers, and Boolean values in virtually any formats. Centerprise comes with a set of built-in formats for dates, numbers, and Boolean values and provides the ability to define additional formats.
Filtering¶
Often, files contain rows that must be skipped. With Centerprise, you can specify criteria for rows that are skipped before parsing. This way, if the file contains multiple types of rows, Centerprise processes only the rows that meet your criteria.
Hierarchical Files¶
Businesses often exchange complex transactions in the form of hierarchical delimited files. Centerprise Delimited File Source provides support for hierarchical files. You can define hierarchical file layout and process the data file as a hierarchical file. Centerprise IDE provides extensive user interface capabilities for processing hierarchical structures.
File Partitioning¶
On multicore machines, you can achieve major performance increase by partitioning a delimited file into multiple chunks and process these chunks in parallel. When you can specify multiple partitions, Centerprise creates multiple readers that read and parse the source file in parallel taking advantage of multiprocessor hardware to deliver greatly improved throughput.
Automatic Layout Building¶
Centerprise provides the ability to build file layout automatically by reading the sample data file. This feature correctly determines data types most of the time. You can manually change data types in the layout grid.
Often, layout specifications are defined in Excel files that contain field names, data types, start positions, and lengths. You can import these specifications in Centerprise and quickly build fixed-length layout using these specifications.
Steps¶
Adding a delimited file source object allows you to transfer data from a delimited file. An example of what a delimited file source object looks like is shown below.
To configure the properties of a Delimited File Source object after it was added to the dataflow, right-click on it and select Properties from the context menu.