Creating a Regex Trap

Creating a regex trap begins in the same way other trap types are created: a user looks for characteristics that distinguish some sample text and all other identical instances from other lines in a report. A trap is then built to describe these characteristics. You can instruct Data Prep Studio to recognize a regex trap by selecting Regex from the Trap Type drop-down in the Report Design window.

While using regex traps requires some knowledge of how the regex engine of the .NET Framework functions, these traps also provide high levels of flexibility when creating templates because such traps automatically consider variable spaces between fields of interest. Thus, in contrast to a standard trap, where you might have to specify 6 blank trap characters to indicate 6 spaces and the template cannot be used for instances when five or seven spaces come between two fields of interest, the regex \s* trap considers any number of spaces between the two fields.

Regex traps may be built in a number of ways and yield similar results. The versatility of these traps lies in the fact that changing the elements used to create the trap changes the data the trap will pick up.

Consider the following regex traps that may be used to trap the Customer field of Classic.prn:

CUSTOMER:\s*(?<customer>[A-Z].*)

CUSTOMER:\s*(?<customer>.*)

While both traps will capture the text following the Customer field in the report, the first trap will  pick up text beginning with the letters A to Z only while the second trap will pick up text beginning with any alphanumeric character. Thus, the first trap will not capture 1999 Hot Spot as a Customer name but the second trap will. Both traps will (and do) capture Betty's Music Store as a Customer name.

Once the trap has been created, tick the Accept icon to apply this trap to the report.

 

The fields identified by your trap are automatically highlighted. To add a field to your table, highlight this field in the sample line, right-click on your mouse, and then select Create Field from this Capture > <Field Name>.

 

More information regarding regular expressions may be found in the following links:

https://msdn.microsoft.com/en-us/library/hs600312(v=vs.110).aspx

http://www.regular-expressions.info/

http://regexlib.com/